ice: handle flushing stale Tx timestamps in ice_ptp_tx_tstamp
In the event of a PTP clock time change due to .adjtime or .settime, the
ice driver needs to update the cached copy of the PHC time and also discard
any outstanding Tx timestamps.
This is required because otherwise the wrong copy of the PHC time will be
used when extending the Tx timestamp. This could result in reporting
incorrect timestamps to the stack.
The current approach taken to handle this is to call
ice_ptp_flush_tx_tracker, which will discard any timestamps which are not
yet complete.
This is problematic for two reasons:
1) it could lead to a potential race condition where the wrong timestamp is
associated with a future packet.
This can occur with the following flow:
1. Thread A gets request to transmit a timestamped packet, and picks an
index and transmits the packet
2. Thread B calls ice_ptp_flush_tx_tracker and sees the index in use,
marking is as disarded. No timestamp read occurs because the status
bit is not set, but the index is released for re-use
3. Thread A gets a new request to transmit another timestamped packet,
picks the same (now unused) index and transmits that packet.
4. The PHY transmits the first packet and updates the timestamp slot and
generates an interrupt.
5. The ice_ptp_tx_tstamp thread executes and sees the interrupt and a
valid timestamp but associates it with the new Tx SKB and not the one
that actual timestamp for the packet as expected.
This could result in the previous timestamp being assigned to a new
packet producing incorrect timestamps and leading to incorrect behavior
in PTP applications.
This is most likely to occur when the packet rate for Tx timestamp
requests is very high.
2) on E822 hardware, we must avoid reading a timestamp index more than once
each time its status bit is set and an interrupt is generated by
hardware.
We do have some extensive checks for the unread flag to ensure that only
one of either the ice_ptp_flush_tx_tracker or ice_ptp_tx_tstamp threads
read the timestamp. However, even with this we can still have cases
where we "flush" a timestamp that was actually completed in hardware.
This can lead to cases where we don't read the timestamp index as
appropriate.
To fix both of these issues, we must avoid calling ice_ptp_flush_tx_tracker
outside of the teardown path.
Rather than using ice_ptp_flush_tx_tracker, introduce a new state bitmap,
the stale bitmap. Start this as cleared when we begin a new timestamp
request. When we're about to extend a timestamp and send it up to the
stack, first check to see if that stale bit was set. If so, drop the
timestamp without sending it to the stack.
When we need to update the cached PHC timestamp out of band, just mark all
currently outstanding timestamps as stale. This will ensure that once
hardware completes the timestamp we'll ignore it correctly and avoid
reporting bogus timestamps to userspace.
With this change, we fix potential issues caused by calling
ice_ptp_flush_tx_tracker during normal operation.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>