You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
netstacklat: Add sanity check for out-of-order sequence
The logic for excluding samples from TCP reads that may have been
delayed by HOL blocking relies on reading a number of fields from the
TCP socket outside of the socket lock. This may be prone to errors due
to the socket state being updated at another place in the kernel while
our eBPF program is running. To reduce the risk that a data race
causes the filter to fail, add a sanity check for the maximum out of
order sequence used to exclude future TCP reads from monitoring.
The most problematic of the read fields in the tcp_sock is
ooo_last_skb, as that is a pointer to another SKB rather than a direct
value. This pointer is only valid as long as the out_of_order_queue is
non-empty. Due to a data race, we may check that the ooo-queue is
non-empty while there are still SKBs in it, then have the kernel clear
out the ooo-queue, and finally attempt to read the ooo_last_skb
pointer later when it is no longer valid (and may now point to a
freed/recycled SKB). This may result in incorrect values being used
for the sequence limit used to exclude future reads of
ooo-segments. The faulty sequence limit may both cause reads of
HOL-blocked segments to be included or the exclusion of an
unnecessarily large amount of future reads (up to 2 GB).
To reduce the risk that the garbage data from an invalid SKB is used,
introduce two sanity checks for end_seq in the ooo_last_skb. First
check if the sequence number is zero, if so assume it is invalid (even
though it can be a valid sequence number). Even though we will get an
error code if reading the data from this SKB fails altogether, we may
still succeed reading from a no longer valid SKB, in which case there
is a high risk the data will have been zeroed. If it's non-zero, also
check that it is within the current receive window (if not, clamp it
to the receive window).
Signed-off-by: Simon Sundberg <[email protected]>
0 commit comments