Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cubic start bounce #1831

Merged
merged 2 commits into from
Feb 4, 2025
Merged

Fix cubic start bounce #1831

merged 2 commits into from
Feb 4, 2025

Conversation

huitema
Copy link
Collaborator

@huitema huitema commented Feb 3, 2025

Fix issue #1830

@huitema
Copy link
Collaborator Author

huitema commented Feb 3, 2025

@hfstco there are 4 fixes in this PR:

  1. Tighten the bound of the satellite_cubic and satellite_cubic_loss case. The maximum duration parameters were set at 10 seconds, while the actual value was much better. Reflecting the actual value helped detect a potential regression noticed in the discussion of issue Cubic needs multiple stabilization steps after Hystart #1830.

  2. When exiting startup with HyStart, use path->cwin/2 instead of the full value. This is in line with the discussion in issue Cubic needs multiple stabilization steps after Hystart #1830, and avoids the succession of packet losses and window decrease observed on the traces.

  3. After reducing the CWIN, enter recovery. This is logical, because we expect to experience packet losses. The window was 'just right' when the packet acknowledgment provided an RTT measurement, but it doubled between the moment this packet was sent and the moment the ack was received. We have to wait until after recovery to see signals that match the new value of CWIN.

  4. Update the "ack frequency" test. It expect to observe an ACK interval of at least 1500us (in fact 1800), but we now observe an interval just above 1000us. This is 100 times the 10us RTT, probably triggered by the 1ms floor for the ACK frequency. The previous value was probably inflated due to excess delays after HyStart.

cubic_enter_avoidance(cubic_state, current_time);
/* enter recovery to ignore the losses expected if the window grew
* too large after the acknowleded packet was sent. */
cubic_enter_recovery(cnx, path_x, notification, cubic_state, current_time);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for clarification. We are entering recovery when HyStart trigger?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It is logical. On one hand, we know that Hystart triggered based on the CWIN one RTT before, that the CWIN will generally have doubled, and thus that there is very high risk of observing congestion losses after that. On the other hand, we just reduced the CWIN, so these likely losses do not reflect the new state of congestion control. That's the classic reasoning behind recovery.

@hfstco
Copy link
Collaborator

hfstco commented Feb 4, 2025

Tighten the bound of the satellite_cubic and satellite_cubic_loss case. The maximum duration parameters were set at 10 seconds, while the actual value was much better. Reflecting the actual value helped detect a potential regression noticed in the discussion of issue #1830.

Sounds good.

When exiting startup with HyStart, use path->cwin/2 instead of the full value. This is in line with the discussion in issue #1830, and avoids the succession of packet losses and window decrease observed on the traces.

After reducing the CWIN, enter recovery. This is logical, because we expect to experience packet losses. The window was 'just right' when the packet acknowledgment provided an RTT measurement, but it doubled between the moment this packet was sent and the moment the ack was received. We have to wait until after recovery to see signals that match the new value of CWIN.

Shared my thoughts in #1830.
But in general I like these changes. Let's see if HyStart++ does better.

@huitema
Copy link
Collaborator Author

huitema commented Feb 4, 2025

You are correct about Hystart++. I did not want to mix two issues in one PR, but there is definitely a risk of exiting the slow start too soon due to some transient condition. And then Cubic takes a long time to ramp up and recover.

@huitema huitema merged commit cd0223f into master Feb 4, 2025
11 checks passed
@huitema huitema deleted the fix-cubic-start-bounce branch February 4, 2025 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants