Skip to content

feat(quinn): Pass in &mut self for AsyncUdpSocket's poll_recv and re-introduce poll_send #2259

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

matheus23
Copy link
Contributor

@matheus23 matheus23 commented Jun 3, 2025

This PR contains some changes to the AsyncUdpSocket trait. In short:

  • poll_recv now takes &mut self instead of &self
  • try_send is removed from AsyncUdpSocket
  • create_io_poller and the UdpPoller trait are now replaced with create_sender and the UdpSender trait
  • the UdpSender trait has a try_send as well as a poll_send function (that takes self: Pin<&mut Self>) instead of a poll_writable function

The final traits look like this:

pub trait AsyncUdpSocket: Send + Sync + Debug + 'static {
    fn create_sender(&self) -> Pin<Box<dyn UdpSender>>;

    fn poll_recv(
        &mut self,
        cx: &mut Context,
        bufs: &mut [IoSliceMut<'_>],
        meta: &mut [RecvMeta],
    ) -> Poll<io::Result<usize>>;

    fn local_addr(&self) -> io::Result<SocketAddr>;

    fn max_receive_segments(&self) -> usize;

    fn may_fragment(&self) -> bool;
}

pub trait UdpSender: Send + Sync + Debug + 'static {
    fn poll_send(
        self: Pin<&mut Self>,
        transmit: &Transmit,
        cx: &mut Context,
    ) -> Poll<io::Result<()>>;

    fn max_transmit_segments(&self) -> usize;
    
    fn try_send(self: Pin<&mut Self>, transmit: &Transmit) -> io::Result<()>;
}

Motivation

The general idea is to encode the invariants that the runtime and socket implementations need to adhere to into rust's type system.
One of the harder things for me was to understand initially when I was working with quinn's AsyncUdpSocket was when you're allowed to overwrite the waker you're getting from poll_recv and when you can't, and similarly for the old UdpPoller.
Also, with iroh we had annoying issues with passing state from poll_writable to try_send. With a single poll_send that gives you (essentially) &mut self access, this gets much easier.

With all of these changes it's also a lot clearer when it's fine to overwrite wakers in your custom implementation. If &mut self is passed in from your poll function, then you know you can only be called from a single task that owns you at a time, thus you can safely overwrite old wakers (and even do so without any use of mutexes or fancy linked lists or anything like that).

Performance

I've tried testing the performance of this on various systems (my laptop, dedicated bare metal servers, etc.). I wasn't able to determine any differences, but at the same time, I always got very inconsistent results (both on quinn main and this PR).
I'd really appreciate any help from someone who knows how to benchmark quinn effectively. There's some more things I can try, such as fiddling with the constants mentioned in #1126, but other than that I'm out of ideas.
Looking at the code, this should not have a negative performance impact. I think there's technically 1 more Arc clone for initiating connections, but other than that it should result in pretty much the same amount of work/same sequence of syscalls.

@matheus23 matheus23 force-pushed the matheus23/mut-self-upstream branch from 8e11fe1 to 86f4953 Compare June 3, 2025 14:11
@matheus23
Copy link
Contributor Author

When manually setting const MAX_TRANSMIT_DATAGRAMS: usize = 100, I get more consistent benchmarking results.
Here's what the criterion part of cargo bench says:

gso_false_gro_false_recvmmsg_false/throughput
                        Change within noise threshold.
gso_false_gro_false_recvmmsg_true/throughput
                        No change in performance detected.
gso_false_gro_true_recvmmsg_false/throughput
                        Change within noise threshold.
gso_false_gro_true_recvmmsg_true/throughput
                        No change in performance detected.
gso_true_gro_false_recvmmsg_false/throughput
                        Change within noise threshold.
gso_true_gro_false_recvmmsg_true/throughput
                        No change in performance detected.
gso_true_gro_true_recvmmsg_false/throughput
                        Change within noise threshold.
gso_true_gro_true_recvmmsg_true/throughput
                        Change within noise threshold.

(this is a comparison between main and this branch with the max transmit datagrams patch. I've removed lines in between.)

The other benchmark is very similar as well (but much noisier!).

# main:
test large_data_10_streams  ... bench:  13,008,394 ns/iter (+/- 6,523,637) = 806 MB/s
test large_data_1_stream    ... bench:   1,271,029 ns/iter (+/- 5,250,573) = 824 MB/s
test small_data_100_streams ... bench:  11,856,811 ns/iter (+/- 19,145,692)
test small_data_1_stream    ... bench:  23,830,707 ns/iter (+/- 1,572,245)
# matheus23/mut-self-upstream:
test large_data_10_streams  ... bench:  12,480,406 ns/iter (+/- 1,986,101) = 840 MB/s
test large_data_1_stream    ... bench:   1,294,919 ns/iter (+/- 5,450,900) = 809 MB/s
test small_data_100_streams ... bench:  11,770,875 ns/iter (+/- 21,641,366)
test small_data_1_stream    ... bench:  24,368,665 ns/iter (+/- 1,775,232)

@Ralith Ralith added the breaking label Jun 5, 2025
Copy link
Collaborator

@Ralith Ralith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think this is a good direction overall, though we may want to delay merging somewhat as it's breaking.

@Ralith
Copy link
Collaborator

Ralith commented Jun 5, 2025

Performance

I see no reason to expect that these changes present any performance hazard, and I'm satisfied with the smoke-testing you've done to check that assumption.

@matheus23 matheus23 force-pushed the matheus23/mut-self-upstream branch 4 times, most recently from 814245c to 8bea58b Compare June 5, 2025 12:36
@matheus23 matheus23 requested a review from Ralith June 5, 2025 12:37
@matheus23 matheus23 force-pushed the matheus23/mut-self-upstream branch from 8bea58b to 51559ee Compare June 9, 2025 08:06
@matheus23
Copy link
Contributor Author

matheus23 commented Jul 11, 2025

Thanks! I think this is a good direction overall, though we may want to delay merging somewhat as it's breaking.

Just to spell it out: I think I've addressed all reviewing concerns, and I think we're waiting only for the right moment to merge these breaking changes?

I realized I might be wrong though, given you're only saying "good direction" and requesting changes, so I'm double-checking now.

@djc
Copy link
Member

djc commented Jul 11, 2025

Just to spell it out: I think I've addressed all reviewing concerns, and I think we're waiting only for the right moment to merge these breaking changes?

That matches my understanding. We want to get a release out before we start merging incompatible changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants