Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust peer is announcing wrong multiaddrs #131

Closed
2color opened this issue Apr 29, 2024 · 9 comments
Closed

Rust peer is announcing wrong multiaddrs #131

2color opened this issue Apr 29, 2024 · 9 comments

Comments

@2color
Copy link
Collaborator

2color commented Apr 29, 2024

Problem

The Rust bootstrap peer (12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3) that is deployed on a machine with a public IP (/ip4/147.28.186.157) has been announcing many multiaddrs with the same IP and a different port.

For example, sending an identify request to the peer we get the following response

➜  ~ ipfs id 12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3
{
	"ID": "12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
	"PublicKey": "CAESIGSBdOTiQ/D6caRtIjkMAVWIhxhNTDZMuAbEbAiLgaUi",
	"Addresses": [
		"/ip4/147.28.186.157/udp/1101/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip4/147.28.186.157/udp/1598/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip4/147.28.186.157/udp/18555/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip4/147.28.186.157/udp/19258/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip4/147.28.186.157/udp/19731/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip4/147.28.186.157/udp/26140/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip4/147.28.186.157/udp/29765/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip4/147.28.186.157/udp/3007/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip4/147.28.186.157/udp/30296/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip4/147.28.186.157/udp/42782/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip4/147.28.186.157/udp/46865/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip4/147.28.186.157/udp/4833/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip4/147.28.186.157/udp/4960/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip4/147.28.186.157/udp/54845/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip4/147.28.186.157/udp/60956/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip4/147.28.186.157/udp/61471/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip4/147.28.186.157/udp/9090/webrtc-direct/certhash/uEiBbC9bbdvraVWDvcvCEdJAWDymmUqiJQ964FuyEq0hELw/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip4/147.28.186.157/udp/9091/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip6/2604:1380:4642:6600::3/udp/48641/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip6/64:ff9b::931c:ba9d/udp/9091/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3"
	],
	"AgentVersion": "rust-libp2p/0.43.0",
	"Protocols": [
		"/ipfs/id/1.0.0",
		"/ipfs/id/push/1.0.0",
		"/ipfs/kad/1.0.0",
		"/libp2p/circuit/relay/0.2.0/hop",
		"/meshsub/1.0.0",
		"/meshsub/1.1.0"
	]
}

Even though the node is only listening on 147.28.186.157 port 9091 for QUIC connections.


Potential cause

The rust peer may be announcing/responding to peer routing requests with wrong multiaddrs.

When the rust peer receives an Identify protocol event, it takes the observed address from the event and calls swarm.add_external_address:

} else if let identify::Event::Received {
peer_id,
info:
identify::Info {
listen_addrs,
protocols,
observed_addr,
..
},
} = e
{
debug!("identify::Event::Received observed_addr: {}", observed_addr);
swarm.add_external_address(observed_addr);

When looking at the looks, there were cases when observed_addr was a different port, e.g. /ip4/147.28.186.157/udp/5272/quic-v1.

I suspect that the observed port in this example is wrongly 5272 instead of 9001 due to connections that were initiated (outgoing) by the bootstrap rust-peer where this in fact the outgoing port, but not the one that is publicly dialable.

Open questions

  • Is this correct usage of the swarm.add_external_address call? According to the code docs:

/// Add a confirmed external address for the local node.
///
/// This function should only be called with addresses that are guaranteed to be reachable.
/// The address is broadcast to all [NetworkBehaviour]s via [FromSwarm::ExternalAddrConfirmed].

  • How does libp2p prevent gossiping outgoing ports that a peer/node isn't explicitly listening on? (I suppose for relayed connections, this is desired, but it seems undesired in this case)
@2color
Copy link
Collaborator Author

2color commented Apr 29, 2024

I've deployed this branch of the rust-peer and so far it seems to have solved the problem:

➜  ~ ipfs id 12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3
{
	"ID": "12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
	"PublicKey": "CAESIGSBdOTiQ/D6caRtIjkMAVWIhxhNTDZMuAbEbAiLgaUi",
	"Addresses": [
		"/ip4/147.28.186.157/udp/9090/webrtc-direct/certhash/uEiBbC9bbdvraVWDvcvCEdJAWDymmUqiJQ964FuyEq0hELw/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3",
		"/ip4/147.28.186.157/udp/9091/quic-v1/p2p/12D3KooWGahRw3ZnM4gAyd9FK75v4Bp5keFYTvkcAwhpEm28wbV3"
	],
	"AgentVersion": "rust-libp2p/0.43.0",
	"Protocols": [
		"/ipfs/id/1.0.0",
		"/ipfs/id/push/1.0.0",
		"/libp2p/circuit/relay/0.2.0/hop",
		"/meshsub/1.0.0",
		"/meshsub/1.1.0"
	]
}

Also true from the perspective of the delegated routing endpoint

@AgeManning
Copy link

This looks to me like you are running this application over the internet.
I suspect you have not set up correct port forwards when running this application and what is likely happening is that your router is NAT'ing your outbound connections.

It may establish a new port per connection and so the external peer is witnessing this new port and reporting it back to you.

I would try setting up a port forward on your router to avoid this kind of NATing and see if that helps.

@MarcoPolo
Copy link

I also originally thought it was a NAT issue. But there may be something else going on

When I tried this briefly I'm seeing most peers report my correct port and only a few others don't. I then tried to identify those peers directly with vole (go-libp2p), thinking they may be giving weird/wrong responses. But they return the correct response there. I'll try to dig a little deeper.

The core issue of course is that we are seeing weird/wrong observed address after Identifying. In general it should be safe to only announce our observed addr after we see the same addr a handful of times (in go-libp2p we set this to 4).

@2color
Copy link
Collaborator Author

2color commented Apr 30, 2024

This looks to me like you are running this application over the internet.

Yes

I suspect you have not set up correct port forwards when running this application and what is likely happening is that your router is NAT'ing your outbound connections.

I have the Rust peer deployed to a VM with a dedicated public IPv4. No NAT.

The core issue of course is that we are seeing weird/wrong observed address after Identifying. In general it should be safe to only announce our observed addr after we see the same addr a handful of times (in go-libp2p we set this to 4).

Right, is that generally something that is abstracted by libp2p or done in user space?


So what's the root problem?

We see that other peers observe the Rust peer to have the same IP however with a different port to the one it's explicitly listening on (9091 and 9090).

I'm not intimately familiar with how connections are initiated in rust libp2p and the underlying OS network stack, but it seems reasonable that outgoing connections from the rust peer have a dynamic allocated source port that is different the one the rust peer is explicitly listening on

This kind of begs the question –if nothing else but for my own curiosity– of how this problem is avoided generally in the identify protocol? Is convention to follow a certain "threshold consensus" of 4 (or more) peers reporting the same observed address to verify your observed address?

@MarcoPolo
Copy link

@2color can you try an experiment for me?

Run the rust-peer and save the output to a file (e.g. cargo run 2>/tmp/rust-peer.log ). Then run this over the output (forgive my hacky one-liner):

cat /tmp/rust-peer.log | ag -o "observed_addr(.*)quic-v1" | ag -o "/ip4/.*quic-v1"  | awk -F'/' '{print $5}' | sort | uniq -c | sort -r    

Here's my output: https://gist.github.com/MarcoPolo/1eb35eddd9bda917cd052fc7bb7b108e

I'm running this on a forwarded port on a home server. There should be no NAT involved.

This is also interesting:

cat /tmp/rust-peer.log | ag -o "observed_addr(.*)quic-v1" | ag -o "/ip4/.*quic-v1"  | awk -F'/' '{print $3}' | sort | uniq -c | sort -r                                           rust-peer -> main
64367 23.xx.xx.25 # my ip
  35 10.51.2.1
  17 10.0.1.3
  12 10.0.2.100
  11 45.148.17.39
   9 192.168.1.1
   7 10.129.2.1
   6 10.244.0.63
   6 10.0.203.167
   5 10.203.176.1
   3 84.54.179.206
   1 192.168.20.1

This makes me think that what we might be seeing is the NAT on our remote peer's end.

@2color
Copy link
Collaborator Author

2color commented Apr 30, 2024

@MarcoPolo Here's the output for both commands:

https://gist.github.com/2color/f9cd261d13d994ebd5abb918f972e5fe

@MarcoPolo
Copy link

I did a similar experiment on the Go peer in this repo (using this patch) and here are the addrs I see reported back:

cat log.txt | ag -o '=/ip4/.*' | sort | uniq -c | sort -r                                                                   
1463 =/ip4/23.XXXXXXXX5/udp/9095/quic-v1
   4 =/ip4/23.XXXXXXXX5/udp/9095/quic-v1/p2p/12D3...
   3 =/ip4/70.XXXXXXXX0/udp/11181/quic-v1/p2p/12D3.../p2p-circuit
   1 =/ip4/23.XXXXXXXX5/udp/40754/quic-v1
   1 =/ip4/23.XXXXXXXX5/udp/1946/quic-v1

So the go peer does see random ports reported back as well.

Another interesting thing is that the rust-peer seems to be go through a lot more connections in the same period of time. Here the Go peer connected to about 1500 peers. The Rust one in the same time connected to about 15k peers! That's probably a bug?

@2color do you also see the rust one connecting to ~10x the number of peers in the same period of time?

Unrelated to the many conns bug. I think this experiment shows that the weird multiaddrs we see in the observed addrs are coming from remote peers and not something due to rust-libp2p (probably) since we see the same thing in go-libp2p.

@2color
Copy link
Collaborator Author

2color commented May 2, 2024

I did a similar experiment on the Go peer in this repo (using this patch) and here are the addrs I see reported back:

cat log.txt | ag -o '=/ip4/.*' | sort | uniq -c | sort -r                                                                   
1463 =/ip4/23.XXXXXXXX5/udp/9095/quic-v1
   4 =/ip4/23.XXXXXXXX5/udp/9095/quic-v1/p2p/12D3...
   3 =/ip4/70.XXXXXXXX0/udp/11181/quic-v1/p2p/12D3.../p2p-circuit
   1 =/ip4/23.XXXXXXXX5/udp/40754/quic-v1
   1 =/ip4/23.XXXXXXXX5/udp/1946/quic-v1

So the go peer does see random ports reported back as well.

Interesting. What's your explanation for these random ports then?

@2color do you also see the rust one connecting to ~10x the number of peers in the same period of time?

Off the cuff, yes, but I haven't tested it enough to say.

Unrelated to the many conns bug. I think this experiment shows that the weird multiaddrs we see in the observed addrs are coming from remote peers and not something due to rust-libp2p (probably) since we see the same thing in go-libp2p.

Yeah. I wonder why the swarm.add_external_address(observed_addr) call was added and what are the best practices around its use. Presumably it's useful when the Rust peer NAT and you don't know your public IP.

@MarcoPolo
Copy link

Interesting. What's your explanation for these random ports then?

I think it's a NAT on their end. Their router maps the port differently, maybe for "security".

Yeah. I wonder why the swarm.add_external_address(observed_addr) call was added and what are the best practices around its use. Presumably it's useful when the Rust peer NAT and you don't know your public IP.

It's to learn about your public IP Addresses if you don't know it. On EC2 your machine doesn't know its public address.

The correct logic is complicated and subtle. But a first order approximation is probably something like:

  1. Limit the maximum number of "external address you add"
  2. Limit the duration an observed address is valid.
  3. Only add an observed address if there are multiple observations.
  4. Limit the observation count to 1 per remote ip range (subtleties here in what the range should be).

@2color 2color closed this as completed in bcce1fb May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants