-
Notifications
You must be signed in to change notification settings - Fork 748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce MPTCP #1811
base: unstable
Are you sure you want to change the base?
Introduce MPTCP #1811
Conversation
Multipath TCP (MPTCP) is an extension of the standard TCP protocol that allows a single transport connection to use multiple network interfaces or paths. MPTCP is useful for applications like bandwidth aggregation, failover, and more resilient connections. Linux kernel starts to support MPTCP since v5.6, it's time to support it. The test report shows that MPTCP reduces latency by ~25% in a 1% networking packet drop environment. Proposed-by: Geliang Tang <[email protected]> Tested-by: Gang Yan <[email protected]> Signed-off-by: zhenwei pi <[email protected]> Signed-off-by: zhenwei pi <[email protected]>
This PR could be tested by https://github.com/pizhenwei/valkey/tree/mptcp-with-hiredis |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## unstable #1811 +/- ##
============================================
+ Coverage 70.87% 70.98% +0.11%
============================================
Files 123 123
Lines 65651 65669 +18
============================================
+ Hits 46529 46618 +89
+ Misses 19122 19051 -71
🚀 New features to boost your workflow:
|
Not sure how useful MPTCP is for low-latency data center inner networks. It was built for mobile devices to switch between WiFi and cellular, handling packet loss. I do see the benefit of bandwidth aggregation though (which can be fixed externally, eg. openmptcprouter) |
src/config.c
Outdated
@@ -3276,6 +3276,7 @@ standardConfig static_configs[] = { | |||
createIntConfig("timeout", NULL, MODIFIABLE_CONFIG, 0, INT_MAX, server.maxidletime, 0, INTEGER_CONFIG, NULL, NULL), /* Default client timeout: infinite */ | |||
createIntConfig("replica-announce-port", "slave-announce-port", MODIFIABLE_CONFIG, 0, 65535, server.replica_announce_port, 0, INTEGER_CONFIG, NULL, NULL), | |||
createIntConfig("tcp-backlog", NULL, IMMUTABLE_CONFIG, 0, INT_MAX, server.tcp_backlog, 511, INTEGER_CONFIG, NULL, NULL), /* TCP listen backlog. */ | |||
createIntConfig("mptcp", NULL, IMMUTABLE_CONFIG, 0, INT_MAX, server.mptcp, 0, INTEGER_CONFIG, NULL, NULL), /* Multipath TCP. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it makes sense to have a dedicated config for replication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I have the same opinion in the next step.
Because server and client benchmark MPTCP support could be tested easily, we can know the test result in the commit message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the first step - server side supports MPTCP. If this is merged into valkey, then I'll support:
- explicit MPTCP config for replication. This will be helpful for replication data synchronization over a long distance.
- support libvalkey. After valkey uses libvalkey instead of hiredis, the client tools will support MPTCP easily. Use the wrapper from cli_common instead of hiredis #1802 is ready to review, this will help MPTCP to support MPTCP with a few code changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The config should be a bool config, createBoolConfig. Values specified as yes/no.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, fix it and append a commit. Thanks!
MPTCP leads additional ~10% latency on Linux-6.11 from test over no packet drop stable network, so it is NOT suitable for any client/replication. If server listens on MPTCP, and the client side uses TCP, the connection will be TCP only. This means the client/replication side has a chance to decide TCP/MPTCP.. Hi @matttbe, is it possible to improve the performance of MPTCP as fast as TCP? |
Hello,
@xbasel: MPTCP can also be useful in data centre for fast recoveries when some links fail, to use the path with the lowest latency, or even to combine multiple paths. There were a few scientific papers on the subject (using MPTCP in data center). I guess with Valkey, there will be a higher focused on the latency aspect. Please note that on the server side, supporting MPTCP is "cheap": when clients don't request to use MPTCP, server applications will receive a "plain" TCP sockets from the kernel when connections are accepted, making the performance impact minimal.
To be able to use multiple paths for the same connection, a few bytes need to be added to the header of each TCP packet. More details here. It means that if only one path is used per connection, MPTCP cannot be as fast as TCP. But that's normal, that's the small cost to pay to leverage multiple network paths, potentially increasing throughput and reliability. Of course, if only one path is used per MPTCP connection, no need to use Multipath TCP ;) Still, we are working on reducing the gap between TCP and MPTCP single path, e.g. this recent modification here. |
Unrelated to this PR, any idea if its possible create subflows on the same interface ? |
Yes it is. This page should explain what is possible with the default path-manager: https://www.mptcp.dev/pm.html Some ideas:
Don't hesitate to share the use-case if something is not supported ;-) |
Some cloud providers and some internet routers enforce per-flow bandwidth limits. |
src/anet.c
Outdated
* it when MPTCP is enabled. | ||
* Ref: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/net/mptcp/mptcp_connect.c | ||
*/ | ||
static int anetTcpSetMptcp(char *err, int ai_protocol, int mptcp) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 'mptcp' parameter is redundant; the function should assume MPTCP was requested (the function name suggests this).
or alternatively change the function name (anetTcpSet...) and keep the param
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be better:
static int _anetTcpServer(char *err, int port, char *bindaddr, int af, int backlog, int mptcp) {
...
memset(&hints, 0, sizeof(hints));
hints.ai_family = af;
hints.ai_socktype = SOCK_STREAM;
hints.ai_flags = AI_PASSIVE; /* No effect if bindaddr != NULL */
hints.ai_protocol = mptcp ? IPPROTO_MPTCP : IPPROTO_IP;
...
getaddrinfo(bindaddr, _port, &hints, &servinfo);
However, there is a lack of getaddrinfo MPTCP support, getaddrinfo
will fail.
As the comment, currently we have to adapt this scenario, then we can use a single function static int anetTcpSetMptcp(char *err, int ai_protocol, int mptcp)
to handle this rather than test mptcp flag in a address resolving loop:
for (p = servinfo; p != NULL; p = p->ai_next) {
if (mptcp) {
...
} else {
...
}
if ((s = socket(p->ai_family, p->ai_socktype, rv)) == -1) continue;
if (af == AF_INET6 && anetV6Only(err, s) == ANET_ERR) goto error;
if (anetSetReuseAddr(err, s) == ANET_ERR) goto error;
if (anetListen(err, s, p->ai_addr, p->ai_addrlen, backlog, 0, NULL) == ANET_ERR) s = ANET_ERR;
goto end;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know the code but for hints.ai_protocol
, can you not always set 0 or modulo 256.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it's the reason or not, from glibc source:
static const struct gaih_typeproto gaih_inet_typeproto[] =
{
{ 0, 0, 0, false, "" },
{ SOCK_STREAM, IPPROTO_TCP, 0, true, "tcp" },
{ SOCK_DGRAM, IPPROTO_UDP, 0, true, "udp" },
#if defined SOCK_DCCP && defined IPPROTO_DCCP
{ SOCK_DCCP, IPPROTO_DCCP, 0, false, "dccp" },
#endif
#ifdef IPPROTO_UDPLITE
{ SOCK_DGRAM, IPPROTO_UDPLITE, 0, false, "udplite" },
#endif
#ifdef IPPROTO_SCTP
{ SOCK_STREAM, IPPROTO_SCTP, 0, false, "sctp" },
{ SOCK_SEQPACKET, IPPROTO_SCTP, 0, false, "sctp" },
#endif
{ SOCK_RAW, 0, GAI_PROTO_PROTOANY|GAI_PROTO_NOSERVICE, true, "raw" },
{ 0, 0, 0, false, "" }
};
there is a lack of MPTCP support...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I send a patch to glibc reviewing list.
Even this patch is right to handle getaddrinfo MPTCP issue, it will be a long time to wait glibc upgrade. So I guess current style (static int anetTcpSetMptcp(char *err, int ai_protocol, int mptcp)) is still needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pizhenwei thank you for the patch for glibc!
I don't know the code but for
hints.ai_protocol
, can you not always set 0 or modulo 256.
I just looked at the code, I better understand what you were doing there. So maybe it is just the name function name that is misleading?
protocol = anetTcpGetProtocol(err, p->ai_protocol, mptcp);
if (protocol == ANET_ERR) goto error;
if ((s = socket(p->ai_family, p->ai_socktype, protocol)) == -1) continue;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @matttbe @Dwyane-Yan @zuiderkwast
getaddrinfo MPTCP issue is fixed in glibc, please see commit.
But I think this work around logic is still needed, because Linux distributions may upgrade glibc in several years. maybe use the correct logic instead someday ...
src/anet.c
Outdated
* it when MPTCP is enabled. | ||
* Ref: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/net/mptcp/mptcp_connect.c | ||
*/ | ||
static int anetTcpSetMptcp(char *err, int ai_protocol, int mptcp) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pizhenwei thank you for the patch for glibc!
I don't know the code but for
hints.ai_protocol
, can you not always set 0 or modulo 256.
I just looked at the code, I better understand what you were doing there. So maybe it is just the name function name that is misleading?
protocol = anetTcpGetProtocol(err, p->ai_protocol, mptcp);
if (protocol == ANET_ERR) goto error;
if ((s = socket(p->ai_family, p->ai_socktype, protocol)) == -1) continue;
Signed-off-by: zhenwei pi <[email protected]>
No. LGTM now. |
Hi @PingXie @zuiderkwast Would you please take a look? |
I like this because it can be done without additional libraries. Currently we're busy with the 8.1 release and I will be away for the next week. This one will have to wait to the next release. I have some questions though:
|
We do recommend this, that's what GoLang is doing since v1.24 for example. But then, it is required to change the way the socket creation is handled in this PR: for the moment, there is an error if MPTCP is not supported by the kernel. Instead, if it is not possible to create an MPTCP socket, I suggest to simply retry with a plain TCP socket, then continue like before, e.g. #ifdef IPPROTO_MPTCP
if ((s = socket(p->ai_family, p->ai_socktype, IPPROTO_MPTCP)) == -1)
#endif
{
if ((s = socket(p->ai_family, p->ai_socktype, p->ai_protocol)) == -1) continue;
} Note that I think it would still make sense to have an option to easily disable MPTCP support, just in case. Disabling MPTCP globally is not difficult, simply with |
@zuiderkwast : Oops, I forgot about the two other items
No need to change anything on the application side for that, the kernel will do that automatically once it has been configured, see here. There are tools that can automatically configure the MPTCP endpoints for the kernel, e.g. NetworkManager >=1.40. If these tools are not available, this can be easily done manually thanks to the
This should not change anything compared to before. If the If an MPTCP endpoint has been configured to accept connections to a different port, then this port will only be used for additional subflows (paths). Of course, there are ways for the application to get info about the connection and which subflows are being used, see here. |
Hi @matttbe If we change valkey as this style, valkey will listen on MPTCP by default on a higher Linux kernel implicitly, and client compiled by golang v1.24 (or higher) will get a few higher latency. Maybe lots of golang users need additional work for trouble shoot to find the root cause. So I prefer explicit
|
With GoLang 1.24, only the server side is using MPTCP by default, not the client side. That's good to have servers supporting MPTCP by default, so the clients, the ones who usually benefits the most from MPTCP, can use it if they want to. The latency is not supposed to increase with MPTCP, it should even decrease when it is possible to pick a path with a lower latency or less loaded. If the latency increases significantly with MPTCP single path compared to TCP, that's not normal. In this case, please report a new issue on our side. |
Based on the test results, if MPTCP connection is used by default, it may incur some performance loss.
The above test results were obtained from tests conducted in a virtual environment set up using a script.
valkey-server 、valkey-benchmark、valkey-cli is compiled in https://github.com/pizhenwei/valkey/tree/mptcp-with-hiredis This script has parameters: Usage example:
which simulates a scenario where only one network interface exists using MPTCP. By the way, there is a simpler testing method here: using Thanks the community for attention to MPTCP. Hope we can work together to make this happen. |
@Dwyane-Yan Thank you for the tests and sharing the results! Do you have more details about the benchmark tool? It looks like it will create many "small" requests (1KB), but will it create one connection per request, or one (or a few) connections doing many requests? Because you already contributed to MPTCP in the Linux kernel, maybe would like to analyse how the kernel is behaving with TCP and MPTCP? It would be really good to analyse the performances in this use-case -- e.g. with If that's OK, can we move this discussion to a new issue on MPTCP side? |
Signed-off-by: zhenwei pi <[email protected]>
Signed-off-by: zhenwei pi <[email protected]>
rv = anetTcpGetProtocol(err, p->ai_protocol, mptcp); | ||
if (rv == ANET_ERR) goto error; | ||
|
||
if ((s = socket(p->ai_family, p->ai_socktype, rv)) == -1) continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pizhenwei if Valkey maintainers are still OK with that, I think it would make sense to create an MPTCP socket by default, if available. The option can still be used to turn it off if really needed. Then, it will be up to the client to decide to use MPTCP or not: that will ease the deployments. If MPTCP is only needed for some specific clients, then they can use it, and the others can continue to use "plain" TCP like before.
So here, anetTcpGetProtocol
can be dropped, and something like this can be done:
rv = anetTcpGetProtocol(err, p->ai_protocol, mptcp); | |
if (rv == ANET_ERR) goto error; | |
if ((s = socket(p->ai_family, p->ai_socktype, rv)) == -1) continue; | |
#ifdef IPPROTO_MPTCP | |
if (!mptcp || (s = socket(p->ai_family, p->ai_socktype, IPPROTO_MPTCP)) == -1) | |
#endif | |
{ | |
if ((s = socket(p->ai_family, p->ai_socktype, p->ai_protocol)) == -1) continue; | |
} |
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @matttbe
- explicit mptcp (not strong opinion): I still prefer explicit mptcp config for the server side, because TCP and MPTCP are different transmission protocol theoretically. The maintainers (who deploy valkey-server) of valkey services should know the transmission protocol of their service and necessary operations for trouble shooting.
- default no (not strong opinion): coping the config from 8.0 for the new version, but it listens on different socket protocol, it will make user confused.
anetTcpGetProtocol
is needed (strong opinion): if user specifiesmptcp yes
on a lower kernel or other OS, it should be an error, because the valkey-server could NOT satisfy the user's request.
Let's leave this to the maintainer to decide...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pizhenwei thank you for your reply.
explicit mptcp (not strong opinion): I still prefer explicit mptcp config for the server side, because TCP and MPTCP are different transmission protocol theoretically. The maintainers (who deploy valkey-server) of valkey services should know the transmission protocol of their service and necessary operations for trouble shooting.
MPTCP is an extension to TCP, it is not a radical change. On the wire, TCP is still used, but with extra options in the header. But indeed, sysadmins should be told about the behaviour change, but at the end, they should not worry about that. Please note that when the client doesn't request to use MPTCP, the server application will get a "plain" TCP socket after the accept()
. So no modification on that side.
anetTcpGetProtocol
is needed (strong opinion): if user specifiesmptcp yes
on a lower kernel or other OS, it should be an error, because the valkey-server could NOT satisfy the user's request.
If MPTCP is used by default, the application should not fail if it is not available. Or you need tree choices: auto (fallback to TCP if MPTCP is not available), on (error if not available), off (don't try to use MPTCP).
Let's leave this to the maintainer to decide...
Indeed, my proposal was just a suggestion to ease MPTCP deployments without extra steps on the server side, only on the client side.
@@ -3276,6 +3276,7 @@ standardConfig static_configs[] = { | |||
createIntConfig("timeout", NULL, MODIFIABLE_CONFIG, 0, INT_MAX, server.maxidletime, 0, INTEGER_CONFIG, NULL, NULL), /* Default client timeout: infinite */ | |||
createIntConfig("replica-announce-port", "slave-announce-port", MODIFIABLE_CONFIG, 0, 65535, server.replica_announce_port, 0, INTEGER_CONFIG, NULL, NULL), | |||
createIntConfig("tcp-backlog", NULL, IMMUTABLE_CONFIG, 0, INT_MAX, server.tcp_backlog, 511, INTEGER_CONFIG, NULL, NULL), /* TCP listen backlog. */ | |||
createBoolConfig("mptcp", NULL, IMMUTABLE_CONFIG, server.mptcp, 0, NULL, NULL), /* Multipath TCP. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linked to my previous comment:
createBoolConfig("mptcp", NULL, IMMUTABLE_CONFIG, server.mptcp, 0, NULL, NULL), /* Multipath TCP. */ | |
createBoolConfig("mptcp", NULL, IMMUTABLE_CONFIG, server.mptcp, 1, NULL, NULL), /* Multipath TCP. */ |
# failover, and more resilient connections. | ||
# Note that MPTCP is supported in the official Linux kernel starting with version 5.6. | ||
# | ||
# mptcp yes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also linked to my previous comment
# mptcp yes | |
# MPTCP will be used if available. If not, "plain" TCP connections will be used. | |
mptcp yes |
Apply Matthieu's suggestion Signed-off-by: zhenwei pi <[email protected]>
Force push the last commit to fix DCO error. |
Multipath TCP (MPTCP) is an extension of the standard TCP protocol that allows a single transport connection to use multiple network interfaces or paths. MPTCP is useful for applications like bandwidth aggregation, failover, and more resilient connections.
Linux kernel starts to support MPTCP since v5.6, it's time to support it.
The test report shows that MPTCP reduces latency by ~25% in a 1% networking packet drop environment.
Proposed-by: Geliang Tang [email protected]
Tested-by: Gang Yan [email protected]
Signed-off-by: zhenwei pi [email protected]
Signed-off-by: zhenwei pi [email protected]
Cc Linux kernel MPTCP maintainer @matttbe