Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce MPTCP #1811

Open
wants to merge 5 commits into
base: unstable
Choose a base branch
from
Open

Introduce MPTCP #1811

wants to merge 5 commits into from

Conversation

pizhenwei
Copy link
Contributor

Multipath TCP (MPTCP) is an extension of the standard TCP protocol that allows a single transport connection to use multiple network interfaces or paths. MPTCP is useful for applications like bandwidth aggregation, failover, and more resilient connections.

Linux kernel starts to support MPTCP since v5.6, it's time to support it.

The test report shows that MPTCP reduces latency by ~25% in a 1% networking packet drop environment.

Proposed-by: Geliang Tang [email protected]
Tested-by: Gang Yan [email protected]
Signed-off-by: zhenwei pi [email protected]
Signed-off-by: zhenwei pi [email protected]

Cc Linux kernel MPTCP maintainer @matttbe

Multipath TCP (MPTCP) is an extension of the standard TCP protocol that allows
a single transport connection to use multiple network interfaces or paths.
MPTCP is useful for applications like bandwidth aggregation, failover, and more
resilient connections.

Linux kernel starts to support MPTCP since v5.6, it's time to support it.

The test report shows that MPTCP reduces latency by ~25% in a 1% networking
packet drop environment.

Proposed-by: Geliang Tang <[email protected]>
Tested-by: Gang Yan <[email protected]>
Signed-off-by: zhenwei pi <[email protected]>
Signed-off-by: zhenwei pi <[email protected]>
@pizhenwei
Copy link
Contributor Author

This PR could be tested by https://github.com/pizhenwei/valkey/tree/mptcp-with-hiredis

Copy link

codecov bot commented Mar 3, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.98%. Comparing base (0cc0bf7) to head (079da6d).
Report is 11 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #1811      +/-   ##
============================================
+ Coverage     70.87%   70.98%   +0.11%     
============================================
  Files           123      123              
  Lines         65651    65669      +18     
============================================
+ Hits          46529    46618      +89     
+ Misses        19122    19051      -71     
Files with missing lines Coverage Δ
src/anet.c 72.62% <100.00%> (+0.31%) ⬆️
src/config.c 78.39% <ø> (ø)
src/server.c 87.57% <100.00%> (+0.02%) ⬆️
src/server.h 100.00% <ø> (ø)

... and 16 files with indirect coverage changes

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@xbasel
Copy link
Member

xbasel commented Mar 3, 2025

Not sure how useful MPTCP is for low-latency data center inner networks. It was built for mobile devices to switch between WiFi and cellular, handling packet loss.
How does MPTCP perform when there is no packet drop? Does it introduce any overhead or impact latency in such stable network conditions?

I do see the benefit of bandwidth aggregation though (which can be fixed externally, eg. openmptcprouter)

src/config.c Outdated
@@ -3276,6 +3276,7 @@ standardConfig static_configs[] = {
createIntConfig("timeout", NULL, MODIFIABLE_CONFIG, 0, INT_MAX, server.maxidletime, 0, INTEGER_CONFIG, NULL, NULL), /* Default client timeout: infinite */
createIntConfig("replica-announce-port", "slave-announce-port", MODIFIABLE_CONFIG, 0, 65535, server.replica_announce_port, 0, INTEGER_CONFIG, NULL, NULL),
createIntConfig("tcp-backlog", NULL, IMMUTABLE_CONFIG, 0, INT_MAX, server.tcp_backlog, 511, INTEGER_CONFIG, NULL, NULL), /* TCP listen backlog. */
createIntConfig("mptcp", NULL, IMMUTABLE_CONFIG, 0, INT_MAX, server.mptcp, 0, INTEGER_CONFIG, NULL, NULL), /* Multipath TCP. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it makes sense to have a dedicated config for replication.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have the same opinion in the next step.

Because server and client benchmark MPTCP support could be tested easily, we can know the test result in the commit message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xbasel

This is the first step - server side supports MPTCP. If this is merged into valkey, then I'll support:

  • explicit MPTCP config for replication. This will be helpful for replication data synchronization over a long distance.
  • support libvalkey. After valkey uses libvalkey instead of hiredis, the client tools will support MPTCP easily. Use the wrapper from cli_common instead of hiredis #1802 is ready to review, this will help MPTCP to support MPTCP with a few code changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config should be a bool config, createBoolConfig. Values specified as yes/no.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, fix it and append a commit. Thanks!

@pizhenwei
Copy link
Contributor Author

Not sure how useful MPTCP is for low-latency data center inner networks. It was built for mobile devices to switch between WiFi and cellular, handling packet loss. How does MPTCP perform when there is no packet drop? Does it introduce any overhead or impact latency in such stable network conditions?

I do see the benefit of bandwidth aggregation though (which can be fixed externally, eg. openmptcprouter)

MPTCP leads additional ~10% latency on Linux-6.11 from test over no packet drop stable network, so it is NOT suitable for any client/replication. If server listens on MPTCP, and the client side uses TCP, the connection will be TCP only. This means the client/replication side has a chance to decide TCP/MPTCP..

Hi @matttbe, is it possible to improve the performance of MPTCP as fast as TCP?

@matttbe
Copy link

matttbe commented Mar 3, 2025

Hello,

Not sure how useful MPTCP is for low-latency data center inner networks. It was built for mobile devices to switch between WiFi and cellular, handling packet loss. How does MPTCP perform when there is no packet drop? Does it introduce any overhead or impact latency in such stable network conditions?

@xbasel: MPTCP can also be useful in data centre for fast recoveries when some links fail, to use the path with the lowest latency, or even to combine multiple paths. There were a few scientific papers on the subject (using MPTCP in data center). I guess with Valkey, there will be a higher focused on the latency aspect.

Please note that on the server side, supporting MPTCP is "cheap": when clients don't request to use MPTCP, server applications will receive a "plain" TCP sockets from the kernel when connections are accepted, making the performance impact minimal.

is it possible to improve the performance of MPTCP as fast as TCP?

To be able to use multiple paths for the same connection, a few bytes need to be added to the header of each TCP packet. More details here. It means that if only one path is used per connection, MPTCP cannot be as fast as TCP. But that's normal, that's the small cost to pay to leverage multiple network paths, potentially increasing throughput and reliability. Of course, if only one path is used per MPTCP connection, no need to use Multipath TCP ;)

Still, we are working on reducing the gap between TCP and MPTCP single path, e.g. this recent modification here.

@xbasel
Copy link
Member

xbasel commented Mar 3, 2025

Unrelated to this PR, any idea if its possible create subflows on the same interface ?

@matttbe
Copy link

matttbe commented Mar 3, 2025

Unrelated to this PR, any idea if its possible create subflows on the same interface ?

Yes it is. This page should explain what is possible with the default path-manager: https://www.mptcp.dev/pm.html

Some ideas:

  • the server can announce multiple addresses v4/v6, even with a port. So it can announce the same IP with different ports
  • the client can use multiple addresses, v4/v6. They can also be assigned to the same interface or not. The fullmesh mode can help to create even more paths than "supposedly" needed.
  • technically, the client could establish multiples subflows from the same source address and to the same destination using different source ports, but that seems very particular, and only the userspace PM will let you do that.

Don't hesitate to share the use-case if something is not supported ;-)

@xbasel
Copy link
Member

xbasel commented Mar 3, 2025

Some cloud providers and some internet routers enforce per-flow bandwidth limits.
MPTCP can "fix" this issue... Basically to use MPTCP as "download accelerator".

@pizhenwei pizhenwei requested review from zuiderkwast and PingXie March 4, 2025 00:31
src/anet.c Outdated
* it when MPTCP is enabled.
* Ref: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/net/mptcp/mptcp_connect.c
*/
static int anetTcpSetMptcp(char *err, int ai_protocol, int mptcp) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'mptcp' parameter is redundant; the function should assume MPTCP was requested (the function name suggests this).

or alternatively change the function name (anetTcpSet...) and keep the param

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be better:

static int _anetTcpServer(char *err, int port, char *bindaddr, int af, int backlog, int mptcp) {
    ...
    memset(&hints, 0, sizeof(hints));
    hints.ai_family = af; 
    hints.ai_socktype = SOCK_STREAM;
    hints.ai_flags = AI_PASSIVE; /* No effect if bindaddr != NULL */
    hints.ai_protocol = mptcp ? IPPROTO_MPTCP : IPPROTO_IP;
    ...
    getaddrinfo(bindaddr, _port, &hints, &servinfo);

However, there is a lack of getaddrinfo MPTCP support, getaddrinfo will fail.

As the comment, currently we have to adapt this scenario, then we can use a single function static int anetTcpSetMptcp(char *err, int ai_protocol, int mptcp) to handle this rather than test mptcp flag in a address resolving loop:

    for (p = servinfo; p != NULL; p = p->ai_next) {
        if (mptcp) {
              ...
        } else {
              ...
        }

        if ((s = socket(p->ai_family, p->ai_socktype, rv)) == -1) continue;

        if (af == AF_INET6 && anetV6Only(err, s) == ANET_ERR) goto error;
        if (anetSetReuseAddr(err, s) == ANET_ERR) goto error;
        if (anetListen(err, s, p->ai_addr, p->ai_addrlen, backlog, 0, NULL) == ANET_ERR) s = ANET_ERR;
        goto end;
    }

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the code but for hints.ai_protocol, can you not always set 0 or modulo 256.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's the reason or not, from glibc source:

static const struct gaih_typeproto gaih_inet_typeproto[] =
{
  { 0, 0, 0, false, "" },
  { SOCK_STREAM, IPPROTO_TCP, 0, true, "tcp" },
  { SOCK_DGRAM, IPPROTO_UDP, 0, true, "udp" },
#if defined SOCK_DCCP && defined IPPROTO_DCCP
  { SOCK_DCCP, IPPROTO_DCCP, 0, false, "dccp" },
#endif
#ifdef IPPROTO_UDPLITE
  { SOCK_DGRAM, IPPROTO_UDPLITE, 0, false, "udplite" },
#endif
#ifdef IPPROTO_SCTP
  { SOCK_STREAM, IPPROTO_SCTP, 0, false, "sctp" },
  { SOCK_SEQPACKET, IPPROTO_SCTP, 0, false, "sctp" },
#endif
  { SOCK_RAW, 0, GAI_PROTO_PROTOANY|GAI_PROTO_NOSERVICE, true, "raw" },
  { 0, 0, 0, false, "" }
};

there is a lack of MPTCP support...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I send a patch to glibc reviewing list.
Even this patch is right to handle getaddrinfo MPTCP issue, it will be a long time to wait glibc upgrade. So I guess current style (static int anetTcpSetMptcp(char *err, int ai_protocol, int mptcp)) is still needed.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pizhenwei thank you for the patch for glibc!

I don't know the code but for hints.ai_protocol, can you not always set 0 or modulo 256.

I just looked at the code, I better understand what you were doing there. So maybe it is just the name function name that is misleading?

protocol = anetTcpGetProtocol(err, p->ai_protocol, mptcp);
if (protocol == ANET_ERR) goto error;
if ((s = socket(p->ai_family, p->ai_socktype, protocol)) == -1) continue;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @matttbe @Dwyane-Yan @zuiderkwast

getaddrinfo MPTCP issue is fixed in glibc, please see commit.
But I think this work around logic is still needed, because Linux distributions may upgrade glibc in several years. maybe use the correct logic instead someday ...

src/anet.c Outdated
* it when MPTCP is enabled.
* Ref: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/net/mptcp/mptcp_connect.c
*/
static int anetTcpSetMptcp(char *err, int ai_protocol, int mptcp) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pizhenwei thank you for the patch for glibc!

I don't know the code but for hints.ai_protocol, can you not always set 0 or modulo 256.

I just looked at the code, I better understand what you were doing there. So maybe it is just the name function name that is misleading?

protocol = anetTcpGetProtocol(err, p->ai_protocol, mptcp);
if (protocol == ANET_ERR) goto error;
if ((s = socket(p->ai_family, p->ai_socktype, protocol)) == -1) continue;

@pizhenwei
Copy link
Contributor Author

After renaming anetTcpSetMptcp to anetTcpGetProtocol, this function gets clearer.

Thanks to @matttbe for this suggestion!

Hi @xbasel
Do you have any suggestion about this change?

@xbasel
Copy link
Member

xbasel commented Mar 4, 2025

Hi @xbasel Do you have any suggestion about this change?

No. LGTM now.

@pizhenwei
Copy link
Contributor Author

Hi @PingXie @zuiderkwast
This change is reviewed by @xbasel and Linux kernel MPTCP maintainer @matttbe , also tested by another Linux kernel MPTCP maintainer @geliangtang 's team.

Would you please take a look?

@zuiderkwast
Copy link
Contributor

I like this because it can be done without additional libraries.

Currently we're busy with the 8.1 release and I will be away for the next week. This one will have to wait to the next release. I have some questions though:

  1. Is there any harm in enabling MPTCP by default? I.e. can we skip the mptcp on/off config and just allow it to be used on supported platforms?

    As @matttbe mentioned, it falls back to plain TCP when the client doesn't explicitly request MPTCP.

    Please note that on the server side, supporting MPTCP is "cheap": when clients don't request to use MPTCP, server applications will receive a "plain" TCP sockets from the kernel when connections are accepted, making the performance impact minimal.

  2. I'm wondering how clients and servers can announce and find the peer's additional IPs and/or ports. Do we need to add more configs for this?

  3. If we need to expose multiple IPs and ports, how can we report this in commands like CLUSTER SLOTS?

@matttbe
Copy link

matttbe commented Mar 5, 2025

  1. Is there any harm in enabling MPTCP by default? I.e. can we skip the mptcp on/off config and just allow it to be used on supported platforms?
    As @matttbe mentioned, it falls back to plain TCP when the client doesn't explicitly request MPTCP.

We do recommend this, that's what GoLang is doing since v1.24 for example. But then, it is required to change the way the socket creation is handled in this PR: for the moment, there is an error if MPTCP is not supported by the kernel. Instead, if it is not possible to create an MPTCP socket, I suggest to simply retry with a plain TCP socket, then continue like before, e.g.

#ifdef IPPROTO_MPTCP
    if ((s = socket(p->ai_family, p->ai_socktype, IPPROTO_MPTCP)) == -1)
#endif
    {
        if ((s = socket(p->ai_family, p->ai_socktype, p->ai_protocol)) == -1) continue;
    }

Note that I think it would still make sense to have an option to easily disable MPTCP support, just in case. Disabling MPTCP globally is not difficult, simply with sysctl net.mptcp.enabled=0, but it might require different permissions.

@matttbe
Copy link

matttbe commented Mar 5, 2025

@zuiderkwast : Oops, I forgot about the two other items

  1. I'm wondering how clients and servers can announce and find the peer's additional IPs and/or ports. Do we need to add more configs for this?

No need to change anything on the application side for that, the kernel will do that automatically once it has been configured, see here.

There are tools that can automatically configure the MPTCP endpoints for the kernel, e.g. NetworkManager >=1.40. If these tools are not available, this can be easily done manually thanks to the ip mptcp endpoint add <IP address> dev <interface> <type> command.

  1. If we need to expose multiple IPs and ports, how can we report this in commands like CLUSTER SLOTS?

This should not change anything compared to before. If the bind() commands restricts to a specific IP, the restriction will still be the same. If there is no such restrictions, it means the service can be reached from different IPs, like before.

If an MPTCP endpoint has been configured to accept connections to a different port, then this port will only be used for additional subflows (paths).

Of course, there are ways for the application to get info about the connection and which subflows are being used, see here.

@pizhenwei
Copy link
Contributor Author

  1. Is there any harm in enabling MPTCP by default? I.e. can we skip the mptcp on/off config and just allow it to be used on supported platforms?
    As @matttbe mentioned, it falls back to plain TCP when the client doesn't explicitly request MPTCP.

We do recommend this, that's what GoLang is doing since v1.24 for example. But then, it is required to change the way the socket creation is handled in this PR: for the moment, there is an error if MPTCP is not supported by the kernel. Instead, if it is not possible to create an MPTCP socket, I suggest to simply retry with a plain TCP socket, then continue like before, e.g.

#ifdef IPPROTO_MPTCP
    if ((s = socket(p->ai_family, p->ai_socktype, IPPROTO_MPTCP)) == -1)
#endif
    {
        if ((s = socket(p->ai_family, p->ai_socktype, p->ai_protocol)) == -1) continue;
    }

Note that I think it would still make sense to have an option to easily disable MPTCP support, just in case. Disabling MPTCP globally is not difficult, simply with sysctl net.mptcp.enabled=0, but it might require different permissions.

Hi @matttbe

If we change valkey as this style, valkey will listen on MPTCP by default on a higher Linux kernel implicitly, and client compiled by golang v1.24 (or higher) will get a few higher latency. Maybe lots of golang users need additional work for trouble shoot to find the root cause. So I prefer explicit mptcp config for the server side, the client side also uses explicit option. Valkey clients usually access server in a good networking environment, however replica may access it across data center for data backup, enabling/disabling global MPTCP kernel config is not the key idea of this PR.

+------+                     +-------+ 
|valkey| --MPTCP across DC-- |replica| 
+------+                     +-------+
   |
  TCP
   |
 client

@matttbe
Copy link

matttbe commented Mar 5, 2025

  1. Is there any harm in enabling MPTCP by default? I.e. can we skip the mptcp on/off config and just allow it to be used on supported platforms?
    As @matttbe mentioned, it falls back to plain TCP when the client doesn't explicitly request MPTCP.

We do recommend this, that's what GoLang is doing since v1.24 for example. But then, it is required to change the way the socket creation is handled in this PR: for the moment, there is an error if MPTCP is not supported by the kernel. Instead, if it is not possible to create an MPTCP socket, I suggest to simply retry with a plain TCP socket, then continue like before, e.g.

#ifdef IPPROTO_MPTCP
    if ((s = socket(p->ai_family, p->ai_socktype, IPPROTO_MPTCP)) == -1)
#endif
    {
        if ((s = socket(p->ai_family, p->ai_socktype, p->ai_protocol)) == -1) continue;
    }

Note that I think it would still make sense to have an option to easily disable MPTCP support, just in case. Disabling MPTCP globally is not difficult, simply with sysctl net.mptcp.enabled=0, but it might require different permissions.

Hi @matttbe

If we change valkey as this style, valkey will listen on MPTCP by default on a higher Linux kernel implicitly, and client compiled by golang v1.24 (or higher) will get a few higher latency. Maybe lots of golang users need additional work for trouble shoot to find the root cause. So I prefer explicit mptcp config for the server side, the client side also uses explicit option. Valkey clients usually access server in a good networking environment, however replica may access it across data center for data backup, enabling/disabling global MPTCP kernel config is not the key idea of this PR.

With GoLang 1.24, only the server side is using MPTCP by default, not the client side. That's good to have servers supporting MPTCP by default, so the clients, the ones who usually benefits the most from MPTCP, can use it if they want to.

The latency is not supposed to increase with MPTCP, it should even decrease when it is possible to pick a path with a lower latency or less loaded. If the latency increases significantly with MPTCP single path compared to TCP, that's not normal. In this case, please report a new issue on our side.

@Dwyane-Yan
Copy link

Based on the test results, if MPTCP connection is used by default, it may incur some performance loss.
Below are some simple test results using valkey-benchmark:

                Origin                  MPTCP
             rps  |  avg_msec        rps  |  avg_msec
ping      132363.6|  1.494        110492.4|  1.793
set       11013.0 |  18.152       10747.1 |  18.592
get       11371.9 |  17.568       11090.4 |  18.021

The above test results were obtained from tests conducted in a virtual environment set up using a script.
Link: https://github.com/Dwyane-Yan/mptcp_net-next/blob/export/tools/testing/selftests/net/mptcp/mptcp_valkey.sh.
The configuration of this script is a little complex:

        1、Requires the Linux tools/testing/selftests environment.
        2、cd to .tools/testing/selftests/net/mptcp
        3、make
        4、cp [valkey-server/valkey-benchmark/valkey-cli(with mptcp)] ./
        5、sudo ./mptcp_valkey.sh

valkey-server 、valkey-benchmark、valkey-cli is compiled in https://github.com/pizhenwei/valkey/tree/mptcp-with-hiredis

This script has parameters:
-l represents a lossy network environment, and -m represents MPTCP subflows.

Usage example:
sudo ./mptcp-valkey.sh -m 2 # Represents an environment with 2 network interfaces.
The above test results were obtained using:

sudo ./mptcp-valkey.sh
sudo ./mptcp-valkey.sh -m 1,

which simulates a scenario where only one network interface exists using MPTCP.

By the way, there is a simpler testing method here: using 'mptcpize run' for testing.
The mptcpize is a tool that can force transfer the TCP socket into MPTCP.
This eliminates the need to set up additional environments. You can easily obtain test results by simply
running 'mptcpize run ./valkey-server' and 'mptcpize run ./valkey-benchmark -h XXX -d 1024 -c 200 -n 10000000 -t ping'.
Similarly, based on this testing in a real-world environment, the results also prove that there is a slight
performance decrease when using MPTCP with only one network interface.

Thanks the community for attention to MPTCP. Hope we can work together to make this happen.

@matttbe
Copy link

matttbe commented Mar 6, 2025

Similarly, based on this testing in a real-world environment, the results also prove that there is a slight performance decrease when using MPTCP with only one network interface.

@Dwyane-Yan Thank you for the tests and sharing the results! Do you have more details about the benchmark tool? It looks like it will create many "small" requests (1KB), but will it create one connection per request, or one (or a few) connections doing many requests?

Because you already contributed to MPTCP in the Linux kernel, maybe would like to analyse how the kernel is behaving with TCP and MPTCP? It would be really good to analyse the performances in this use-case -- e.g. with perf, maybe with some flamegraphes? -- and identify where are the major differences between TCP and MPTCP single path.

If that's OK, can we move this discussion to a new issue on MPTCP side?

Comment on lines +613 to +616
rv = anetTcpGetProtocol(err, p->ai_protocol, mptcp);
if (rv == ANET_ERR) goto error;

if ((s = socket(p->ai_family, p->ai_socktype, rv)) == -1) continue;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pizhenwei if Valkey maintainers are still OK with that, I think it would make sense to create an MPTCP socket by default, if available. The option can still be used to turn it off if really needed. Then, it will be up to the client to decide to use MPTCP or not: that will ease the deployments. If MPTCP is only needed for some specific clients, then they can use it, and the others can continue to use "plain" TCP like before.

So here, anetTcpGetProtocol can be dropped, and something like this can be done:

Suggested change
rv = anetTcpGetProtocol(err, p->ai_protocol, mptcp);
if (rv == ANET_ERR) goto error;
if ((s = socket(p->ai_family, p->ai_socktype, rv)) == -1) continue;
#ifdef IPPROTO_MPTCP
if (!mptcp || (s = socket(p->ai_family, p->ai_socktype, IPPROTO_MPTCP)) == -1)
#endif
{
if ((s = socket(p->ai_family, p->ai_socktype, p->ai_protocol)) == -1) continue;
}

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @matttbe

  • explicit mptcp (not strong opinion): I still prefer explicit mptcp config for the server side, because TCP and MPTCP are different transmission protocol theoretically. The maintainers (who deploy valkey-server) of valkey services should know the transmission protocol of their service and necessary operations for trouble shooting.
  • default no (not strong opinion): coping the config from 8.0 for the new version, but it listens on different socket protocol, it will make user confused.
  • anetTcpGetProtocol is needed (strong opinion): if user specifies mptcp yes on a lower kernel or other OS, it should be an error, because the valkey-server could NOT satisfy the user's request.

Let's leave this to the maintainer to decide...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pizhenwei thank you for your reply.

explicit mptcp (not strong opinion): I still prefer explicit mptcp config for the server side, because TCP and MPTCP are different transmission protocol theoretically. The maintainers (who deploy valkey-server) of valkey services should know the transmission protocol of their service and necessary operations for trouble shooting.

MPTCP is an extension to TCP, it is not a radical change. On the wire, TCP is still used, but with extra options in the header. But indeed, sysadmins should be told about the behaviour change, but at the end, they should not worry about that. Please note that when the client doesn't request to use MPTCP, the server application will get a "plain" TCP socket after the accept(). So no modification on that side.

  • anetTcpGetProtocol is needed (strong opinion): if user specifies mptcp yes on a lower kernel or other OS, it should be an error, because the valkey-server could NOT satisfy the user's request.

If MPTCP is used by default, the application should not fail if it is not available. Or you need tree choices: auto (fallback to TCP if MPTCP is not available), on (error if not available), off (don't try to use MPTCP).

Let's leave this to the maintainer to decide...

Indeed, my proposal was just a suggestion to ease MPTCP deployments without extra steps on the server side, only on the client side.

@@ -3276,6 +3276,7 @@ standardConfig static_configs[] = {
createIntConfig("timeout", NULL, MODIFIABLE_CONFIG, 0, INT_MAX, server.maxidletime, 0, INTEGER_CONFIG, NULL, NULL), /* Default client timeout: infinite */
createIntConfig("replica-announce-port", "slave-announce-port", MODIFIABLE_CONFIG, 0, 65535, server.replica_announce_port, 0, INTEGER_CONFIG, NULL, NULL),
createIntConfig("tcp-backlog", NULL, IMMUTABLE_CONFIG, 0, INT_MAX, server.tcp_backlog, 511, INTEGER_CONFIG, NULL, NULL), /* TCP listen backlog. */
createBoolConfig("mptcp", NULL, IMMUTABLE_CONFIG, server.mptcp, 0, NULL, NULL), /* Multipath TCP. */
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linked to my previous comment:

Suggested change
createBoolConfig("mptcp", NULL, IMMUTABLE_CONFIG, server.mptcp, 0, NULL, NULL), /* Multipath TCP. */
createBoolConfig("mptcp", NULL, IMMUTABLE_CONFIG, server.mptcp, 1, NULL, NULL), /* Multipath TCP. */

# failover, and more resilient connections.
# Note that MPTCP is supported in the official Linux kernel starting with version 5.6.
#
# mptcp yes
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also linked to my previous comment

Suggested change
# mptcp yes
# MPTCP will be used if available. If not, "plain" TCP connections will be used.
mptcp yes

@pizhenwei pizhenwei requested review from matttbe and removed request for matttbe March 10, 2025 08:45
Apply Matthieu's suggestion

Signed-off-by: zhenwei pi <[email protected]>
@pizhenwei
Copy link
Contributor Author

Force push the last commit to fix DCO error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants