Skip to content

connection acquisition timeout description broken by NettyNioAsyncHttpClient #6013

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Thrillpool opened this issue Apr 7, 2025 · 1 comment
Labels
documentation This is a problem with documentation. needs-triage This issue or PR still needs to be triaged.

Comments

@Thrillpool
Copy link

Thrillpool commented Apr 7, 2025

Describe the issue

If we read https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/SdkHttpConfigurationOption.html it says CONNECTION_ACQUIRE_TIMEOUT is

Timeout for acquiring an already-established connection from a connection pool to a remote service.

The implication being that this time does not need to worry about the time to actually form a connection, it's just retrieving one from pool and the timeout is if pool is full and it takes a while to get free.

But this is broken by NettyNioAsyncHttpClient, as shown by following (self contained, just have netty-nio-client compile dependency) snippet

package com.thrillpool;

import org.reactivestreams.Publisher;
import software.amazon.awssdk.http.SdkHttpConfigurationOption;
import software.amazon.awssdk.http.SdkHttpFullRequest;
import software.amazon.awssdk.http.SdkHttpMethod;
import software.amazon.awssdk.http.SdkHttpResponse;
import software.amazon.awssdk.http.async.AsyncExecuteRequest;
import software.amazon.awssdk.http.async.SdkAsyncHttpResponseHandler;
import software.amazon.awssdk.http.nio.netty.NettyNioAsyncHttpClient;
import software.amazon.awssdk.utils.AttributeMap;

import java.net.URI;
import java.net.URISyntaxException;
import java.nio.ByteBuffer;
import java.time.Duration;
import java.util.concurrent.ExecutionException;

public class Main {
    public static void main(String[] args) throws URISyntaxException, ExecutionException, InterruptedException {
        var c = NettyNioAsyncHttpClient.builder()
                .buildWithDefaults(AttributeMap.builder()
                        .put(SdkHttpConfigurationOption.CONNECTION_ACQUIRE_TIMEOUT, Duration.ofMillis(5000))
                        .put(SdkHttpConfigurationOption.CONNECTION_TIMEOUT, Duration.ofMillis(10000))
                        .build()
                );
        var req = SdkHttpFullRequest
                .builder()
                // artificial uri guaranteed to timeout
                .uri(new URI("http://10.255.255.1"))
                .method(SdkHttpMethod.GET)
                .build();

        var inflight = c.execute(AsyncExecuteRequest
                .builder()
                .request(req)
                .responseHandler(new SdkAsyncHttpResponseHandler() {
                    public void onHeaders(SdkHttpResponse sdkHttpResponse) {}
                    public void onStream(Publisher<ByteBuffer> publisher) {}
                    public void onError(Throwable throwable) {}
                })
                .build()
        );
        inflight.get();
    }
}

which throws exception Acquire operation took longer than 5000 milliseconds, but the pool I made initially has no connections, this first time I try to do something should be in the connect case, and should timeout after 10s.

Reason is simple enough, BetterFixedChannelPool does the right thing here, it only schedules timeoutTask if pool is full of connections, but HealthCheckedChannelPool which ultimately delegates to BetterFixedChannelPool, it unconditionally starts timer.

Essentially there is implicit assumption that actually acquireConnectionTimeout is at least size of connection timeout for this specific client, but this isn't enforced anywhere (and it's too late to change now probably), so it should be shouted out not to go setting this number to something small.

Will note that docs for https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/nio/netty/NettyNioAsyncHttpClient.Builder.html for corresponding method say

The amount of time to wait when acquiring a connection from the pool before giving up and timing out.

which is more nebulous, and could in principle mean they include the time to actually make a connection, so I don't have a gripe with that per se, but in tandem with the fact that this method is coupled to this config that explicitly says it's for picking up existing connections, I do think it should be made clearer.

Links

https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/nio/netty/NettyNioAsyncHttpClient.Builder.html and perhaps others

@Thrillpool Thrillpool added documentation This is a problem with documentation. needs-triage This issue or PR still needs to be triaged. labels Apr 7, 2025
@Thrillpool
Copy link
Author

There is second negative consequence of this, consider expanded snippet with two requests

package com.thrillpool;

import org.reactivestreams.Publisher;
import software.amazon.awssdk.http.SdkHttpConfigurationOption;
import software.amazon.awssdk.http.SdkHttpFullRequest;
import software.amazon.awssdk.http.SdkHttpMethod;
import software.amazon.awssdk.http.SdkHttpResponse;
import software.amazon.awssdk.http.async.AsyncExecuteRequest;
import software.amazon.awssdk.http.async.SdkAsyncHttpResponseHandler;
import software.amazon.awssdk.http.nio.netty.NettyNioAsyncHttpClient;
import software.amazon.awssdk.utils.AttributeMap;

import java.net.URI;
import java.net.URISyntaxException;
import java.nio.ByteBuffer;
import java.time.Duration;
import java.util.concurrent.ExecutionException;

public class Main {
    public static void main(String[] args) throws URISyntaxException, ExecutionException, InterruptedException {
        var startTime = System.currentTimeMillis();
        var c = NettyNioAsyncHttpClient.builder()
                .buildWithDefaults(AttributeMap.builder()
                        .put(SdkHttpConfigurationOption.CONNECTION_ACQUIRE_TIMEOUT, Duration.ofMillis(10000))
                        .put(SdkHttpConfigurationOption.CONNECTION_TIMEOUT, Duration.ofMillis(11000))
                        .build()
                );
        var req = SdkHttpFullRequest
                .builder()
                // artificial uri guaranteed to timeout
                .uri(new URI("http://10.255.255.1"))
                .method(SdkHttpMethod.GET)
                .build();

        var inflight = c.execute(AsyncExecuteRequest
                .builder()
                .request(req)
                .responseHandler(new SdkAsyncHttpResponseHandler() {
                    public void onHeaders(SdkHttpResponse sdkHttpResponse) {}
                    public void onStream(Publisher<ByteBuffer> publisher) {}
                    public void onError(Throwable throwable) {}
                })
                .build()
        );
        try {
            inflight.get();
        } catch (Exception e) {
            System.out.println(System.currentTimeMillis() - startTime);
        }
        var inflight2 = c.execute(AsyncExecuteRequest
                .builder()
                .request(req)
                .responseHandler(new SdkAsyncHttpResponseHandler() {
                    public void onHeaders(SdkHttpResponse sdkHttpResponse) {}
                    public void onStream(Publisher<ByteBuffer> publisher) {}
                    public void onError(Throwable throwable) {}
                })
                .build()
        );
        try {
            inflight2.get();
        } catch (Exception e) {
            System.out.println(System.currentTimeMillis() - startTime);
        }

    }
}

Normally one would hope second request had it's full own chance at making a connection, but no, it latches onto the still attempting to connect connection, and then fails after 1 second! That's not fair.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation This is a problem with documentation. needs-triage This issue or PR still needs to be triaged.
Projects
None yet
Development

No branches or pull requests

1 participant