Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ public class DatanodeConfiguration extends ReconfigurableConfig {
static final int BLOCK_DELETE_THREADS_DEFAULT = 5;

public static final String GRPC_SO_BACKLOG_KEY = "hdds.datanode.grpc.so.backlog";
public static final int GRPC_SO_BACKLOG_DEFAULT = 4096;
public static final int GRPC_SO_BACKLOG_DEFAULT = 256;

public static final String BLOCK_DELETE_COMMAND_WORKER_INTERVAL =
"hdds.datanode.block.delete.command.worker.interval";
Expand All @@ -167,7 +167,7 @@ public class DatanodeConfiguration extends ReconfigurableConfig {
*/
@Config(key = "hdds.datanode.grpc.so.backlog",
type = ConfigType.INT,
defaultValue = "4096",
defaultValue = "256",
tags = {DATANODE},
description = "The SO_BACKLOG value for the Datanode gRPC server socket. " +
"This limits the number of pending connections in the kernel's " +
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,17 @@ public XceiverServerGrpc(DatanodeDetails datanodeDetails,
.channelType(channelType)
.withOption(ChannelOption.SO_BACKLOG, soBacklog)
.executor(readExecutors)
// If a client does not send an actual functional business RPC for 15 minutes,
// the server kicks them off with a GOAWAY frame.
.maxConnectionIdle(15, TimeUnit.MINUTES)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also set MaxConnectionAge() to 1H?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a long running application, a ozone client can exist for hours, even days.

// If the server receives absolutely zero network traffic from a client for
// 5 minutes, the server proactively sends an HTTP/2 PING frame to verify
// if the network wire or client machine is still alive.
.keepAliveTime(5, TimeUnit.MINUTES)
Comment on lines +147 to +148
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1min?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1min is a little aggressive.

// If the server fires a ping and the client fails to respond with a
// PING ACK within 30 seconds, the server assumes the socket is a dead
// "zombie connection" and immediately destroys the TCP socket.
.keepAliveTimeout(30, TimeUnit.SECONDS)
Comment thread
ChenSammi marked this conversation as resolved.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

15s?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

15s is a little aggressive.

.addService(ServerInterceptors.intercept(
xceiverService.bindServiceWithZeroCopy(),
new GrpcServerInterceptor()));
Expand Down