Merge pull request #1725 from redis/DOC-5283-jedis-reconnect-advice

andy-stark-redis · web-flow · commit 97794d202d24 · 2025-07-09T14:13:29.000+01:00
DOC-5283 DOC-5287 DOC-5288 DOC-5289 DOC-5290 DOC-5291 Update production usage advice
diff --git a/content/develop/clients/dotnet/produsage.md b/content/develop/clients/dotnet/produsage.md
@@ -28,6 +28,7 @@ progress in implementing the recommendations.
     {{< checklist-item "#event-handling" >}}Event handling{{< /checklist-item >}}
     {{< checklist-item "#timeouts" >}}Timeouts{{< /checklist-item >}}
     {{< checklist-item "#exception-handling" >}}Exception handling{{< /checklist-item >}}
+    {{< checklist-item "#retries" >}}Retries{{< /checklist-item >}}
 {{< /checklist >}}
 
 ## Recommendations
@@ -110,3 +111,68 @@ the most common Redis exceptions:
   (for example, trying to access a
   [stream entry]({{< relref "/develop/data-types/streams#entry-ids" >}})
   using an invalid ID).
+
+### Retries
+
+During the initial `ConnectionMultiplexer.Connect()` call, `NRedisStack` will
+keep trying to connect if the first attempt fails. By default, it will make
+three attempts, but you can configure the number of retries using the
+`ConnectRetry` configuration option:
+
+```cs
+var muxer = ConnectionMultiplexer.Connect(new ConfigurationOptions {
+    ConnectRetry = 5,  // Retry up to five times.
+        .
+        .
+});
+```
+
+After the initial `Connect()` call is successful, `NRedisStack` will
+automatically attempt to reconnect if the connection is lost. You can
+specify a reconnection strategy with the `ReconnectRetryPolicy` configuration
+option. `NRedisStack` provides two built-in classes that implement
+reconnection strategies:
+
+- `ExponentialRetry`: (Default) Uses an
+    [exponential backoff](https://en.wikipedia.org/wiki/Exponential_backoff)
+    strategy, where you specify an increment to the delay between successive
+    attempts and, optionally, a maximum delay, both in milliseconds.
+-   `LinearRetry`: Uses a linear backoff strategy with a fixed delay between
+    attempts, in milliseconds.
+
+The example below shows how to use the `ExponentialRetry` class:
+
+```cs
+var muxer = ConnectionMultiplexer.Connect(new ConfigurationOptions {
+    // 500ms increment per attempt, max 2000ms.
+    ReconnectRetryPolicy = new ExponentialRetry(500, 2000),  
+        .
+        .
+});
+```
+
+You can also implement your own custom retry policy by creating a class
+that implements the `IReconnectRetryPolicy` interface.
+
+`NRedisStack` doesn't provide an automated retry mechanism for commands, but
+you can implement your own retry logic in your application code. Use
+a loop with a `try`/`catch` block to catch `RedisConnectionException` and
+`RedisTimeoutException` exceptions and then retry the command after a
+suitable delay, as shown in the example below:
+
+```cs
+const int MAX_RETRIES = 3;
+
+for (int i = 0; i < MAX_RETRIES; i++) {
+    try {
+        string value = db.StringGet("foo");
+        break;
+    } catch (RedisConnectionException) {
+        // Wait before retrying.
+        Thread.Sleep(500 * (i + 1));
+    } catch (RedisTimeoutException) {
+        // Wait before retrying.
+        Thread.Sleep(500 * (i + 1));
+    }
+}
+```
diff --git a/content/develop/clients/go/produsage.md b/content/develop/clients/go/produsage.md
@@ -28,6 +28,8 @@ progress in implementing the recommendations.
     {{< checklist-item "#health-checks" >}}Health checks{{< /checklist-item >}}
     {{< checklist-item "#error-handling" >}}Error handling{{< /checklist-item >}}
     {{< checklist-item "#monitor-performance-and-errors">}}Monitor performance and errors{{< /checklist-item >}}
+    {{< checklist-item "#retries" >}}Retries{{< /checklist-item >}}
+    {{< checklist-item "#timeouts" >}}Timeouts{{< /checklist-item >}}
 {{< /checklist >}}
 
 ## Recommendations
@@ -68,3 +70,55 @@ you trace command execution and monitor your server's performance.
 You can use this information to detect problems before they are reported
 by users. See [Observability]({{< relref "/develop/clients/go#observability" >}})
 for more information.
+
+### Retries
+
+`go-redis` will automatically retry failed connections and commands. By
+default, the number of attempts is set to three, but you can change this
+using the `MaxRetries` field of `Options` when you connect. The retry
+strategy starts with a short delay between the first and second attempts,
+and increases the delay with each attempt. The initial delay is set
+with the `MinRetryBackoff` option (defaulting to 8 milliseconds) and the
+maximum delay is set with the `MaxRetryBackoff` option (defaulting to
+512 milliseconds):
+
+```go
+client := redis.NewClient(&redis.Options{
+    MinRetryBackoff: 10 * time.Millisecond,
+    MaxRetryBackoff: 100 * time.Millisecond,
+    MaxRetries: 5,
+})
+```
+
+You can use the observability features of `go-redis` to monitor the
+number of retries and the time taken for each attempt, as noted in the
+[Monitor performance and errors](#monitor-performance-and-errors) section
+above. Use this data to help you decide on the best retry settings
+for your application.
+
+### Timeouts
+
+`go-redis` supports timeouts for connections and commands to avoid
+stalling your app if the server does not respond within a reasonable time.
+The `DialTimeout` field of `Options` sets the timeout for connections,
+and the `ReadTimeout` and `WriteTimeout` fields set the timeouts for
+reading and writing data, respectively. The default timeout is five seconds
+for connections and three seconds for reading and writing data, but you can
+set your own timeouts when you connect:
+
+```go
+client := redis.NewClient(&redis.Options{
+    DialTimeout:  10 * time.Second,
+    ReadTimeout:  5 * time.Second,
+    WriteTimeout: 5 * time.Second,
+})
+```
+
+You can use the observability features of `go-redis` to monitor the
+frequency of timeouts, as noted in the
+[Monitor performance and errors](#monitor-performance-and-errors) section
+above. Use this data to help you decide on the best timeout settings
+for your application. If timeouts are set too short, then `go-redis`
+might retry commands that would have succeeded if given more time. However,
+if they are too long, your app might hang unnecessarily while waiting for a
+response that will never arrive.
diff --git a/content/develop/clients/jedis/connect.md b/content/develop/clients/jedis/connect.md
@@ -368,3 +368,53 @@ poolConfig.setTimeBetweenEvictionRuns(Duration.ofSeconds(1));
 // to prevent connection starvation
 JedisPooled jedis = new JedisPooled(poolConfig, "localhost", 6379);
 ```
+
+### Retrying a command after a connection failure
+
+If a connection is lost before a command is completed, the command will fail with a `JedisConnectionException`. Although the connection pool manages the connections
+for you, you must request a new connection from the pool to retry the command.
+You would typically do this in a loop that makes several attempts to reconnect
+before aborting and reporting that the error isn't transient. The example below
+shows a retry loop that uses a simple
+[exponential backoff](https://en.wikipedia.org/wiki/Exponential_backoff)
+strategy:
+
+```java
+JedisPooled jedis = new JedisPooled(
+    new HostAndPort("localhost", 6379),
+    clientConfig,
+    poolConfig
+);
+
+// Set max retry attempts
+final int MAX_RETRIES = 5;
+
+// Example of retrying a command
+String key = "retry-example";
+String value = "success";
+
+int attempts = 0;
+boolean success = false;
+
+while (!success && attempts < MAX_RETRIES) {
+    try {
+        attempts++;
+        String result = jedis.set(key, value);
+        System.out.println("Command succeeded on attempt " + attempts + ": " + result);
+        success = true;
+    } catch (JedisConnectionException e) {
+        System.out.println("Connection failed on attempt " + attempts + ": " + e.getMessage());
+        if (attempts >= MAX_RETRIES) {
+            System.out.println("Max retries reached. Giving up.");
+            throw e;
+        }
+
+        // Wait before retrying
+        try {
+            Thread.sleep(500 * attempts); // Exponential backoff
+        } catch (InterruptedException ie) {
+            Thread.currentThread().interrupt();
+        }
+    }
+}
+```
diff --git a/content/develop/clients/jedis/produsage.md b/content/develop/clients/jedis/produsage.md
@@ -26,6 +26,7 @@ progress in implementing the recommendations.
 
 {{< checklist "prodlist" >}}
     {{< checklist-item "#connection-pooling" >}}Connection pooling{{< /checklist-item >}}
+    {{< checklist-item "#connection-retries" >}}Connection retries{{< /checklist-item >}}
     {{< checklist-item "#client-side-caching" >}}Client-side caching{{< /checklist-item >}}
     {{< checklist-item "#timeouts" >}}Timeouts{{< /checklist-item >}}
     {{< checklist-item "#health-checks" >}}Health checks{{< /checklist-item >}}
@@ -51,6 +52,13 @@ write your own code to cache and reuse open connections. See
 [Connect with a connection pool]({{< relref "/develop/clients/jedis/connect#connect-with-a-connection-pool" >}})
 to learn how to use this technique with Jedis.
 
+### Connection retries
+
+If a connection is lost before a command is completed, the command will fail with a `JedisConnectionException`. However, a connection error is often transient, in which case the
+command will succeed after one or more reconnection attempts. See
+[Retrying a command after a connection failure]({{< relref "/develop/clients/jedis/connect#retrying-a-command-after-a-connection-failure" >}})
+for an example of a simple retry loop that can recover from a transient connection error.
+
 ### Client-side caching
 
 [Client-side caching]({{< relref "/develop/clients/client-side-caching" >}})
diff --git a/content/develop/clients/lettuce/produsage.md b/content/develop/clients/lettuce/produsage.md
@@ -29,6 +29,7 @@ progress in implementing the recommendations.
     {{< checklist-item "#cluster-topology-refresh">}}Cluster topology refresh{{< /checklist-item >}}
     {{< checklist-item "#dns-cache-and-redis" >}}DNS cache and Redis{{< /checklist-item >}}
     {{< checklist-item "#exception-handling" >}}Exception handling{{< /checklist-item >}}
+    {{< checklist-item "#connection-and-execution-reliability" >}}Connection and execution reliability{{< /checklist-item >}}
 {{< /checklist >}}
 
 ## Recommendations
@@ -189,3 +190,51 @@ See the Error handling sections of the
 [Lettuce async](https://redis.github.io/lettuce/user-guide/async-api/#error-handling) and
 [Lettuce reactive](https://redis.github.io/lettuce/user-guide/reactive-api/#error-handling)
 API guides to learn more about handling exceptions.
+
+
+## Connection and execution reliability
+
+By default, Lettuce uses an *at-least-once* strategy for command execution.
+It will automatically reconnect after a disconnection and resume executing
+any commands that were queued when the connection was lost. If you
+switch to *at-most-once* execution, Lettuce will
+not reconnect after a disconnection and will discard commands
+instead of queuing them. You can enable at-most-once execution by setting
+`autoReconnect(false)` in the
+`ClientOptions` when you create the client, as shown in the example below:
+
+```java
+RedisURI uri = RedisURI.Builder
+                .redis("localhost", 6379)
+                .withAuthentication("default", "yourPassword")
+                .build();
+
+RedisClient client = RedisClient.create(uri);
+
+client.setOptions(ClientOptions.builder()
+    .autoReconnect(false)
+        .
+        .
+    .build());
+```
+
+If you need finer control over which commands you want to execute in which mode, you can
+configure a *replay filter* to choose the commands that should retry after a disconnection.
+The example below shows a filter that retries all commands except for
+[`DECR`]({{< relref "/commands/decr" >}})
+(this command is not [idempotent](https://en.wikipedia.org/wiki/Idempotence) and
+so you might need to avoid executing it more than once). Note that
+replay filters are only available in in Lettuce v6.6 and above.
+
+```java
+Predicate<RedisCommand<?, ?, ?> > filter =
+        cmd -> cmd.getType().toString().equalsIgnoreCase("DECR");
+
+client.setOptions(ClientOptions.builder()
+    .replayFilter(filter)
+    .build());
+```
+
+See
+[Command execution reliability](https://redis.github.io/lettuce/advanced-usage/#command-execution-reliability)
+in the Lettuce reference guide for more information.
diff --git a/content/develop/clients/nodejs/produsage.md b/content/develop/clients/nodejs/produsage.md
@@ -27,7 +27,8 @@ progress in implementing the recommendations.
 {{< checklist "nodeprodlist" >}}
     {{< checklist-item "#handling-errors" >}}Handling errors{{< /checklist-item >}}
     {{< checklist-item "#handling-reconnections" >}}Handling reconnections{{< /checklist-item >}}
-    {{< checklist-item "#timeouts" >}}Timeouts{{< /checklist-item >}}
+    {{< checklist-item "#connection-timeouts" >}}Connection timeouts{{< /checklist-item >}}
+    {{< checklist-item "#command-execution-reliability" >}}Command execution reliability{{< /checklist-item >}}
 {{< /checklist >}}
 
 ## Recommendations
@@ -63,15 +64,44 @@ own custom strategy. See
 [Reconnect after disconnection]({{< relref "/develop/clients/nodejs/connect#reconnect-after-disconnection" >}})
 for more information.
 
-### Timeouts
+### Connection timeouts
 
-To set a timeout for a connection, use the `connectTimeout` option:
-```typescript
+To set a timeout for a connection, use the `connectTimeout` option
+(the default timeout is 5 seconds):
+
+```js
 const client = createClient({
   socket: {
     // setting a 10-second timeout  
     connectTimeout: 10000 // in milliseconds
   }
 });
 client.on('error', error => console.error('Redis client error:', error));
-```
+```
+
+### Command execution reliability
+
+By default, `node-redis` reconnects automatically when the connection is lost
+(but see [Handling reconnections](#handling-reconnections), if you want to
+customize this behavior). While the connection is down, any commands that you
+execute will be queued and sent to the server when the connection is restored.
+This might occasionally cause problems if the connection fails while a
+[non-idempotent](https://en.wikipedia.org/wiki/Idempotence) command
+is being executed. In this case, the command could change the data on the server
+without the client removing it from the queue. When the connection is restored,
+the command will be sent again, resulting in incorrect data.
+
+If you need to avoid this situation, set the `disableOfflineQueue` option
+to `true` when you create the client. This will cause the client to discard
+unexecuted commands rather than queuing them:
+
+```js
+const client = createClient({
+  disableOfflineQueue: true,
+      .
+      .
+});
+```
+
+Use a separate connection with the queue disabled if you want to avoid queuing
+only for specific commands.
diff --git a/content/develop/clients/redis-py/connect.md b/content/develop/clients/redis-py/connect.md
@@ -242,3 +242,13 @@ r3.close()
 
 pool.close()
 ```
+
+## Retrying connections
+
+A connection will sometimes fail because of a transient problem, such as a
+network outage or a server that is temporarily unavailable. In these cases,
+retrying the connection after a short delay will usually succeed. `redis-py` uses
+a simple retry strategy by default, but there are various ways you can customize
+this behavior to suit your use case. See
+[Retries]({{< relref "/develop/clients/redis-py/produsage#retries" >}})
+for more information about custom retry strategies, with example code.
diff --git a/content/develop/clients/redis-py/produsage.md b/content/develop/clients/redis-py/produsage.md
@@ -29,6 +29,7 @@ progress in implementing the recommendations.
     {{< checklist-item "#retries" >}}Retries{{< /checklist-item >}}
     {{< checklist-item "#health-checks" >}}Health checks{{< /checklist-item >}}
     {{< checklist-item "#exception-handling" >}}Exception handling{{< /checklist-item >}}
+    {{< checklist-item "#timeouts" >}}Timeouts{{< /checklist-item >}}
 {{< /checklist >}}
 
 ## Recommendations
@@ -170,3 +171,29 @@ module. The list below describes some of the most common exceptions.
 - `WatchError`: Thrown when a
   [watched key]({{< relref "/develop/clients/redis-py/transpipe#watch-keys-for-changes" >}}) is
   modified during a transaction.
+
+### Timeouts
+
+After you issue a command or a connection attempt, the client will wait
+for a response from the server. If the server doesn't respond within a
+certain time limit, the client will throw a `TimeoutError`. By default,
+the timeout happens after 10 seconds for both connections and commands, but you
+can set your own timeouts using the `socket_connect_timeout` and `socket_timeout` parameters
+when you connect:
+
+```py
+# Set a 15-second timeout for connections and a
+# 5-second timeout for commands.
+r = Redis(
+  socket_connect_timeout=15,
+  socket_timeout=5,
+    .
+    .
+)
+```
+
+Take care to set the timeouts to appropriate values for your use case.
+If you use timeouts that are too short, then `redis-py` might retry
+commands that would have succeeded if given more time. However, if the
+timeouts are too long, your app might hang unnecessarily while waiting for a
+response that will never arrive.