Skip to content

Commit 97794d2

Browse files
Merge pull request #1725 from redis/DOC-5283-jedis-reconnect-advice
DOC-5283 DOC-5287 DOC-5288 DOC-5289 DOC-5290 DOC-5291 Update production usage advice
2 parents f1be735 + 8451992 commit 97794d2

File tree

8 files changed

+299
-5
lines changed

8 files changed

+299
-5
lines changed

content/develop/clients/dotnet/produsage.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ progress in implementing the recommendations.
2828
{{< checklist-item "#event-handling" >}}Event handling{{< /checklist-item >}}
2929
{{< checklist-item "#timeouts" >}}Timeouts{{< /checklist-item >}}
3030
{{< checklist-item "#exception-handling" >}}Exception handling{{< /checklist-item >}}
31+
{{< checklist-item "#retries" >}}Retries{{< /checklist-item >}}
3132
{{< /checklist >}}
3233

3334
## Recommendations
@@ -110,3 +111,68 @@ the most common Redis exceptions:
110111
(for example, trying to access a
111112
[stream entry]({{< relref "/develop/data-types/streams#entry-ids" >}})
112113
using an invalid ID).
114+
115+
### Retries
116+
117+
During the initial `ConnectionMultiplexer.Connect()` call, `NRedisStack` will
118+
keep trying to connect if the first attempt fails. By default, it will make
119+
three attempts, but you can configure the number of retries using the
120+
`ConnectRetry` configuration option:
121+
122+
```cs
123+
var muxer = ConnectionMultiplexer.Connect(new ConfigurationOptions {
124+
ConnectRetry = 5, // Retry up to five times.
125+
.
126+
.
127+
});
128+
```
129+
130+
After the initial `Connect()` call is successful, `NRedisStack` will
131+
automatically attempt to reconnect if the connection is lost. You can
132+
specify a reconnection strategy with the `ReconnectRetryPolicy` configuration
133+
option. `NRedisStack` provides two built-in classes that implement
134+
reconnection strategies:
135+
136+
- `ExponentialRetry`: (Default) Uses an
137+
[exponential backoff](https://en.wikipedia.org/wiki/Exponential_backoff)
138+
strategy, where you specify an increment to the delay between successive
139+
attempts and, optionally, a maximum delay, both in milliseconds.
140+
- `LinearRetry`: Uses a linear backoff strategy with a fixed delay between
141+
attempts, in milliseconds.
142+
143+
The example below shows how to use the `ExponentialRetry` class:
144+
145+
```cs
146+
var muxer = ConnectionMultiplexer.Connect(new ConfigurationOptions {
147+
// 500ms increment per attempt, max 2000ms.
148+
ReconnectRetryPolicy = new ExponentialRetry(500, 2000),
149+
.
150+
.
151+
});
152+
```
153+
154+
You can also implement your own custom retry policy by creating a class
155+
that implements the `IReconnectRetryPolicy` interface.
156+
157+
`NRedisStack` doesn't provide an automated retry mechanism for commands, but
158+
you can implement your own retry logic in your application code. Use
159+
a loop with a `try`/`catch` block to catch `RedisConnectionException` and
160+
`RedisTimeoutException` exceptions and then retry the command after a
161+
suitable delay, as shown in the example below:
162+
163+
```cs
164+
const int MAX_RETRIES = 3;
165+
166+
for (int i = 0; i < MAX_RETRIES; i++) {
167+
try {
168+
string value = db.StringGet("foo");
169+
break;
170+
} catch (RedisConnectionException) {
171+
// Wait before retrying.
172+
Thread.Sleep(500 * (i + 1));
173+
} catch (RedisTimeoutException) {
174+
// Wait before retrying.
175+
Thread.Sleep(500 * (i + 1));
176+
}
177+
}
178+
```

content/develop/clients/go/produsage.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,8 @@ progress in implementing the recommendations.
2828
{{< checklist-item "#health-checks" >}}Health checks{{< /checklist-item >}}
2929
{{< checklist-item "#error-handling" >}}Error handling{{< /checklist-item >}}
3030
{{< checklist-item "#monitor-performance-and-errors">}}Monitor performance and errors{{< /checklist-item >}}
31+
{{< checklist-item "#retries" >}}Retries{{< /checklist-item >}}
32+
{{< checklist-item "#timeouts" >}}Timeouts{{< /checklist-item >}}
3133
{{< /checklist >}}
3234

3335
## Recommendations
@@ -68,3 +70,55 @@ you trace command execution and monitor your server's performance.
6870
You can use this information to detect problems before they are reported
6971
by users. See [Observability]({{< relref "/develop/clients/go#observability" >}})
7072
for more information.
73+
74+
### Retries
75+
76+
`go-redis` will automatically retry failed connections and commands. By
77+
default, the number of attempts is set to three, but you can change this
78+
using the `MaxRetries` field of `Options` when you connect. The retry
79+
strategy starts with a short delay between the first and second attempts,
80+
and increases the delay with each attempt. The initial delay is set
81+
with the `MinRetryBackoff` option (defaulting to 8 milliseconds) and the
82+
maximum delay is set with the `MaxRetryBackoff` option (defaulting to
83+
512 milliseconds):
84+
85+
```go
86+
client := redis.NewClient(&redis.Options{
87+
MinRetryBackoff: 10 * time.Millisecond,
88+
MaxRetryBackoff: 100 * time.Millisecond,
89+
MaxRetries: 5,
90+
})
91+
```
92+
93+
You can use the observability features of `go-redis` to monitor the
94+
number of retries and the time taken for each attempt, as noted in the
95+
[Monitor performance and errors](#monitor-performance-and-errors) section
96+
above. Use this data to help you decide on the best retry settings
97+
for your application.
98+
99+
### Timeouts
100+
101+
`go-redis` supports timeouts for connections and commands to avoid
102+
stalling your app if the server does not respond within a reasonable time.
103+
The `DialTimeout` field of `Options` sets the timeout for connections,
104+
and the `ReadTimeout` and `WriteTimeout` fields set the timeouts for
105+
reading and writing data, respectively. The default timeout is five seconds
106+
for connections and three seconds for reading and writing data, but you can
107+
set your own timeouts when you connect:
108+
109+
```go
110+
client := redis.NewClient(&redis.Options{
111+
DialTimeout: 10 * time.Second,
112+
ReadTimeout: 5 * time.Second,
113+
WriteTimeout: 5 * time.Second,
114+
})
115+
```
116+
117+
You can use the observability features of `go-redis` to monitor the
118+
frequency of timeouts, as noted in the
119+
[Monitor performance and errors](#monitor-performance-and-errors) section
120+
above. Use this data to help you decide on the best timeout settings
121+
for your application. If timeouts are set too short, then `go-redis`
122+
might retry commands that would have succeeded if given more time. However,
123+
if they are too long, your app might hang unnecessarily while waiting for a
124+
response that will never arrive.

content/develop/clients/jedis/connect.md

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -368,3 +368,53 @@ poolConfig.setTimeBetweenEvictionRuns(Duration.ofSeconds(1));
368368
// to prevent connection starvation
369369
JedisPooled jedis = new JedisPooled(poolConfig, "localhost", 6379);
370370
```
371+
372+
### Retrying a command after a connection failure
373+
374+
If a connection is lost before a command is completed, the command will fail with a `JedisConnectionException`. Although the connection pool manages the connections
375+
for you, you must request a new connection from the pool to retry the command.
376+
You would typically do this in a loop that makes several attempts to reconnect
377+
before aborting and reporting that the error isn't transient. The example below
378+
shows a retry loop that uses a simple
379+
[exponential backoff](https://en.wikipedia.org/wiki/Exponential_backoff)
380+
strategy:
381+
382+
```java
383+
JedisPooled jedis = new JedisPooled(
384+
new HostAndPort("localhost", 6379),
385+
clientConfig,
386+
poolConfig
387+
);
388+
389+
// Set max retry attempts
390+
final int MAX_RETRIES = 5;
391+
392+
// Example of retrying a command
393+
String key = "retry-example";
394+
String value = "success";
395+
396+
int attempts = 0;
397+
boolean success = false;
398+
399+
while (!success && attempts < MAX_RETRIES) {
400+
try {
401+
attempts++;
402+
String result = jedis.set(key, value);
403+
System.out.println("Command succeeded on attempt " + attempts + ": " + result);
404+
success = true;
405+
} catch (JedisConnectionException e) {
406+
System.out.println("Connection failed on attempt " + attempts + ": " + e.getMessage());
407+
if (attempts >= MAX_RETRIES) {
408+
System.out.println("Max retries reached. Giving up.");
409+
throw e;
410+
}
411+
412+
// Wait before retrying
413+
try {
414+
Thread.sleep(500 * attempts); // Exponential backoff
415+
} catch (InterruptedException ie) {
416+
Thread.currentThread().interrupt();
417+
}
418+
}
419+
}
420+
```

content/develop/clients/jedis/produsage.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ progress in implementing the recommendations.
2626

2727
{{< checklist "prodlist" >}}
2828
{{< checklist-item "#connection-pooling" >}}Connection pooling{{< /checklist-item >}}
29+
{{< checklist-item "#connection-retries" >}}Connection retries{{< /checklist-item >}}
2930
{{< checklist-item "#client-side-caching" >}}Client-side caching{{< /checklist-item >}}
3031
{{< checklist-item "#timeouts" >}}Timeouts{{< /checklist-item >}}
3132
{{< checklist-item "#health-checks" >}}Health checks{{< /checklist-item >}}
@@ -51,6 +52,13 @@ write your own code to cache and reuse open connections. See
5152
[Connect with a connection pool]({{< relref "/develop/clients/jedis/connect#connect-with-a-connection-pool" >}})
5253
to learn how to use this technique with Jedis.
5354

55+
### Connection retries
56+
57+
If a connection is lost before a command is completed, the command will fail with a `JedisConnectionException`. However, a connection error is often transient, in which case the
58+
command will succeed after one or more reconnection attempts. See
59+
[Retrying a command after a connection failure]({{< relref "/develop/clients/jedis/connect#retrying-a-command-after-a-connection-failure" >}})
60+
for an example of a simple retry loop that can recover from a transient connection error.
61+
5462
### Client-side caching
5563

5664
[Client-side caching]({{< relref "/develop/clients/client-side-caching" >}})

content/develop/clients/lettuce/produsage.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ progress in implementing the recommendations.
2929
{{< checklist-item "#cluster-topology-refresh">}}Cluster topology refresh{{< /checklist-item >}}
3030
{{< checklist-item "#dns-cache-and-redis" >}}DNS cache and Redis{{< /checklist-item >}}
3131
{{< checklist-item "#exception-handling" >}}Exception handling{{< /checklist-item >}}
32+
{{< checklist-item "#connection-and-execution-reliability" >}}Connection and execution reliability{{< /checklist-item >}}
3233
{{< /checklist >}}
3334

3435
## Recommendations
@@ -189,3 +190,51 @@ See the Error handling sections of the
189190
[Lettuce async](https://redis.github.io/lettuce/user-guide/async-api/#error-handling) and
190191
[Lettuce reactive](https://redis.github.io/lettuce/user-guide/reactive-api/#error-handling)
191192
API guides to learn more about handling exceptions.
193+
194+
195+
## Connection and execution reliability
196+
197+
By default, Lettuce uses an *at-least-once* strategy for command execution.
198+
It will automatically reconnect after a disconnection and resume executing
199+
any commands that were queued when the connection was lost. If you
200+
switch to *at-most-once* execution, Lettuce will
201+
not reconnect after a disconnection and will discard commands
202+
instead of queuing them. You can enable at-most-once execution by setting
203+
`autoReconnect(false)` in the
204+
`ClientOptions` when you create the client, as shown in the example below:
205+
206+
```java
207+
RedisURI uri = RedisURI.Builder
208+
.redis("localhost", 6379)
209+
.withAuthentication("default", "yourPassword")
210+
.build();
211+
212+
RedisClient client = RedisClient.create(uri);
213+
214+
client.setOptions(ClientOptions.builder()
215+
.autoReconnect(false)
216+
.
217+
.
218+
.build());
219+
```
220+
221+
If you need finer control over which commands you want to execute in which mode, you can
222+
configure a *replay filter* to choose the commands that should retry after a disconnection.
223+
The example below shows a filter that retries all commands except for
224+
[`DECR`]({{< relref "/commands/decr" >}})
225+
(this command is not [idempotent](https://en.wikipedia.org/wiki/Idempotence) and
226+
so you might need to avoid executing it more than once). Note that
227+
replay filters are only available in in Lettuce v6.6 and above.
228+
229+
```java
230+
Predicate<RedisCommand<?, ?, ?> > filter =
231+
cmd -> cmd.getType().toString().equalsIgnoreCase("DECR");
232+
233+
client.setOptions(ClientOptions.builder()
234+
.replayFilter(filter)
235+
.build());
236+
```
237+
238+
See
239+
[Command execution reliability](https://redis.github.io/lettuce/advanced-usage/#command-execution-reliability)
240+
in the Lettuce reference guide for more information.

content/develop/clients/nodejs/produsage.md

Lines changed: 35 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,8 @@ progress in implementing the recommendations.
2727
{{< checklist "nodeprodlist" >}}
2828
{{< checklist-item "#handling-errors" >}}Handling errors{{< /checklist-item >}}
2929
{{< checklist-item "#handling-reconnections" >}}Handling reconnections{{< /checklist-item >}}
30-
{{< checklist-item "#timeouts" >}}Timeouts{{< /checklist-item >}}
30+
{{< checklist-item "#connection-timeouts" >}}Connection timeouts{{< /checklist-item >}}
31+
{{< checklist-item "#command-execution-reliability" >}}Command execution reliability{{< /checklist-item >}}
3132
{{< /checklist >}}
3233

3334
## Recommendations
@@ -63,15 +64,44 @@ own custom strategy. See
6364
[Reconnect after disconnection]({{< relref "/develop/clients/nodejs/connect#reconnect-after-disconnection" >}})
6465
for more information.
6566

66-
### Timeouts
67+
### Connection timeouts
6768

68-
To set a timeout for a connection, use the `connectTimeout` option:
69-
```typescript
69+
To set a timeout for a connection, use the `connectTimeout` option
70+
(the default timeout is 5 seconds):
71+
72+
```js
7073
const client = createClient({
7174
socket: {
7275
// setting a 10-second timeout
7376
connectTimeout: 10000 // in milliseconds
7477
}
7578
});
7679
client.on('error', error => console.error('Redis client error:', error));
77-
```
80+
```
81+
82+
### Command execution reliability
83+
84+
By default, `node-redis` reconnects automatically when the connection is lost
85+
(but see [Handling reconnections](#handling-reconnections), if you want to
86+
customize this behavior). While the connection is down, any commands that you
87+
execute will be queued and sent to the server when the connection is restored.
88+
This might occasionally cause problems if the connection fails while a
89+
[non-idempotent](https://en.wikipedia.org/wiki/Idempotence) command
90+
is being executed. In this case, the command could change the data on the server
91+
without the client removing it from the queue. When the connection is restored,
92+
the command will be sent again, resulting in incorrect data.
93+
94+
If you need to avoid this situation, set the `disableOfflineQueue` option
95+
to `true` when you create the client. This will cause the client to discard
96+
unexecuted commands rather than queuing them:
97+
98+
```js
99+
const client = createClient({
100+
disableOfflineQueue: true,
101+
.
102+
.
103+
});
104+
```
105+
106+
Use a separate connection with the queue disabled if you want to avoid queuing
107+
only for specific commands.

content/develop/clients/redis-py/connect.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -242,3 +242,13 @@ r3.close()
242242

243243
pool.close()
244244
```
245+
246+
## Retrying connections
247+
248+
A connection will sometimes fail because of a transient problem, such as a
249+
network outage or a server that is temporarily unavailable. In these cases,
250+
retrying the connection after a short delay will usually succeed. `redis-py` uses
251+
a simple retry strategy by default, but there are various ways you can customize
252+
this behavior to suit your use case. See
253+
[Retries]({{< relref "/develop/clients/redis-py/produsage#retries" >}})
254+
for more information about custom retry strategies, with example code.

content/develop/clients/redis-py/produsage.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ progress in implementing the recommendations.
2929
{{< checklist-item "#retries" >}}Retries{{< /checklist-item >}}
3030
{{< checklist-item "#health-checks" >}}Health checks{{< /checklist-item >}}
3131
{{< checklist-item "#exception-handling" >}}Exception handling{{< /checklist-item >}}
32+
{{< checklist-item "#timeouts" >}}Timeouts{{< /checklist-item >}}
3233
{{< /checklist >}}
3334

3435
## Recommendations
@@ -170,3 +171,29 @@ module. The list below describes some of the most common exceptions.
170171
- `WatchError`: Thrown when a
171172
[watched key]({{< relref "/develop/clients/redis-py/transpipe#watch-keys-for-changes" >}}) is
172173
modified during a transaction.
174+
175+
### Timeouts
176+
177+
After you issue a command or a connection attempt, the client will wait
178+
for a response from the server. If the server doesn't respond within a
179+
certain time limit, the client will throw a `TimeoutError`. By default,
180+
the timeout happens after 10 seconds for both connections and commands, but you
181+
can set your own timeouts using the `socket_connect_timeout` and `socket_timeout` parameters
182+
when you connect:
183+
184+
```py
185+
# Set a 15-second timeout for connections and a
186+
# 5-second timeout for commands.
187+
r = Redis(
188+
socket_connect_timeout=15,
189+
socket_timeout=5,
190+
.
191+
.
192+
)
193+
```
194+
195+
Take care to set the timeouts to appropriate values for your use case.
196+
If you use timeouts that are too short, then `redis-py` might retry
197+
commands that would have succeeded if given more time. However, if the
198+
timeouts are too long, your app might hang unnecessarily while waiting for a
199+
response that will never arrive.

0 commit comments

Comments
 (0)