Skip to content

Commit 58b1eb6

Browse files
authored
Add VictoriaMetrics switch guide for TiUP cluster (#20957)
1 parent 914d41d commit 58b1eb6

File tree

1 file changed

+152
-8
lines changed

1 file changed

+152
-8
lines changed

maintain-tidb-using-tiup.md

Lines changed: 152 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,7 @@ aliases: ['/docs/dev/maintain-tidb-using-tiup/','/docs/dev/how-to/maintain/tiup-
66

77
# TiUP Common Operations
88

9-
This document describes the following common operations when you operate and maintain a TiDB cluster using TiUP.
10-
11-
- View the cluster list
12-
- Start the cluster
13-
- View the cluster status
14-
- Modify the configuration
15-
- Stop the cluster
16-
- Destroy the cluster
9+
This document describes the common operations when you operate and maintain a TiDB cluster using TiUP.
1710

1811
## View the cluster list
1912

@@ -290,3 +283,154 @@ The destroy operation stops the services and clears the data directory and deplo
290283
```bash
291284
tiup cluster destroy ${cluster-name}
292285
```
286+
287+
## Switch from Prometheus to VictoriaMetrics
288+
289+
In large-scale clusters, Prometheus might encounter performance bottlenecks when handling a large number of instances. Starting from TiUP v1.16.3, TiUP supports switching the monitoring component from Prometheus to VictoriaMetrics (VM) to provide better scalability, higher performance, and lower resource consumption.
290+
291+
### Set up VictoriaMetrics for a new deployment
292+
293+
By default, TiUP uses Prometheus as the metrics monitoring component. To use VictoriaMetrics instead of Prometheus in a new deployment, configure the topology file as follows:
294+
295+
```yaml
296+
# Monitoring server configuration
297+
monitoring_servers:
298+
# IP address of the monitoring server
299+
- host: ip_address
300+
...
301+
prom_remote_write_to_vm: true
302+
enable_prom_agent_mode: true
303+
304+
# Grafana server configuration
305+
grafana_servers:
306+
# IP address of the Grafana server
307+
- host: ip_address
308+
...
309+
use_vm_as_datasource: true
310+
```
311+
312+
### Migrate an existing deployment to VictoriaMetrics
313+
314+
You can perform the migration without affecting running instances. Existing metrics will remain in Prometheus, while TiUP will write new metrics to VictoriaMetrics.
315+
316+
#### Enable VictoriaMetrics remote write
317+
318+
1. Edit the cluster configuration:
319+
320+
```bash
321+
tiup cluster edit-config ${cluster-name}
322+
```
323+
324+
2. Under `monitoring_servers`, set `prom_remote_write_to_vm` to `true`:
325+
326+
```yaml
327+
monitoring_servers:
328+
- host: ip_address
329+
...
330+
prom_remote_write_to_vm: true
331+
```
332+
333+
3. Reload the configuration to apply the changes:
334+
335+
```bash
336+
tiup cluster reload ${cluster-name} -R prometheus
337+
```
338+
339+
#### Switch the default data source to VictoriaMetrics
340+
341+
1. Edit the cluster configuration:
342+
343+
```bash
344+
tiup cluster edit-config ${cluster-name}
345+
```
346+
347+
2. Under `grafana_servers`, set `use_vm_as_datasource` to `true`:
348+
349+
```yaml
350+
grafana_servers:
351+
- host: ip_address
352+
...
353+
use_vm_as_datasource: true
354+
```
355+
356+
3. Reload the configuration to apply the changes:
357+
358+
```bash
359+
tiup cluster reload ${cluster-name} -R grafana
360+
```
361+
362+
#### View historical metrics generated before the switch (optional)
363+
364+
If you need to view historical metrics generated before the switch, switch the data source to Grafana as follows:
365+
366+
1. Edit the cluster configuration:
367+
368+
```bash
369+
tiup cluster edit-config ${cluster-name}
370+
```
371+
372+
2. Under `grafana_servers`, comment out `use_vm_as_datasource`:
373+
374+
```yaml
375+
grafana_servers:
376+
- host: ip_address
377+
...
378+
# use_vm_as_datasource: true
379+
```
380+
381+
3. Reload the configuration to apply the changes:
382+
383+
```bash
384+
tiup cluster reload ${cluster-name} -R grafana
385+
```
386+
387+
4. To switch back to VictoriaMetrics, repeat the steps in [Switch the default data source to VictoriaMetrics](#switch-the-default-data-source-to-victoriametrics).
388+
389+
### Clean up old metrics and services
390+
391+
After confirming that the old metrics have expired, you can perform the following steps to remove redundant services and files. This does not affect the running cluster.
392+
393+
#### Set Prometheus to agent mode
394+
395+
1. Edit the cluster configuration:
396+
397+
```bash
398+
tiup cluster edit-config ${cluster-name}
399+
```
400+
401+
2. Under `monitoring_servers`, set `enable_prom_agent_mode` to `true`, and ensure you also set `prom_remote_write_to_vm` and `use_vm_as_datasource` correctly:
402+
403+
```yaml
404+
monitoring_servers:
405+
- host: ip_address
406+
...
407+
prom_remote_write_to_vm: true
408+
enable_prom_agent_mode: true
409+
grafana_servers:
410+
- host: ip_address
411+
...
412+
use_vm_as_datasource: true
413+
```
414+
415+
3. Reload the configuration to apply the changes:
416+
417+
```bash
418+
tiup cluster reload ${cluster-name} -R prometheus
419+
```
420+
421+
#### Remove expired data directories
422+
423+
1. In the configuration file, locate the `data_dir` path of the monitoring server:
424+
425+
```yaml
426+
monitoring_servers:
427+
- host: ip_address
428+
...
429+
data_dir: "/tidb-data/prometheus-8249"
430+
```
431+
432+
2. Remove the data directory:
433+
434+
```bash
435+
rm -rf /tidb-data/prometheus-8249
436+
```

0 commit comments

Comments
 (0)