Skip to content

Commit c4f167a

Browse files
authored
Merge pull request #2598 from Creatone/creatone/disable-metrics-docs
Add "-disable_metrics" column to prometheus metrics table.
2 parents 6f30891 + dc0aa9b commit c4f167a

File tree

3 files changed

+83
-79
lines changed

3 files changed

+83
-79
lines changed

cmd/cadvisor.go

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,7 @@ var (
8989
container.HugetlbUsageMetrics: struct{}{},
9090
container.ReferencedMemoryMetrics: struct{}{},
9191
container.CPUTopologyMetrics: struct{}{},
92+
container.ResctrlMetrics: struct{}{},
9293
}}
9394

9495
// List of metrics that can be ignored.
@@ -106,6 +107,7 @@ var (
106107
container.HugetlbUsageMetrics: struct{}{},
107108
container.ReferencedMemoryMetrics: struct{}{},
108109
container.CPUTopologyMetrics: struct{}{},
110+
container.ResctrlMetrics: struct{}{},
109111
}
110112
)
111113

@@ -137,7 +139,7 @@ func (ml *metricSetValue) Set(value string) error {
137139
}
138140

139141
func init() {
140-
flag.Var(&ignoreMetrics, "disable_metrics", "comma-separated list of `metrics` to be disabled. Options are 'accelerator', 'cpu_topology','disk', 'diskIO', 'network', 'tcp', 'udp', 'percpu', 'sched', 'process', 'hugetlb', 'referenced_memory'.")
142+
flag.Var(&ignoreMetrics, "disable_metrics", "comma-separated list of `metrics` to be disabled. Options are 'accelerator', 'cpu_topology','disk', 'diskIO', 'network', 'tcp', 'udp', 'percpu', 'sched', 'process', 'hugetlb', 'referenced_memory', 'resctrl'.")
141143

142144
// Default logging verbosity to V(2)
143145
flag.Set("v", "2")

docs/storage/prometheus.md

Lines changed: 72 additions & 72 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Monitoring cAdvisor with Prometheus
22

3-
cAdvisor exposes container and hardware statistics as [Prometheus](https://prometheus.io) metrics out of the box. By default, these metrics are served under the `/metrics` HTTP endpoint. This endpoint may be customized by setting the `-prometheus_endpoint` command-line flag.
3+
cAdvisor exposes container and hardware statistics as [Prometheus](https://prometheus.io) metrics out of the box. By default, these metrics are served under the `/metrics` HTTP endpoint. This endpoint may be customized by setting the `-prometheus_endpoint` and `-disable_metrics` command-line flags.
44

55
To monitor cAdvisor with Prometheus, simply configure one or more jobs in Prometheus which scrape the relevant cAdvisor processes at that metrics endpoint. For details, see Prometheus's [Configuration](https://prometheus.io/docs/operating/configuration/) documentation, as well as the [Getting started](https://prometheus.io/docs/introduction/getting_started/) guide.
66

@@ -14,95 +14,95 @@ To monitor cAdvisor with Prometheus, simply configure one or more jobs in Promet
1414

1515
The table below lists the Prometheus container metrics exposed by cAdvisor (in alphabetical order by metric name):
1616

17-
Metric name | Type | Description | Unit (where applicable)
18-
:-----------|:-----|:------------|:-----------------------
19-
`container_accelerator_duty_cycle` | Gauge | Percent of time over the past sample period during which the accelerator was actively processing | percentage
20-
`container_accelerator_memory_total_bytes` | Gauge | Total accelerator memory | bytes
21-
`container_accelerator_memory_used_bytes` | Gauge | Total accelerator memory allocated | bytes
22-
`container_cpu_cfs_periods_total` | Counter | Number of elapsed enforcement period intervals |
23-
`container_cpu_cfs_throttled_periods_total` | Counter | Number of throttled period intervals |
24-
`container_cpu_cfs_throttled_seconds_total` | Counter | Total time duration the container has been throttled | seconds
25-
`container_cpu_load_average_10s` | Gauge | Value of container cpu load average over the last 10 seconds |
26-
`container_cpu_schedstat_run_periods_total` | Counter | Number of times processes of the cgroup have run on the cpu |
27-
`container_cpu_schedstat_run_seconds_total` | Counter | Time duration the processes of the container have run on the CPU | seconds
28-
`container_cpu_schedstat_runqueue_seconds_total` | Counter | Time duration processes of the container have been waiting on a runqueue | seconds
29-
`container_cpu_system_seconds_total` | Counter | Cumulative system cpu time consumed | seconds
30-
`container_cpu_usage_seconds_total` | Counter | Cumulative cpu time consumed | seconds
31-
`container_cpu_user_seconds_total` | Counter | Cumulative user cpu time consumed | seconds
32-
`container_file_descriptors` | Gauge | Number of open file descriptors for the container |
33-
`container_fs_inodes_free` | Gauge | Number of available Inodes |
34-
`container_fs_inodes_total` | Gauge | Total number of Inodes |
35-
`container_fs_io_current` | Gauge | Number of I/Os currently in progress |
36-
`container_fs_io_time_seconds_total` | Counter | Cumulative count of seconds spent doing I/Os | seconds
37-
`container_fs_io_time_weighted_seconds_total` | Counter | Cumulative weighted I/O time | seconds
38-
`container_fs_limit_bytes` | Gauge | Number of bytes that can be consumed by the container on this filesystem | bytes
39-
`container_fs_reads_bytes_total` | Counter | Cumulative count of bytes read | bytes
40-
`container_fs_reads_total` | Counter | Cumulative count of reads completed |
41-
`container_fs_read_seconds_total` | Counter | Cumulative count of seconds spent reading |
42-
`container_fs_reads_merged_total` | Counter | Cumulative count of reads merged
43-
`container_fs_sector_reads_total` | Counter | Cumulative count of sector reads completed
44-
`container_fs_sector_writes_total` | Counter | Cumulative count of sector writes completed
45-
`container_fs_usage_bytes` | Gauge | Number of bytes that are consumed by the container on this filesystem | bytes
46-
`container_fs_write_seconds_total` | Counter | Cumulative count of seconds spent writing | seconds
47-
`container_fs_writes_bytes_total` | Counter | Cumulative count of bytes written | bytes
48-
`container_fs_writes_merged_total` | Counter | Cumulative count of writes merged |
49-
`container_fs_writes_total` | Counter | Cumulative count of writes completed |
50-
`container_hugetlb_failcnt` | Counter | Number of hugepage usage hits limits |
51-
`container_hugetlb_max_usage_bytes` | Gauge | Maximum hugepage usages recorded | bytes
52-
`container_hugetlb_usage_bytes` | Gauge | Current hugepage usage | bytes
17+
Metric name | Type | Description | Unit (where applicable) | -disable_metrics parameter
18+
:-----------|:-----|:------------|:------------------------|:-------
19+
`container_accelerator_duty_cycle` | Gauge | Percent of time over the past sample period during which the accelerator was actively processing | percentage | accelerator
20+
`container_accelerator_memory_total_bytes` | Gauge | Total accelerator memory | bytes | accelerator
21+
`container_accelerator_memory_used_bytes` | Gauge | Total accelerator memory allocated | bytes | accelerator
22+
`container_cpu_cfs_periods_total` | Counter | Number of elapsed enforcement period intervals | |
23+
`container_cpu_cfs_throttled_periods_total` | Counter | Number of throttled period intervals | |
24+
`container_cpu_cfs_throttled_seconds_total` | Counter | Total time duration the container has been throttled | seconds |
25+
`container_cpu_load_average_10s` | Gauge | Value of container cpu load average over the last 10 seconds | |
26+
`container_cpu_schedstat_run_periods_total` | Counter | Number of times processes of the cgroup have run on the cpu | | sched
27+
`container_cpu_schedstat_run_seconds_total` | Counter | Time duration the processes of the container have run on the CPU | seconds | sched
28+
`container_cpu_schedstat_runqueue_seconds_total` | Counter | Time duration processes of the container have been waiting on a runqueue | seconds | sched
29+
`container_cpu_system_seconds_total` | Counter | Cumulative system cpu time consumed | seconds |
30+
`container_cpu_usage_seconds_total` | Counter | Cumulative cpu time consumed | seconds |
31+
`container_cpu_user_seconds_total` | Counter | Cumulative user cpu time consumed | seconds |
32+
`container_file_descriptors` | Gauge | Number of open file descriptors for the container | | process
33+
`container_fs_inodes_free` | Gauge | Number of available Inodes | | disk
34+
`container_fs_inodes_total` | Gauge | Total number of Inodes | | disk
35+
`container_fs_io_current` | Gauge | Number of I/Os currently in progress | | diskIO
36+
`container_fs_io_time_seconds_total` | Counter | Cumulative count of seconds spent doing I/Os | seconds | diskIO
37+
`container_fs_io_time_weighted_seconds_total` | Counter | Cumulative weighted I/O time | seconds | diskIO
38+
`container_fs_limit_bytes` | Gauge | Number of bytes that can be consumed by the container on this filesystem | bytes | disk
39+
`container_fs_reads_bytes_total` | Counter | Cumulative count of bytes read | bytes | diskIO
40+
`container_fs_reads_total` | Counter | Cumulative count of reads completed | | diskIO
41+
`container_fs_read_seconds_total` | Counter | Cumulative count of seconds spent reading | | diskIO
42+
`container_fs_reads_merged_total` | Counter | Cumulative count of reads merged | | diskIO
43+
`container_fs_sector_reads_total` | Counter | Cumulative count of sector reads completed | | diskIO
44+
`container_fs_sector_writes_total` | Counter | Cumulative count of sector writes completed | | diskIO
45+
`container_fs_usage_bytes` | Gauge | Number of bytes that are consumed by the container on this filesystem | bytes | disk
46+
`container_fs_write_seconds_total` | Counter | Cumulative count of seconds spent writing | seconds | diskIO
47+
`container_fs_writes_bytes_total` | Counter | Cumulative count of bytes written | bytes | diskIO
48+
`container_fs_writes_merged_total` | Counter | Cumulative count of writes merged | | diskIO
49+
`container_fs_writes_total` | Counter | Cumulative count of writes completed | | diskIO
50+
`container_hugetlb_failcnt` | Counter | Number of hugepage usage hits limits | | hugetlb
51+
`container_hugetlb_max_usage_bytes` | Gauge | Maximum hugepage usages recorded | bytes | hugetlb
52+
`container_hugetlb_usage_bytes` | Gauge | Current hugepage usage | bytes | hugetlb
5353
`container_last_seen` | Gauge | Last time a container was seen by the exporter | timestamp
54-
`container_llc_occupancy_bytes` | Gauge | Last level cache usage statistics for container counted with RDT Memory Bandwidth Monitoring (MBM). | bytes
55-
`container_memory_bandwidth_bytes` | Gauge | Total memory bandwidth usage statistics for container counted with RDT Memory Bandwidth Monitoring (MBM). | bytes
56-
`container_memory_bandwidth_local_bytes` | Gauge | Local memory bandwidth usage statistics for container counted with RDT Memory Bandwidth Monitoring (MBM). | bytes
57-
`container_memory_cache` | Gauge | Total page cache memory | bytes
58-
`container_memory_failcnt` | Counter | Number of memory usage hits limits |
59-
`container_memory_failures_total` | Counter | Cumulative count of memory allocation failures |
60-
`container_memory_max_usage_bytes` | Gauge | Maximum memory usage recorded | bytes
61-
`container_memory_rss` | Gauge | Size of RSS | bytes
62-
`container_memory_swap` | Gauge | Container swap usage | bytes
63-
`container_memory_mapped_file` | Gauge | Size of memory mapped files | bytes
64-
`container_memory_usage_bytes` | Gauge | Current memory usage, including all memory regardless of when it was accessed | bytes
65-
`container_memory_working_set_bytes` | Gauge | Current working set | bytes
66-
`container_network_receive_bytes_total` | Counter | Cumulative count of bytes received | bytes
67-
`container_network_receive_packets_dropped_total` | Counter | Cumulative count of packets dropped while receiving |
68-
`container_network_receive_packets_total` | Counter | Cumulative count of packets received |
69-
`container_network_receive_errors_total` | Counter | Cumulative count of errors encountered while receiving |
70-
`container_network_transmit_bytes_total` | Counter | Cumulative count of bytes transmitted | bytes
71-
`container_network_transmit_packets_total` | Counter | Cumulative count of packets transmitted |
72-
`container_network_transmit_packets_dropped_total` | Counter | Cumulative count of packets dropped while transmitting |
73-
`container_network_transmit_errors_total` | Counter | Cumulative count of errors encountered while transmitting |
74-
`container_network_tcp_usage_total` | Gauge | tcp connection usage statistic for container |
75-
`container_network_tcp6_usage_total` | Gauge | tcp6 connection usage statistic for container |
76-
`container_network_udp_usage_total` | Gauge | udp connection usage statistic for container |
77-
`container_network_udp6_usage_total` | Gauge | udp6 connection usage statistic for container |
78-
`container_processes` | Gauge | Number of processes running inside the container |
79-
`container_referenced_bytes` | Gauge | Container referenced bytes during last measurements cycle based on Referenced field in /proc/smaps file, with /proc/PIDs/clear_refs set to 1 after defined number of cycles configured through `referenced_reset_interval` cAdvisor parameter.</br>Warning: this is intrusive collection because can influence kernel page reclaim policy and add latency. Refer to https://github.com/brendangregg/wss#wsspl-referenced-page-flag for more details. | bytes
54+
`container_llc_occupancy_bytes` | Gauge | Last level cache usage statistics for container counted with RDT Memory Bandwidth Monitoring (MBM). | bytes | resctrl
55+
`container_memory_bandwidth_bytes` | Gauge | Total memory bandwidth usage statistics for container counted with RDT Memory Bandwidth Monitoring (MBM). | bytes | resctrl
56+
`container_memory_bandwidth_local_bytes` | Gauge | Local memory bandwidth usage statistics for container counted with RDT Memory Bandwidth Monitoring (MBM). | bytes | resctrl
57+
`container_memory_cache` | Gauge | Total page cache memory | bytes |
58+
`container_memory_failcnt` | Counter | Number of memory usage hits limits | |
59+
`container_memory_failures_total` | Counter | Cumulative count of memory allocation failures | |
60+
`container_memory_max_usage_bytes` | Gauge | Maximum memory usage recorded | bytes |
61+
`container_memory_rss` | Gauge | Size of RSS | bytes |
62+
`container_memory_swap` | Gauge | Container swap usage | bytes |
63+
`container_memory_mapped_file` | Gauge | Size of memory mapped files | bytes |
64+
`container_memory_usage_bytes` | Gauge | Current memory usage, including all memory regardless of when it was accessed | bytes |
65+
`container_memory_working_set_bytes` | Gauge | Current working set | bytes |
66+
`container_network_receive_bytes_total` | Counter | Cumulative count of bytes received | bytes | network
67+
`container_network_receive_packets_dropped_total` | Counter | Cumulative count of packets dropped while receiving | | network
68+
`container_network_receive_packets_total` | Counter | Cumulative count of packets received | | network
69+
`container_network_receive_errors_total` | Counter | Cumulative count of errors encountered while receiving | | network
70+
`container_network_transmit_bytes_total` | Counter | Cumulative count of bytes transmitted | bytes | network
71+
`container_network_transmit_packets_total` | Counter | Cumulative count of packets transmitted | | network
72+
`container_network_transmit_packets_dropped_total` | Counter | Cumulative count of packets dropped while transmitting | | network
73+
`container_network_transmit_errors_total` | Counter | Cumulative count of errors encountered while transmitting | | network
74+
`container_network_tcp_usage_total` | Gauge | tcp connection usage statistic for container | | tcp
75+
`container_network_tcp6_usage_total` | Gauge | tcp6 connection usage statistic for container | | tcp
76+
`container_network_udp_usage_total` | Gauge | udp connection usage statistic for container | | udp
77+
`container_network_udp6_usage_total` | Gauge | udp6 connection usage statistic for container | | udp
78+
`container_processes` | Gauge | Number of processes running inside the container | | process
79+
`container_referenced_bytes` | Gauge | Container referenced bytes during last measurements cycle based on Referenced field in /proc/smaps file, with /proc/PIDs/clear_refs set to 1 after defined number of cycles configured through `referenced_reset_interval` cAdvisor parameter.</br>Warning: this is intrusive collection because can influence kernel page reclaim policy and add latency. Refer to https://github.com/brendangregg/wss#wsspl-referenced-page-flag for more details. | bytes | referenced_memory
8080
`container_spec_cpu_period` | Gauge | CPU period of the container |
8181
`container_spec_cpu_quota` | Gauge | CPU quota of the container |
8282
`container_spec_cpu_shares` | Gauge | CPU share of the container |
8383
`container_spec_memory_limit_bytes` | Gauge | Memory limit for the container | bytes
8484
`container_spec_memory_swap_limit_bytes` | Gauge | Memory swap limit for the container | bytes
8585
`container_spec_memory_reservation_limit_bytes` | Gauge | Memory reservation limit for the container | bytes
8686
`container_start_time_seconds` | Gauge | Start time of the container since unix epoch | seconds
87-
`container_tasks_state` | Gauge | Number of tasks in given state (`sleeping`, `running`, `stopped`, `uninterruptible`, or `ioawaiting`) |
88-
`container_perf_metric` | Counter | Scaled counter of perf event (event can be identified by `event` label and `cpu` indicates the core where event was measured). See [perf event configuration](docs/runtime_options.md#perf-events) |
89-
`container_perf_metric_scaling_ratio` | Gauge | Scaling ratio for perf event counter (event can be identified by `event` label and `cpu` indicates the core where event was measured). See [perf event configuration](docs/runtime_options.md#perf-events) |
87+
`container_tasks_state` | Gauge | Number of tasks in given state (`sleeping`, `running`, `stopped`, `uninterruptible`, or `ioawaiting`) | |
88+
`container_perf_metric` | Counter | Scaled counter of perf event (event can be identified by `event` label and `cpu` indicates the core where event was measured). See [perf event configuration](docs/runtime_options.md#perf-events) | |
89+
`container_perf_metric_scaling_ratio` | Gauge | Scaling ratio for perf event counter (event can be identified by `event` label and `cpu` indicates the core where event was measured). See [perf event configuration](docs/runtime_options.md#perf-events) | |
9090

9191
## Prometheus hardware metrics
9292

9393
The table below lists the Prometheus hardware metrics exposed by cAdvisor (in alphabetical order by metric name):
9494

95-
Metric name | Type | Description | Unit (where applicable)
96-
:-----------|:-----|:------------|:-----------------------
97-
`machine_cpu_cache_capacity_bytes` | Gauge | Cache size in bytes assigned to NUMA node and CPU core | bytes
95+
Metric name | Type | Description | Unit (where applicable) | -disable_metrics parameter
96+
:-----------|:-----|:------------|:------------------------|:--------------------------
97+
`machine_cpu_cache_capacity_bytes` | Gauge | Cache size in bytes assigned to NUMA node and CPU core | bytes | cpu_topology
9898
`machine_cpu_cores` | Gauge | Number of logical CPU cores |
9999
`machine_cpu_physical_cores` | Gauge | Number of physical CPU cores |
100100
`machine_cpu_sockets` | Gauge | Number of CPU sockets |
101101
`machine_dimm_capacity_bytes` | Gauge | Total RAM DIMM capacity (all types memory modules) value labeled by dimm type,<br>information is retrieved from sysfs edac per-DIMM API (/sys/devices/system/edac/mc/) introduced in kernel 3.6 | bytes
102102
`machine_dimm_count` | Gauge | Number of RAM DIMM (all types memory modules) value labeled by dimm type,<br>information is retrieved from sysfs edac per-DIMM API (/sys/devices/system/edac/mc/) introduced in kernel 3.6 |
103103
`machine_memory_bytes` | Gauge | Amount of memory installed on the machine | bytes
104-
`machine_node_hugepages_count` | Gauge | Numer of hugepages assigned to NUMA node |
105-
`machine_node_memory_capacity_bytes` | Gauge | Amount of memory assigned to NUMA node | bytes
104+
`machine_node_hugepages_count` | Gauge | Numer of hugepages assigned to NUMA node | | cpu_topology
105+
`machine_node_memory_capacity_bytes` | Gauge | Amount of memory assigned to NUMA node | bytes | cpu_topology
106106
`machine_nvm_avg_power_budget_watts` | Gauge | NVM power budget | watts
107107
`machine_nvm_capacity` | Gauge | NVM capacity value labeled by NVM mode (memory mode or app direct mode) | bytes
108-
`machine_thread_siblings_count` | Gauge | Number of CPU thread siblings |
108+
`machine_thread_siblings_count` | Gauge | Number of CPU thread siblings | | cpu_topology

0 commit comments

Comments
 (0)