You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
flag.Var(&ignoreMetrics, "disable_metrics", "comma-separated list of `metrics` to be disabled. Options are 'accelerator', 'cpu_topology','disk', 'diskIO', 'network', 'tcp', 'udp', 'percpu', 'sched', 'process', 'hugetlb', 'referenced_memory'.")
142
+
flag.Var(&ignoreMetrics, "disable_metrics", "comma-separated list of `metrics` to be disabled. Options are 'accelerator', 'cpu_topology','disk', 'diskIO', 'network', 'tcp', 'udp', 'percpu', 'sched', 'process', 'hugetlb', 'referenced_memory', 'resctrl'.")
cAdvisor exposes container and hardware statistics as [Prometheus](https://prometheus.io) metrics out of the box. By default, these metrics are served under the `/metrics` HTTP endpoint. This endpoint may be customized by setting the `-prometheus_endpoint` command-line flag.
3
+
cAdvisor exposes container and hardware statistics as [Prometheus](https://prometheus.io) metrics out of the box. By default, these metrics are served under the `/metrics` HTTP endpoint. This endpoint may be customized by setting the `-prometheus_endpoint`and `-disable_metrics`command-line flags.
4
4
5
5
To monitor cAdvisor with Prometheus, simply configure one or more jobs in Prometheus which scrape the relevant cAdvisor processes at that metrics endpoint. For details, see Prometheus's [Configuration](https://prometheus.io/docs/operating/configuration/) documentation, as well as the [Getting started](https://prometheus.io/docs/introduction/getting_started/) guide.
6
6
@@ -14,95 +14,95 @@ To monitor cAdvisor with Prometheus, simply configure one or more jobs in Promet
14
14
15
15
The table below lists the Prometheus container metrics exposed by cAdvisor (in alphabetical order by metric name):
16
16
17
-
Metric name | Type | Description | Unit (where applicable)
`container_accelerator_duty_cycle` | Gauge | Percent of time over the past sample period during which the accelerator was actively processing | percentage
20
-
`container_accelerator_memory_total_bytes` | Gauge | Total accelerator memory | bytes
21
-
`container_accelerator_memory_used_bytes` | Gauge | Total accelerator memory allocated | bytes
22
-
`container_cpu_cfs_periods_total` | Counter | Number of elapsed enforcement period intervals |
23
-
`container_cpu_cfs_throttled_periods_total` | Counter | Number of throttled period intervals |
24
-
`container_cpu_cfs_throttled_seconds_total` | Counter | Total time duration the container has been throttled | seconds
25
-
`container_cpu_load_average_10s` | Gauge | Value of container cpu load average over the last 10 seconds |
26
-
`container_cpu_schedstat_run_periods_total` | Counter | Number of times processes of the cgroup have run on the cpu |
27
-
`container_cpu_schedstat_run_seconds_total` | Counter | Time duration the processes of the container have run on the CPU | seconds
28
-
`container_cpu_schedstat_runqueue_seconds_total` | Counter | Time duration processes of the container have been waiting on a runqueue | seconds
29
-
`container_cpu_system_seconds_total` | Counter | Cumulative system cpu time consumed | seconds
30
-
`container_cpu_usage_seconds_total` | Counter | Cumulative cpu time consumed | seconds
31
-
`container_cpu_user_seconds_total` | Counter | Cumulative user cpu time consumed | seconds
32
-
`container_file_descriptors` | Gauge | Number of open file descriptors for the container |
33
-
`container_fs_inodes_free` | Gauge | Number of available Inodes |
34
-
`container_fs_inodes_total` | Gauge | Total number of Inodes |
35
-
`container_fs_io_current` | Gauge | Number of I/Os currently in progress |
`container_accelerator_duty_cycle` | Gauge | Percent of time over the past sample period during which the accelerator was actively processing | percentage | accelerator
`container_cpu_cfs_periods_total` | Counter | Number of elapsed enforcement period intervals | |
23
+
`container_cpu_cfs_throttled_periods_total` | Counter | Number of throttled period intervals | |
24
+
`container_cpu_cfs_throttled_seconds_total` | Counter | Total time duration the container has been throttled | seconds |
25
+
`container_cpu_load_average_10s` | Gauge | Value of container cpu load average over the last 10 seconds | |
26
+
`container_cpu_schedstat_run_periods_total` | Counter | Number of times processes of the cgroup have run on the cpu | | sched
27
+
`container_cpu_schedstat_run_seconds_total` | Counter | Time duration the processes of the container have run on the CPU | seconds | sched
28
+
`container_cpu_schedstat_runqueue_seconds_total` | Counter | Time duration processes of the container have been waiting on a runqueue | seconds | sched
29
+
`container_cpu_system_seconds_total` | Counter | Cumulative system cpu time consumed | seconds |
30
+
`container_cpu_usage_seconds_total` | Counter | Cumulative cpu time consumed | seconds |
31
+
`container_cpu_user_seconds_total` | Counter | Cumulative user cpu time consumed | seconds |
32
+
`container_file_descriptors` | Gauge | Number of open file descriptors for the container | | process
33
+
`container_fs_inodes_free` | Gauge | Number of available Inodes | | disk
34
+
`container_fs_inodes_total` | Gauge | Total number of Inodes | | disk
35
+
`container_fs_io_current` | Gauge | Number of I/Os currently in progress | | diskIO
`container_processes` | Gauge | Number of processes running inside the container |
79
-
`container_referenced_bytes` | Gauge | Container referenced bytes during last measurements cycle based on Referenced field in /proc/smaps file, with /proc/PIDs/clear_refs set to 1 after defined number of cycles configured through `referenced_reset_interval` cAdvisor parameter.</br>Warning: this is intrusive collection because can influence kernel page reclaim policy and add latency. Refer to https://github.com/brendangregg/wss#wsspl-referenced-page-flag for more details. | bytes
54
+
`container_llc_occupancy_bytes` | Gauge | Last level cache usage statistics for container counted with RDT Memory Bandwidth Monitoring (MBM). | bytes | resctrl
55
+
`container_memory_bandwidth_bytes` | Gauge | Total memory bandwidth usage statistics for container counted with RDT Memory Bandwidth Monitoring (MBM). | bytes | resctrl
56
+
`container_memory_bandwidth_local_bytes` | Gauge | Local memory bandwidth usage statistics for container counted with RDT Memory Bandwidth Monitoring (MBM). | bytes | resctrl
`container_processes` | Gauge | Number of processes running inside the container | | process
79
+
`container_referenced_bytes` | Gauge | Container referenced bytes during last measurements cycle based on Referenced field in /proc/smaps file, with /proc/PIDs/clear_refs set to 1 after defined number of cycles configured through `referenced_reset_interval` cAdvisor parameter.</br>Warning: this is intrusive collection because can influence kernel page reclaim policy and add latency. Refer to https://github.com/brendangregg/wss#wsspl-referenced-page-flag for more details. | bytes | referenced_memory
80
80
`container_spec_cpu_period` | Gauge | CPU period of the container |
81
81
`container_spec_cpu_quota` | Gauge | CPU quota of the container |
82
82
`container_spec_cpu_shares` | Gauge | CPU share of the container |
83
83
`container_spec_memory_limit_bytes` | Gauge | Memory limit for the container | bytes
84
84
`container_spec_memory_swap_limit_bytes` | Gauge | Memory swap limit for the container | bytes
85
85
`container_spec_memory_reservation_limit_bytes` | Gauge | Memory reservation limit for the container | bytes
86
86
`container_start_time_seconds` | Gauge | Start time of the container since unix epoch | seconds
87
-
`container_tasks_state` | Gauge | Number of tasks in given state (`sleeping`, `running`, `stopped`, `uninterruptible`, or `ioawaiting`) |
88
-
`container_perf_metric` | Counter | Scaled counter of perf event (event can be identified by `event` label and `cpu` indicates the core where event was measured). See [perf event configuration](docs/runtime_options.md#perf-events) |
89
-
`container_perf_metric_scaling_ratio` | Gauge | Scaling ratio for perf event counter (event can be identified by `event` label and `cpu` indicates the core where event was measured). See [perf event configuration](docs/runtime_options.md#perf-events) |
87
+
`container_tasks_state` | Gauge | Number of tasks in given state (`sleeping`, `running`, `stopped`, `uninterruptible`, or `ioawaiting`) | |
88
+
`container_perf_metric` | Counter | Scaled counter of perf event (event can be identified by `event` label and `cpu` indicates the core where event was measured). See [perf event configuration](docs/runtime_options.md#perf-events) | |
89
+
`container_perf_metric_scaling_ratio` | Gauge | Scaling ratio for perf event counter (event can be identified by `event` label and `cpu` indicates the core where event was measured). See [perf event configuration](docs/runtime_options.md#perf-events) | |
90
90
91
91
## Prometheus hardware metrics
92
92
93
93
The table below lists the Prometheus hardware metrics exposed by cAdvisor (in alphabetical order by metric name):
94
94
95
-
Metric name | Type | Description | Unit (where applicable)
`machine_cpu_cache_capacity_bytes` | Gauge | Cache size in bytes assigned to NUMA node and CPU core | bytes | cpu_topology
98
98
`machine_cpu_cores` | Gauge | Number of logical CPU cores |
99
99
`machine_cpu_physical_cores` | Gauge | Number of physical CPU cores |
100
100
`machine_cpu_sockets` | Gauge | Number of CPU sockets |
101
101
`machine_dimm_capacity_bytes` | Gauge | Total RAM DIMM capacity (all types memory modules) value labeled by dimm type,<br>information is retrieved from sysfs edac per-DIMM API (/sys/devices/system/edac/mc/) introduced in kernel 3.6 | bytes
102
102
`machine_dimm_count` | Gauge | Number of RAM DIMM (all types memory modules) value labeled by dimm type,<br>information is retrieved from sysfs edac per-DIMM API (/sys/devices/system/edac/mc/) introduced in kernel 3.6 |
103
103
`machine_memory_bytes` | Gauge | Amount of memory installed on the machine | bytes
104
-
`machine_node_hugepages_count` | Gauge | Numer of hugepages assigned to NUMA node |
105
-
`machine_node_memory_capacity_bytes` | Gauge | Amount of memory assigned to NUMA node | bytes
104
+
`machine_node_hugepages_count` | Gauge | Numer of hugepages assigned to NUMA node | | cpu_topology
105
+
`machine_node_memory_capacity_bytes` | Gauge | Amount of memory assigned to NUMA node | bytes | cpu_topology
106
106
`machine_nvm_avg_power_budget_watts` | Gauge | NVM power budget | watts
107
107
`machine_nvm_capacity` | Gauge | NVM capacity value labeled by NVM mode (memory mode or app direct mode) | bytes
108
-
`machine_thread_siblings_count` | Gauge | Number of CPU thread siblings |
108
+
`machine_thread_siblings_count` | Gauge | Number of CPU thread siblings | | cpu_topology
0 commit comments