Skip to content

Latest commit

 

History

History
82 lines (67 loc) · 3.71 KB

README.md

File metadata and controls

82 lines (67 loc) · 3.71 KB

nvme_exporter for Prometheus

🌀 nvme_exporter provides useful metrics of NVMe SSDs (e.g., lifetime, device status, and read/write operations that are described in NVME specifications https://nvmexpress.org/). The information is obtained from NVMe Admin Commands using the NVMe CLI tool.

nvme_exporter is written based on the python prometheus client (https://github.com/prometheus/client_python) and the NVMe CLI tool (https://github.com/linux-nvme/nvme-cli).

Dependencies

  • python3.6
  • nvme-cli
  • python prometheous-client

Installation

git clone [email protected]:yongseokoh/nvme_exporter.git
cd nvme_exporter

# Install prometheus-client
pip install requirements.txt

# Install nvme-cli tool
git submodule init
git submodule update
cd nvme-cli
make
make install

Usage

usage: nvme_exporter.py [-h] [-p PORT] [-u UPDATE] [-s SIMULATION]

NVME Export port number and update period time

optional arguments:
  -h, --help            show this help message and exit
  -p PORT, --port PORT  Port to listen
  -u UPDATE, --update UPDATE
                        export mertic update period in seconds
  -s SIMULATION, --simulation SIMULATION
                        making use of NVMe simulation

Example

sudo python nvme_exporter.py -p 9900 -u 10

Grafana Sample

NVMe Health & Monitoring Metrics

Command: nvme smart-log /dev/nvme0 -o json

Name type impl. state
critical_warning Gauge implemented
temperature Gauge implemented
avail_sapre Gauge implemented
spare_thresh Gauge implemented
percent_used Gauge implemented
data_units_read Gauge implemented
data_units_written Gauge implemented
host_read_commands Gauge implemented
host_write_commands Gauge implemented
controller_busy_time Gauge implemented
power_cycles Gauge implemented
power_on_hours Gauge implemented
unsafe_shutdowns Gauge implemented
media_errors Gauge implemented
num_err_log_entries Gauge implemented
warning_temp_time Gauge implemented
critical_comp_time Gauge implemented
thm_temp1_trans_count Gauge implemented
tmp_temp2_trans_count Gauge implemented

Command: nvme show-regs /dev/nvme0 -o json

Name type impl. state
controller configuration Gauge implemented
controller status Gauge implemented
other metrics Gauge pending