A multi-use script for assisting in troubleshooting Network Performance Monitoring installations. npmDiag.sh
has three modes for different use-cases:
-
--collect
: Creates an output file containing your configuration file and logs from Ktranslate. If Ktranslate is installed as a Linux service, the/etc/ktranslate/profiles
directory is gathered. If Ktranslate is running in a container, only files mounted to the container directly are included. Outputs a file callednpmDiag-output-<date>.zip
. -
--time
: Uses thesnmp-base.yaml
configuration file to list available devices. The selected device hassnmpwalk
run against it with all OIDs included in it's assigned (and extension) profiles. Outputs the time required to poll the device, as well as a file called<targetDevice>_timing_results-<date>.txt
in the current directory. Useful if the best timeout setting for a device is unknown. -
--walk
: Uses thesnmp-base.yaml
configuration file to list available devices. The selected device hassnmpwalk
run against it in order to return all of it's supported OIDs. Outputs a file called<targetDevice>_snmpwalk_results-<date>.txt
. This process can take a long time depending on how many OIDs the device supports.
The script requires different packages depending on the use-case. Required packages and install commands are below:
-
Ubuntu:
sudo apt install jq zip -y
RHEL/CentOS:
sudo yum install jq zip -y
-
Ubuntu:
sudo apt install jq snmp -y; \ sudo wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/bin/yq; \ sudo chmod +x /usr/bin/yq # OR # sudo apt install jq snmp -y; sudo snap install yq
RHEL/CentOS:
sudo yum install jq net-snmp-utils -y; \ sudo wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/bin/yq; \ sudo chmod +x /usr/bin/yq # OR # sudo yum install jq net-snmp-utils -y; sudo snap install yq
Note: The snap
package is required when installing yq
on Ubuntu & RHEL/CentOS if pulling from GitHub is not an option. If you're currently running the Docker container on RHEL or CentOS, the jq
and snap
packages are not available in the base image repositories. Documentation on the process of adding the Extra Packages for Enterprise Linux repository can be found (here)
- Download the script with
wget https://raw.githubusercontent.com/newrelic-experimental/newrelic-npm-diagnostics/main/npmDiag.sh
- Use
chmod +x ./npmDiag.sh
to make it executable - Run the script with
./npmDiag.sh --collect
,./npmDiag.sh --time
, or./npmDiag.sh --walk
depending on what you want to do-
--collect
: Collects diagnostic info from Ktranslate's container or service. OutputsnpmDiag-output-<date>.zip
-
--time
: Runsnmpwalk
against a device from the config using it's assigned profile. Outputs time to complete, as well as<targetDevice>_timing_results-<date>.txt
-
--walk
: Runsnmpwalk
against a device from the config using it's assigned profile. Outputs the complete list of OIDs supported by the device, as well as<targetDevice>_snmpwalk_results-<date>.txt
-
Note: --collect
mode must be run with sudo
if Ktranslate is installed in a Docker container or as a Linux service. Running this mode without sudo
for either installation will throw an error, and the script will exit.
Note: npmDiag.sh will do it's best to automatically determine what installation method you've used. If it's failing to do so, you can include
--installMethod [DOCKER|PODMAN|BAREMETAL]` in the run command to force a method.
By default the Ktranslate container runs with info-level logs being generated. Ktranslate isn't able to update the verbosity of the logs on the fly, so if you want to collect debug-level logs you will need to launch a new container. This can be achieved by doing the following:
- Find your existing Ktranslate container's short ID with the command
docker ps -a
- Stop the container with the command
docker stop <shortID>
- Once the container is stopped, you'll need to launch a new container with debug-level logs enabled. This can be done by modifying your container's run command to include
-log_level=debug
in the Ktranslate arguments. For example, you would change your run command from
docker run -d --name ktranslate-info-level-container --restart unless-stopped --pull=always -p 162:1620/udp \
-v `pwd`/snmp-base.yaml:/snmp-base.yaml \
-e NEW_RELIC_API_KEY=$LICENSE_KEY \
kentik/ktranslate:v2 \
-snmp /snmp-base.yaml \
-nr_account_id=$ACCOUNT_ID \
-metrics=jchf \
-tee_logs=true \
-service_name=debug-level-test \
-snmp_discovery_on_start=true \
-snmp_discovery_min=180 \
nr1.snmp
to
docker run -d --name ktranslate-debug-level-container --restart unless-stopped --pull=always -p 162:1620/udp \
-v `pwd`/snmp-base.yaml:/snmp-base.yaml \
-e NEW_RELIC_API_KEY=$LICENSE_KEY \
kentik/ktranslate:v2 \
-snmp /snmp-base.yaml \
-nr_account_id=$ACCOUNT_ID \
-metrics=jchf \
-tee_logs=true \
-service_name=debug-level-test \
-snmp_discovery_on_start=true \
-snmp_discovery_min=180 \
-log_level=debug \ # <- Debug-level logs are enabled here
nr1.snmp
- Once the new container is up and running with debug mode enabled, you can target it with the script to collect more-informative logs. This new container can be stopped and deleted once the script is finished running, and the old container can be started up again.
N/A
Requests for support can be filed as an Issue.
We encourage your contributions to improve npmDiag
! Keep in mind when you submit your pull request, you'll need to sign the CLA via the click-through using CLA-Assistant. You only have to sign the CLA one time per project.
If you have any questions, or to execute our corporate CLA, required if your contribution is on behalf of a company, please drop us an email at [email protected].
A note about vulnerabilities
As noted in our security policy, New Relic is committed to the privacy and security of our customers and their data. We believe that providing coordinated disclosure by security researchers and engaging with the security community are important means to achieve our security goals.
If you believe you have found a security vulnerability in this project or any of New Relic's products or websites, we welcome and greatly appreciate you reporting it to New Relic through HackerOne.
npmDiag
is licensed under the Apache 2.0 License.