-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Status of the connection #4
Comments
Sounds interesting; the normal pattern would be to dump this on STDOUT/STDERR, but a |
I'll work in a PR to add these error metrics. Maybe it would be interesting, as you pointed, to give information of the number of failures, even adding a label with the type of failure (connection_error, timeout, incorrect_function?, bad_address?). This way operator would have more information on the type of problem they are addressing. |
Sounds good. If you touch that part, moving the exporter metrics from |
Seems similar to how the blackbox exporter also works. I'll try to work in a PR these days. |
Yes, we made blackbox and snmp exporters behave the same so we have a bit of a standard already |
I'm already working in adding information about the error rate and codes to the exporter. I think it would be also important to add information about time and number of the requests, so it is possible to visualize latency, traffic and error rate in a dashboard. Similar to what the blackbox exporter does, but with the modbus queries. I'll include them in the metrics of the exporter with a label for the target, to be able to visualize these metrics in total and per target. |
Query runtime should be part of the target metric data. That way, you can pin down specific PLCs becoming slower, etc. |
In the PR #7 I added those metrics with the label "target", so error rates, latency, number of queries, etc. can be treated in total and per target. (I had to deal with specific PLCs with problems and that's why I added the label with the target). The number of possible values of the label if bounded by the number of modbus TCP devices, and keeping in account that usually modbus IP PLCs act as aggregators of modbus RTU devices, there shouldn't be a great number of different targets. This way, the high cardinality problem should be contained. For further information about the cause of a specific PLC failure or malfunction, there are logs, but at least, the alarm can be configured to note that something is happening and with information about which target is with problems and what kind of error. |
@daviddetorres We are having some issues with some connections to the modbus server. some sessions are left as close_wait in the server side. in the same request we are querying around 600 devices, should we split the request? |
Modbus has interesting function code "0x11 - Report Slave ID". You can send request to modbus device and get unit id back, if device is up and running. |
In some cases it can happen that the TCP connection with the device is ok, but the modbus server is not running, the client is connecting to another open port or that the modbus ID configured is not correct.
It also can happen that in the configuration shown in the README with a modbus TCP/RTU bridge, that the bridge is online (so the connection is established) but the connection with the RTU devices (usually a RS232 or RS485 bus) is not ok (bus disconnected, incorrect serial configuration, etc).
It could be interesting to add a metric to inform about the connection status (connection_up?) if the socket is correctly established, independently of the result of the query of the modbus registers. This can help to detect failures in the devices or configuration issues.
If you think any of this ideas would be worth to work in, I could work in a PR.
The text was updated successfully, but these errors were encountered: