You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have Telegraf running in our environment, and recently, our containers started crashing unexpectedly, entering a CrashLoopBackOff state with exit code 137.
Upon reviewing the logs, we observed that Telegraf initializes successfully but crashes immediately afterward. There are no error events available for analysis, including in the Kubelet logs.
An unusual observation is that when we modify the container's entrypoint to ["tail", "-f", "/dev/null"] (essentially keeping it alive without executing Telegraf), and then manually start Telegraf using kubectl exec, it runs without any issues.
Troubleshooting Steps & Findings:
The issue persists across different versions of Telegraf, including the latest release.
Other clusters running the same setup are unaffected.
The underlying nodes have not changed since before the crashes began.
The issue is not related to insufficient resource allocation.
Would appreciate any insights or suggestions on potential causes or further debugging approaches.
The text was updated successfully, but these errors were encountered:
Hello! I recommend posting this question in our Community Slack or Community Forums, we have a lot of talented community members there who could help answer your question more quickly. You can also learn more about Telegraf by enrolling at InfluxDB University for free!
Heads up, this issue will be automatically closed after 7 days of inactivity. Thank you!
We have Telegraf running in our environment, and recently, our containers started crashing unexpectedly, entering a CrashLoopBackOff state with exit code 137.
Upon reviewing the logs, we observed that Telegraf initializes successfully but crashes immediately afterward. There are no error events available for analysis, including in the Kubelet logs.
An unusual observation is that when we modify the container's entrypoint to ["tail", "-f", "/dev/null"] (essentially keeping it alive without executing Telegraf), and then manually start Telegraf using kubectl exec, it runs without any issues.
Troubleshooting Steps & Findings:
Would appreciate any insights or suggestions on potential causes or further debugging approaches.
The text was updated successfully, but these errors were encountered: