-
Notifications
You must be signed in to change notification settings - Fork 541
Description
We have our K8S development clusters set to hibernate everyday at the end of regular work hours. The cluster becomes active the next day. We have setup opentelemetry-operator on our cluster. Also configured OpenTelemetry Collector as a Daemon with corresponding annotations on the pod (Java/NodeJS)
For Java:
# format <namespace/otel instrumentation CR>
instrumentation.opentelemetry.io/inject-java=dev-opentelemetry/opentelemetry-instrumentation
For NodeJS:
# format <namespace/otel instrumentation CR>
instrumentation.opentelemetry.io/inject-nodejs=dev-opentelemetry/opentelemetry-instrumentation
With this setup, everything works fine. For example, for Java apps the JavaAgent is volume mounted automatically. The agent instruments the application and ships the traces to a OpenTelemetry collector pod (created via Otel CR via the operator). Finally, the collector pod ships the traces to our Observability backend service. However, when the workload resumes the next day after hibernation everything seems to be lost (see screen shot below). Not sure why it happens ? There is not much information in the application logs or OpenTelemetry daemon pod logs or even in the opentelemetry-operator-controller-manager
pod in the opentelemetry-operator-system
namespace.
Container spec before hibernation:
After resumption from hibernation: OpenTelemetry setup is lost
Thanks in advance!!!