Skip to content

Auto-instrumentation lost on resumption of cluster from hibernation #1329

@santoshkashyap

Description

@santoshkashyap

We have our K8S development clusters set to hibernate everyday at the end of regular work hours. The cluster becomes active the next day. We have setup opentelemetry-operator on our cluster. Also configured OpenTelemetry Collector as a Daemon with corresponding annotations on the pod (Java/NodeJS)
For Java:

# format <namespace/otel instrumentation CR>
instrumentation.opentelemetry.io/inject-java=dev-opentelemetry/opentelemetry-instrumentation

For NodeJS:

# format <namespace/otel instrumentation CR>
instrumentation.opentelemetry.io/inject-nodejs=dev-opentelemetry/opentelemetry-instrumentation

With this setup, everything works fine. For example, for Java apps the JavaAgent is volume mounted automatically. The agent instruments the application and ships the traces to a OpenTelemetry collector pod (created via Otel CR via the operator). Finally, the collector pod ships the traces to our Observability backend service. However, when the workload resumes the next day after hibernation everything seems to be lost (see screen shot below). Not sure why it happens ? There is not much information in the application logs or OpenTelemetry daemon pod logs or even in the opentelemetry-operator-controller-manager pod in the opentelemetry-operator-system namespace.

Container spec before hibernation:
image

After resumption from hibernation: OpenTelemetry setup is lost
image

Thanks in advance!!!

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions