Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] pod_phase_status alerts does not clear when the pod was replaced by another #566

Open
adrienguyclaranet opened this issue Jul 18, 2024 · 0 comments
Labels
bug Something isn't working detectors About nex or existing detectors

Comments

@adrienguyclaranet
Copy link

adrienguyclaranet commented Jul 18, 2024

What is the module?
otel-collector_kubernetes-common

What is the detector?
pod_phase_status

Describe the bug
pod_phase_status alerts are still active even when the pod does not exist anymore

To Reproduce
Steps to reproduce the behavior:

  1. Nominal "ok" state of the detector
  2. The pod is in a failed state the alert is raised
  3. The pod is automaticaly recreated by k8s
  4. The alert is still active unless we do a manual clear

Expected behavior
The alert should clear itself when the pod does not exist anymore or if the pod just pop-up and dies quickly this detector should not triggers at all

Screenshots

Additional context
A local solution has been found :
Add .fill(2,duration='1s') in the line :
signal = data('k8s.pod.phase', filter=base_filtering and filter('env', 'prod') and filter('sfx_monitored', 'true')).fill(2,duration='1s').publish('signal')

Pull request should come up soon.

@adrienguyclaranet adrienguyclaranet added bug Something isn't working detectors About nex or existing detectors labels Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working detectors About nex or existing detectors
Projects
None yet
Development

No branches or pull requests

1 participant