Hello HyperDX maintainers, sorry if I'm not following an issue formatting guide here, I didn't see one in the CONTRIBUTING.md.
Anyways, I have what I think is likely a relatively straightforward feature request. In Grafana (and likely other tools), you can configure an alert such that it only fires after the offending log line, metric, etc. has exceeded its provided alerting threshold over multiple consecutive windows. For example, if I have a log line like failed to send message to queue and I only wanted to send an alert (or page someone) if we see at least one of those messages over five consecutive minutes, I could configure my alert that way. The goal in mind is to be able to mitigate flaky alerts and pages that auto-resolve quickly. Note that "at least one occurrence per minute for five consecutive minutes" is not necessarily equivalent to "at least five occurrences over five minutes" - this is the critical distinction.
I was suspecting that this might not be too large of a lift on top of the current alerting feature surface, and I'd be happy to take a stab at implementing it if you all think that the intention here makes sense.
Let me know, and thank you!
Hello HyperDX maintainers, sorry if I'm not following an issue formatting guide here, I didn't see one in the CONTRIBUTING.md.
Anyways, I have what I think is likely a relatively straightforward feature request. In Grafana (and likely other tools), you can configure an alert such that it only fires after the offending log line, metric, etc. has exceeded its provided alerting threshold over multiple consecutive windows. For example, if I have a log line like
failed to send message to queueand I only wanted to send an alert (or page someone) if we see at least one of those messages over five consecutive minutes, I could configure my alert that way. The goal in mind is to be able to mitigate flaky alerts and pages that auto-resolve quickly. Note that "at least one occurrence per minute for five consecutive minutes" is not necessarily equivalent to "at least five occurrences over five minutes" - this is the critical distinction.I was suspecting that this might not be too large of a lift on top of the current alerting feature surface, and I'd be happy to take a stab at implementing it if you all think that the intention here makes sense.
Let me know, and thank you!