Skip to content

Conversation

gagbo
Copy link
Member

@gagbo gagbo commented Jul 24, 2023

Description

  • Rename function_calls_count to function_calls
  • Update autometrics-shared
  • Add unit to histogram duration
  • Replace caller with caller.(function|module)
  • Add service.name label

Fixes #54
Fixes #60
Fixes #62

Checklist

  • The CHANGELOG is updated.
  • The open-telemetry example in repository works fine:
    • the docker compose file is valid
    • the application compiles and run
    • alerts are getting triggered in Prometheus
  • The prometheus example in repository works fine:
    • the docker compose file is valid
    • the application compiles and run
    • alerts are getting triggered in Prometheus
    • exemplars are accessible on the graph

@gagbo gagbo force-pushed the misc_fixes branch 2 times, most recently from e0ae326 to 2250fab Compare July 24, 2023 08:54
@gagbo
Copy link
Member Author

gagbo commented Jul 24, 2023

The SLO-breach reporting is broken for multiple reasons:

  • the refactoring to use context.Context brought a regression on latency detection (see ffafd90 )
  • the rules file from autometrics-shared isn't updated to detect error rate correctly with the new function calls metric name and the new function duration metric name (see this comment)

So the PR is currently blocked

@gagbo
Copy link
Member Author

gagbo commented Jul 27, 2023

Update: the error rate alert has been fixed, but there's still a typo in the latency query, once that is solved (and the bundled rules get regenerated) it's being merged

@gagbo gagbo merged commit 0a62cf8 into main Sep 4, 2023
@gagbo gagbo deleted the misc_fixes branch September 4, 2023 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants