Skip to content

[HealthChecks] Add TelemetryHealthCheckPublisher #11178

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: dev
Choose a base branch
from

Conversation

jviau
Copy link
Contributor

@jviau jviau commented Jul 11, 2025

Issue describing the changes in this PR

Part of #11010
Resolves #11170

Pull request checklist

IMPORTANT: Currently, changes must be backported to the in-proc branch to be included in Core Tools and non-Flex deployments.

  • Backporting to the in-proc branch is not required
    • Otherwise: Link to backporting PR
  • My changes do not require documentation changes
    • Otherwise: Documentation issue linked to PR
  • My changes should not be added to the release notes for the next release
    • Otherwise: I've added my notes to release_notes.md
  • My changes do not need to be backported to a previous version
    • Otherwise: Backport tracked by issue/PR #issue_or_pr
  • My changes do not require diagnostic events changes
    • Otherwise: I have added/updated all related diagnostic events and their documentation (Documentation issue linked to PR)
  • I have added all required tests (Unit tests, E2E tests)

Additional information

Adds an IHealthCheckPublisher which publishes health checks as telemetry.

  • Writes log statements for the health checks. Currently will only log when unhealthy.
  • Adds 2 metrics for health
Name Instrument Type Unit (UCUM) Description
az.functions.health_check.reports Histogram {health}{0, 0.5, 1}[1] Represents the overall health state of the instance

[1]: Represents the health as a double, ranging from 0 to 1. 1 = healthy, 0.5 = degraded, 0 = unhealthy. By keeping this between 0 and 1, we open the door to a percentage based health calculation.

Attribute Type Description Examples
az.functions.health_check.tag String The health check tag the metric is for <empty_string> [1], az.functions.liveness, az.functions.readiness

[1]: Empty string represents all health checks being included for this metric

Name Instrument Type Unit (UCUM) Description
az.functions.health_check.unhealthy_checks Histogram {health}{0, 0.5, 1}[1] Represents the individual health state of the health check components. Only emitted when not healthy.

[1]: Represents the health as a double, ranging from 0 to 1. 1 = healthy, 0.5 = degraded, 0 = unhealthy. By keeping this between 0 and 1, we open the door to a percentage based health calculation.

Attribute Type Description Examples
az.functions.health_check.tag String The health check tag the metric is for <empty_string> [1], az.functions.liveness, az.functions.readiness
az.functions.health_check.name String The name of the health check [2] az.functions.script_host.lifecycle, az.functions.web_host.lifecycle, az.functions.deployment

[1]: Empty string represents this was part of the all health checks publish. The goal is to have this tag match with az.functions.health_check.reports so they can be joined into a single view/query in dashboards.
[2]: The name of a health check should follow OTel attribute naming conventions itself.

Other Changes

  • Introduces ObjectPool<T> usage, with helpers for getting shared pools. First shared pool introduced is for StringBuilder. This allows for efficient re-usage of string builders when possible.
    • Will make a PR later to shift to using the shared string builder pool throughout the repo.

Notes

We intend to use health checks to back liveness & readiness probes. As such, having metrics scoped to specific health check tags will be important to highlight directly the history of those probes. This is why TelemetryHealthCheckPublisher has the additionalTags parameter. This approach will lead to redundant metrics, as publishing metrics for the "default" (no tag) health check plus a tag-filtered health check will overlap.

@jviau jviau requested a review from a team as a code owner July 11, 2025 22:17
@jviau jviau changed the title Add TelemetryHealthCheckPublisher [HealthChecks] Add TelemetryHealthCheckPublisher Jul 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[HealthChecks] Add health check telemetry publisher
1 participant