Skip to content

feat(controller): prometheus metrics for git and SCM operations (#117) #255

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

crenshaw-dev
Copy link
Contributor

Partially implements #117

@codecov-commenter
Copy link

codecov-commenter commented Apr 12, 2025

Codecov Report

Attention: Patch coverage is 0% with 30 lines in your changes missing coverage. Please review.

Project coverage is 49.80%. Comparing base (835f2a1) to head (40a983f).

Files with missing lines Patch % Lines
internal/scms/github/pullrequest.go 0.00% 20 Missing ⚠️
internal/scms/github/utils.go 0.00% 6 Missing ⚠️
internal/scms/github/commit_status.go 0.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #255      +/-   ##
==========================================
- Coverage   50.64%   49.80%   -0.84%     
==========================================
  Files          14       15       +1     
  Lines        1779     1809      +30     
==========================================
  Hits          901      901              
- Misses        759      789      +30     
  Partials      119      119              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Michael Crenshaw <[email protected]>
@crenshaw-dev
Copy link
Contributor Author

crenshaw-dev commented Apr 12, 2025

Looking good:

# HELP git_operations_duration_seconds A histogram of the duration of git clone operations.
# TYPE git_operations_duration_seconds histogram
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="clone",result="success",scm_provider="gitlab",le="0.005"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="clone",result="success",scm_provider="gitlab",le="0.01"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="clone",result="success",scm_provider="gitlab",le="0.025"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="clone",result="success",scm_provider="gitlab",le="0.05"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="clone",result="success",scm_provider="gitlab",le="0.1"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="clone",result="success",scm_provider="gitlab",le="0.25"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="clone",result="success",scm_provider="gitlab",le="0.5"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="clone",result="success",scm_provider="gitlab",le="1"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="clone",result="success",scm_provider="gitlab",le="2.5"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="clone",result="success",scm_provider="gitlab",le="5"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="clone",result="success",scm_provider="gitlab",le="10"} 3
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="clone",result="success",scm_provider="gitlab",le="+Inf"} 3
git_operations_duration_seconds_sum{git_repository="gitlab",operation="clone",result="success",scm_provider="gitlab"} 20.964030624
git_operations_duration_seconds_count{git_repository="gitlab",operation="clone",result="success",scm_provider="gitlab"} 3
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="fetch",result="success",scm_provider="gitlab",le="0.005"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="fetch",result="success",scm_provider="gitlab",le="0.01"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="fetch",result="success",scm_provider="gitlab",le="0.025"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="fetch",result="success",scm_provider="gitlab",le="0.05"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="fetch",result="success",scm_provider="gitlab",le="0.1"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="fetch",result="success",scm_provider="gitlab",le="0.25"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="fetch",result="success",scm_provider="gitlab",le="0.5"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="fetch",result="success",scm_provider="gitlab",le="1"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="fetch",result="success",scm_provider="gitlab",le="2.5"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="fetch",result="success",scm_provider="gitlab",le="5"} 34
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="fetch",result="success",scm_provider="gitlab",le="10"} 34
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="fetch",result="success",scm_provider="gitlab",le="+Inf"} 34
git_operations_duration_seconds_sum{git_repository="gitlab",operation="fetch",result="success",scm_provider="gitlab"} 90.721027418
git_operations_duration_seconds_count{git_repository="gitlab",operation="fetch",result="success",scm_provider="gitlab"} 34
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="ls-remote",result="success",scm_provider="gitlab",le="0.005"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="ls-remote",result="success",scm_provider="gitlab",le="0.01"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="ls-remote",result="success",scm_provider="gitlab",le="0.025"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="ls-remote",result="success",scm_provider="gitlab",le="0.05"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="ls-remote",result="success",scm_provider="gitlab",le="0.1"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="ls-remote",result="success",scm_provider="gitlab",le="0.25"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="ls-remote",result="success",scm_provider="gitlab",le="0.5"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="ls-remote",result="success",scm_provider="gitlab",le="1"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="ls-remote",result="success",scm_provider="gitlab",le="2.5"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="ls-remote",result="success",scm_provider="gitlab",le="5"} 52
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="ls-remote",result="success",scm_provider="gitlab",le="10"} 53
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="ls-remote",result="success",scm_provider="gitlab",le="+Inf"} 53
git_operations_duration_seconds_sum{git_repository="gitlab",operation="ls-remote",result="success",scm_provider="gitlab"} 148.46533913000002
git_operations_duration_seconds_count{git_repository="gitlab",operation="ls-remote",result="success",scm_provider="gitlab"} 53
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="pull",result="success",scm_provider="gitlab",le="0.005"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="pull",result="success",scm_provider="gitlab",le="0.01"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="pull",result="success",scm_provider="gitlab",le="0.025"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="pull",result="success",scm_provider="gitlab",le="0.05"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="pull",result="success",scm_provider="gitlab",le="0.1"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="pull",result="success",scm_provider="gitlab",le="0.25"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="pull",result="success",scm_provider="gitlab",le="0.5"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="pull",result="success",scm_provider="gitlab",le="1"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="pull",result="success",scm_provider="gitlab",le="2.5"} 0
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="pull",result="success",scm_provider="gitlab",le="5"} 78
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="pull",result="success",scm_provider="gitlab",le="10"} 78
git_operations_duration_seconds_bucket{git_repository="gitlab",operation="pull",result="success",scm_provider="gitlab",le="+Inf"} 78
git_operations_duration_seconds_sum{git_repository="gitlab",operation="pull",result="success",scm_provider="gitlab"} 209.87871878699997
git_operations_duration_seconds_count{git_repository="gitlab",operation="pull",result="success",scm_provider="gitlab"} 78
# HELP git_operations_total A counter of git clone operations.
# TYPE git_operations_total counter
git_operations_total{git_repository="gitlab",operation="clone",result="success",scm_provider="gitlab"} 3
git_operations_total{git_repository="gitlab",operation="fetch",result="success",scm_provider="gitlab"} 34
git_operations_total{git_repository="gitlab",operation="ls-remote",result="success",scm_provider="gitlab"} 53
git_operations_total{git_repository="gitlab",operation="pull",result="success",scm_provider="gitlab"} 78
# HELP scm_calls_duration_seconds A histogram of the duration of SCM API calls.
# TYPE scm_calls_duration_seconds histogram
scm_calls_duration_seconds_bucket{api="CommitStatus",git_repository="gitlab",operation="create",response_code="201",scm_provider="gitlab",le="0.005"} 0
scm_calls_duration_seconds_bucket{api="CommitStatus",git_repository="gitlab",operation="create",response_code="201",scm_provider="gitlab",le="0.01"} 0
scm_calls_duration_seconds_bucket{api="CommitStatus",git_repository="gitlab",operation="create",response_code="201",scm_provider="gitlab",le="0.025"} 0
scm_calls_duration_seconds_bucket{api="CommitStatus",git_repository="gitlab",operation="create",response_code="201",scm_provider="gitlab",le="0.05"} 0
scm_calls_duration_seconds_bucket{api="CommitStatus",git_repository="gitlab",operation="create",response_code="201",scm_provider="gitlab",le="0.1"} 0
scm_calls_duration_seconds_bucket{api="CommitStatus",git_repository="gitlab",operation="create",response_code="201",scm_provider="gitlab",le="0.25"} 0
scm_calls_duration_seconds_bucket{api="CommitStatus",git_repository="gitlab",operation="create",response_code="201",scm_provider="gitlab",le="0.5"} 0
scm_calls_duration_seconds_bucket{api="CommitStatus",git_repository="gitlab",operation="create",response_code="201",scm_provider="gitlab",le="1"} 0
scm_calls_duration_seconds_bucket{api="CommitStatus",git_repository="gitlab",operation="create",response_code="201",scm_provider="gitlab",le="2.5"} 5
scm_calls_duration_seconds_bucket{api="CommitStatus",git_repository="gitlab",operation="create",response_code="201",scm_provider="gitlab",le="5"} 5
scm_calls_duration_seconds_bucket{api="CommitStatus",git_repository="gitlab",operation="create",response_code="201",scm_provider="gitlab",le="10"} 5
scm_calls_duration_seconds_bucket{api="CommitStatus",git_repository="gitlab",operation="create",response_code="201",scm_provider="gitlab",le="+Inf"} 5
scm_calls_duration_seconds_sum{api="CommitStatus",git_repository="gitlab",operation="create",response_code="201",scm_provider="gitlab"} 8.670691792
scm_calls_duration_seconds_count{api="CommitStatus",git_repository="gitlab",operation="create",response_code="201",scm_provider="gitlab"} 5
scm_calls_duration_seconds_bucket{api="PullRequest",git_repository="gitlab",operation="list",response_code="200",scm_provider="gitlab",le="0.005"} 0
scm_calls_duration_seconds_bucket{api="PullRequest",git_repository="gitlab",operation="list",response_code="200",scm_provider="gitlab",le="0.01"} 0
scm_calls_duration_seconds_bucket{api="PullRequest",git_repository="gitlab",operation="list",response_code="200",scm_provider="gitlab",le="0.025"} 0
scm_calls_duration_seconds_bucket{api="PullRequest",git_repository="gitlab",operation="list",response_code="200",scm_provider="gitlab",le="0.05"} 0
scm_calls_duration_seconds_bucket{api="PullRequest",git_repository="gitlab",operation="list",response_code="200",scm_provider="gitlab",le="0.1"} 0
scm_calls_duration_seconds_bucket{api="PullRequest",git_repository="gitlab",operation="list",response_code="200",scm_provider="gitlab",le="0.25"} 0
scm_calls_duration_seconds_bucket{api="PullRequest",git_repository="gitlab",operation="list",response_code="200",scm_provider="gitlab",le="0.5"} 0
scm_calls_duration_seconds_bucket{api="PullRequest",git_repository="gitlab",operation="list",response_code="200",scm_provider="gitlab",le="1"} 0
scm_calls_duration_seconds_bucket{api="PullRequest",git_repository="gitlab",operation="list",response_code="200",scm_provider="gitlab",le="2.5"} 1
scm_calls_duration_seconds_bucket{api="PullRequest",git_repository="gitlab",operation="list",response_code="200",scm_provider="gitlab",le="5"} 2
scm_calls_duration_seconds_bucket{api="PullRequest",git_repository="gitlab",operation="list",response_code="200",scm_provider="gitlab",le="10"} 2
scm_calls_duration_seconds_bucket{api="PullRequest",git_repository="gitlab",operation="list",response_code="200",scm_provider="gitlab",le="+Inf"} 2
scm_calls_duration_seconds_sum{api="PullRequest",git_repository="gitlab",operation="list",response_code="200",scm_provider="gitlab"} 4.482714084
scm_calls_duration_seconds_count{api="PullRequest",git_repository="gitlab",operation="list",response_code="200",scm_provider="gitlab"} 2
# HELP scm_calls_total A counter of SCM API calls.
# TYPE scm_calls_total counter
scm_calls_total{api="CommitStatus",git_repository="gitlab",operation="create",response_code="201",scm_provider="gitlab"} 5
scm_calls_total{api="PullRequest",git_repository="gitlab",operation="list",response_code="200",scm_provider="gitlab"} 2

And for rate limits:

# HELP scm_calls_rate_limit_limit A gauge for the rate limit of SCM API calls.
# TYPE scm_calls_rate_limit_limit gauge
scm_calls_rate_limit_limit{scm_provider="github"} 5000
# HELP scm_calls_rate_limit_remaining A gauge for the remaining rate limit of SCM API calls.
# TYPE scm_calls_rate_limit_remaining gauge
scm_calls_rate_limit_remaining{scm_provider="github"} 4886
# HELP scm_calls_rate_limit_reset_remaining_seconds A gauge for the remaining seconds until the SCM API rate limit resets.
# TYPE scm_calls_rate_limit_reset_remaining_seconds gauge
scm_calls_rate_limit_reset_remaining_seconds{scm_provider="github"} 3010.768046

Signed-off-by: Michael Crenshaw <[email protected]>
@robinlieb
Copy link
Contributor

Nice work!
Since I have been also looking into metrics these days I saw that the PullReqeuest / Commit Status functions getting bloated containing identical lines. Have you thought about putting logs and metrics into a generic function?

Something like:

type GithubResponseFunc[T any] func() (T, *github.Response, error)

func LogAndMetricWrapper[T any](
	ctx context.Context,
	gitRepo *v1alpha1.GitRepository,
	fn GithubResponseFunc[T],
) (T, *github.Response, error) {
	start := time.Now()

	t, resp, err := fn()

	if resp != nil {
		logger := log.FromContext(ctx)
		logger.Info("github rate limit",
			"limit", resp.Rate.Limit,
			"remaining", resp.Rate.Remaining,
			"reset", resp.Rate.Reset,
			"url", resp.Request.URL)
		logger.V(4).Info("github response status", "status", resp.Status)

		metrics.RecordSCMCall(gitRepo, metrics.SCMAPIPullRequest, metrics.SCMOperationList, resp.StatusCode, time.Since(start))
	}

	return t, resp, err
}

And then using it like:

githubPullRequest, _, err := LogAndMetricWrapper(ctx, gitRepo, func() (*github.PullRequest, *github.Response, error) {
	return pr.client.PullRequests.Create(ctx, gitRepo.Spec.GitHub.Owner, gitRepo.Spec.GitHub.Name, newPR)
})

I guess creating a RoundTripper could also be a viable approach for that. Could also offer to create a dedicated refactor PR of that after the merge of that one.

Signed-off-by: Michael Crenshaw <[email protected]>
@crenshaw-dev
Copy link
Contributor Author

That could be a good refactor in the future... I think the duplication is probably kinda-alright for now.

A round tripper would be interesting. It could get us nicer metrics for, e.g., the token fetch call which is currently obstructed by the go client.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants