[wip] [feedback wanted] Do not scrape pods when activator in path #16254

Alexander-Kita · 2025-11-21T03:43:03Z

Proposed Changes

Pause scraping pods when activator in data path (excess burst capacity < 0)
Resume when excess burst capacity >= 0

Feedback needed on the following items (more emphasis on the first):

In the current implementation, due to Improve SKS handling for unavailable Activator. #13027, in the circumstance that excess burst capacity < 0 AND there are no activator endpoints, then there might be metrics missed since SKS forces "serve" mode. Is there a way to float the status ("proxy" or "serve") to the autoscaler? Or, another way to account for this?
Writing unit tests for this situation

Release Note

NONE

knative-prow · 2025-11-21T03:43:12Z

Hi @Alexander-Kita. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

knative-prow · 2025-11-21T03:43:14Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Alexander-Kita
Once this PR has been reviewed and has the lgtm label, please assign skonto for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

codecov · 2025-11-21T03:48:48Z

Codecov Report

❌ Patch coverage is 34.61538% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.96%. Comparing base (090b6ae) to head (40751ea).
⚠️ Report is 39 commits behind head on main.

Files with missing lines	Patch %	Lines
pkg/autoscaler/metrics/collector.go	22.72%	16 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #16254      +/-   ##
==========================================
- Coverage   80.05%   79.96%   -0.09%     
==========================================
  Files         215      215              
  Lines       13327    13354      +27     
==========================================
+ Hits        10669    10679      +10     
- Misses       2300     2315      +15     
- Partials      358      360       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

dprotaso · 2025-12-04T15:11:45Z

/ok-to-test

dprotaso · 2025-12-04T15:17:05Z

The e2e failures seem legit

dprotaso · 2025-12-04T15:18:58Z

Generally after a quick look I like the abstractions used. Though I'm guessing there's something more nuanced that's causing this change to break the e2e tests

knative-prow · 2025-12-04T16:27:29Z

@Alexander-Kita: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
istio-latest-no-mesh_serving_main	`40751ea`	link	true	`/test istio-latest-no-mesh`

Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Alexander-Kita · 2025-12-09T22:17:00Z

I believe I found what is causing these e2e failures. The activator (concurrency_reporter) appears to stop collecting metrics when it sees zero concurrency in the service:

from pkg/activator/handler/concurrency_reporter.go

		// This is only 0 if we have seen no activity for the entire reporting
		// period at all.
		if report.AverageConcurrency == 0 {
			toDelete = append(toDelete, key)
		}

This appears to trigger too early and stop sending metrics (since no concurrency is seen), which is preventing the service from ever scaling to zero since pods are no longer scraped. I added a buffer to test this out (it has to see zero 3 times before stopping) and it passed the e2e test when I ran it. This behavior was probably hidden since we were still scraping metrics while the activator was in the path. How do you recommend I approach a solution to this, if one is still wanted? @dprotaso

Alexander-Kita added 6 commits November 18, 2025 22:52

initial implementation

55e736b

use attribute instead of channel

bd83808

change log

440ea96

more logging changes

6e1b67c

change switch to if else statement

3e6b5ca

remove logs to prevent spam

ee86a28

knative-prow bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 21, 2025

knative-prow bot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 21, 2025

knative-prow bot requested review from dprotaso, dsimansk and skonto November 21, 2025 03:43

Alexander-Kita added 2 commits December 2, 2025 20:02

change when pause and resume happens

afaeba9

add unit test for autoscaler

c6a5041

knative-prow bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Dec 4, 2025

whitespace linting fixes

40751ea

knative-prow bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[wip] [feedback wanted] Do not scrape pods when activator in path #16254

[wip] [feedback wanted] Do not scrape pods when activator in path #16254

Alexander-Kita commented Nov 21, 2025 •

edited

Loading

Uh oh!

knative-prow bot commented Nov 21, 2025

Uh oh!

knative-prow bot commented Nov 21, 2025

Uh oh!

codecov bot commented Nov 21, 2025 •

edited

Loading

Uh oh!

dprotaso commented Dec 4, 2025

Uh oh!

dprotaso commented Dec 4, 2025

Uh oh!

dprotaso commented Dec 4, 2025 •

edited

Loading

Uh oh!

knative-prow bot commented Dec 4, 2025

Uh oh!

Alexander-Kita commented Dec 9, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[wip] [feedback wanted] Do not scrape pods when activator in path #16254

Are you sure you want to change the base?

[wip] [feedback wanted] Do not scrape pods when activator in path #16254

Conversation

Alexander-Kita commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed Changes

Uh oh!

knative-prow bot commented Nov 21, 2025

Uh oh!

knative-prow bot commented Nov 21, 2025

Uh oh!

codecov bot commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dprotaso commented Dec 4, 2025

Uh oh!

dprotaso commented Dec 4, 2025

Uh oh!

dprotaso commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

knative-prow bot commented Dec 4, 2025

Uh oh!

Alexander-Kita commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Alexander-Kita commented Nov 21, 2025 •

edited

Loading

codecov bot commented Nov 21, 2025 •

edited

Loading

dprotaso commented Dec 4, 2025 •

edited

Loading

Alexander-Kita commented Dec 9, 2025 •

edited

Loading