- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1.6k
🐛 Fix flakes in e2e tests for metrics #5138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| Hi @mayuka-c. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with  Once the patch is verified, the new status will be reflected by the  I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. | 
| [APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: mayuka-c The full list of commands accepted by this bot can be found here. 
Needs approval from an approver in each of these files:
 Approvers can indicate their approval by writing  | 
f130aac    to
    1c32309      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @mayuka-c 👋
That looks great! However, this change only makes sense for projects that actually have webhooks.
Not all projects do — we use markers to inject code (see Kubebuilder markers
).
For example:
+kubebuilder:scaffold:e2e-webhooks-checks
Then, if we really need that we will need to see if we can use the same; it seems not. Then we need a new marker. So, we need to be 1000% sure that is the best way. Ideally we should avoid create new markers and etc since we need keep the support for those, maker harder for end user customisartions, etc.
So, if we need to check whether the webhook is serving before the metrics, we should only add this logic when webhooks are present — meaning we’ll need to use a marker for it.
Also, please make sure the scaffolded code remains as generic as possible, so it still works even if users customise their configurations.
Lastly, since this change affects the scaffolded e2e tests in projects (which impact end users), we need to be careful with the commit type.
We should use the 🐛 emoji instead of 🌱 — the seed emoji is only used for changes that don’t affect users and can be ignored in the release notes.
| Hi @camilamacedo86 
 If I'm not wrong, I could add a new fragment to do the webhook readiness check here and replace with the marker here But the only problem that I see is, it will end up running other fragment checks too which is not required. Shall I go ahead to define a new marker? or if you have any better solution, please do suggest :) | 
| 
 Sorry my bad, I will update the title with proper emoji. Thank you :) | 
| Hi @mayuka-c Let's create a new marker | 
| 
 Sure thank you @camilamacedo86 . I will review the code and make sure it is generic. | 
b703a44    to
    d058ca9      
    Compare
  
    | Hi @camilamacedo86 👋 I have added a new marker  Please let me know if this is fine or if I'm missing | 
| g.Expect(err).NotTo(HaveOccurred(), "Webhook server not responding on port 443") | ||
| } | ||
| Eventually(verifyWebhookServiceReady, 2*time.Minute).Should(Succeed()) | ||
| // +kubebuilder:scaffold:e2e-webhooks-readiness | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we may can simplify with
By("waiting for webhook service to be ready if webhooks are configured")
verifyWebhookServiceReady := func(g Gomega) {
	const webhookServiceName = "{{ProjectName}}-webhook-service"
	cmd := exec.Command("kubectl", "get", "service", webhookServiceName, "-n", namespace)
	_, err := utils.Run(cmd)
	if err != nil {
		// Project has no webhook service; nothing to wait on.
		return
	}
	cmd = exec.Command("kubectl", "get", "pods", "-l", "control-plane=controller-manager",
		"-n", namespace, "-o", "jsonpath={.items[0].status.conditions[?(@.type=='Ready')].status}")
	output, err := utils.Run(cmd)
	g.Expect(err).NotTo(HaveOccurred())
	g.Expect(output).To(Equal("True"),
		"Controller manager pod not ready (webhook server may not be accepting connections)")
	cmd = exec.Command("kubectl", "get", "endpoints", webhookServiceName,
		"-n", namespace, "-o", "jsonpath={.subsets[*].addresses[*].ip}")
	output, err = utils.Run(cmd)
	g.Expect(err).NotTo(HaveOccurred())
	g.Expect(output).NotTo(BeEmpty(), "Webhook service endpoints are not ready")
	cmd = exec.Command("kubectl", "get", "--raw",
		fmt.Sprintf("/api/v1/namespaces/%s/services/https:%[2]s:https/proxy/readyz", namespace, 		webhookServiceName))
	_, err = utils.Run(cmd)
	g.Expect(err).NotTo(HaveOccurred(), "Webhook server not responding on port 443")
}
Eventually(verifyWebhookServiceReady, 2*time.Minute).Should(Succeed())There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks you, this looks Perfect :)
Regarding this I have slightly modified. Since now we have the marker which runs exclusively for webhooks, instead of ignoring the error, will be failing if service is not found.
	const webhookServiceName = "{{ProjectName}}-webhook-service"
	cmd := exec.Command("kubectl", "get", "service", webhookServiceName, "-n", namespace)
	_, err := utils.Run(cmd)
	if err != nil {
		// Project has no webhook service; nothing to wait on.
		return
	}
to
      const webhookServiceName = "{{.ProjectName}}-webhook-service"
      
      // Webhook service should exist since webhooks are configured
      cmd := exec.Command("kubectl", "get", "service", webhookServiceName, "-n", namespace)
      _, err := utils.Run(cmd)
      g.Expect(err).NotTo(HaveOccurred(), "Webhook service should exist but was not found")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding this
	cmd = exec.Command("kubectl", "get", "--raw",
		fmt.Sprintf("/api/v1/namespaces/%s/services/https:%[2]s:https/proxy/readyz", namespace, 		webhookServiceName))
	_, err = utils.Run(cmd)
	g.Expect(err).NotTo(HaveOccurred(), "Webhook server not responding on port 443")
Looks like it is failing after using the service proxy to check for readiness here: https://github.com/mayuka-c/kubebuilder/actions/runs/18962257859/job/54151953303.
I will see on what could be the reason for this. Please do let me know if the endpoint used is wrong. Thanks!
ec70142    to
    abd279e      
    Compare
  
    Fix flakes - 2 Run make generate RUn make generate Minor fix Minor fix Fix-2 Fix-3 Implement a new marker for webhook readiness Lint fix scaffold md fix Address comment-2 Minor fix Test-1 Fix-2
Fixes: #5137
By looking at the logs, looks like during the curl metrics pod creation, calls to the webhook is failing due to connection refused. So added precheck to ensure the webhook is ready before beginning with the pod creation.
Validation
Ran the e2e tests multiple times here and did not see the flakes: https://github.com/mayuka-c/kubebuilder/actions/runs/18582517449