Skip to content

🌱 Add feature-gate kustomize files, docs, and demo for webhook support feature #1996

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# kustomization file for secure OLMv1
# kustomization file for OLMv1 support for synthetic auth
# DO NOT ADD A NAMESPACE HERE
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# kustomization file for cert-manager backed OLMv1 support for installation of bundles with webhooks
# DO NOT ADD A NAMESPACE HERE
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../../base/operator-controller
- ../../../base/common
components:
- ../../../components/tls/operator-controller

patches:
- target:
kind: Deployment
name: operator-controller-controller-manager
path: patches/enable-featuregate.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# enable cert-manager backed webhook support feature gate
- op: add
path: /spec/template/spec/containers/0/args/-
value: "--feature-gates=WebhookProviderCertManager=true"
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# kustomization file for openshift-serviceca backed OLMv1 support for installation of bundles with webhooks
# DO NOT ADD A NAMESPACE HERE
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../../base/operator-controller
- ../../../base/common
components:
- ../../../components/tls/operator-controller

patches:
- target:
kind: Deployment
name: operator-controller-controller-manager
path: patches/enable-featuregate.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# enable openshift-serviceca backed webhook support feature gate
- op: add
path: /spec/template/spec/containers/0/args/-
value: "--feature-gates=WebhookProviderOpenshiftServiceCA=true"
62 changes: 62 additions & 0 deletions docs/draft/howto/enable-webhook-support.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
## Installation of Bundles containing Webhooks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @michaelryanpeter, just for your awareness — since we're adding more +1 drafts here. No pressure at all, just wanted to let you know it's there in case you have bandwidth later. We can always do follow-ups to improve it.


!!! note
This feature is still in *alpha*. Either the `WebhookProviderCertManager`, or the `WebhookProviderOpenshiftServiceCA`, feature-gate
must be enabled to make use of it. See the instructions below on how to enable the feature-gate.

OLMv1 currently does not support the installation of bundles containing webhooks. The webhook support feature enables this capability.
Webhooks, or more concretely Admission Webhooks, are part of Kuberntes' [Dynamic Admission Control](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/)
feature. Webhooks run as services called by the kube-apiservice in due course of processing a resource related request. They can be used to validate resources, ensure reasonable default values,
are set, or aid in the migration to new CustomResourceDefinition schema. The communication with the webhook service is secured by TLS. In OLMv1, the TLS certificate is managed by a
certificate provider. Currently, two certificate providers are supported: CertManager and Openshift-ServiceCA. The certificate provider to use given by the feature-gate:

- `WebhookProviderCertManager` for [CertManager](https://cert-manager.io/)
- `WebhookProviderOpenshiftServiceCA` for [Openshift-ServiceCA](https://github.com/openshift/service-ca-operator)

As CertManager is already installed with OLMv1, we suggest using `WebhookProviderCertManager`.

### Update OLM to enable Feature

```terminal title=Enable WebhookProviderCertManager feature
kubectl kustomize config/overlays/featuregate/webhook-provider-certmanager | kubectl apply -f -
```

Or,

```terminal title=Enable WebhookProviderOpenshiftServiceCA feature
kubectl kustomize config/overlays/featuregate/webhook-provider-openshift-serviceca | kubectl apply -f -
```

Then,

```terminal title=Wait for rollout to complete
kubectl rollout status -n olmv1-system deployment/operator-controller-controller-manager
```

### Notes on the generated certificate

#### CertManager

The generated certificate maintains a high-level of parity with the certificate generated by OLMv0:
- Self-signed
- Two validity period, rotating 24h before expiry
- Valid for the webhook service's DNSNames:
- <service-name>.<namespace>
- <service-name>.<namespace>.svc
- <service-name>.<namespace>.svc.cluster.local

#### Openshift-ServiceCA

Generation and rotation are completely governed by [Openshift-ServiceCA](https://github.com/openshift/service-ca-operator)

### How does it work?

There's no change in the installation flow. Just install a bundle containing webhooks as you would any other.

### Demo

!!! note
As there is no difference in usage or experience between the CertManager and Openshift-ServiceCA variants, only
the cert-manager variant is demoed.

[![asciicast](https://asciinema.org/a/GyjsB129GkUadeuxFhNuG4FcS.svg)](https://asciinema.org/a/GyjsB129GkUadeuxFhNuG4FcS)
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
apiVersion: webhook.operators.coreos.io/v1
kind: webhooktest
metadata:
namespace: webhook-operator
name: mutating-webhook-test
spec:
valid: true
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
apiVersion: webhook.operators.coreos.io/v1
kind: webhooktest
metadata:
namespace: webhook-operator
name: validating-webhook-test
spec:
valid: false
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
apiVersion: olm.operatorframework.io/v1
kind: ClusterCatalog
metadata:
name: webhook-operator-catalog
spec:
source:
type: Image
image:
ref: quay.io/operator-framework/webhook-operator-index:0.0.3
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
apiVersion: olm.operatorframework.io/v1
kind: ClusterExtension
metadata:
name: webhook-operator
spec:
namespace: webhook-operator
serviceAccount:
name: webhook-operator-installer
source:
catalog:
packageName: webhook-operator
version: 0.0.1
selector: {}
upgradeConstraintPolicy: CatalogProvided
sourceType: Catalog
60 changes: 60 additions & 0 deletions hack/demo/webhook-provider-certmanager-demo.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
#!/usr/bin/env bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have : https://github.com/operator-framework/operator-controller/blob/main/.github/workflows/catalogd-demo.yaml

However, could we add a GitHub action to run all demos that we have been adding, so that we can avoid breaking any of them? It is nice because it seems like we end up covering the feature to prevent regressions.

Copy link
Contributor Author

@perdasilva perdasilva May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry about doing this for every PR as the number of demos will grow and this might get time consuming. I'm also not entirely convinced that breaking a demo should block a PR from merging to main. We may not want to release, but we shouldn't block progress because of something that can be solved separately, imo. So, maybe this is something we might want to bring up in the community meeting for discussion rather than solving in this PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IHMO: If we broke the demo it is like an e2e test,
That means that something that was working stop to work
So, we need to start testing out.

But I either agree that it might be out of the scope of this one
We can create a follow up PR for that and discuss it 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, this is more of a "documentation is out of date" issue than a "we're not doing what we are supposed to do" issue. So, I'd be against adding this check on a PR basis:

  • e2es should already catch any regressions in behavior (the most important thing imo)
  • (as with e2es and unit tests) these will tend to grow in runtime with the number of features/demos added. I think it's worthwhile to have a PR pipeline that is as quick as possible.
  • I think it's unlikely that they will yield that many true positives because the failures will likely be around API surface changes (which should change infrequently, at least, after release). If we have false positives (issues with the demo script, dependencies, etc.) it's another set of problems that aren't related to the PR that can increase the lead time to merge.

So, I think from a cost/value perspective, it would be better to put this check on the critical path of the release job and run it once at the end, or have it as a periodic running once every week or two.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comment – you made some good points! 😊
Just to share another view:

  • We tested one today and it worked fine, no flakes.
  • These tests check real features, like E2E, If something breaks, it’s probably the feature, not the docs.
    (However, +1 for a possible duplication since we can end up have e2e tests and demos doing the same)
  • Performance might be a problem later, but we can wait and see.

I think your opinion makes sense too, and it’s probably not something we need to fix now.
All good from my side!


#
# Welcome to the webhook support with CertManager demo
#
trap "trap - SIGTERM && kill -- -$$" SIGINT SIGTERM EXIT

# enable 'WebhookProviderCertManager' feature
kubectl kustomize config/overlays/featuregate/webhook-provider-certmanager | kubectl apply -f -

# wait for operator-controller to become available
kubectl rollout status -n olmv1-system deployment/operator-controller-controller-manager

# create webhook-operator catalog
cat ${DEMO_RESOURCE_DIR}/webhook-provider-certmanager/webhook-operator-catalog.yaml
kubectl apply -f ${DEMO_RESOURCE_DIR}/webhook-provider-certmanager/webhook-operator-catalog.yaml

# wait for catalog to be serving
kubectl wait --for=condition=Serving clustercatalog/webhook-operator-catalog --timeout="60s"

# create install namespace
kubectl create ns webhook-operator

# create installer service account
kubectl create serviceaccount -n webhook-operator webhook-operator-installer

# give installer service account admin privileges
kubectl create clusterrolebinding webhook-operator-installer-crb --clusterrole=cluster-admin --serviceaccount=webhook-operator:webhook-operator-installer

# install webhook operator clusterextension
cat ${DEMO_RESOURCE_DIR}/webhook-provider-certmanager/webhook-operator-extension.yaml

# apply cluster extension
kubectl apply -f ${DEMO_RESOURCE_DIR}/webhook-provider-certmanager/webhook-operator-extension.yaml

# wait for cluster extension installation to succeed
kubectl wait --for=condition=Installed clusterextension/webhook-operator --timeout="60s"

# wait for webhook-operator deployment to become available and back the webhook service
kubectl wait --for=condition=Available -n webhook-operator deployments/webhook-operator-webhook

# demonstrate working validating webhook
cat ${DEMO_RESOURCE_DIR}/webhook-provider-certmanager/validating-webhook-test.yaml

# resource creation should be rejected by the validating webhook due to bad attribute value
kubectl apply -f ${DEMO_RESOURCE_DIR}/webhook-provider-certmanager/validating-webhook-test.yaml

# demonstrate working mutating webhook
cat ${DEMO_RESOURCE_DIR}/webhook-provider-certmanager/mutating-webhook-test.yaml

# apply resource
kubectl apply -f ${DEMO_RESOURCE_DIR}/webhook-provider-certmanager/mutating-webhook-test.yaml

# get webhooktest resource in v1 schema - resource should have new .spec.mutate attribute
kubectl get webhooktest.v1.webhook.operators.coreos.io -n webhook-operator mutating-webhook-test -o yaml

# demonstrate working conversion webhook by getting webhook test resource in v2 schema - the .spec attributes should now be under the .spec.conversion stanza
kubectl get webhooktest.v2.webhook.operators.coreos.io -n webhook-operator mutating-webhook-test -o yaml

# this concludes the webhook support demo - Thank you!
Loading