Skip to content

Commit ff8fcb3

Browse files
mallardduckalexandreLamarre
authored andcommitted
Squash updates 57444ba..5f12190
Fix bug when finding project root Pick the last tag instead of the first one Allow setting GIT_TAG externally too and set on workflow based on tags Fix install-federator e2e CI to use proper method for current charts Rewrite a few e2e scripts to be timeout based loops Add local-e2e target script matching current GHA workflow Remove unnecessary wait for data steps w/ new script Correct dev branch tags Refactor another e2e script to use timeout loop more shellcheck Move timeout onto correct step Fix success condition of alerts script try e2e without branch tags script Seed image name for testing via versions script Add TAG env back to e2e workflow Refactor another script for wait loop Fix inconsistent case of strings Also make sure that Prom Targets are in the "up" state Adds missing parameters to the commands in */developing.md fix(ci): Correct k3s version used for e2e testing (#172) * Add min/max k3s version used for testing I'm adding both so in the future we can test both extremes, but for now I'm going to PR just using 1 of them. * go generate * Add helper script to grab K3S versions as ENVs and use that in local-e2e * Make debug output actually create envs that should work in GHA instead of just human debug output * Source k3s version from `build.yaml` for GHA too * Fix job step ordering so yq will be installed when k3s version env is set * remove conflicting and unused ids * Fix helm-locker yq step * Export tag style of k3s versions * Use tags directly in GHA update libraries & dependencies to k8s 1.32 Signed-off-by: Alexandre Lamarre <[email protected]> add updated helm-locker Signed-off-by: Alexandre Lamarre <[email protected]> add updated helm-project-operator Signed-off-by: Alexandre Lamarre <[email protected]> Add node identification to internal package Signed-off-by: Alexandre Lamarre <[email protected]> Add internal helmcommon package to manage crds Signed-off-by: Alexandre Lamarre <[email protected]> update helm-locker to use shared crds abstraction Signed-off-by: Alexandre Lamarre <[email protected]> update helm-project-operator to use shared crds abstraction Signed-off-by: Alexandre Lamarre <[email protected]> update older sub-packages Signed-off-by: Alexandre Lamarre <[email protected]> update main functions slightly rework crd management Signed-off-by: Alexandre Lamarre <[email protected]> update go generate targets Signed-off-by: Alexandre Lamarre <[email protected]> update to go 1.23 run go mod tidy Signed-off-by: Alexandre Lamarre <[email protected]> update golangci-lint Signed-off-by: Alexandre Lamarre <[email protected]> remove internal go sub modules Signed-off-by: Alexandre Lamarre <[email protected]> update controller & crd generation Signed-off-by: Alexandre Lamarre <[email protected]> Run go generate Signed-off-by: Alexandre Lamarre <[email protected]> update helm-project-operator with new generated code Signed-off-by: Alexandre Lamarre <[email protected]> fix crd generation path Signed-off-by: Alexandre Lamarre <[email protected]> tweak test labels Signed-off-by: Alexandre Lamarre <[email protected]> run go generate Signed-off-by: Alexandre Lamarre <[email protected]> update images to go 1.23 Signed-off-by: Alexandre Lamarre <[email protected]> simplify dev scripts Signed-off-by: Alexandre Lamarre <[email protected]> remove dapper from Makefile Signed-off-by: Alexandre Lamarre <[email protected]> [WIP] simplify CI Signed-off-by: Alexandre Lamarre <[email protected]> vendor chart data in public package Signed-off-by: Alexandre Lamarre <[email protected]> update integration workflow again Signed-off-by: Alexandre Lamarre <[email protected]> use upstream helm during image builds Signed-off-by: Alexandre Lamarre <[email protected]> [WIP] importing images for integration CI Signed-off-by: Alexandre Lamarre <[email protected]> fix integration testing Signed-off-by: Alexandre Lamarre <[email protected]> [temp] disable validate-ci Signed-off-by: Alexandre Lamarre <[email protected]> lint fixes Signed-off-by: Alexandre Lamarre <[email protected]> Add back deprecated flag logic Signed-off-by: Alexandre Lamarre <[email protected]> remove unecessary CI stuff Signed-off-by: Alexandre Lamarre <[email protected]> update helm-project-operator build-chart destination Signed-off-by: Alexandre Lamarre <[email protected]> make crd management consistent with existing flags and add tests Signed-off-by: Alexandre Lamarre <[email protected]> update unit test command to exclude integration tests Signed-off-by: Alexandre Lamarre <[email protected]> Add helm-project-operator chart to examples dir Signed-off-by: Alexandre Lamarre <[email protected]> update path for HPO chart in integration tests Signed-off-by: Alexandre Lamarre <[email protected]> revert public pkg/chart package Signed-off-by: Alexandre Lamarre <[email protected]> vendor correct helm-controller dependencies Signed-off-by: Alexandre Lamarre <[email protected]> update wrangler/go toolchain Signed-off-by: Alexandre Lamarre <[email protected]> remove helm-locker e2e CI Signed-off-by: Alexandre Lamarre <[email protected]> bump rancher/klipper-helm to v0.9.4-build20250113 Signed-off-by: Alexandre Lamarre <[email protected]> add KUBECONFIG to version script Signed-off-by: Alexandre Lamarre <[email protected]> add back embedded controller name logic Signed-off-by: Alexandre Lamarre <[email protected]> Empty-Commit rename internal/helm-locker/pkg -> internal/helm-locker Signed-off-by: Alexandre Lamarre <[email protected]> rename internal/helm-project-operator/pkg -> internal/helm-project-operator Signed-off-by: Alexandre Lamarre <[email protected]> centralize cmd version Signed-off-by: Alexandre Lamarre <[email protected]> update golangci-lint config Signed-off-by: Alexandre Lamarre <[email protected]> bump integration test k3s matrix [1.30.9, 1.32.1] Signed-off-by: Alexandre Lamarre <[email protected]> feat: Update Rancher Project Monitoring to use upstream 66.7.1 (#173) * Update Rancher Project Monitoring to use 0.5.0 chart version that matches Monitoring 66.7.1 * go generate * bump RPM version for small version fix * go generate * fix local-e2e bug * fix ci artifact file name * (test) expand CI wait time * fix: consistently use KUBECTL_WAIT_TIMEOUT * Make install line more readable * Every supported k3s/rke2 versions should have it disabled * make all helm install steps uniform * Capture more details about Project Monitoring install task * expand timeouts for scripts * Update shell for kuberlr-kubectl * Adjust alerts validation script Add initial Renovate configuration (#149) Co-authored-by: renovate-rancher[bot] <119870437+renovate-rancher[bot]@users.noreply.github.com> Pin dependencies Migrate config .github/renovate.json remove dapper Signed-off-by: Alexandre Lamarre <[email protected]> Update Docker File Deps manually downgrade to go 1.23 Signed-off-by: Alexandre Lamarre <[email protected]> ci: Improve ability to use CI in forks & fix multiple tag bug (#189) * Add empty for debug * fix all runs-on to be dynamic * only read vault secrets when repo matches rancher/prom-fed * Install YQ only in GHA runners where YQ isn't available * Ensure package-helm uses the current tag for chart versioning * fix runs-on format to respect matrix.arch Adjust Alertmanager E2E verification to allow common alerts (#190) * Adjust Alertmanager E2E verification to allow InfoInhibitor & PrometheusOutOfOrderTimestamps * Make helm repo url configurable * Simplify checks to verify based on count vs (count - expected)
1 parent 57444ba commit ff8fcb3

File tree

170 files changed

+5767
-4809
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

170 files changed

+5767
-4809
lines changed

.github/renovate.json

+1-1
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515
"dockerfile",
1616
"github-actions",
1717
"helm-values",
18-
"regex"
18+
"custom.regex"
1919
],
2020
"packageRules": [
2121
{

.github/scripts/branch-tags.sh

-55
This file was deleted.

.github/workflows/prom-fed-ci.yaml .github/workflows/ci.yaml

+15-4
Original file line numberDiff line numberDiff line change
@@ -33,18 +33,29 @@ jobs:
3333
arch:
3434
- x64
3535
- arm64
36-
runs-on : runs-on,image=ubuntu22-full-${{ matrix.arch }},runner=4cpu-linux-${{ matrix.arch }},run-id=${{ github.run_id }}
36+
runs-on: ${{ github.repository == 'rancher/prometheus-federator' && format('runs-on,image=ubuntu22-full-{1},runner=4cpu-linux-{1},run-id={0}', github.run_id, matrix.arch) || 'ubuntu-latest' }}
3737
steps:
38-
- uses: actions/checkout@v4
38+
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4
3939
- name : Set up Go
40-
uses: actions/setup-go@v2
40+
uses: actions/setup-go@bfdd3570ce990073878bf10f6b2d79082de49492 # v2
4141
with:
4242
go-version: '1.22'
43+
- name: Check if yq is installed
44+
id: check_yq
45+
run: |
46+
if ! command -v yq &> /dev/null; then
47+
echo "yq not found, installing..."
48+
echo "::set-output name=install_yq::true"
49+
else
50+
echo "yq is already installed"
51+
echo "::set-output name=install_yq::false"
52+
fi
4353
- name : Install YQ
54+
if: steps.check_yq.outputs.install_yq == 'true'
4455
run: |
4556
sudo wget https://github.com/mikefarah/yq/releases/download/${YQ_VERSION}/yq_linux_${{ matrix.arch == 'x64' && 'amd64' || matrix.arch }} -O /usr/bin/yq && sudo chmod +x /usr/bin/yq;
4657
- name : Install helm
47-
uses: azure/setup-helm@v3
58+
uses: azure/setup-helm@5119fcb9089d432beecbf79bb2c7915207344b78 # v3
4859
with:
4960
token: ${{ secrets.GITHUB_TOKEN }}
5061
- name: Run CI

.github/workflows/e2e/package/Dockerfile

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
1-
FROM registry.suse.com/bci/golang:1.20 AS helm
1+
FROM registry.suse.com/bci/golang:1.23 AS helm
22
RUN zypper -n install git
33
RUN git -C / clone --branch release-v3.9.0 --depth=1 https://github.com/rancher/helm
44
RUN make -C /helm
55

6-
FROM registry.suse.com/bci/golang:1.20
6+
FROM registry.suse.com/bci/golang:1.23
77

88
ARG ARCH=amd64
99
ENV KUBECTL_VERSION v1.21.8

.github/workflows/e2e/scripts/cluster-args.sh

+2-44
Original file line numberDiff line numberDiff line change
@@ -8,55 +8,13 @@ cd $(dirname $0)/../../../..
88

99
case "${KUBERNETES_DISTRIBUTION_TYPE}" in
1010
"k3s")
11-
cluster_args=""
12-
kubernetes_version=$(kubectl version | grep "Server Version" | cut -d ' ' -f3)
13-
case "${kubernetes_version}" in
14-
v1.23.*)
15-
embedded_helm_controller_fixed_version="v1.23.14"
16-
if [[ $(echo ${kubernetes_version} ${embedded_helm_controller_fixed_version} | tr " " "\n" | sort -rV | head -n 1 ) == "${embedded_helm_controller_fixed_version}" ]]; then
17-
cluster_args="--set helmProjectOperator.helmController.enabled=false"
18-
fi
19-
;;
20-
v1.24.*)
21-
embedded_helm_controller_fixed_version="v1.24.8"
22-
if [[ $(echo ${kubernetes_version} ${embedded_helm_controller_fixed_version} | tr " " "\n" | sort -rV | head -n 1 ) == "${embedded_helm_controller_fixed_version}" ]]; then
23-
cluster_args="--set helmProjectOperator.helmController.enabled=false"
24-
fi
25-
;;
26-
v1.25.*)
27-
embedded_helm_controller_fixed_version="v1.25.4"
28-
if [[ $(echo ${kubernetes_version} ${embedded_helm_controller_fixed_version} | tr " " "\n" | sort -rV | head -n 1 ) == "${embedded_helm_controller_fixed_version}" ]]; then
29-
cluster_args="--set helmProjectOperator.helmController.enabled=false"
30-
fi
31-
;;
32-
esac
11+
cluster_args="--set helmProjectOperator.helmController.enabled=false"
3312
;;
3413
"rke")
3514
cluster_args=""
3615
;;
3716
"rke2")
38-
cluster_args=""
39-
kubernetes_version=$(kubectl version | grep "Server Version" | cut -d ' ' -f3)
40-
case "${kubernetes_version}" in
41-
v1.23.*)
42-
embedded_helm_controller_fixed_version="v1.23.14"
43-
if [[ $(echo ${kubernetes_version} ${embedded_helm_controller_fixed_version} | tr " " "\n" | sort -rV | head -n 1 ) == "${embedded_helm_controller_fixed_version}" ]]; then
44-
cluster_args="--set helmProjectOperator.helmController.enabled=false"
45-
fi
46-
;;
47-
v1.24.*)
48-
embedded_helm_controller_fixed_version="v1.24.8"
49-
if [[ $(echo ${kubernetes_version} ${embedded_helm_controller_fixed_version} | tr " " "\n" | sort -rV | head -n 1 ) == "${embedded_helm_controller_fixed_version}" ]]; then
50-
cluster_args="--set helmProjectOperator.helmController.enabled=false"
51-
fi
52-
;;
53-
v1.25.*)
54-
embedded_helm_controller_fixed_version="v1.25.4"
55-
if [[ $(echo ${kubernetes_version} ${embedded_helm_controller_fixed_version} | tr " " "\n" | sort -rV | head -n 1 ) == "${embedded_helm_controller_fixed_version}" ]]; then
56-
cluster_args="--set helmProjectOperator.helmController.enabled=false"
57-
fi
58-
;;
59-
esac
17+
cluster_args="--set helmProjectOperator.helmController.enabled=false"
6018
;;
6119
*)
6220
echo "KUBERNETES_DISTRIBUTION_TYPE=${KUBERNETES_DISTRIBUTION_TYPE} is unknown"

.github/workflows/e2e/scripts/create-projecthelmchart.sh

+1-1
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ if [[ "${E2E_CI}" == "true" ]]; then
1111
else
1212
kubectl apply -f ./examples/prometheus-federator/project-helm-chart.yaml
1313
fi
14-
sleep ${DEFAULT_SLEEP_TIMEOUT_SECONDS};
14+
sleep "${DEFAULT_SLEEP_TIMEOUT_SECONDS}";
1515

1616
if ! kubectl get -n cattle-monitoring-system job/helm-install-cattle-project-p-example-monitoring; then
1717
echo "ERROR: Helm Install Job for Project Monitoring Stack was never created after ${DEFAULT_SLEEP_TIMEOUT_SECONDS} seconds"

.github/workflows/e2e/scripts/generate-artifacts.sh

+9
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ case "${KUBERNETES_DISTRIBUTION_TYPE}" in
3434
esac
3535

3636
ARTIFACT_DIRECTORY=artifacts
37+
DESCRIBE_DIRECTORY=${ARTIFACT_DIRECTORY}/described
3738
MANIFEST_DIRECTORY=${ARTIFACT_DIRECTORY}/manifests
3839
LOG_DIRECTORY=${ARTIFACT_DIRECTORY}/logs
3940

@@ -114,3 +115,11 @@ kubectl logs deployment/cattle-project-p-example-monitoring-grafana -n cattle-pr
114115
kubectl logs deployment/cattle-project-p-example-monitoring-grafana -n cattle-project-p-example -c grafana-sc-dashboard > ${LOG_DIRECTORY}/project-monitoring/grafana_sc_dashboard.log || true
115116
kubectl logs deployment/cattle-project-p-example-monitoring-grafana -n cattle-project-p-example -c grafana-sc-datasources > ${LOG_DIRECTORY}/project-monitoring/grafana_sc_datasources.log || true
116117
kubectl logs deployment/cattle-project-p-example-monitoring-grafana -n cattle-project-p-example -c grafana-init-sc-datasources > ${LOG_DIRECTORY}/project-monitoring/grafana_init_sc_datasources.log || true
118+
119+
# Resource Descriptions
120+
121+
mkdir -p ${DESCRIBE_DIRECTORY}
122+
123+
## Additional Context
124+
kubectl describe jobs -n cattle-monitoring-system helm-install-cattle-project-p-example-monitoring > ${DESCRIBE_DIRECTORY}/project-monitoring-helm-install-job.log
125+
kubectl describe pods -n cattle-monitoring-system -l job-name=helm-install-cattle-project-p-example-monitoring > ${DESCRIBE_DIRECTORY}/project-monitoring-helm-install-pod.log

.github/workflows/e2e/scripts/install-federator.sh

+6-5
Original file line numberDiff line numberDiff line change
@@ -8,11 +8,12 @@ source $(dirname $0)/cluster-args.sh
88
cd $(dirname $0)/../../../..
99
source "$(pwd)/scripts/util-team-charts"
1010

11-
NEWEST_CHART_VERSION=$(newest-chart-version "prometheus-federator")
12-
fetch-team-chart "prometheus-federator" "$NEWEST_CHART_VERSION"
13-
LATEST_CHART_PATH="./build/charts/prometheus-federator-${NEWEST_CHART_VERSION}.tgz"
14-
tar -xvzf "$LATEST_CHART_PATH" -C ./build/charts/
11+
make package-helm
1512

16-
helm upgrade --install --create-namespace -n cattle-monitoring-system prometheus-federator --set helmProjectOperator.image.repository=${REPO:-rancher}/prometheus-federator --set helmProjectOperator.image.tag=${TAG:-dev} ${cluster_args} ${RANCHER_HELM_ARGS} ./build/charts/prometheus-federator
13+
helm upgrade --install --create-namespace -n cattle-monitoring-system prometheus-federator \
14+
--set helmProjectOperator.image.repository=${REPO:-rancher}/prometheus-federator \
15+
--set helmProjectOperator.image.tag=${TAG:-dev} \
16+
${cluster_args} \
17+
${RANCHER_HELM_ARGS} ./build/charts/prometheus-federator
1718

1819
echo "PASS: Prometheus Federator has been installed"

.github/workflows/e2e/scripts/install-monitoring.sh

+2-1
Original file line numberDiff line numberDiff line change
@@ -5,12 +5,13 @@ set -x
55
source $(dirname $0)/entry
66

77
HELM_REPO="rancher-charts"
8+
HELM_REPO_URL="https://charts.rancher.io"
89

910
cd $(dirname $0)/../../../..
1011

1112
helm version
1213

13-
helm repo add ${HELM_REPO} https://charts.rancher.io
14+
helm repo add ${HELM_REPO} $HELM_REPO_URL
1415
helm repo update
1516

1617
echo "Create required \`cattle-fleet-system\` namespace"

.github/workflows/e2e/scripts/validate-project-alertmanager.sh

+53-20
Original file line numberDiff line numberDiff line change
@@ -10,27 +10,60 @@ tmp_alerts_yaml=$(mktemp)
1010
trap 'cleanup' EXIT
1111
cleanup() {
1212
set +e
13-
rm ${tmp_alerts_yaml}
13+
rm "${tmp_alerts_yaml}"
1414
}
1515

16-
if [[ -z "${RANCHER_TOKEN}" ]]; then
17-
curl -s ${API_SERVER_URL}/api/v1/namespaces/cattle-project-p-example/services/http:cattle-project-p-example-m-alertmanager:9093/proxy/api/v2/alerts | yq -P - > ${tmp_alerts_yaml}
18-
else
19-
curl -s ${API_SERVER_URL}/api/v1/namespaces/cattle-project-p-example/services/http:cattle-project-p-example-m-alertmanager:9093/proxy/api/v2/alerts -k -H "Authorization: Bearer ${RANCHER_TOKEN}" | yq -P - > ${tmp_alerts_yaml}
20-
fi
21-
22-
if [[ $(yq '. | length' "${tmp_alerts_yaml}") != "1" ]]; then
23-
echo "ERROR: Found the wrong number of alerts in Project Alertmanager, expected only 'Watchdog'"
24-
cat ${tmp_alerts_yaml}
25-
exit 1
26-
fi
27-
28-
if [[ $(yq '.[0].labels.alertname' "${tmp_alerts_yaml}") != "Watchdog" ]]; then
29-
echo "ERROR: Expected the only alert to be triggered on the Project Alertmanager to be 'Watchdog'"
30-
cat ${tmp_alerts_yaml}
31-
exit 1
32-
fi
33-
34-
cat ${tmp_alerts_yaml}
16+
checkData() {
17+
if [[ -z "${RANCHER_TOKEN}" ]]; then
18+
curl -s ${API_SERVER_URL}/api/v1/namespaces/cattle-project-p-example/services/http:cattle-project-p-example-m-alertmanager:9093/proxy/api/v2/alerts | yq -P - > "${tmp_alerts_yaml}"
19+
else
20+
curl -s ${API_SERVER_URL}/api/v1/namespaces/cattle-project-p-example/services/http:cattle-project-p-example-m-alertmanager:9093/proxy/api/v2/alerts -k -H "Authorization: Bearer ${RANCHER_TOKEN}" | yq -P - > "${tmp_alerts_yaml}"
21+
fi
22+
}
23+
24+
WAIT_TIMEOUT="${KUBECTL_WAIT_TIMEOUT%s}"
25+
START_TIME=$(date +%s)
26+
while true; do
27+
checkData
28+
CHECKS_PASSED=0
29+
30+
# Check if timeout has been reached
31+
CURRENT_TIME=$(date +%s)
32+
ELAPSED_TIME=$((CURRENT_TIME - START_TIME))
33+
if [[ $ELAPSED_TIME -ge $WAIT_TIMEOUT ]]; then
34+
echo "ERROR: Timeout reached, condition not met."
35+
exit 1
36+
fi
37+
38+
ALERT_COUNT=$(yq '. | length' "${tmp_alerts_yaml}")
39+
if [[ $ALERT_COUNT -gt 3 ]]; then
40+
echo "ERROR: Found too many alerts in Project Alertmanager. Expected at most: 'Watchdog', 'InfoInhibitor' and/or 'PrometheusOutOfOrderTimestamps'."
41+
cat "${tmp_alerts_yaml}"
42+
43+
echo "Retrying in $DEFAULT_SLEEP_TIMEOUT_SECONDS seconds..."
44+
sleep "$DEFAULT_SLEEP_TIMEOUT_SECONDS"
45+
continue
46+
fi
47+
CHECKS_PASSED=$((CHECKS_PASSED+1))
48+
49+
UNEXPECTED_COUNT=$(yq '[.[] | select(.labels.alertname != "Watchdog" and .labels.alertname != "InfoInhibitor" and .labels.alertname != "PrometheusOutOfOrderTimestamps")] | length' "${tmp_alerts_yaml}")
50+
if [[ $UNEXPECTED_COUNT -gt 0 ]]; then
51+
echo "ERROR: Unexpected alert(s) found in active alerts list."
52+
cat "${tmp_alerts_yaml}"
53+
54+
echo "Retrying in $DEFAULT_SLEEP_TIMEOUT_SECONDS seconds..."
55+
sleep "$DEFAULT_SLEEP_TIMEOUT_SECONDS"
56+
continue
57+
fi
58+
CHECKS_PASSED=$((CHECKS_PASSED+1))
59+
60+
if [[ $CHECKS_PASSED -eq 2 ]];then
61+
# Get final elapsed time
62+
ELAPSED_TIME=$((CURRENT_TIME - START_TIME))
63+
break
64+
fi
65+
done
66+
67+
cat "${tmp_alerts_yaml}"
3568

3669
echo "PASS: Project Alertmanager is up and running"

.github/workflows/e2e/scripts/validate-project-grafana-dashboards.sh

+43-17
Original file line numberDiff line numberDiff line change
@@ -13,11 +13,13 @@ cleanup() {
1313
rm ${tmp_dashboards_yaml}
1414
}
1515

16-
if [[ -z "${RANCHER_TOKEN}" ]]; then
17-
curl -s ${API_SERVER_URL}/api/v1/namespaces/cattle-project-p-example/services/http:cattle-project-p-example-monitoring-grafana:80/proxy/api/search | yq -P - > ${tmp_dashboards_yaml}
18-
else
19-
curl -s ${API_SERVER_URL}/api/v1/namespaces/cattle-project-p-example/services/http:cattle-project-p-example-monitoring-grafana:80/proxy/api/search -k -H "Authorization: Bearer ${RANCHER_TOKEN}" | yq -P - > ${tmp_dashboards_yaml}
20-
fi
16+
checkData() {
17+
if [[ -z "${RANCHER_TOKEN}" ]]; then
18+
curl -s ${API_SERVER_URL}/api/v1/namespaces/cattle-project-p-example/services/http:cattle-project-p-example-monitoring-grafana:80/proxy/api/search | yq -P - > ${tmp_dashboards_yaml}
19+
else
20+
curl -s ${API_SERVER_URL}/api/v1/namespaces/cattle-project-p-example/services/http:cattle-project-p-example-monitoring-grafana:80/proxy/api/search -k -H "Authorization: Bearer ${RANCHER_TOKEN}" | yq -P - > ${tmp_dashboards_yaml}
21+
fi
22+
}
2123

2224
expected_dashboards=(
2325
db/alertmanager-overview
@@ -41,18 +43,42 @@ expected_dashboards=(
4143
db/rancher-workload-pods
4244
);
4345

44-
if [[ $(yq '.[].uri' ${tmp_dashboards_yaml} | wc -l | xargs) != "${#expected_dashboards[@]}" ]]; then
45-
echo "ERROR: Found the wrong number of dashboards in Project Grafana, expected only the following: ${expected_dashboards[@]}"
46-
cat ${tmp_dashboards_yaml}
47-
exit 1
48-
fi
49-
50-
for dashboard in "${expected_dashboards[@]}"; do
51-
if ! yq '.[].uri' ${tmp_dashboards_yaml} | grep "${dashboard}" 1>/dev/null; then
52-
echo "ERROR: Expected '${dashboard}' to exist amongst ${#expected_dashboards[@]} dashboards in Project Grafana"
53-
cat ${tmp_dashboards_yaml}
54-
exit 1
55-
fi
46+
WAIT_TIMEOUT="${KUBECTL_WAIT_TIMEOUT%s}"
47+
START_TIME=$(date +%s)
48+
while true; do
49+
checkData
50+
51+
# Check if timeout has been reached
52+
CURRENT_TIME=$(date +%s)
53+
ELAPSED_TIME=$((CURRENT_TIME - START_TIME))
54+
if [[ $ELAPSED_TIME -ge $WAIT_TIMEOUT ]]; then
55+
echo "ERROR: Timeout reached, condition not met."
56+
exit 1
57+
fi
58+
59+
if [[ $(yq '.[].uri' ${tmp_dashboards_yaml} | wc -l | xargs) != "${#expected_dashboards[@]}" ]]; then
60+
echo "Retrying in $DEFAULT_SLEEP_TIMEOUT_SECONDS seconds..."
61+
sleep "$DEFAULT_SLEEP_TIMEOUT_SECONDS"
62+
continue
63+
fi
64+
65+
FOUND_DASHBOARDS=0
66+
for dashboard in "${expected_dashboards[@]}"; do
67+
if ! yq '.[].uri' ${tmp_dashboards_yaml} | grep "${dashboard}" 1>/dev/null; then
68+
echo "ERROR: Expected '${dashboard}' to exist amongst ${#expected_dashboards[@]} dashboards in Project Grafana"
69+
cat ${tmp_dashboards_yaml}
70+
echo "Retrying in $DEFAULT_SLEEP_TIMEOUT_SECONDS seconds..."
71+
sleep "$DEFAULT_SLEEP_TIMEOUT_SECONDS"
72+
break
73+
fi
74+
FOUND_DASHBOARDS=$((FOUND_DASHBOARDS+1))
75+
done
76+
77+
if [[ FOUND_DASHBOARDS -eq 19 ]];then
78+
# Get final elapsed time
79+
ELAPSED_TIME=$((CURRENT_TIME - START_TIME))
80+
break
81+
fi
5682
done
5783

5884
cat ${tmp_dashboards_yaml}

0 commit comments

Comments
 (0)