-
Notifications
You must be signed in to change notification settings - Fork 1.5k
OCPBUGS-34950: Fix OpenStack infrastructure bootstrap issues #10148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/assign @stephenfin Looks like there are some linter issues to address here? |
|
/retest |
1 similar comment
|
/retest |
stephenfin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. I have left a lot of comments but most of them are nits: feel free to ignore if you disagree. I think the main meaningful changes are the use of net/http consts and handling deletion of FIPs/SGs when there are more than one and you get errors.
|
/retitle OCPBUGS-34950: Fix OpenStack infrastructure bootstrap issues |
|
@eshulman2: This pull request references Jira Issue OCPBUGS-34950, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
I believe I addressed all concerns and comments and also improved the error handling and logging :) |
c9ee29b to
7b9ece6
Compare
|
/jira refresh |
|
@eshulman2: This pull request references Jira Issue OCPBUGS-34950, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@eshulman2: This pull request references Jira Issue OCPBUGS-34950, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retest |
Fixes the issue of bootstrap machine logs cannot be collected when installation fails because the bootstrap VM uses the master security group, which doesn't allow SSH access from the installation source address. This prevents gathering diagnostic information needed to troubleshoot failed installations. - Created dedicated bootstrap security group with SSH access from anywhere - Tagged with `openshiftRole=bootstrap` for lifecycle management - Enables SSH access to bootstrap VM for log collection on failure - Implement PostDestroyer for openstack bootstrap vm
Added cleanup for bootstrap VM FIP in the insatller as part of the PostDestroyer to streamline FIP creation and deletion to be done by the installer avoiding orphan FIPs.
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: stephenfin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
mandre
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/label acknowledge-critical-fixes-only
/verified by CI
/hold cancel
|
/verified remove |
|
/verified later |
|
@mandre: This PR does not have In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@mandre: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/verified later @imatza-rh |
|
@mandre: This PR has been marked to be verified later by In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@eshulman2: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@eshulman2: Jira Issue OCPBUGS-34950: All pull requests linked via external trackers have merged: This pull request has the In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
This commit addresses two related issues in OpenStack IPI installations:
Problem 1: Bootstrap Log Collection
Bootstrap machine logs cannot be collected when installation fails because
the bootstrap VM uses the master security group, which doesn't allow SSH
access from the installation source address. This prevents gathering
diagnostic information needed to troubleshoot failed installations.
Problem 2: Orphaned Bootstrap Resources
During investigation of bootstrap resource lifecycle, discovered that CAPO
deletes floating IPs instead of disassociating them when removing control
plane machines. This contradicts the maintainer's design intent (see comment
at openstackmachine_controller.go:291) and causes issues for installer-managed
floating IPs.
Solution
1. Bootstrap Security Group
openshiftRole=bootstrapfor lifecycle management2. PostDestroyer Implementation
openshiftClusterID={infraID}ANDopenshiftRole=bootstrap