-
Notifications
You must be signed in to change notification settings - Fork 479
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
WIP Minimise Baremetal footprint enhancement
Work on defining the ideas around 3-master deployment on baremetal where we wish to avoid the install-time dependency for a 4th host either in the rack or connected directly to it.
- Loading branch information
Steven Hardy
committed
Jun 5, 2020
1 parent
f681a13
commit 79c83f2
Showing
1 changed file
with
143 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,143 @@ | ||
--- | ||
title: minimise-baremetal-footprint | ||
authors: | ||
- "@hardys" | ||
reviewers: | ||
- "@avishayt" | ||
- "@beekhof" | ||
- "@crawford" | ||
- "@deads2k" | ||
- "@dhellmann" | ||
- "@hexfusion" | ||
- "@mhrivnak" | ||
approvers: | ||
- "@crawford" | ||
creation-date: "2020-06-04" | ||
last-updated: "2020-06-05" | ||
status: implementable | ||
see-also: compact-clusters | ||
replaces: | ||
superseded-by: | ||
--- | ||
|
||
# Minimise Baremetal footprint | ||
|
||
## Release Signoff Checklist | ||
|
||
- [ ] Enhancement is `implementable` | ||
- [ ] Design details are appropriately documented from clear requirements | ||
- [ ] Test plan is defined | ||
- [ ] Graduation criteria for dev preview, tech preview, GA | ||
- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/) | ||
|
||
## Summary | ||
|
||
Over recent releases OpenShift has improved support for small-footprint | ||
deployments, in particular with the compact-clusters enhancement which adds | ||
full support for 3-node clusters where the masters are schedulable. | ||
|
||
This is a particularly useful deployment option for baremetal PoC environments, | ||
where often the amount of physical hardware is limited, but there is still the | ||
problem of where to run the installer/bootstrap-VM in this environment. | ||
|
||
The current solution for IPI baremetal is to require a 4th bootstrap host, | ||
which is a machine physically connected to the 3 master nodes, that runs | ||
the installer and/or the bootstrap VM. This effectively means the minimum | ||
footprint is 4 nodes, unless you can temporarily connect a provisioning host | ||
to the cluster machines. | ||
|
||
A similar constraint exists for UPI baremetal deployments, where although a | ||
3 master cluster is possible, you need to run a 4th bootstrap node somewhere | ||
for the duration of the initial installation. | ||
|
||
## Motivation | ||
|
||
This proposal outlines a potential approach to avoid the requirement for a | ||
4th node, leveraging the recent etcd-operator improvements and work to enable | ||
a live-iso replacement for the bootstrap VM. | ||
|
||
### Goals | ||
|
||
* Enable compact clusters to be deployed on baremetal with exactly 3 nodes | ||
* Avoid the need for additional nodes to run install/bootstrap components | ||
|
||
### Non-Goals | ||
|
||
* Supporting any topology other than three masters for production clusters. | ||
* Dupporting deployment of a single master or scaling from such a deployment. | ||
|
||
## Proposal | ||
|
||
### User Stories | ||
|
||
As a user of OpenShift, I should be able to install a 3-node cluster (no workers) | ||
in baremetal environments, without the requirement to temporarily connect a 4th | ||
node to host installer/bootstrap services. Instead I want to boot one of the | ||
3 target nodes with an image which enables installation to proceed, and the | ||
end state should be a full supportable 3-node cluster. | ||
|
||
|
||
### Risks and Mitigations | ||
|
||
This proposal builds on work already completed e.g etcd-operator improvements | ||
but we need to ensure any change in deployment topology is well tested and | ||
fully supported, to avoid compact-baremetal deployments being an unreliable | ||
corner-case. | ||
|
||
## Design Details | ||
|
||
### Enabling three-node clusters on baremetal | ||
|
||
OpenShift now provides a bootable RHCOS based installer ISO image, which can | ||
be booted on baremetal, and adapted to install the components normally | ||
deployed on the bootstrap VM [TODO reference to PoC @avishayt] | ||
|
||
This means we can run the bootstrap services in-place on one of the target hosts | ||
which we can later reboot to become a master (referred to as master-0 below). | ||
|
||
While the master-0 is running the bootstrap services, the two additional hosts | ||
are then provisioned, either with a UPI-like boot-it-yourself method, or via a | ||
variation on the current IPI flow where the provisioning components run on | ||
master-0 alongside the bootstrap services (exactly like we do today on the | ||
bootstrap VM). | ||
|
||
When the two masters have deployed, they form the initial OpenShift controlplane | ||
and master-0 then reboots to become a regular master. At this point it joins | ||
the cluster and bootstrapping is complete, and the result is a full-HA 3-master | ||
deployment without any dependency on a 4th provisioning host. | ||
|
||
TODO - detailed flow derived from assisted-install planning docs? | ||
|
||
|
||
### Test Plan | ||
|
||
We should test in baremetal (or emulated baremetal) environments with 3-node clusters with machines that represent | ||
our minimum target and ensure our e2e tests operate reliably with this new topology. | ||
|
||
### Graduation Criteria | ||
|
||
TODO | ||
|
||
### Upgrade / Downgrade Strategy | ||
|
||
This is an install-time variation so no upgrade/downgrade impact. | ||
|
||
## Implementation History | ||
|
||
TODO links to existing PoC code/docs/demos | ||
|
||
## Drawbacks | ||
|
||
The main drawback of this approach is it requires a deployment topology and | ||
controlplane scaling which is not likely to be adopted by any of the existing | ||
cloud platforms, thus moves away from the well-tested path and increases the | ||
risk of regressions and corner-cases not covered by existing platform testing. | ||
|
||
## Alternatives | ||
|
||
One possible alternative is to have master-0 deploy a single-node controlplane | ||
then provision the remaining two hosts. This idea has been rejected as it | ||
is likely more risky trying to scale from 1->3 masters than establishing | ||
initial quorum with a 2-node controlplane, which should be similar to the | ||
degraded mode when any master fails in a HA deployment, and thus a more | ||
supportable scenario. |