Metal³: Baremetal Provisioning for Kubernetes
Originally posted at https://blog.russellbryant.net/post/2019/04/2019-04-30-metal-metal-kubed-bare-metal-provisioning-for-kubernetes/
Project Introduction
There are a number of great open-source tools for bare metal host provisioning, including Ironic. Metal³ aims to build on these technologies to provide a Kubernetes native API for managing bare metal hosts via a provisioning stack that is also running on Kubernetes. We believe that Kubernetes Native Infrastructure, or managing your infrastructure just like your applications, is a powerful next step in the evolution of infrastructure management.
The Metal³ project is also building integration with the Kubernetes cluster-api project, allowing Metal³ to be used as an infrastructure backend for Machine objects from the Cluster API.
Metal3 Repository Overview
There is a Metal³ overview and some more detailed design documents in the metal3-docs repository.
The baremetal-operator is the component that manages bare metal hosts. It exposes a new BareMetalHost custom resource in the Kubernetes API that lets you manage hosts in a declarative way.
Finally, the cluster-api-provider-baremetal repository includes integration with the cluster-api project. This provider currently includes a Machine actuator that acts as a client of the BareMetalHost custom resources.
Demo
The project has been going on for a few months now, and there’s enough now to show some working code.
For this demonstration, I’ve started with a 3-node Kubernetes cluster installed using OpenShift.
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
master-0 Ready master 24h v1.13.4+d4ce02c1d
master-1 Ready master 24h v1.13.4+d4ce02c1d
master-2 Ready master 24h v1.13.4+d4ce02c1d
Machine objects were created to reflect these 3 masters, as well.
$ kubectl get machines
NAME INSTANCE STATE TYPE REGION ZONE AGE
ostest-master-0 24h
ostest-master-1 24h
ostest-master-2 24h
For this cluster-api provider, a Machine has a corresponding BareMetalHost object, which corresponds to the piece of hardware we are managing. There is a design document that covers the relationship between Nodes, Machines, and BareMetalHosts.
Since these hosts were provisioned earlier, they are in a special externally provisioned
state, indicating that we enrolled them in management while they were already running in a desired state. If changes are needed going forward, the baremetal-operator will be able to automate them.
$ kubectl get baremetalhosts
NAME STATUS PROVISIONING STATUS MACHINE BMC HARDWARE PROFILE ONLINE ERROR
openshift-master-0 OK externally provisioned ostest-master-0 ipmi://192.168.111.1:6230 true
openshift-master-1 OK externally provisioned ostest-master-1 ipmi://192.168.111.1:6231 true
openshift-master-2 OK externally provisioned ostest-master-2 ipmi://192.168.111.1:6232 true
Now suppose we’d like to expand this cluster by adding another bare metal host to serve as a worker node. First, we need to create a new BareMetalHost object that adds this new host to the inventory of hosts managed by the baremetal-operator. Here’s the YAML for the new BareMetalHost:
---
apiVersion: v1
kind: Secret
metadata:
name: openshift-worker-0-bmc-secret
type: Opaque
data:
username: YWRtaW4=
password: cGFzc3dvcmQ=
---
apiVersion: metalkube.org/v1alpha1
kind: BareMetalHost
metadata:
name: openshift-worker-0
spec:
online: true
bmc:
address: ipmi://192.168.111.1:6233
credentialsName: openshift-worker-0-bmc-secret
bootMACAddress: 00:ab:4f:d8:9e:fa
Now to add the BareMetalHost and its IPMI credentials Secret to the cluster:
$ kubectl create -f worker_crs.yaml
secret/openshift-worker-0-bmc-secret created
baremetalhost.metalkube.org/openshift-worker-0 created
The list of BareMetalHosts now reflects a new host in the inventory that is ready to be provisioned. It will remain in this ready
state until it is claimed by a new Machine object.
$ kubectl get baremetalhosts
NAME STATUS PROVISIONING STATUS MACHINE BMC HARDWARE PROFILE ONLINE ERROR
openshift-master-0 OK externally provisioned ostest-master-0 ipmi://192.168.111.1:6230 true
openshift-master-1 OK externally provisioned ostest-master-1 ipmi://192.168.111.1:6231 true
openshift-master-2 OK externally provisioned ostest-master-2 ipmi://192.168.111.1:6232 true
openshift-worker-0 OK ready ipmi://192.168.111.1:6233 unknown true
We have a MachineSet already created for workers, but it scaled down to 0.
$ kubectl get machinesets
NAME DESIRED CURRENT READY AVAILABLE AGE
ostest-worker-0 0 0 24h
We can scale this MachineSet to 1 to indicate that we’d like a worker provisioned. The baremetal cluster-api provider will then look for an available BareMetalHost, claim it, and trigger provisioning of that host.
$ kubectl scale machineset ostest-worker-0 --replicas=1
After the new Machine was created, our cluster-api provider claimed the available host and triggered it to be provisioned.
$ kubectl get baremetalhosts
NAME STATUS PROVISIONING STATUS MACHINE BMC HARDWARE PROFILE ONLINE ERROR
openshift-master-0 OK externally provisioned ostest-master-0 ipmi://192.168.111.1:6230 true
openshift-master-1 OK externally provisioned ostest-master-1 ipmi://192.168.111.1:6231 true
openshift-master-2 OK externally provisioned ostest-master-2 ipmi://192.168.111.1:6232 true
openshift-worker-0 OK provisioning ostest-worker-0-jmhtc ipmi://192.168.111.1:6233 unknown true
This process takes some time. Under the hood, the baremetal-operator is driving Ironic through a provisioning process. This begins with wiping disks to ensure the host comes up in a clean state. It will eventually write the desired OS image to disk and then reboot into that OS. When complete, a new Kubernetes Node will register with the cluster.
$ kubectl get baremetalhosts
NAME STATUS PROVISIONING STATUS MACHINE BMC HARDWARE PROFILE ONLINE ERROR
openshift-master-0 OK externally provisioned ostest-master-0 ipmi://192.168.111.1:6230 true
openshift-master-1 OK externally provisioned ostest-master-1 ipmi://192.168.111.1:6231 true
openshift-master-2 OK externally provisioned ostest-master-2 ipmi://192.168.111.1:6232 true
openshift-worker-0 OK provisioned ostest-worker-0-jmhtc ipmi://192.168.111.1:6233 unknown true
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
master-0 Ready master 24h v1.13.4+d4ce02c1d
master-1 Ready master 24h v1.13.4+d4ce02c1d
master-2 Ready master 24h v1.13.4+d4ce02c1d
worker-0 Ready worker 68s v1.13.4+d4ce02c1d
The following screen cast demonstrates this process, as well:
Removing a bare metal host from the cluster is very similar. We just have to scale this MachineSet back down to 0.
$ kubectl scale machineset ostest-worker-0 --replicas=0
Once the Machine has been deleted, the baremetal-operator will deprovision the bare metal host.
$ kubectl get baremetalhosts
NAME STATUS PROVISIONING STATUS MACHINE BMC HARDWARE PROFILE ONLINE ERROR
openshift-master-0 OK externally provisioned ostest-master-0 ipmi://192.168.111.1:6230 true
openshift-master-1 OK externally provisioned ostest-master-1 ipmi://192.168.111.1:6231 true
openshift-master-2 OK externally provisioned ostest-master-2 ipmi://192.168.111.1:6232 true
openshift-worker-0 OK deprovisioning ipmi://192.168.111.1:6233 unknown false
Once the deprovisioning process is complete, the bare metal host will be back to its ready
state, available in the host inventory to be claimed by a future Machine object.
$ kubectl get baremetalhosts
NAME STATUS PROVISIONING STATUS MACHINE BMC HARDWARE PROFILE ONLINE ERROR
openshift-master-0 OK externally provisioned ostest-master-0 ipmi://192.168.111.1:6230 true
openshift-master-1 OK externally provisioned ostest-master-1 ipmi://192.168.111.1:6231 true
openshift-master-2 OK externally provisioned ostest-master-2 ipmi://192.168.111.1:6232 true
openshift-worker-0 OK ready ipmi://192.168.111.1:6233 unknown false
Getting Involved
All development is happening on github. We have a metal3-dev mailing list and use #cluster-api-baremetal on Kubernetes Slack to chat. Occasional project updates are posted to @metal3_io on Twitter.