Naming
For the v1alpha3 release, the Cluster API provider for Metal3 was renamed from Cluster API provider BareMetal (CAPBM) to Cluster API provider Metal3 (CAPM3). Hence, from v1alpha3 onwards it is Cluster API provider Metal3.
Ready to start taking steps towards your first experience with metal3.io? Follow these commands to get started!
Naming
For the v1alpha3 release, the Cluster API provider for Metal3 was renamed from Cluster API provider BareMetal (CAPBM) to Cluster API provider Metal3 (CAPM3). Hence, from v1alpha3 onwards it is Cluster API provider Metal3.
Information
If you need detailed information regarding the process of creating a Metal³ emulated environment using metal3-dev-env, it is worth taking a look at the blog post “A detailed walkthrough of the Metal³ development environment”.
This is a high-level architecture of the Metal³-dev-env. Note that for Ubuntu based setup, either Kind or Minikube can be used to instantiate an ephemeral cluster, while for CentOS based setup only Minikube is currently supported. Ephemeral cluster creation tool can be manipulated with EPHEMERAL_CLUSTER environment variable.
tl;dr - Clone metal³-dev-env and run
$ make
The Makefile
runs a series of scripts, described here:
01_prepare_host.sh
- Installs all needed packages.
02_configure_host.sh
- Creates a set of VMs that will be managed as if they
were bare metal hosts. It also downloads some images needed for Ironic.
03_launch_mgmt_cluster.sh
- Launches a management cluster using minikube
or kind
and runs the baremetal-operator
on that cluster.
04_verify.sh
- Runs a set of tests that verify that the deployment completed successfully.
When the environment setup is completed, you should be able to see BareMetalHost
(bmh) objects in Ready state.
To tear down the environment, run
$ make clean
Note
When redeploying metal³-dev-env with a different release version of CAPM3, you
must set the FORCE_REPO_UPDATE
variable in config_${user}.sh
to true.
Whether you want to run target cluster Nodes with your own image, you can override the three following variables: IMAGE_NAME
,
IMAGE_LOCATION
, IMAGE_USERNAME
. If the requested image with name IMAGE_NAME
does not
exist in the IRONIC_IMAGE_DIR
(/opt/metal3-dev-env/ironic/html/images) folder, then it will be automatically
downloaded from the IMAGE_LOCATION
value configured.
Warning
If you see this error during the installation:
error: failed to connect to the hypervisor \
error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': Permission denied
You may need to log out then login again, and run make clean
and make
again.
This environment creates a set of VMs to manage as if they were bare metal
hosts. You can see the VMs using virsh
.
$ sudo virsh list
Id Name State
----------------------------------------------------
6 minikube running
9 node_0 running
10 node_1 running
Each of the VMs (aside from the minikube
management cluster VM) are
represented by BareMetalHost
objects in our management cluster. The yaml
defition file used to create these host objects is in bmhosts_crs.yaml
.
$ kubectl get baremetalhosts -n metal3
NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR
node-0 OK ready ipmi://192.168.111.1:6230 unknown true
node-1 OK ready ipmi://192.168.111.1:6231 unknown true
You can also look at the details of a host, including the hardware information gathered by doing pre-deployment introspection.
$ kubectl get baremetalhost -n metal3 -o yaml node-0
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"metal3.io/v1alpha1","kind":"BareMetalHost","metadata":{"annotations":{},"name":"node-0","namespace":"metal3"},"spec":{"bmc":{"address":"ipmi://192.168.111.1:6230","credentialsName":"node-0-bmc-secret"},"bootMACAddress":"00:f8:16:dd:3b:9b","online":true}}
creationTimestamp: "2020-02-05T09:09:44Z"
finalizers:
- baremetalhost.metal3.io
generation: 1
name: node-0
namespace: metal3
resourceVersion: "16312"
selfLink: /apis/metal3.io/v1alpha1/namespaces/metal3/baremetalhosts/node-0
uid: 99f4c905-b850-45e0-bf1b-61b12f91182b
spec:
bmc:
address: ipmi://192.168.111.1:6230
credentialsName: node-0-bmc-secret
bootMACAddress: 00:f8:16:dd:3b:9b
online: true
status:
errorMessage: ""
goodCredentials:
credentials:
name: node-0-bmc-secret
namespace: metal3
credentialsVersion: "1242"
hardware:
cpu:
arch: x86_64
clockMegahertz: 2399.998
count: 4
model: Intel Xeon E3-12xx v2 (Ivy Bridge)
firmware:
bios:
date: 04/01/2014
vendor: SeaBIOS
version: 1.10.2-1ubuntu1
hostname: node-0
nics:
- ip: 192.168.111.20
mac: 00:f8:16:dd:3b:9d
model: 0x1af4 0x0001
name: eth1
pxe: false
speedGbps: 0
vlanId: 0
- ip: 172.22.0.47
mac: 00:f8:16:dd:3b:9b
model: 0x1af4 0x0001
name: eth0
pxe: true
speedGbps: 0
vlanId: 0
ramMebibytes: 8192
storage:
- hctl: "0:0:0:0"
model: QEMU HARDDISK
name: /dev/sda
rotational: true
serialNumber: drivMetal3-dev-env setupe-scsi0-0-0-0
sizeBytes: 53687091200
vendor: QEMU
systemVendor:
manufacturer: QEMU
productName: Standard PC (Q35 + ICH9, 2009)
serialNumber: ""
hardwareProfile: unknown
lastUpdated: "2020-02-05T10:10:49Z"
operationHistory:
deprovision:
end: null
start: null
inspect:
end: "2020-02-05T09:15:08Z"
start: "2020-02-05T09:11:33Z"
provision:
end: null
start: null
register:
end: "2020-02-05T09:11:33Z"
start: "2020-02-05T09:10:32Z"
operationalStatus: OK
poweredOn: true
provisioning:
ID: b605df1d-7674-44ad-9810-20ad3e3c558b
image:
checksum: ""
url: ""
state: ready
triedCredentials:
credentials:
name: node-0-bmc-secret
namespace: metal3
credentialsVersion: "1242"
This section describes how to trigger provisioning of a cluster and hosts via
Machine
objects as part of the Cluster API integration. This uses Cluster API
v1alpha3 and
assumes that metal3-dev-env is deployed with the environment variable
CAPM3_VERSION set to v1alpha3. The v1alpha3 deployment can be done with
Ubuntu 18.04 or Centos 8 target host images. Please make sure to meet resource requirements for successfull deployment:
The following scripts can be used to provision a cluster, controlplane node and worker node.
$ ./scripts/provision/cluster.sh
$ ./scripts/provision/controlplane.sh
$ ./scripts/provision/worker.sh
At this point, the Machine
actuator will respond and try to claim a
BareMetalHost
for this Machine
. You can check the logs of the actuator
here:
$ kubectl logs -n capm3 pod/capm3-manager-7bbc6897c7-bp2pw -c manager
09:10:38.914458 controller-runtime/controller "msg"="Starting Controller" "controller"="metal3cluster"
09:10:38.926489 controller-runtime/controller "msg"="Starting workers" "controller"="metal3machine" "worker count"=1
10:54:16.943712 Host matched hostSelector for Metal3Machine
10:54:16.943772 2 hosts available while choosing host for bare metal machine
10:54:16.944087 Associating machine with host
10:54:17.516274 Finished creating machine
10:54:17.518718 Provisioning BaremetalHost
If you look at the yaml representation of the Machine
object, you will see a
new annotation that identifies which BareMetalHost
was chosen to satisfy this
Machine
request.
$ kubectl get machine centos -n metal3 -o yaml
...
annotations:
metal3.io/BareMetalHost: metal3/node-1
...
You can also see in the list of BareMetalHosts
that one of the hosts is now
provisioned and associated with a Machine
.
$ kubectl get baremetalhosts -n metal3
NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR
node-0 OK provisioning test1-md-0-m87bq ipmi://192.168.111.1:6230 unknown true
node-1 OK provisioning test1-controlplane-0 ipmi://192.168.111.1:6231 unknown true
You should be able to ssh into your host once provisioning is completed.
The default username for both CentOS & Ubuntu image is metal3
.
For the IP address, you can either use API endpoint IP of the target cluster
which is - 192.168.111.249
by default or use predictable IP address of the first
master node - 192.168.111.100
.
$ ssh [email protected]
Deprovisioning of the target cluster is done just by deleting Cluster
and Machine
objects or by executing the deprovisioning scripts in reverse order than provisioning:
$ ./scripts/deprovision/worker.sh
$ ./scripts/deprovision/controlplane.sh
$ ./scripts/deprovision/cluster.sh
Note that you can easily deprovision worker Nodes by decreasing the number of replicas in the MachineDeployment
object created when executing the provision_worker.sh
script:
$ kubectl scale machinedeployment test1-md-0 --replicas=0
Warning
control-plane and cluster are very tied together. This means that you are not able to deprovision the control-plane of a cluster and then provision a new one within the same cluster. Therefore, in case you want to deprovision the control-plane you need to deprovision the cluster as well and provision both again.
Below, it is shown how the deprovisioning can be executed in a more manual way by just deleting the proper Custom Resources (CR)
$ kubectl delete machine test1-md-0-m87bq -n metal3
machine.cluster.x-k8s.io "test1-md-0-m87bq" deleted
$ kubectl delete machine test1-controlplane-0 -n metal3
machine.cluster.x-k8s.io "test1-controlplane-0" deleted
$ kubectl delete cluster test1 -n metal3
cluster.cluster.x-k8s.io "test1" deleted
Once the deprovisioning is started, you can see that the BareMetalHost
and Cluster
are going
through a deprovisioning process too.
$ kubectl get baremetalhosts -n metal3
NAME STATUS PROVISIONING STATUS CONSUMER BMC HARDWARE PROFILE ONLINE ERROR
node-0 OK deprovisioning test1-md-0-m87bq ipmi://192.168.111.1:6230 unknown false
node-1 OK deprovisioning test1-controlplane-0 ipmi://192.168.111.1:6231 unknown false
$ kubectl get cluster -n metal3
NAME PHASE
test1 deprovisioning
If you want to deploy Ubuntu hosts, please skip this section.
As shown in the prerequisites section, the preferred OS image for CentOS is version 8. Actually, for both the system where the metal3-dev-env environment is configured and the target cluster nodes.
Wheter you still want to deploy Centos 7 for the target hosts, the following variables needs to be modified:
IMAGE_NAME_CENTOS="centos-updated.qcow2"
IMAGE_LOCATION_CENTOS="http://artifactory.nordix.org/artifactory/airship/images/centos.qcow2"
IMAGE_OS=Centos
Additionally, you can let the Ansible provision_controlplane.sh
and provision_worker.sh
download the image automatically following the variables listed above or download the properly configured CentOS 7 image from the following location into the IRONIC_IMAGE_DIR
:
curl -LO http://artifactory.nordix.org/artifactory/airship/images/centos.qcow2
mv centos.qcow2 /opt/metal3-dev-env/ironic/html/images/centos-updated.qcow2
md5sum /opt/metal3-dev-env/ironic/html/images/centos-updated.qcow2 | \
awk '{print $1}' > \
/opt/metal3-dev-env/ironic/html/images/centos-updated.qcow2.md5sum
It’s also possible to provision via the BareMetalHost
interface directly
without using the Cluster API integration.
There is a helper script available to trigger provisioning of one of these hosts. To provision a host with CentOS, run:
$ ./provision_host.sh node-0
The BareMetalHost
will go through the provisioning process, and will
eventually reboot into the operating system we wrote to disk.
$ kubectl get baremetalhost node-0 -n metal3
NAME STATUS PROVISIONING STATUS MACHINE BMC HARDWARE PROFILE ONLINE ERROR
node-0 OK provisioned ipmi://192.168.111.1:6230 unknown true
provision_host.sh
will inject your SSH public key into the VM. To find the IP
address, you can check the DHCP leases on the baremetal
libvirt network.
$ sudo virsh net-dhcp-leases baremetal
Expiry Time MAC address Protocol IP address Hostname Client ID or DUID
-------------------------------------------------------------------------------------------------------------------
2019-05-06 19:03:46 00:1c:cc:c6:29:39 ipv4 192.168.111.20/24 node-0 -
2019-05-06 19:04:18 00:1c:cc:c6:29:3d ipv4 192.168.111.21/24 node-1 -
The default user for the CentOS image is metal3
.
There is another helper script to deprovision a host.
$ ./deprovision_host.sh node-0
You will then see the host go into a deprovisioning
status:
$ kubectl get baremetalhost node-0 -n metal3
NAME STATUS PROVISIONING STATUS MACHINE BMC HARDWARE PROFILE ONLINE ERROR
node-0 OK deprovisioning ipmi://192.168.111.1:6230 unknown true
The baremetal-operator
comes up running in the cluster by default, using an
image built from the metal3-io/baremetal-operator repository. If you’d like to test changes to the
baremetal-operator
, you can follow this process.
First, you must scale down the deployment of the baremetal-operator
running
in the cluster.
kubectl scale deployment metal3-baremetal-operator -n metal3 --replicas=0
To be able to run baremetal-operator
locally, you need to install
operator-sdk. After that, you can run
the baremetal-operator
including any custom changes.
cd ~/go/src/github.com/metal3-io/baremetal-operator
make run
There are two Cluster API related managers running in the cluster. One
includes set of generic controllers, and the other includes a custom Machine
controller for Metal3. If you want to try changes to
cluster-api-provider-metal3
, you want to shut down the custom Machine
controller manager first.
$ kubectl scale statefulset capm3-controller-manager -n capm3-system --replicas=0
Then you can run the custom Machine controller manager out of your local git tree.
cd ~/go/src/github.com/metal3-io/cluster-api-provider-metal3
make run
Sometimes you may want to look directly at Ironic to debug something. The metal3-dev-env repository contains a clouds.yaml file with connection settings for Ironic.
Metal3-dev-env will install the unified OpenStack and standalone OpenStack Ironic command-line clients on the provisioning host as part of setting up the cluster.
Note that currently you can use either unified OpenStack client or Ironic client. In this example we are using Ironic client to interact with Ironic API.
Please make sure to export
CONTAINER_RUNTIME
environment variable before you execute
commands.
Example:
[[email protected] metal3-dev-env]$ export CONTAINER_RUNTIME=docker
[[email protected] metal3-dev-env]$ baremetal node list
+--------------------------------------+--------+---------------+-------------+--------------------+-------------+
| UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+--------+---------------+-------------+--------------------+-------------+
| 882cf206-d688-43fa-bf4c-3282fcb00b12 | node-0 | None | None | enroll | False |
| ac257479-d6c6-47c1-a649-64a88e6ff312 | node-1 | None | None | enroll | False |
+--------------------------------------+--------+---------------+-------------+--------------------+-------------+
To view a particular node’s details, run the below command. The
last_error
, maintenance_reason
, and provisioning_state
fields are
useful for troubleshooting to find out why a node did not deploy.
[[email protected] metal3-dev-env]$ baremetal node show 882cf206-d688-43fa-bf4c-3282fcb00b12
+------------------------+------------------------------------------------------------+
| Field | Value |
+------------------------+------------------------------------------------------------+
| allocation_uuid | None |
| automated_clean | None |
| bios_interface | no-bios |
| boot_interface | ipxe |
| chassis_uuid | None |
| clean_step | {} |
| conductor | localhost.localdomain |
| conductor_group | |
| console_enabled | False |
| console_interface | no-console |
| created_at | 2019-10-07T19:37:36+00:00 |
| deploy_interface | direct |
| deploy_step | {} |
| description | None |
| driver | ipmi |
| driver_info | {u'ipmi_port': u'6230', u'ipmi_username': u'admin', u'deploy_kernel': u'http://172.22.0.2/images/ironic-python-agent.kernel', u'ipmi_address': u'192.168.111.1', u'deploy_ramdisk': u'http://172.22.0.2/images/ironic-python-agent.initramfs', u'ipmi_password': u'******'} |
| driver_internal_info | {u'agent_enable_ata_secure_erase': True, u'agent_erase_devices_iterations': 1, u'agent_erase_devices_zeroize': True, u'disk_erasure_concurrency': 1, u'agent_continue_if_ata_erase_failed': False} |
| extra | {} |
| fault | clean failure |
| inspect_interface | inspector |
| inspection_finished_at | None |
| inspection_started_at | None |
| instance_info | {} |
| instance_uuid | None |
| last_error | None |
| maintenance | True |
| maintenance_reason | Timeout reached while cleaning the node. Please check if the ramdisk responsible for the cleaning is running on the node. Failed on step {}. |
| management_interface | ipmitool |
| name | master-0 |
| network_interface | noop |
| owner | None |
| power_interface | ipmitool |
| power_state | power on |
| properties | {u'cpu_arch': u'x86_64', u'root_device': {u'name': u'/dev/sda'}, u'local_gb': u'50'} |
| protected | False |
| protected_reason | None |
| provision_state | clean wait |
| provision_updated_at | 2019-10-07T20:09:13+00:00 |
| raid_config | {} |
| raid_interface | no-raid |
| rescue_interface | no-rescue |
| reservation | None |
| resource_class | baremetal |
| storage_interface | noop |
| target_power_state | None |
| target_provision_state | available |
| target_raid_config | {} |
| traits | [] |
| updated_at | 2019-10-07T20:09:13+00:00 |
| uuid | 882cf206-d688-43fa-bf4c-3282fcb00b12 |
| vendor_interface | ipmitool |
+-------------------------------------------------------------------------------------+