Provisioning bare metal servers using ManageIQ

March 26, 2019 - Words by  Tadej Borovšak - 6 min read

Card image caption

It seems that today almost anyone knows how to provision a virtual machine in the cloud: simply select an operating system image of choice along with an instance type and you are one click away from a working virtual machine. But what about bare metal servers? Is it possible to provision them with the same ease? Let us find out.

Lucy, our new server has just arrived!

Say hello to Lucy, a system administrator, responsible for provisioning the server. Lucy is not a happy camper at the moment, since she needs to:

  1. Mount an installation media.
  2. Make sure that the server is set to boot from the installation media.
  3. Boot the server.
  4. Perform the operating system installation.

And as you might expect, doing this manually is error-prone and gets old rather quickly. If Lucy is installing Gentoo and compiling her own kernel along the way, the installation is quite fun with an added thrill of will my kernel work properly when I reboot. But most of the time she is not installing Gentoo and she is not having fun.

But what can Lucy do to make this process less painful? Luckily, Lucy is an avid reader of our blog and has a few tricks up her sleeve and she will tackle the installation media mounting problem first.

Mythical creature to the rescue

When Lucy is preparing to install the operating system to the server, she usually logs into the vendor-supplied web interface to the server’s baseboard management controller (BMC) and mounts installation media from there. But thanks to one of our previous posts, she knows about preboot execution environment (PXE, usually pronounced pixie) specification.

Just like in folklore where pixies brought blessings to those who treated them nicely, this specification allows us to boot into installation over the network (doing away with CDs, DVDs, USB dongles and whatnot) in exchange for setting up a PXE server.

The best part of this setup is that after Lucy has her PXE server up and running, she can start installing operating systems on any server that has a PXE-capable network interface controller. Which is just about any server in use today.

So, how does this enhancement affect Lucy’s provisioning process? Well, the steps that she needs to perform now are:

  1. Make sure that the server is set to boot from the network.
  2. Boot the server.
  3. From the on-screen menu presented on the computer’s screen select the installation program to boot into.
  4. Perform the operating system installation.

Now Lucy is ready to automate the operating system installation.

Kickstarting the installation

Most of the installation programs come with the built-in ability to seed the installation options and perform automatic installation. All Lucy needs is a place to store those configuration files and a method of delivering them to the installation programs. Because Lucy read our post about managing PXE server using Manageiq, she considers this problem solved.

The only vendor-specific things still left on her list are the boot order manipulation and booting. Lucy would normally use vendor-specific tools to perform those two steps and while automating this is certainly possible, Lucy would need to prepare specialized automation solution for each vendor. And we can all agree that this sounds neither fun nor particularly rewarding.

So let us help Lucy make them vendor-agnostic as well.

So long, and thanks for all the fish

So, yeah, for those of you, who have been living under a rock and have not been able to read our blog posts, go do that now and the title of this section will make perfect sense. We will wait.

Right, so now we all know that Lucy will be using Redfish API to do the boot order manipulation. In order to do that, the server’s baseboard management controller needs to have support for the Redfish standard. Most of the servers that are on the market today have this support readily available, making this almost a non-issue.

Redfish API has this nifty capability of overriding boot order for just a single boot, which is just perfect for bare metal provisioning, since Lucy can instruct system to boot once from the network, and it will then revert to booting from the system’s disk as usual. To do that she must send the following PATCH request to the Redfish service:

PATCH /redfish/v1/Systems/system-id
{
  "Boot": {
    "BootSourceOverrideEnabled": "Once",
    "BootSourceOverrideTarget": "Pxe"
  }
}

And to finish the thing off, she requests the server to reboot by POSTing data to the Redfish service:

POST /redfish/v1/Systems/systemd-id/Actions/ComputerSystem.Reset
{
  "ResetType": "GracefulRestart"
}

And there we have it: a fully vendor-agnostic provisioning procedure. We can instruct Lucy to write provisioning scripts and then kindly ask her to start searching for another job, since her services are no longer needed, right?

Well, no, not really. Why? Because there are still some nitty-gritty details that we did not take care of, like automatically selecting proper installation program to boot into.

Letting ManageIQ do the dirty work

In order to make bare metal provisioning as painless as possible, Lucy would still need a way to link all of the above into a provisioning process. And ManageIQ fits the bill perfectly here.

It has a UI for displaying PXE images that are available on the selected PXE server. And to make things even better, it knows how to retrieve this list of images on its own.

List of available PXE images.

List of available PXE images.

Lucy can also use ManageIQ to create new customization files, store them and edit them if needed. And again, ManageIQ has a few additional tricks up its sleeve, since these customization files are actually embedded Ruby (ERB) templates that can use some special variables to dynamically create customization file content.

Preview of customization template.

Preview of customization template.

As for the physical server list, Redfish provider has Lucy’s back here, since it manages server inventory for her and allows her to interact with those servers.

List of available servers.

List of available servers.

All Lucy needs now is some way of starting the provisioning process. Up until now, ManageIQ had no way of provisioning those bare metal servers. Lenovo physical infrastructure provider has support for provisioning, but it delegates most of the work to its own XClarity service.

But thanks to the World Wide Technology (WWT), things are changing and we are in the process of making this provisioning possible. In our proof-of-concept implementation, Lucy just needs to click a Lifecycle -> Provision Physical Server button on a server details page, select PXE image, select customization file and click Submit. And that is it. After a few minutes, the selected server will have its operating system installed along with all other customizations that were made through customization file.

Physical server provisioning process.

Physical server provisioning process.

Please keep in mind that this is work-in-progress feature and the final product might look a bit different, but the principles will stay the same.

Conclusion

Did we fire Lucy at the end? Of course not! Automating processes is not just about reducing the expenses but more about creating environment where people are not constantly occupied with trivial tasks, freeing them to focus on the things that cannot be automated.

Having all the complexity of the provisioning process hidden inside the ManageIQ has another benefit: it allows us to pick the best tool for the job at hand. If the workload we are preparing resources for would benefit from the raw power of bare metal servers, we can harness that now by simply switching a tab in ManageIQ.

Also, we would like to say Thank you to WWT for giving us an opportunity to work with them on the Bare Metal Provisioning Solution (BMPS). And another Thank you to Red Hat for working with us.

Do you have anything to add? Talk to us on Twitter and/or Reddit. Or visit Red Hat Summit where Ian will give a talk about the BMPS.

Cheers!