Last week, after many months development & testing, we finally did a new release of
libvirt which includes secure remote management. Previously usage of libvirt was restricted to apps running on the machine being managed. When you are managing a large data center of machines requiring that an admin ssh into a machine to manage virtual machines is clearly sub-optimal. XenD has had the ability to talk to its HTTP service remotely for quite a while, but this used cleartext HTTP and had zero authentication until Xen 3.1.0. We could have worked on improving XenD, but it was more compelling to work on a solution that would apply to all virtualization platforms. Thus we designed & implemented a small daemon for libvirt to expose the API to remote machines. The the communications can be run over a native SSL/TLS encrypted transport, or indirectly over an SSH tunnel. The former will offer a number of benefits in the long term – not least of which the ability to delegate management permissiosn per-VM and thus avoid the need to provide root access to virtual machine administrators.
So how do you make use of this new remote management. Well, installing libvirt 0.3.0 is the first step. Out of the box, the SSL/TLS transport is not enabled since it requires x509 certificates to be created. There are docs about certificate creation/setup so I won’t repeat it here – don’t be put off by any past experiance setting up SSL with Apache – its really not as complicated as it seems. The GNU TLS certtool
also a much more user friendly tool than the horrific openssl
command line. Once the daemon is running, then the only thing that changes is the URI you use to connect to libvirt. This is best illustrated by a couple of examples
- Connecting to Xen locally
-
$ ssh root@pumpkin.virt.boston.redhat.com
# virsh --connect xen:/// list --all
Id Name State
----------------------------------
0 Domain-0 running
6 rhel5fv blocked
- hello shut off
- rhel4x86_64 shut off
- rhel5pv shut off
- Connecting to Xen remotely using SSL/TLS
-
$ virsh --connect xen://pumpkin.virt.boston.redhat.com/ list --all
Id Name State
----------------------------------
0 Domain-0 running
6 rhel5fv blocked
- hello shut off
- rhel4x86_64 shut off
- rhel5pv shut off
- Connecting to Xen remotely using SSH
-
$ virsh --connect xen+ssh://root@pumpkin.virt.boston.redhat.com/ list --all
Id Name State
----------------------------------
0 Domain-0 running
6 rhel5fv blocked
- hello shut off
- rhel4x86_64 shut off
- rhel5pv shut off
- Connecting to QEMU/KVM locally
-
$ ssh root@celery.virt.boston.redhat.com
# virsh --connect qemu:///system list --all
Id Name State
----------------------------------
1 kvm running
- demo shut off
- eek shut off
- fc6qemu shut off
- rhel4 shut off
- wizz shut off
- Connecting to QEMU/KVM remotely using SSL/TLS
-
$ virsh --connect qemu://celery.virt.boston.redhat.com/system list --all
Id Name State
----------------------------------
1 kvm running
- demo shut off
- eek shut off
- fc6qemu shut off
- rhel4 shut off
- wizz shut off
- Connecting to QEMU/KVM remotely using SSH
-
$ virsh --connect qemu+ssh://root@celery.virt.boston.redhat.com/system list --all
Id Name State
----------------------------------
1 kvm running
- demo shut off
- eek shut off
- fc6qemu shut off
- rhel4 shut off
- wizz shut off
Notice how the only thing that changes is the URI – the information returned is identical no matter how you connect to libvirt. So if you have an application using libvirt, all you need do is adapt your connect URIs to support remote access. BTW, a quick tip – if you get tired of typing –connect arg you can set the VIRSH_DEFAULT_CONNECT_URI environment variable instead.
What about virt-install and virt-manager you might ask. Well there are slightly more complicated. During the creation of new virtual machines, both of them need to create files on the local disk (to act as virtual disks for the guest), possibly download kernel+initrd images for booting the installer, and enumerating network devices to setup networking. So while virt-manager can run remotely now – it will be restricted to monitoring existing VMs, and basic lifecycle management – it won’t be possible to provision entirely new VMs remotely. Yet. Now the basic remote management is working, we’re looking at APIs to satisfy storage management needs of virt-manager. For device enumeration we can add APIs which ask HAL questions and pass the info back to the client over our secure channel. Finally kernel+initrd downloading can be avoided with by PXE booting the guests.
There’s lots more to talk about, such as securing the VNC console with SSL/TLS, but I’ve not got time for that in this blog posting. Suffice to say, we’re well on our way to our Fedora 8 goals for secure remote management. Fedora 8 will be the best platform for virtualization management by miles.
In the latter half of last year I was mulling over the idea of writing SELinux policy for Test-AutoBuild. I played around a little bit, but never got the time to really make a serious attempt at it. Before I go on, a brief re-cap on the motivation…
Test-AutoBuild is a framework for providing continous, unattended, automated software builds. Each run of the build engine checks the latest source out from CVS, calculates a build order based on module dependancies, builds each modules, and the publishes the results. The build engine typically runs under a dedicated system user account – builder
– to avoid any risk of the module build process compromising a host (either accidentally, or delibrately). This works reasonably well if you are only protecting against accidental damage from a module build – eg building apps maintained inside your organization. If building code from source repositories out on the internet though there is a real risk of delibrately hostile module build processes. A module may be trojaned so that its build process attempts to scan your internal network, or it may trash the state files of the build engine itself – both the engine & the module being built are under the same user account. There is also the risk that the remote source control server has been trojaned to try and exploit flaws in the client.
And so enter SELinux… The build engine is highly modular in structure, with different tasks in the build workflow being pretty well isolated. So the theory was that it ought to be possible to write SELinux policy to guarentee separation of the build engine, from the SCM tools doing source code checkout, from the module build processes, and other commands being run. As an example, within a build root there a handful of core directories
root
|
+- source-root - dir in which module source is checked out
+- package-root - dir in which RPMs/Debs & other packages are generated
+- install-root - virtual root dir for installing files in 'make install'
+- build-archive - archive of previous successful module builds
+- log-root - dir for creating log files of build process
+- public_html - dir in which results are published
All these dirs are owned by the builder
user account. The build engine itself provides all the adminsitrative tasks for the build workflow, so generally requires full access to all of these directories. The SCM tools, however, merely need to be able to check out files into the source-root
and create logs in the log-root
. The module build process needs to be able to read/write in the source-root
, package-root
and install-root
, as well as creating logs in the log-root
. So, given suitable SELinux policy it ought to be possible to lock down the access of the SCM tools and build process quite significantly.
Now aside from writing the policy there are a couple of other small issues. The primary one is that the build engine has to run in a confined SELinux context, and has to be able to run SCM tools and build processes in a different context. For the former, I choose to create a ‘auto-build-secure’ command to augment the ‘auto-build’ command. This allows user to easily run the build process in SELinux enforced, or traditional unconfined modes. In the latter cases, most SELinux policy has automated process context transitions based on the binary file labels. This isn’t soo useful for autobuild though, because the script we’re running is being checked out direct from a SCM repo & thus not labelled. The solution for this is easily though – after fork()ing, but before exec()ing the SCM tools / build script we simply write the desired target context into /proc/self/attr/exec.
So with a couple of tiny modifications to the build engine, and many hours of writing suitable policy for Test-AutoBuild, its now possible to run the build engine under a strictly confined policy. There is one horrible troublespot though. Every application has its own build process & set of operations is wishes to perform. Writing a policy which confines the build process as much as possible, while still keeping it secure is very hard indeed. In fact it is effectively unsolveable in the general case.
So what to do ? SELinux booleans provide a way to toggle on/off various capabilities system wide. If building multiple applications though, it may be desirable to run some under a more confined policy than others – booleans are system wide. The solution I think is to define a set of perhaps 4 or 5 different execution contexts with differing levels of privileges. As an example, some contexts may allow outgoing network access, while others may deny all network activity. So the build admin can use the most restrictive policy by default, and a less restrictive policy for applications which are more trusted.
This weekend was just the start of experimentation with SELinux policy in regards to Test-AutoBuild, but it was more far, far successful than I ever expected it to be. The level of control afforded by SELinux is awesome, and with the flexibility of modifying the application itself too, the possibilities for fine grained access control are enourmous. One idea I’d like to investigate is whether it is possible to define new SELinux execution contexts on-the-fly. eg, instead of all application sources being checked out under a single ‘absource_t’ file context, it would be desirable to create a new source file context per-applicaiton. I’m not sure whether SELinux supports this idea, but it is interesting to push the boundaries here nonetheless…