libvirt: split of the monolithic libvirtd daemon
Since the project’s creation about 14 years ago, libvirt has grown enormously. In that time there has been a lot of code refactoring, but these were always fairly evolutionary changes; there has been little revolutionary change of the overall system architecture or some core technical decisions made early on. This blog post is one of a series examining recent technical decisions that can be considered more revolutionary to libvirt. This was the topic of a talk given at KVM Forum 2019 in Lyon.
Monolithic daemon
Anyone who has used libvirt should be familiar with the libvirtd
daemon which runs most of the virtualization and secondary drivers that libvirt distributes. Only a few libvirt drivers are stateless and run purely in the library. Internally libvirt has always tried to maintain a fairly modular architecture, with each hypervisor driver being a separated from other drivers. There are also secondary drivers providing storage, network, firewall functionality which are notionally separate from all the virtualization drivers. Over time the separation has broken down with hypervisor drivers directly invoking internal methods from the secondary drivers, but last year there was a major effort to reverse this and re-gain full separation between every driver.
There are various problems with having a monolithic daemon like libvirtd
. From a security POV, it is hard to provide any meaningful protections to libvirtd
. The range of functionality it exposes, provides an access level that is more or less equivalent to having a root shell. So although libvirtd
runs with a “virtd_t
” SELinux context, this should be considered little better than running “unconfined_t
“. As well as providing direct local access to the APIs, the libvirtd
daemon also has the job of exposing remote access over TCP, most commonly needed when doing live migration. Exposing the drivers directly over TCP is somewhat undesirable given the size of the attack surface they have.
The biggest problems users have seen are around reliability of the daemon. A bug in any single driver in libvirt can impact on the functionality of all other drivers. As an example, if something goes wrong in the libvirt storage mgmt APIs, this can harm management of any QEMU VMs. Problems can be things like crashes of the daemon due to memory corruption, or more subtle things like main event loop starvation due to long running file handle event callbacks, or accidental resource cleanup such as closing a file descriptor belonging to another thread.
Libvirt drivers are shipped as loadable modules, and an installation of libvirt does not have to include all drivers. Thus a minimal installation of libvirt is a lot smaller than users typically imagine it is. The existance of the monolithic libvirtd
daemon, however, and the fact the many apps pull in broader RPM dependencies than they truly need, results in a perception that libvirt is bloated / heavyweight.
Modular daemons
With all this in mind, libvirt has started a move over to a new modular daemon model. In this new world, each driver in libvirt (both hypervisor drivers & secondary drivers) will be serviced by its own dedicated daemon. So there will be a “virtqemud
“, “virtxend
“, “virtstoraged
“, “virtnwfilterd
“, etc. Each of these daemons will only support access via a dedicated local UNIX domain socket, /run/libvirt/$DAEMONNAME
, eg /run/libvirt/virtqemud
. The libvirt client library will be able to connect to either the old monolithic daemon socket path /run/libvirt/libvirt-sock
, or the new per-daemon socket. The hypervisor daemons will be able to open connections to the secondary daemons when required by requested functionality, eg to config a firewall for a QEMU guest NIC.
Remote off-host access to libvirt functionality will be handled via a new virtproxyd
daemon which listens for TCP connections and forwards API calls over a local UNIX socket to whichever modular daemon needs to service it. This proxy daemon will also be responsible for handling the monolithic daemon UNIX domain socket path that old libvirt clients will be expecting to use.
Overall from an application developer POV, the change to monolithic daemons will be transparent at the API level. The main impact will be on deployment tools like Puppet / Ansible seeking to automate deployment of libvirt, which will need to be aware of these new daemons and their config files. The resulting architecture should be more reliable in operation and enable development of more restrictive security policies.
Both the existing libvirtd
and the new modular daemons have been configured to make use of systemd socket activation and auto-shutdown after a timeout, so the daemons should only be launched when they actually need to do some work. Several daemons will still need to startup at boot to activate various resources (create the libvirt virb0
bridge device, or auto-start VMs), but should stop quickly once this is done.
Migration timeframe
At the time of writing the modular daemons exist in libvirt releases and are built and installed by default. The libvirt client library, however, still defaults to connecting to the monolithic libvirtd
UNIX socket. To best of my knowledge, all distros with systemd use presets which favour the monolithic daemon too. IOW, thus far, nothing has changed from most user’s POV. In the near future, however, we intend to flip the switch in the build system such that the libvirt client library favours connections to the modular daemons, and encourage distros to change their systemd presets to match.
The libvirtd daemon will remain around, but deprecated, for some period of time before it is finally deleted entirely. When this deletion will happen is still TBD, but it is not less than 1 year away, and possibly as much as 2 years. The decision will be made based on how easily & quickly applications find adaptation to the new modular daemon world.
Future benefits
The modular daemon model opens up a number of interesting possibilities for addressing long standing problems with libvirt. For example, the QEMU driver in libvirt can operate in “system mode” where it is running as root and can expose all features of QEMU. There is also the “session mode” where it runs as an unprivileged user but with features dramatically reduced. For example, no firewall integration, drastically reduced network connectivity options, no PCI device assignment and so on. With the modular daemon model, a new hybrid approach is possible. A “session mode” QEMU driver can be enhanced to know how to talk to a “system mode” host device driver to do PCI device assignment (with suitable authentication prompts of course), likewise for network connectivity. This will make the unprivileged “session mode” QEMU driver a much more compelling choice for applications such as virt-manager or GNOME Boxes which prefer to run fully unprivileged.
Leave a Reply