Using command line arg & monitor command passthrough with libvirt and KVM

Posted: December 19th, 2011 | Filed under: Fedora, libvirt, Virt Tools | Tags: , , , , , , , | 1 Comment »

The general goal of the libvirt project is to provide an API definition and XML schema that is independent of any one hypervisor technology. This means that for every feature in KVM, we have to take a little time to carefully design a suitable API or XML schema to expose in libvirt, if none already exists. Sometimes we are lucky and features in KVM can be expressed in almost the same way in libvirt, but more often than not, this does not hold true – the design for libvirt may look somewhat different. For example, several KVM monitor commands at the KVM level may be exposed as a single API call in libvirt. Or conversely, several different libvirt APIs may all end up invoking the same underlying KVM monitor commands. The need to put careful thought into libvirt API & XML design means that there may be a short delay between a feature appearing in KVM, and it appearing in libvirt, though it can also be the other way around, where libvirt has a feature but KVM doesn’t provide a way to support it yet!

For many applications / developers this delay is a non-issue, since they don’t need to be on the absolute bleeding edge of development of KVM or libvirt, but there are always exceptions to the rule. For a long time, we did not want to enable direct use of hypervisor specific features at all via libvirt, because of the support implications of doing so. A little over a year ago though, we decide to change our position on this matter. The key to this change in policy was deciding how to clearly demarcate functionality which is long term supportable, vs that which is not.

KVM custom command line arguments

To support custom command line argument passthrough for KVM, we decided to introduce a new set of XML elements in a different XML namespace. This namespace is defined:

  xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'

Our policy is that any guest configuration that uses this QEMU XML namespace is not guaranteed to continue working if either libvirt or KVM are upgraded. A change to the way libvirt generates command line arguments may break the arguments the app has passed through. Alternatively KVM itself may drop or change the semantics of an existing command line argument. In other words applications should not rely on this capability long term, rather they should raise an RFE against libvirt to support it via the primary XML namespace, and just use the QEMU namespace until the RFE is complete.

The QEMU namespace defines syntax for passing arbitrary command line arguments, along with arbitrary environment variables:

<domain type='qemu' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
  <name>QEMUGuest1</name>
  <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
  ...
  <devices>
    <emulator>/usr/bin/qemu</emulator>
    <disk type='block' device='disk'>
      <source dev='/dev/HostVG/QEMUGuest1'/>
      <target dev='hda' bus='ide'/>
    </disk>
    ...
  </devices>
  <qemu:commandline>
    <qemu:arg value='-unknown'/>
    <qemu:arg value='parameter'/>
    <qemu:env name='NS' value='ns'/>
    <qemu:env name='BAR'/>
  </qemu:commandline>
</domain>

Remember that when adding the <qemu:commandline> element, you need to always declare the XML namespace ‘xmlns:qemu’ on the top level <domain> element.

KVM monitor command passthrough

To support custom monitor command passthrough for KVM, we decided to introduce a second ELF library, libvirt-qemu.so, and separate header file /usr/include/libvirt/libvirt-qemu.h. An application that uses APIs in libvirt-qemu.so is not guaranteed to continue working if either libvirt or KVM are upgraded. A change to the way libvirt manages guests may conflict with the monitor commands the app is trying to issue. Alternatively KVM itself may drop or change the semantics of an existing monitor commands. In other words applications should not rely on this capability long term, rather they should raise an RFE against libvirt to support it via the primary library API, and just use libvirt-qemu.so until the RFE is complete.

Currently the libvirt-qemu.h defines two custom APIs

typedef enum {
  VIR_DOMAIN_QEMU_MONITOR_COMMAND_DEFAULT = 0,
  VIR_DOMAIN_QEMU_MONITOR_COMMAND_HMP     = (1 << 0), /* cmd is in HMP */
} virDomainQemuMonitorCommandFlags;

int virDomainQemuMonitorCommand(virDomainPtr domain, const char *cmd,
                                char **result, unsigned int flags);

virDomainPtr virDomainQemuAttach(virConnectPtr domain,
                                 unsigned int pid,
                                 unsigned int flags);

The first allows passthrough of arbitrary monitor commands, while the latter allows attachment to an existing QEMU instance as discussed previously. The monitor command API is quite straighforward, it accepts a string command, and returns a string reply.  The data for the command/reply can be either in HMP or QMP syntax, depending on how QEMU was launched by libvirt. The VIR_DOMAIN_QEMU_MONITOR_COMMAND_HMP flag allows an application to force use of the HMP syntax at all times.

Using monitor command passthrough from virsh

Not all users will be writing directly to the libvirt API, so the monitor command passthrough is also wired up into virsh via the “qemu-monitor-command” API. First is an example using QMP (JSON syntax):

$ virsh qemu-monitor-command vm-vnc '{ "execute": "query-block"}'
{"return":[{"device":"drive-virtio-disk0","locked":false,"removable":false,"inserted":{"ro":false,"drv":"qcow2","encrypted":false,"file":"/home/berrange/VirtualMachines/plain.qcow"},"type":"unknown"}],"id":"libvirt-9"}

And second is an example demonstrating use of HMP with a guest that runs QMP (libvirt automagically redirects via the ‘human-monitor-command’ command)

$ virsh qemu-monitor-command --hmp vm-vnc  'info block'
drive-virtio-disk0: removable=0 file=/home/berrange/VirtualMachines/plain.qcow ro=0 drv=qcow2 encrypted=0

Tainting of guests

Anyone familiar with the kernel will know that it marks itself as tainted whenever the user does something that is outside the boundaries of normal support. We have borrowed this idea from the kernel and apply it to guests run by libvirt too. Any attempt to use either the command line argument passthrough via XML, or QEMU monitor command passthrough via libvirt-qemu.so will result in the guest domain being marked as tainted. This shows up in the libvirt log files. For example after that last example,  $HOME/.libvirt/qemu/log/vm-vnc.log shows the following

Domain id=2 is tainted: custom-monitor

This allows OS distro support staff to determine if something unusal has been done to a guest when they see support tickets raised. Depending on the OS distro’s support policy they may decline to support problem arising from tainted guests. In RHEL for example, any usage of QEMU monitor command passthrough, or command line argument passthrough is outside the bounds of libvirt support, and users would normally be asked to try to reproduce any problem without a tainted guest.