Guest CPU model configuration in libvirt with QEMU/KVM
Many of the management problems in virtualization are caused by the annoyingly popular & desirable host migration feature! I previously talked about PCI device addressing problems, but this time the topic to consider is that of CPU models. Every hypervisor has its own policies for what a guest will see for its CPUs by default, Xen just passes through the host CPU, with QEMU/KVM the guest sees a generic model called “qemu32” or “qemu64”. VMWare does something more advanced, classifying all physical CPUs into a handful of groups and has one baseline CPU model for each group that’s exposed to the guest. VMWare’s behaviour lets guests safely migrate between hosts provided they all have physical CPUs that classify into the same group. libvirt does not like to enforce policy itself, preferring just to provide the mechanism on which the higher layers define their own desired policy. CPU models are a complex subject, so it has taken longer than desirable to support their configuration in libvirt. In the 0.7.5 release that will be in Fedora 13, there is finally a comprehensive mechanism for controlling guest CPUs.
Learning about the host CPU model
If you have been following earlier articles (or otherwise know a bit about libvirt) you’ll know that the “virsh capabilities” command displays an XML document describing the capabilities of the hypervisor connection & host. It should thus come as no surprise that this XML schema has been extended to provide information about the host CPU model. One of the big challenges in describing a CPU models is that every architecture has different approach to exposing their capabilities. On x86, a modern CPUs’ capabilities are exposed via the CPUID instruction. Essentially this comes down to a set of 32-bit integers with each bit given a specific meaning. Fortunately AMD & Intel agree on common semantics for these bits. VMWare and Xen both expose the notion of CPUID masks directly in their guest configuration format. Unfortunately (or fortunately depending on your POV) QEMU/KVM support far more than just the x86 architecture, so CPUID is clearly not suitable as the canonical configuration format. QEMU ended up using a scheme which combines a CPU model name string, with a set of named flags. On x86 the CPU model maps to a baseline CPUID mask, and the flags can be used to then toggle bits in the mask on or off. libvirt decided to follow this lead and use a combination of a model name and flags. Without further ado, here is an example of what libvirt reports as the capabilities of my laptop’s CPU
# virsh capabilities <capabilities> <host> <cpu> <arch>i686</arch> <model>pentium3</model> <topology sockets='1' cores='2' threads='1'/> <feature name='lahf_lm'/> <feature name='lm'/> <feature name='xtpr'/> <feature name='cx16'/> <feature name='ssse3'/> <feature name='tm2'/> <feature name='est'/> <feature name='vmx'/> <feature name='ds_cpl'/> <feature name='monitor'/> <feature name='pni'/> <feature name='pbe'/> <feature name='tm'/> <feature name='ht'/> <feature name='ss'/> <feature name='sse2'/> <feature name='acpi'/> <feature name='ds'/> <feature name='clflush'/> <feature name='apic'/> </cpu> ...snip...
In it not practical to have a database listing all known CPU models, so libvirt has a small list of baseline CPU model names. It picks the one that shares the greatest number of CPUID bits with the actual host CPU and then lists the remaining bits as named features. Notice that libvirt does not tell you what features the baseline CPU contains. This might seem like a flaw at first, but as will be shown next, it is not actually necessary to know this information.
Determining a compatible CPU model to suit a pool of hosts
Now that it is possible to find out what CPU capabilities a single host has, the next problem is to determine what CPU capabilities are best to expose to the guest. If it is known that the guest will never need to be migrated to another host, the host CPU model can be passed straight through unmodified. Some lucky people might have a virtualized data center where they can guarantee all servers will have 100% identical CPUs. Again the host CPU model can be passed straight through unmodified. The interesting case though, is where there is variation in CPUs between hosts. In this case the lowest common denominator CPU must be determined. This is not entirely straightforward, so libvirt provides an API for exactly this task. Provide libvirt with a list of XML documents, each describing a host’s CPU model, and it will internally convert these to CPUID masks, calculate their intersection, finally converting the CPUID mask result back into an XML CPU description. Taking the CPU description from a random server
<capabilities> <host> <cpu> <arch>x86_64</arch> <model>phenom</model> <topology sockets='2' cores='4' threads='1'/> <feature name='osvw'/> <feature name='3dnowprefetch'/> <feature name='misalignsse'/> <feature name='sse4a'/> <feature name='abm'/> <feature name='cr8legacy'/> <feature name='extapic'/> <feature name='cmp_legacy'/> <feature name='lahf_lm'/> <feature name='rdtscp'/> <feature name='pdpe1gb'/> <feature name='popcnt'/> <feature name='cx16'/> <feature name='ht'/> <feature name='vme'/> </cpu> ...snip...
As a quick check is it possible to ask libvirt whether this CPU description is compatible with the previous laptop CPU description, using the “virsh cpu-compare
” command
$ ./tools/virsh cpu-compare cpu-server.xml CPU described in cpu-server.xml is incompatible with host CPU
libvirt is correctly reporting the CPUs are incompatible, because there are several features in the laptop CPU that are missing in the server CPU. To be able to migrate between the laptop and the server, it will be necessary to mask out some features, but which ones ? Again libvirt provides an API for this, also exposed via the “virsh cpu-baseline
” command
# virsh cpu-baseline both-cpus.xml <cpu match='exact'> <model>pentium3</model> <feature policy='require' name='lahf_lm'/> <feature policy='require' name='lm'/> <feature policy='require' name='cx16'/> <feature policy='require' name='monitor'/> <feature policy='require' name='pni'/> <feature policy='require' name='ht'/> <feature policy='require' name='sse2'/> <feature policy='require' name='clflush'/> <feature policy='require' name='apic'/> </cpu>
libvirt has determined that in order to safely migrate a guest between the laptop and the server, it is neccesary to mask out 11 features from the laptop’s XML description.
Configuring the guest CPU model
To simplify life, the guest CPU configuration accepts the same basic XML representation as the host capabilities XML exposes. In other words, the XML from that “cpu-baseline” virsh command, can now be copied directly into the guest XML at the top level under the <domain> element. As the observant reader will have noticed from that last XML snippet, there are a few extra attributes available when describing a CPU in the guest XML. These can mostly be ignored, but for the curious here’s a quick description of what they do. The top level <cpu> element gets an attribute called “match” with possible values
- match=”minimum” – the host CPU must have at least the CPU features described in the guest XML. If the host has additional features beyond the guest configuration, these will also be exposed to the guest
- match=”exact” – the host CPU must have at least the CPU features described in the guest XML. If the host has additional features beyond the guest configuration, these will be masked out from the guest
- match=”strict” – the host CPU must have exactly the same CPU features described in the guest XML. No more, no less.
The next enhancement is that the <feature> elements can each have an extra “policy” attribute with possible values
- policy=”force” – expose the feature to the guest even if the host does not have it. This is kind of crazy, except in the case of software emulation.
- policy=”require” – expose the feature to the guest and fail if the host does not have it. This is the sensible default.
- policy=”optional” – expose the feature to the guest if it happens to support it.
- policy=”disable” – if the host has this feature, then hide it from the guest
- policy=”forbid” – if the host has this feature, then fail and refuse to start the guest
The “forbid” policy is for a niche scenario where a badly behaved application will try to use a feature even if it is not in the CPUID mask, and you wish to prevent accidentally running the guest on a host with that feature. The “optional” policy has special behaviour wrt migration. When the guest is initially started the flag is optional, but when the guest is live migrated, this policy turns into “require”, since you can’t have features disappearing across migration.
All the stuff described in this posting is currently implemented for libvirt’s QEMU/KVM driver, basic code in the 0.7.5/6 releases, but the final ‘cpu-baseline’ stuff is arriving in 0.7.7. Needless to say this will all be available in Fedora 13 and future RHEL. This obviously also needs to be ported over to the Xen and VMWare ESX drivers in libvirt, which isn’t as hard as it sounds, because libvirt has a very good internal API for processed CPUID masks now. Kudos to Jiri Denemark for doing all the really hardwork on this CPU modelling system!
That article explained me a lot about CPU information presentation for the guests, but it let me confused. I have 2 hosts, core2duo and core2quad, but they are presented as pentium3 and core2duo respectively.
[…] Dan Berrange’s discussion of CPU models and libvirt […]
[…] to Jiri Denemark for the above hint. Also note that, there is a very detailed and informative post from Dan P Berrange on host/guest CPU models in […]
I am trying to install SCO OpenServer 5.0.5 (yes, it is an ancient one) on both a Fedora 16 64-bit and a RHEL 6.2 Xeon-based system. As the virt stuff on F16 is I think more recent than RHEL6.2 (I could be wrong), I am wondering if there are any things I need to consider in the installation.
I did do a successful installation of the VM on a RHEL5.4 (http://harishpillay.livejournal.com/171146.html). With RHEL6.2 and F16, I am not succeeding. What should I be checking for?
The installation in both cases go so far as loading the packages into the virtual disk and when it reaches about 80% or so, F16 just stops. Nothing happens. ON RHEL6.2, I get a kernel error spewing out on the VM.
Thanks for any suggestions.
Harish
Hi.
First of all thanks for this excellent article.
It is true that if I expose as much of the host CPU capabilities the performance of the VMs and thus of KVM-qemu will be increased?
Before to discover this post all my VMs were running without this additional CPU configuration.
Thanks
i have a box with amd athlone II x4 620 i install Centos 6 on it , i have installed virt* ( KVM ) , virt-host-validate is passing HW virt and /proc/cpuinfo looks to have svm flag , but if i create a VM ( Cnetos 6 ) and install kvm on it , and set the in the XML file ,the svm flag does not exist in the guest cpu features , i have tried many methods like selecting opteron_g3 and adding feature to the XML file to expose svm to the guest cpu , i get no where , modeprobe kvm_amd gets error ( operation not supported ) , nested KVM is enable on the physical host
#cat /sys/module/kvm_amd/parameters/nested
1
can you please guide me how to select the cpu config ?
Very useful article! It provide a very simple way for openstack user to examine the cpu incompatibility problem of vm live migration.
Great information! Thanks…
I want to emulate nonexisting CPU capabilities. Where to start from?
According to the article I can set
policy=”force”
but how exactly does software emulation take place? And what are the requirements for doing so?