Following the libvirt CVS repository using Mercurial, Tailor and patch queues

Posted: October 20th, 2008 | Filed under: Coding Tips, libvirt, Virt Tools | 2 Comments »

The libvirt project uses CVS as its primary SCM system. Our project code submission process requires that people submit patches for review & approval on the mailing list before committing them to CVS. This is great for catching stupid bugs and working through various design options, but CVS really isn’t conducive to this kind of workflow. It is not unusual to have a number of outstanding patches going through review, and you want to keep them all tracked separately during development. In other words you want a queue of pending patches.

There are many ways to implement such a workflow. A long time ago Andrew Morton had a set of scripts for managing patches, which evolved into Quilt. In these modern times, many projects (including the kernel of course) use GIT, and its ability to rebase branches to achieve this workflow. We even maintain a GIT mirror of libvirt which could be used for managing outstanding patches. While certainly a very capable tool, GIT has a somewhat steep learning curve, and even once learnt it violates the principle of least surprise on a daily basis with seemingly simple operations working in the most bizarre way. So with libvirt though I’ve been using a Mercurial mirror, and its ‘mq’ extension for patch management. This isn’t the only way of skinning the cat – you could also use branches, or the new ‘rebase’ command inspired by GIT, but my mind works in terms of patch queues, so I chose ‘mq’. For benefit of anyone else with an interest in Mecurial, here’s a quick overview of my workflow

The first task is to get a reliable process to synchronize the CVS repository to Mercurial. For this purpose I chose to use Tailor, though there are plenty of other tools which do the job fine too. For a repeatable conversion, you need to create a recipe describing what you want. I have the following script saved as $HOME/work/libvirt-hg.tailor, with execute permission enabled:

$ sudo yum install tailor
$ cd $HOME/work
$ cat > libvirt-hg.tailor <<EOF
#!/usr/bin/env tailor

"""
[DEFAULT]
verbose = True

[project]
target = hg:target
start-revision = INITIAL
root-directory = /home/berrange/work
state-file = libvirt-hg.state
source = cvs:source
subdir = libvirt-hg

[cvs:source]
module = libvirt
repository = :pserver:anoncvs@libvirt.org:2401/data/cvs

[hg:target]

"""
EOF
$ chmod +x libvirt-hg.tailor
$ ./libvirt-hg.tailor

When invoking this script for the first time it’ll take a seriously long time, checking out every revision of every file in CVS, figuring out the changesets, and populating a mercurial repository with them. When complete it will have left a directory libvirt-hg containing the mercurial repository mirroring the CVS repo, and a file libvirt-hg.state recording how much work it has done. On a daily basis (or more/less often as desired), re-running libvirt-hg.tailor will make it “catch up” with any new changes in CVS, updating the libirt-hg repository with the new changesets.

While you could work directly in the libvirt-hg repository, my preference is to work in a clone of it, and leave that as a pristine mirror of the CVS repository. So I create a repository called ‘libvirt-work’ for my day-to-day patch queue. If I want to try out someone else’s work – for example Ben Guthro’s libvirt events patchset, I create a patch queue just for that, by again cloning the pristine libvirt-hg repository.

 $ cd $HOME/work
 $ hg clone libvirt-hg libvirt-work

For some reason Mercurial extensions aren’t enabled by default, so if you’ve not used the ‘mq’ extension before, create a $HOME/.hgrc containing:

$ cat $HOME/.hgrc
[ui]
username="Daniel P. Berrange <berrange@redhat.com>"
[diff]
git = 1
showfunc = 1
[extensions]
hgext.hgk=
hgext.mq=
hgext.extdiff =

This does a number of things. The ‘showfunc’ setting makes it include function names in context diffs. The ‘hgk’ extension is a graphical repository browser, ‘mq’ is the patch queue extension, and ‘extdiff’ allows use of more interesting diff options. With that in place I can create a queue for my new bit of work in libvirt

$ cd $HOME/work
$ cd libvirt-work
$ hg qinit

The first patch I want to work on is refactoring mac address handling, so I’ll add a patch to track this work

$ hg qnew mac-address-refactor

Now off editing files as normal – ‘hg add’ / ‘hg remove; to create / delete files as needed, ‘hg status’ or ‘hg diff’ to see what’s changed, etc. The only difference comes when the work is complete – instead of using ‘hg commit’ to record a permanent changeset, I want to update the patch queue. This is done with the ‘hg qrefresh’ command, and the ‘hg qdiff’ command will show you the contents of the patch file

$ hg qrefresh
$ hg qdiff | diffstat
 capabilities.c  |   16 +++++++++++++++-
 capabilities.h  |   11 +++++++++++
 domain_conf.c   |   34 +++++++---------------------------
 domain_conf.h   |    8 ++------
 lxc_conf.c      |    3 +++
 lxc_driver.c    |    4 +++-
 openvz_conf.c   |    2 +-
 qemu_conf.c     |    3 +++
 qemu_driver.c   |    6 ++++--
 util.c          |   24 +++++++++++++++++++++++-
 util.h          |   12 +++++++++++-
 xen_internal.c  |    3 +++
 xend_internal.c |    8 ++++++--
 xm_internal.c   |   34 +++++++++++-----------------------
 14 files changed, 103 insertions(+), 65 deletions(-)

This bit of work was a pre-cursor to the real thing I wanted to work on, the OpenVZ networking code. With this refactoring out of the way, I want to add support for the ‘get version’ operation in OpenVZ driver, so I can start a new patch

$ hg qnew openvz-version

Patches stack up in a series, each building on the last

$ hg qseries
mac-address-refactor
openvz-version

Fast forward another hour, and I’ve got the version operation implemented, and ready for the final patch, enhancing the OpenVZ networking support.

$ hg qnew openvz-network
$ hg qseries
mac-address-refactor
openvz-version
openvz-network

In working through this 3rd patch though, I realize there was a problem in the first one, so I save my current work, and then ‘pop’ patches off the queue until I get back to the first.

$ hg qrefresh
$ hg qtop
openvz-network
$ hg qpop
Now at: openvz-version
$ hg qpop
Now at: mac-address-refactor

Once I’ve fixed the mistakes in this patch, I can then ‘push’ patches back onto the queue. If you’re lucky Mercurial can resolve / merge changes, but as with any rebase, sometimes there are conflicts which require fixing up by hand. If this occurrs, ‘.rej’ reject files are saved which must be manually addressed, and then the updated patch saved with ‘hg qrefresh’ before continuing.

$ hg qpush
applying openvz-version
Now at: openvz-version
$ hg qpush
applying openvz-network
Now at: openvz-network

When a piece of work is done the raw patch files all live in $REPO/.hg/patches/ and can be submitted for review to the mailing list.

If a bit of work goes on for a long time, it is often neccessary to re-sync with latest upstream CVS repository. First step in this re-base is to get Tailor to pull down all latest changes into the the local libvirt-hg repository

$ cd $HOME/work
$ ./libvirt-hg.tailor

Then in the work-ing repository to be rebased, first ‘pop’ off all local patches. Then pull in the new changesets from libvirt-hg, before re-applying the patches, fixing up any merge conflicts which arise

$ hg qpop --all
Patch queue now empty
$ hg pull
pulling from /home/berrange/work/libvirt-hg
searching for changes
adding changesets
adding manifests
adding file changes
added 15 changesets with 67 changes to 48 files
(run 'hg update' to get a working copy)
$ hg update
48 files updated, 0 files merged, 0 files removed, 0 files unresolved
$ hg qpush
applying mac-address-refactor
Now at: mac-address-refactor
$ hg qpush
applying openvz-version
Now at: openvz-version
$ hg qpush
applying openvz-network
Now at: openvz-network

This time I was lucky and didn’t have any merge conflicts.

While I remmeber, another reason for keeping the libvirt-hg repository pristine, is that I typically do work on different features across different machines. On one machine I make be doing OpenVZ work, while another has a patch queue of KVM work, while yet another may be Xen work. Instead of running Tailor on every machine, I can just have one master mercurial mirror, and push that where needed. Then each local machine has its own independant libvirt-work cloned repository with relevant patchqueue.

There is soo much more I could describe here – I’ve not even mentioned most of the commands available in the ‘mq’ extension – ‘hg help’ will show them all. For further reading, turn to the Mercurial mq wiki page

kernel-xen is dead. Long live kernel + paravirt_ops

Posted: July 25th, 2008 | Filed under: Fedora | 4 Comments »

In Fedora 9 we discontinued our long standing forward-port of Xen’s 2.6.18 kernel tree, and switch to a generic LKML tree (which already had i386 Xen DomU pv_ops), and added a set of patches to support x86_64 Xen DomU pv_ops. While it lacks functionality compared to the previous Xen kernels, and was certainly less stable for a while, overall this was a great success in terms of maintainability. It was still a separate kernel RPM though…

Jeremy Fitzhardinge has meanwhile continued to improve the Xen i386 pv_ops tree stability & functionality in upstream LKML, and has also taken the hacky Fedora x86_64 pv_ops patches, majorly cleaned them up & worked them into a form that was acceptable for upstream. A couple of days ago Ingo sent Jeremy’s work onto Linus who promptly merged it for 2.6.27

Fedora 10 Rawhide of course is tracking 2.6.27 so yesterday Mark McLoughlin turned on Xen pv_ops in the main kernel RPM, and killed off ‘kernel-xen’.

So for Fedora 10 we’ll have one kernel RPM to rule them all. By the magic of pv_ops, auto-detecting whether its running bare metal, Xen, VMWare (VMI) or KVM at boot and optimizing itself for each platform!

There’s only one small wrinkle, that isn’t really Xen’s fault. Anaconda install images on 32-bit Fedora are a i586 non-PAE kernel. Xen 32-bit is i686, PAE only, so we still need to have a separate initrd and vmlinux for installation – but at least its derived from the general purpose ‘kernel-PAE’ binary, instead of ‘kernel-xen’. Of course 64-bit doesn’t have this complication. Someone just needs to fix 32-bit Linux so it can auto-switch between non-PAE and PAE at runtime. It was always said to be impossible to unify UP & SMP kernels…until someone actually did it. Now just need someone to do the impossible for PAE and all will be right with the world :-)

It’s taken a long time & alot of work by many many people to get Xen’s DomU kernel bits merged upstream, so congratulations to all involved on getting another architecture merged, enabling us to finally take full advantage of paravirt_ops in Fedora’s Xen kernels.

New Java bindings for libvirt

Posted: June 26th, 2008 | Filed under: libvirt, Virt Tools | No Comments »

DV has recently been looking at the issue of Java bindings for libvirt. A few months back a libvirt community member, Tóth István, contributed most of the code for Java bindings to libvirt. Daniel has now taken this codebase added a build system, and is hosting it in the libvirt CVS repository and done a formal release. This should be hitting Fedora 10 rawhide in the near future, meaning we now have bindings for C, Perl, Python, OCaml, Ruby and Java. Now who wants to do a PHP binding….that’s the only other language commonly requested

Red Hat Summit 2008

Posted: June 18th, 2008 | Filed under: Uncategorized | No Comments »

Just finished my talk at the Red Hat Summit on libvirt and virtualization tools. For those who are interested, the I’ve now posted the slides online.

Better living through API design: low level memory allocation

Posted: May 23rd, 2008 | Filed under: Coding Tips | No Comments »

The libvirt library provides a core C API upon which language specific bindings are also built. Being written in C of course means we have to do our memory management / allocation. Historically we’ve primarily used the standard malloc/realloc/calloc/free APIs that even knows and loves/hates, although we do have an internal variable length string API (more on that in a future post). Of course there are many other higher level memory management APIs (obstacks/memory pools, garbage collectors, etc) you can write C code with too, but in common with the majority of other other apps/code we happen to be using the general purpose malloc/free pattern everywhere. I’ve recently been interested in improving our code quality, reliability and testing, and found myself thinking about whether there was a way to make our use of these APIs less error prone, and verifiably correct at build time.

If you consider the standard C library malloc/realloc/calloc/free APIs, they in fact positively encourage application coding errors. Off the top of my head there are at least 7 common problems, probably more….

  1. malloc() – pass incorrect number of bytes
  2. malloc() – fail to check the return value
  3. malloc() – forget to fully initialize returned memory
  4. free() – double free by not setting pointer to NULL
  5. realloc() – fail to check the return value
  6. realloc() – fail to re-assign the pointer with new address
  7. realloc() – leaking memory upon failure to resize

A great many libraries will create wrapper functions around malloc/free/realloc but they generally only attempt to address one or two of these points. As an example, consider GLib, which has a wrapper around malloc() attempting to address point 2. It does this by making it call abort() on failure to allocate, but then re-introduces the risk by also adding a wrapper which doesn’t abort()

  gpointer g_malloc         (gulong        n_bytes) G_GNUC_MALLOC;
  gpointer g_try_malloc     (gulong        n_bytes) G_GNUC_MALLOC;

It also wraps realloc() for the same reason, and adds an annotation to make the compiler warn if you don’t use the return value

  gpointer g_realloc        (gpointer      mem,
                             gulong        n_bytes) G_GNUC_WARN_UNUSED_RESULT;
  gpointer g_try_realloc    (gpointer      mem,
                             gulong        n_bytes) G_GNUC_WARN_UNUSED_RESULT;

This at least addresses point 6, ensuring that you update your variable with the new pointer, but can’t protect against failure to check for NULL. And the free() wrapper doesn’t address the double-free issue at all.

You can debate which of “checking for NULL” vs “calling abort()” is the better approach – Havoc has some compelling points – but that’s not the point of this blog posting. I was interested in whether I could create wrappers for libvirt which kept the choice in the hands of the caller, while still protecting from the common risks.

  1. For any given pointer type the compiler knows how many bytes need to be allocated for a single instance of it – it sizeof(datatype) or sizeof(*mptr). Both options are a little tedious to write out, eg

    mytype *foo;
    foo = malloc(sizeof(*foo));
    foo = malloc(sizeof(mytype));
    

    Since C has no data type representing a data type, you cannot simply pass a data type to a function as you might like to:

    mytype *foo = malloc(mytype);
    

    You do, however, have access to the preprocessor and so you can achieve the same effect via a trivial macro.

    #define MALLOC(ptr) malloc(sizeof(*(ptr)))
    or
    #define MALLOC(mytype) malloc(sizeof(mytype))
    
  2. GCC has a number of ways to annotate APIs, one of which is __attribute__((__warn_unused_result__)). This causes emit a warning if return value is not used / checked. Add -Werror and you can get guarenteed compile time verification (the rest of your code was 100% warning free, wasn’t it – it ought to be :-). This doesn’t actually help with the NULL check for malloc, because you always do something with the return value – assign it to a variable. The core problem here is that the API encodes the error condition as a special case value in the returned “data”. The obvious answer to this is to separate the two cases, keeping the return value soley as a bi-state success of fail flag.
    The data can be passed by via a output parameter. This might look like:

    myptr *foo;
    mymalloc(&foo;, sizeof(*foo));
    

    And the compiler would be able to warn that you failed to check for failure. It looks rather ugly to write this way, but consider that a preprocessor macro is already being used for the previous issue, so we can merely extend its use

    #define MALLOC(ptr) mymalloc(&(foo), sizeof(*(foo))
    myptr *foo;
    if (MALLOC(foo) < 0)
       ... do error handling...
    
  3. The memory allocated by malloc() is not initialized to zero. For that a separate calloc() function is provided. This leaves open the possiblity of an app mistakenly using the wrong variant. Or consider an app using malloc() and explicitly initializing all struct members. Some time later a new member is added and now the developer needs to find all malloc() calls wrt to that struct and explicitly initilize the new member. It would be safer to always have malloc zero all memory - even though it has an obvious performance impact, the net win in safety is probably worth it for most applications. Dealing with this in realloc() is much harder though, because you don't know how many extra bytes (if any) you need to initialize without knowing the original size.

  4. Since free() takes the pointer to be freed directly it cannot reset the caller's original pointer variable back to NULL. So the application is at risk of mistakenly calling free() a second time, leading to corruption or a crash. free() will happily ignore a NULL pointer though. If free() were to accept a pointer to a pointer instead, it could automatically NULL-ify it. Again this has extra computational cost from the often redundant assignment to NULL, but it is negligable in the context of the work done to actually free the memory region, and easily offset by the safety benefits.

  5. As with point 2, this is caused by the fact that the error condition is special case of the return data value.

  6. This is a fun one which may escape detection for ages because most of the time the returned pointer address is the same as the original address. realloc() doesn't often need to relocate the data to another address. Given that solving the previous point required changing the return data to be an output-parameter, we've already basically solved this issue to. Pass in a pointer to the pointer to be reallocated and it can update the address directly.

  7. What todo on failure of a realloc() is an interesting question. The chances are that most applications will immediately free() the block and go into cleanup code - indeed I don't recall coming across code that would ever continue its work if realloc() failed. Thus one could simply define that if realloc fails it automatically free's the original data too.

Having considered this all its possible to define a set of criteria for the design of a low level memory allocation API that is considerably safer than the standard C one, while still retaining nearly all its flexibility and avoiding the imposition of policy such as calling abort() on failure.

  • Use return values only for a success/fail error condition flag
  • Annotate all the return values with __warn_unused_result__
  • Pass a pointer to the pointer into all functions
  • Define macros around all functions to automatically calculate datatype sizes

So the primary application usage would be via a set of macros:

    VIR_ALLOC(ptr)
    VIR_ALLOC_N(ptr, count)
    VIR_REALLOC_N(ptr, count)
    VIR_FREE(ptr)

These call through to the underlying APIs:

    int virAlloc(void *ptrptr, size_t bytes)
    int virAllocN(void *ptrptr, size_t bytes, size_t count)
    int virReallocN(void *ptrptr, size_t bytes, size_t count)
    int virFree(void *ptrptr)

The only annoying thing here is that although we're passing a pointer to a pointer into all these, the first param is still just 'void *' and not 'void **'. This works because 'void *' is defined to hold any type of pointer, and in addition using 'void **' causes the compiler to complain bitterly about strict aliasing violations. Internally the impls of these methods can still safely cast to 'void **' when deferencing the pointer.

All 3 of Alloc/Realloc functions will have __warn_unused_result__ annotation so the caller is forced to check the return value for failure, validated at compile time generating a warning (or fatal compile error with -Werror).

And finally to wire up the macros to the APIs:

  #define VIR_ALLOC(ptr)            virAlloc(&(ptr), sizeof(*(ptr)))
  #define VIR_ALLOC_N(ptr, count)   virAllocN(&(ptr), sizeof(*(ptr)), (count))
  #define VIR_REALLOC_N(ptr, count) virReallocN(&(ptr), sizeof(*(ptr)), (count))
  #define VIR_FREE(ptr)             virFree(&(ptr))

If this is all sounding fairly abstract, an illustration of usage should clear things up. These snippets are taken from libvirt, showing before and after

Allocating a new variable:

Before

      virDomain *domain;
      if ((domain = malloc(sizeof(*domain))) == NULL) {
         __virRaiseError(VIR_ERROR_NO_MEMORY)
         return NULL;
      }

After

      virDomain *domain;

      if (VIR_ALLOC(domain) < 0) {
         __virRaiseError(VIR_ERROR_NO_MEMORY)
         return NULL;
      }
Allocating an array of domains

Before

       virDomain *domains;
       int ndomains = 10;

       if ((domains = malloc(sizeof(*domains) * ndomains)) == NULL) {
         __virRaiseError(VIR_ERROR_NO_MEMORY)
         return NULL;
       }

After

       virDomain *domains;
       int ndomains = 10;

       if (VIR_ALLOC_N(domains, ndomains) < 0) {
         __virRaiseError(VIR_ERROR_NO_MEMORY)
         return NULL;
       }
Allocating an array of domain pointers

Before

       virDomain **domains;
       int ndomains = 10;

       if ((domains = malloc(sizeof(*domains) * ndomains)) == NULL) {
         __virRaiseError(VIR_ERROR_NO_MEMORY)
         return NULL;
       }

After

       virDomain **domains;
       int ndomains = 10;

       if (VIR_ALLOC_N(domains, ndomains) < 0) {
         __virRaiseError(VIR_ERROR_NO_MEMORY)
         return NULL;
       }
Re-allocate the array of domains to be longer

Before

       virDomain **tmp;
       ndomains = 20

       if ((tmp = realloc(domains, sizeof(*domains) * ndomains)) == NULL {
         free(domains);
         __virRaiseError(VIR_ERROR_NO_MEMORY)
         return NULL;
       }
       domains = tmp;

After

       ndomains = 20

       if (VIR_REALLOC_N(domains, ndomains) < 0) {
         __virRaiseError(VIR_ERROR_NO_MEMORY)
         return NULL;
       }
Free the domain

Before

       free(domain);
       domain = NULL;

After

       VIR_FREE(domain);

As these short examples show the number of lines of code hasn't changed much, but the clarity of them has - particularly the realloc() usage, and of course there is now compile time verification of usage. The main problem remaining is the classic memory leak, by forgetting to call free() at all. If you want to use a low level memory allocation API this problem is essentially unavoidable. Fixing it really requires a completely different type of API (eg, obstacks/pools) or a garbage collector. And there's always valgrind to identify leaks which works very nicely, particularly if you have extensive test suite coverage

The original mailing thread from which this post is derived is on the libvirt mailing list