Rambling about the pain of dealing with passwords for online services

Posted: January 23rd, 2012 | Filed under: Fedora, Security | 11 Comments »

Over the last 6 months or so, I’ve become increasingly paranoid about password usage for online web services. There have a been a number of high profile attacks against both the commercial world and open source project infrastructure, many of which have led to compromise of password databases. Indeed, it feels like my news feed has at least 1 article a week covering an attack & user account compromise against some online service or other. And this is only the attacks that are detected and/or reported. Plenty more places don’t even realized they have been attacked yet, and there are likely plenty of cover-ups too. This leads me inescapably to my first axiom of password management:

  • Axiom #1: Your password(s) will be compromised. There is no “if”, only “when”

It follows from this, that it is the epitome of foolishness to use the same password for more than one site. Even if you are diligent to watch for news reports of site compromises & quickly change your passwords across all other sites, you are wasting hours of time, and still vulnerable for the period between the attack taking place & being reported in the media (if at all). Out of curiosity, I made a list of every website I could remember where I had registered an account of some kind. I was worried when the list got to 50, I was shocked when it went over 100, and I stopped counting thereafter.

There is a barrage of often conflicting suggestions about how to create strong passwords for accounts. Most websites simply say things like “you must use a mixture of at least 8 letters, numbers and special symbols“. Google have been trying to educate people about how to make up more easily remembered passwords, but XKCD points out the flaws in these commonly suggested approaches. Even if you do decide upon some nice scheme for creating your passwords, you quickly come across many websites which will reject your carefully thought up & remembered password. Compound this with the fact that many websites (typically financial ones) also require you to enter “passwords hints” based on questions that are supposedly easy to remember, but in fact turn out to be anything but. Now multiply by the number of sites you need passwords for (x100). This leads me inescapably to my second axiom of password management:

  • Axiom #2: It is beyond the capabilities of the human brain to remember enough strong passwords

There have been a great many proposals for shared authentication services, whether owned & managed centrally by a corporation like Microsoft Passport, or completely decentralized & vendor independent like OpenID. Today out of all the 100+ sites I use, I can count the number that allow OpenID login on the fingers of one hand. More recently the big social networks have been having some success with positioning themselves as the managers of your identity & providers of authentication to other sites. I am not happy with the idea of any social network being the controller of not only my online identity, but also controller of access to every single website I register with. I don’t trust them with all this data, and they are an extremely high value target for any would be attackers if they control all your website logins. Letting them control all my logins, feels akin to just re-using the same password across every website. I know they have marginally stronger login procedures than most sites, by allowing you to authenticate individual clients used to login, but this isn’t enough to balance the downside. In fact I’m not really convinced that I want any online service to be the manager or all my login details for websites. It is just too big a single point of trust/failure.

A minority of online banking websites now provide you some form of hardware key token generator, or pin entry device to authenticate with. This is clearly not going to work for most websites, due to the cost & distribution problem. Even within the limited scope of financial websites the practicality is limited – if every financial institution I dealt with had key token generators, I’d have a huge pile of hardware devices to look after ! I do like hardware authentication devices and now use them for login to any personal SSH servers that I manage, but with a few exceptions like Fedora, they are not a solution for the online password problem today or the forseable future. I am depressingly lead to my third axiom of password management:

  • Axiom #3: Widespread password authentication is here to stay for many, many years to come

Hmm, perhaps the problem is better described by mapping to the 5 stages of grief

  1. Denial – only careless people have their details compromised, i’ll be fine using the same 4-5 passwords across all sites
  2. Anger – how could $WEBSITE have been so badly run / protected, to let themselves be compromised
  3. Bargaining – if I just let Facebook handle all my logins, they’ll solve all the hard problems for me
  4. Depression – the industry will never get its act together & solve authentication
  5. Acceptance – passwords are here to stay, what can I do to minimize my risks

Well I think I am at step 5 now. I have accepted that passwords are here to stay, that sooner or later one or more of the sites I am registered with will be compromised, and it is impossible for me to remember enough passwords. My goal is thus to minimise the pain and damage.

My conclusion is that the only viable way to manage passwords today is to do the one thing everyone tells you never to do

  • Write down all the passwords

Of course this shouldn’t be taken too literally. I am not suggesting to put a post-it on the monitor with the passwords on it, rather I mean store the passwords in some secure location, which is in turn protected a master password. ie use a password manager application.

Using KeePassX for managing passwords

After looking at a few options on Linux, I ended up choosing KeePassX as my password manager because it had a quite advanced set of features that appealed to me. Before anyone comments, I had discounted any usage of a password manager built into the web browser before even starting. The browser is a directly network facing process of great complexity and frequent security flaws – they just aren’t the right place to be storing all your valuable secrets. The features in KeepPassX that I liked were:

  • Passwords are stored encrypted in a structured database
  • It is possible to specify many different metadata attributes with each password, username, site URL, title, comment, and more.
  • It can copy the password to the clipboard, allowing paste into web browser forms, avoiding the need to manually type in long password sequences
  • It automatically purges passwords from the clipboard after 30 seconds to minimise the window when it is visible
  • The database can be set to automatically lock itself against after 30 seconds, requiring the master password to be entered again to access further password entries
  • The password database can be secured using a password, or a keyfile, or both. The keyfile is just a plain file with random bytes stored somewhere (like a USB key)
  • An advanced password generator with many tunable options

I have several laptops and I want the password database to be usable from either machine. At the same time though, the password database should not become a single point of failure / data loss, so there needs to be multiple copies of it.  Using a password database does have the downside that it becomes a nice single point of attack for the bad guys. It would thus be desirable to have separate password databases for websites used on a general day-to-day business vs security critical seldom used sites ie bank accounts, and other financial institutions. With this in mind the way I decided to use KeepPassX is as follows

  • I purchased 4x USB stick 4 GB capacity for < 5 GBP each, two coloured black and two coloured white
  • All 4 USB sticks were split into 2 partitions, each of 2 GB size.
  • The primary partition is formatted with a Fedora 16 LiveCD. This is to facilitate easy access to the passwords, should I find myself without one of my own Linux laptops close by
  • The second partition is setup with LUKS full disk encryption and formatted with ext4.
  • The partition with the encrypted filesystem is used to store the KeePassX database files and any other important files (GPG keys, SSH keys, etc)
  • The black coloured USB sticks are used to store a database for financial account details
  • The white coloured USB sticks are used to store a database for any other website logins
  • One USB stick of each colour is the designated backup. The backup sticks are kept in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard.’
  • A shell script which will synchronize files between sticks, to be run periodically to ensure recent-ish backups

With that all decided there merely followed the tedious task of logging into over 100 websites and changing my password on each one. I decided that my default policy would be to let KeePassX generate a new random password for each site made up of letters, numbers and special characters, with a length of 20 characters. Surprisingly the vast majority of sites coped just fine with these passwords. BugZilla turns out to be limited to 16 characters and a handful of ecommerce sites had even shorter limits, or refused to allow use of special characters!

Using E-Mail “plus addressing” for accounts

It is often said that when bad guys have compromised a website’s account database they will try to reuse the same email and password on a number of other high value sites. Since many people reuse passwords & many sites allow login based off an email address, the bad guys will trivially gain access to a significant number of accounts on other non-compromised sites. I am already generating unique passwords for each site, but to add just one more roadblock, I decided that while changing passwords, I would also set a unique email address for every single site.

My exim mail server supports what is known as “plus addressing”, whereby you can append an arbitrary tag to the local part of an email address. For example given an address “fred@example.com” you have an infinite number of unique email address “fred+TAG@example.com” where “TAG” is any reasonable string. Sadly when I tried using plus addressing, I immediately hit problems, because many (broken) form data validation checks think “+” is not a valid character to use in email addresses, or worse they would accept the address but all email they sent would end up in a black hole. Fortunately, it is a trivial matter to reconfigure Exim to allow use of ‘-‘ as the separator for plus addressing, ie to allow “fred-TAG@example.com“.

Out of > 100 websites I updated my account details on, only 1 rejected the use of ‘-‘ in my email address. So now more or less every account I am registered to has both a unique password and unique email address.

In the end, the main thing I an unhappy about is that using a password manager presents a single point of attack for a local computer virus/trojan. Given the frequency with which websites are being compromised these days & the number of sites I need to remember passwords for, I think overall this is clearly still a net win. I will remain on the lookout though for ways to improve the security of the password manager database itself.

Building application sandboxes with libvirt, LXC & KVM

Posted: January 17th, 2012 | Filed under: Fedora, libvirt, Virt Tools | Tags: , , , , , | 12 Comments »

I have mentioned in passing every now & then over the past few months, that I have been working on a tool for creating application sandboxes using libvirt, LXC and KVM. Last Thursday, I finally got around to creating a first public release of a package that is now called libvirt-sandbox. Before continuing it is probably worth defining what I consider the term “application sandbox” to mean. My working definition is that an “application sandbox” is simply a way to confine the execution environment of an application, limiting the access it has to OS resources. To me one notable point is that there is no need for a separate / special installation of the application to be confined. An application sandbox ought to be able to run any existing application installed in the OS.

Background motivation & prototype

For a few Fedora releases, users have had the SELinux sandbox command which will execute a command with a strictly confined SELinux context applied. It is also able to make limited use of the kernel filesystem namespace feature, to allow changes to the mount table inside the sandbox. For example, the common case is to put in place a different $HOME. The SELinux sandbox has been quite effective, but there is a limit to what can be done with SELinux policy alone, as evidenced by the need to create a setuid helper to enable use of the kernel namespace feature. Architecturally this gets even more problematic as new feature requests need to be dealt with.

As most readers are no doubt aware, libvirt provides a virtualization management API, with support for a wide variety of virtualization technologies. The KVM driver is easily the most advanced and actively developed driver for libvirt with a very wide array of features for machine based virtualization. In terms of container based virtualization, the LXC driver is the most advanced driver in libvirt, often getting new features “for free” since it shares alot of code with the KVM driver, in particular anything cgroup based. The LXC driver has always had the ability to pass arbitrary host filesystems through to the container, and the KVM driver gained similar capabilities last year with the inclusion of support for virtio 9p filesystems. One of the well known security features in libvirt is sVirt, which leverages MAC technology like SELinux to strictly confine the execution environment of QEMU. This has also now been adapted to work for the LXC driver.

Looking at the architecture of the SELinux sandbox command last year, it occurred to me that the core concepts mapped very well to the host filesystem passthrough & sVirt features in libvirt’s KVM & LXC drivers. In other words, it ought to be possible to create application sandboxes using the libvirt API and suitably advanced drivers like KVM or LXC. A few weeks hacking resulted in a proof of concept tool virt-sandbox which can run simple commands in sandboxes built on LXC or KVM.

The libvirt-sandbox API

A command line tool for running applications inside a sandbox is great, but even more useful would be an API for creating application sandboxes that programmers can use directly. While libvirt provides an API that is portable across different virtualization technologies, it cannot magically hide the differences in feature set or architecture between the technologies. Thus the decision was taken to create a new library called libvirt-sandbox that provides a higher level API for managing application sandboxes, built on top of libvirt. The virt-sandbox command from the proof of concept would then be re-implemented using this library API.

The libvirt-sandbox library is built using GObject to enable it to be accessible to any programming language via GObject Introspection. The basic idea is that programmer simply defines the desired characteristics of the sandbox, such as the command to be executed, any arguments, filesystems to be exposed from host, any bind mounts, private networking configuration, etc. From this configuration description, libvirt-sandbox will decide upon & construct a libvirt guest XML configuration that can actually provided the requested characteristics. In other words, the libvirt-sandbox API is providing a layer of policy avoid libvirt, to isolate the application developer from the implementation details of the underlying hypervisor.

Building sandboxes using LXC is quite straightforward, since application confinement is a core competency of LXC. Thus I will move straight to the KVM implementation, which is where the real fun is. Booting up an entire virtual machine probably sounds like quite a slow process, but it really need not be particularly if you have a well constrained hardware definition which avoids any need for probing. People also generally assume that running a KVM guest, means having a guest operating system install. This is absolutely something that is not acceptable for application sandboxing, and indeed not actually necessary. In a nutshell, libvirt-sandbox creates a new initrd image containing a custom init binary. This init binary simply loads the virtio-9p kernel module and then mounts the host OS’ root filesystem as the guest’s root filesystem, readonly of course. It then hands off to a second boot strap process which runs the desired application binary and forwards I/O back to the host OS, until the sandboxed application exits. Finally the init process powers off the virtual machine. To get an idea of the overhead, the /bin/false binary can be executed inside a KVM sandbox with an overall execution time of 4 seconds. That is the total time for libvirt to start QEMU, QEMU to run its BIOS, the BIOS to load the kernel + initrd, the kenrel to boot up, /bin/false to run, and the kernel to shutdown & QEMU to exit. I think 3 seconds is pretty impressive todo all that. This is a constant overhead, so for a long running command like an MP3 encoder, it disappears into the background noise. With sufficient optimization, I’m fairly sure we could get the overhead down to approx 2 seconds.

Using the virt-sandbox command

The Fedora review of the libvirt-sandbox package was nice & straightforward, so the package is already available in rawhide for ready to test the VirtSandbox F17 feature. The virt-sandbox command is provided by the libvirt-sandbox RPM package

# yum install libvirt-sandbox

Assuming libvirt is already installed & able to run either LXC or KVM guests, everything is ready to use immediately.

A first example is to run the ‘/bin/date’ command inside a KVM sandbox:

$ virt-sandbox -c qemu:///session  /bin/date
Thu Jan 12 22:30:03 GMT 2012

You want proof that this really is running an entire KVM guest ? How about looking at the /proc/cpuinfo contents:

$ virt-sandbox -c qemu:///session /bin/cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 2
model name	: QEMU Virtual CPU version 1.0
stepping	: 3
cpu MHz		: 2793.084
cache size	: 4096 KB
fpu		: yes
fpu_exception	: yes
cpuid level	: 4
wp		: yes
flags		: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm up rep_good nopl pni cx16 hypervisor lahf_lm
bogomips	: 5586.16
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

How about using LXC instead of KVM, and providing an interactive console instead of just a one-shot command ? Yes, we can do that too:

$ virt-sandbox -c lxc:/// /bin/sh
sh-4.2$ ps -axuwf
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0 165436  3756 pts/0    Ss+  22:31   0:00 libvirt-sandbox-init-lxc
berrange    24  0.0  0.1 167680  4688 pts/0    S+   22:31   0:00 libvirt-sandbox-init-common
berrange    47  0.0  0.0  13852  1608 pts/1    Ss   22:31   0:00  \_ /bin/sh
berrange    48  0.0  0.0  13124   996 pts/1    R+   22:31   0:00      \_ ps -axuwf

Notice how we only see the processes from our sandbox, none from the host OS. There are many more examples I’d like to illustrate, but this post is already far too long.

Future development

This blog post might give the impression that every is complete & operational, but that is far from the truth. This is only the bare minimum functionality to enable some real world usage.  Things that are yet to be dealt with include

  • Write suitable SELinux policy extensions to allow KVM to access host OS filesystems in readonly mode. Currently you need to run in permissive mode which is obviously something that needs solving before F17
  • Turn the virt-viewer command code for SPICE/VNC into a formal API and use that to provide a graphical sandbox running Xorg.
  • Integrate a tool that is able to automatically create sandbox instances for system services like apache to facilitate confined vhosting deployments
  • Correctly propagate exit status from the sandboxed command to the host OS
  • Unentangle stderr and stdout from the sandboxed command
  • Figure out how to make dhclient work nicely when / is readonly and resolv.conf must be updated in-place
  • Expose all the libvirt performance tuning controls to allow disk / net I/O controls, CPU scheduling, NUMA affinity, etc
  • Wire up libvirt’s firewall capability to allow detailed filtering of network traffic to/from sandboxes
  • Much more…

For those attending FOSDEM this year, I will be giving a presentation about libvirt-sandbox in the virt/cloud track.

Oh and as well as the released tar.gz mentioned in the first paragraph, or the Fedora RPM, the  code is all available in GIT

The libvirt & virtualization tools software development platform

In the five years since the libvirt project started, alot has changed. The size of the libvirt API has increased dramatically; the number of languages you can access the API from has likewise grown to cover most important targets; libvirt has been translated to fit into several other object models; plugins have been developed to bind libvirt to other tools. At the same time many other libraries have grown up alongside libvirt, not least libguestfs, gtk-vnc and more recently spice-gtk. Together all these pieces provide a rich software development platform for people building virtualization management applications. A picture is worth 1000 words, so to keep this blog post short, here is the way I visualize the pieces in the virtualization tools platform, and a selection of the applications built on it (click to enlarge the image)

The libvirt & virtualization tools software development platform

The base layer

  • libvirt: the core hypervisor agnostic management API, coring virtual machines, host devices, networking, storage, security and more
  • libvirt-qemu: a small set of QEMU specific APIs, such as the ability to talk to the QEMU monitor, or attach to externally launched QEMU guests. This library builds on top libvirt.
  • libguestfs: the library for manipulating and accessing the contents of guest filesystem images. This uses libvirt for some actions internally. libguestfs has its own huge set of language bindings which are not shown in the diagram, for the sake of clarity. It will also soon be gaining a mapping into the GObject type system, which will help it play nicely with other GObject based APIs here.

Language bindings

The language bindings for libvirt aim to be a 1-for-1 export of the libvirt C API into the corresponding language. They generally don’t attempt to change the way the libvirt API looks or is structured. There is generally completely interoperability between all language bindings, so you can trivially have part of your application written in Perl and another part written in Java and play nicely together.

  • libvirt-ocaml: a binding into the OCaml functional language
  • libvirt-php: a binding into the PHP scripting language
  • libvirt-perl: a binding into the Perl scripting language
  • libvirt-python: a binding into the Python scripting language, which comes as a standard part of the libvirt package
  • libvirt-java: a binding into the Java object language
  • libvirt-ruby: a binding into the Ruby scripting language
  • libvirt-csharp: a binding into the C# object language

Object mappings

The object mappings are distinct from language bindings, because they will often significantly change the structure of the libvirt API to fit in the requirement of the object system being targeted. Depending on the object systems involved, this translation might be lossless, thus an application generally has to pick one object system & stick with it. It is not a good idea to do a mixture of SNMP and QMF calls from the same application.

  • libvirt-snmp: an agent for SNMP that translates from an SNMP MIB to libvirt API calls.
  • libvirt-cim: an agent for CIM the translates from the DMTF virtualization schema to the libvirt API
  • libvirt-qmf: an agent for Matahari that translates from a QMF schema to the libvirt API

Infrastructure plugins

Many common infrastructure applications can be extended by adding plugins for new functionality.  This particularly common with network monitoring or performance collection applications. libvirt can of course be used to create plugins for such applications

  • libvirt-collectd: a plugin for collectd that reports statistics on virtual machines
  • libvirt-munin: a plugin for collectd that reports statistics on virtual machines
  • libvirt-nagios: a plugin for nagious that reports where virtual machines are running
  • fence-virt: a plugin for clustering software to allow virtual machines to be “fenced”

GObject layer

The development of a set of GObject based libraries came about after noticing that many users of the basic libvirt API were having to solve similar problems over & over. For example, every application wanted some programmatic way to extract info from XML documents. Many applications wanted libvirt translated into GObjects. Many applications needed a way to determine optimal hardware configuration for operating systems. The primary reasons for choosing to use GObject as the basis for these APIs was first to facilitate development of graphical desktop applications. With the advent of GObject Introspection, the even more compelling reason is that you get language bindings to all GObject libraries for free. Contrary to popular understanding, GObject is not solely for GTK based desktop applications. It is entirely independent of GTK and can be easily used from any conceivable application. If libvirt were to be started from scratch again today, it would probably go straight for GObject as  the basis for the primary C library. It is that compelling.

  • libosinfo: an API for managing metadata related to operating systems. It includes a database of operating systems with details such as common download URLs, magic byte sequences to identify ISO images, lists of supported hardware. In addition there is a database of hypervisors and their supported hardware. The API allows applications to determine the optimal virtual hardware configuration for deployment of an operating system on a particular hypervisor.
  • gvnc: an API providing a client for the RFB protocol, used for VNC servers. The API facilitates the creation of new VNC client applications.
  • spiceglib: an API providing a client for the SPICE protocol, used for SPICE servers. The API facilitates the creation of new SPICE client applications.
  • libvirt-glib: an API binding the libvirt event loop into the GLib main loop, and translating libvirt errors into GLib errors.
  • libvirt-gconfig: an API for generating and manipulating libvirt XML documents. It removes the need for application programmers to directly deal with raw XML themselves.
  • libvirt-gobject: an API which translates the libvirt object model, also integrating them with the lbivirt-gconfig APIs.
  • libvirt-sandbox: an API for building application sandboxes using virtualization technology.

GTK layer

  • gtk-vnc: an API building on gvnc providing a GTK widget which acts as a VNC client. This is used in both virt-manager & virt-viewer
  • spice-gtk: an API building on spice-glib providing a GTK widget which acts as a SPICE client. This is used in both virt-manager & virt-viewer

Applications

  • python-virtinst: provides the original python virt-install command line tool, as well as a python API which is leveraged by virt-manager. The python-virtinst internal API was the motivation behind the libosinfo library and libvirt-gconfig library
  • virt-manager: provides a general purpose desktop application for interacting with libvirt managed virtualization hosts. The virt-manager internal API was the motivation behind the libvirt-gobject library
  • oVirt: the umbrella project for building an open source virtualized data center management application. Its VDSM component uses the libvirt python language bindings for managing KVM hosts
  • OpenStack: the umbrella project for building an open source cloud management application. Its Nova component uses the libvirt python language bindings for managing KVM, Xen and LXC hosts.
  • GNOME Boxes: the new GNOME desktop application for running virtual machines and accessing remote desktops. It uses libirt-gobject, libosinfo, gtk-vnc & spice-gtk via automatically generated vala bindings.

The Future

  • Get oVirt, OpenStack, python-virtinst and virt-manager using the libosinfo library to centralize definitions of what hardware config to use for deploying operating systems
  • Get oVirt & OpenStack using the libvirt-gconfig library to generate configuration, instead of building XML documents up through string concatenation
  • Convert python-virtinst & virt-manager to use the libvirt-gconfig, libvirt-gobject libraries instead of their private internal equivalents
  • Create a remote-viewer library which pulls in both gtk-vnc and spice-gtk in a higher level framework. This is essentially pulling the commonality out of virt-viewer, virt-manager and GNOME boxes use of gtk-vnc and spice-gtk.
  • Create a libvirt-install library which provides APIs for provisioning operating systems. This would be pulling out commonality between the way python-virtinst, GNOME boxes and other applications deploy new operating systems. This would be a bridge layer between libosinfo and libvirt-gobject

There is undoubtably plenty of stuff I left out of this diagram & description. For example there are many other data center & cloud management projects that are based on libvirt, which I left out for clarity.  There are plenty more libvirt plugins for other applications too, many I will never have heard about. No doubt our future plans will change too, as we adapt to new information.  This should have given a good overview of how broad the open source virtualization tools software development ecosystem has become.

Using command line arg & monitor command passthrough with libvirt and KVM

Posted: December 19th, 2011 | Filed under: Fedora, libvirt, Virt Tools | Tags: , , , , , , , | 1 Comment »

The general goal of the libvirt project is to provide an API definition and XML schema that is independent of any one hypervisor technology. This means that for every feature in KVM, we have to take a little time to carefully design a suitable API or XML schema to expose in libvirt, if none already exists. Sometimes we are lucky and features in KVM can be expressed in almost the same way in libvirt, but more often than not, this does not hold true – the design for libvirt may look somewhat different. For example, several KVM monitor commands at the KVM level may be exposed as a single API call in libvirt. Or conversely, several different libvirt APIs may all end up invoking the same underlying KVM monitor commands. The need to put careful thought into libvirt API & XML design means that there may be a short delay between a feature appearing in KVM, and it appearing in libvirt, though it can also be the other way around, where libvirt has a feature but KVM doesn’t provide a way to support it yet!

For many applications / developers this delay is a non-issue, since they don’t need to be on the absolute bleeding edge of development of KVM or libvirt, but there are always exceptions to the rule. For a long time, we did not want to enable direct use of hypervisor specific features at all via libvirt, because of the support implications of doing so. A little over a year ago though, we decide to change our position on this matter. The key to this change in policy was deciding how to clearly demarcate functionality which is long term supportable, vs that which is not.

KVM custom command line arguments

To support custom command line argument passthrough for KVM, we decided to introduce a new set of XML elements in a different XML namespace. This namespace is defined:

  xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'

Our policy is that any guest configuration that uses this QEMU XML namespace is not guaranteed to continue working if either libvirt or KVM are upgraded. A change to the way libvirt generates command line arguments may break the arguments the app has passed through. Alternatively KVM itself may drop or change the semantics of an existing command line argument. In other words applications should not rely on this capability long term, rather they should raise an RFE against libvirt to support it via the primary XML namespace, and just use the QEMU namespace until the RFE is complete.

The QEMU namespace defines syntax for passing arbitrary command line arguments, along with arbitrary environment variables:

<domain type='qemu' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
  <name>QEMUGuest1</name>
  <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
  ...
  <devices>
    <emulator>/usr/bin/qemu</emulator>
    <disk type='block' device='disk'>
      <source dev='/dev/HostVG/QEMUGuest1'/>
      <target dev='hda' bus='ide'/>
    </disk>
    ...
  </devices>
  <qemu:commandline>
    <qemu:arg value='-unknown'/>
    <qemu:arg value='parameter'/>
    <qemu:env name='NS' value='ns'/>
    <qemu:env name='BAR'/>
  </qemu:commandline>
</domain>

Remember that when adding the <qemu:commandline> element, you need to always declare the XML namespace ‘xmlns:qemu’ on the top level <domain> element.

KVM monitor command passthrough

To support custom monitor command passthrough for KVM, we decided to introduce a second ELF library, libvirt-qemu.so, and separate header file /usr/include/libvirt/libvirt-qemu.h. An application that uses APIs in libvirt-qemu.so is not guaranteed to continue working if either libvirt or KVM are upgraded. A change to the way libvirt manages guests may conflict with the monitor commands the app is trying to issue. Alternatively KVM itself may drop or change the semantics of an existing monitor commands. In other words applications should not rely on this capability long term, rather they should raise an RFE against libvirt to support it via the primary library API, and just use libvirt-qemu.so until the RFE is complete.

Currently the libvirt-qemu.h defines two custom APIs

typedef enum {
  VIR_DOMAIN_QEMU_MONITOR_COMMAND_DEFAULT = 0,
  VIR_DOMAIN_QEMU_MONITOR_COMMAND_HMP     = (1 << 0), /* cmd is in HMP */
} virDomainQemuMonitorCommandFlags;

int virDomainQemuMonitorCommand(virDomainPtr domain, const char *cmd,
                                char **result, unsigned int flags);

virDomainPtr virDomainQemuAttach(virConnectPtr domain,
                                 unsigned int pid,
                                 unsigned int flags);

The first allows passthrough of arbitrary monitor commands, while the latter allows attachment to an existing QEMU instance as discussed previously. The monitor command API is quite straighforward, it accepts a string command, and returns a string reply.  The data for the command/reply can be either in HMP or QMP syntax, depending on how QEMU was launched by libvirt. The VIR_DOMAIN_QEMU_MONITOR_COMMAND_HMP flag allows an application to force use of the HMP syntax at all times.

Using monitor command passthrough from virsh

Not all users will be writing directly to the libvirt API, so the monitor command passthrough is also wired up into virsh via the “qemu-monitor-command” API. First is an example using QMP (JSON syntax):

$ virsh qemu-monitor-command vm-vnc '{ "execute": "query-block"}'
{"return":[{"device":"drive-virtio-disk0","locked":false,"removable":false,"inserted":{"ro":false,"drv":"qcow2","encrypted":false,"file":"/home/berrange/VirtualMachines/plain.qcow"},"type":"unknown"}],"id":"libvirt-9"}

And second is an example demonstrating use of HMP with a guest that runs QMP (libvirt automagically redirects via the ‘human-monitor-command’ command)

$ virsh qemu-monitor-command --hmp vm-vnc  'info block'
drive-virtio-disk0: removable=0 file=/home/berrange/VirtualMachines/plain.qcow ro=0 drv=qcow2 encrypted=0

Tainting of guests

Anyone familiar with the kernel will know that it marks itself as tainted whenever the user does something that is outside the boundaries of normal support. We have borrowed this idea from the kernel and apply it to guests run by libvirt too. Any attempt to use either the command line argument passthrough via XML, or QEMU monitor command passthrough via libvirt-qemu.so will result in the guest domain being marked as tainted. This shows up in the libvirt log files. For example after that last example,  $HOME/.libvirt/qemu/log/vm-vnc.log shows the following

Domain id=2 is tainted: custom-monitor

This allows OS distro support staff to determine if something unusal has been done to a guest when they see support tickets raised. Depending on the OS distro’s support policy they may decline to support problem arising from tainted guests. In RHEL for example, any usage of QEMU monitor command passthrough, or command line argument passthrough is outside the bounds of libvirt support, and users would normally be asked to try to reproduce any problem without a tainted guest.

Securing the WordPress admin interface using (Free!) SSL certificates

Posted: December 19th, 2011 | Filed under: Fedora | Tags: , , , , , , | 1 Comment »

Last year I migrated my website off Blogger to a WordPress installation hosted on my Debian server. Historically my website has only been exposed over plain old HTTP, which was fine since the Blogger publishing UI was running HTTPS. With the migration to WordPress install though, the publishing UI is now running on my own webserver and thus the lack of HTTPS on my server becomes a reasonably serious problem. Most people’s first approach to fixing this would be to just generate a self-signed certificate and deploy that for their server, but I rather wanted to have a x509 certificate that would be immediately trusted by any visiting browser.

Getting free x509 certificates from StartSSL

The problem is that the x509 certificate authority system is a bit of a protection racket with recurring fees that just cannot justify the level of integrity they provide. There is one exception to the norm though, StartSSL offer some basic x509 certificates at zero cost. In particular you can get Class1 web server certificates and personal client identity certificates. The web server certificates are restricted in that you can only include 2 domain names in them, your basic domain name & the same domain name with ‘www.’ prefixed. If you want wildcard domains, or multiple different domain names in a single certificate you’ll have to go for their pay-for offerings. For many people, including myself, this limitation will not be a problem.

StartSSL have a nice self-service web UI for generating the various certificates. The first step is to generate a personal client identity certificate, which the rest of their administrative control panel relies on for authentication. After generation is complete, it automatically gets installed into firefox’s certificate database. You are wisely reminded to export the database to a pkcs12 file and back it up somewhere securely. If you loose this personal client certificate, you will be unable to access their control panel for managing your web server certificates. The validation they do prior to issuing the client certificate is pretty minimal, but fully automated, in so much as they send a message to the email address you provide with a secret URL you need to click on. This “proves” that the email address is yours, so you can’t request certificates for someone else’s email address, unless you can hack their email accounts…

Generating certificates for web servers is not all that much more complicated. There are two ways to go about it though, either you can fill in their interactive web form & let their site generate the private key, or you can generate a private key offline and just provide them with a CSR (Certificate Signing Request). I tried todo the former first of all, but for some reason it didn’t work – it got stuck generating the private key, so I switched to generating a CSR instead. The validation they do prior to issuing a certificate for a web server is also automated. This time they do a whois lookup on the domain name you provide, and send a message with a secret URL to the admin, technical & owner email addresses in the whois record. This “proves” that the domain is yours, so you can’t requests certificates for someone else’s domain name, unless you can hack their whois data or admin/tech/owner email accounts…

Setting up Apache to enable SSL

The next step is to configure apache to enable SSL for the website as a whole. There are four files that need to be installed to provide the certificates to mod_ssl

  • ssl-cert-berrange.com.pem – this is the actual certificate StartSSL issued for my website, against StartSSL’s Class1 root certificate
  • ssl-cert-berrange.com.key – this is the private key I generated and used with my CSR
  • ssl-ca-start.com.pem – this is the master StartSSL CA certificate
  • ssl-ca-chain-start.com-class1-server.pem – this is the chain of trust between your website’s certificate and StartSSL’s master CA certificate, via their Class1 root certificate

On my Debian Lenny host, they were installed to the following locations

  • /etc/ssl/certs/ssl-cert-berrange.com.pem
  • /etc/ssl/private/ssl-cert-berrange.com.key
  • /etc/ssl/certs/ssl-ca-chain-start.com-class1-server.pem
  • /etc/ssl/certs/ssl-ca-start.com.pem

The only other bit I needed todo was to setup a new virtual host in the apache config file, listening on port 443

<VirtualHost *:443>
  ServerName www.berrange.com
  ServerAlias berrange.com

  DocumentRoot /var/www/berrange.com
  ErrorLog /var/log/apache2/berrange.com/error_log
  CustomLog /var/log/apache2/berrange.com/access_log combined

  SSLEngine on

  SSLCertificateFile    /etc/ssl/certs/ssl-cert-berrange.com.pem
  SSLCertificateKeyFile /etc/ssl/private/ssl-cert-berrange.com.key
  SSLCertificateChainFile /etc/ssl/certs/ssl-ca-chain-start.com-class1-server.pem
  SSLCACertificateFile /etc/ssl/certs/ssl-ca-start.com.pem
</VirtualHost>

After restarting Apache, I am now able to connect to https://berrange.com/ and that my browser trusts the site with no exceptions required.

Setting up Apache to require SSL client cert for WordPress admin pages

The next phase is to mandate use of a client certificate when accessing any of the WordPress administration pages. Should there be any future security flaws in the WordPress admin UI, this will block any would be attackers since they will not have the requisite client SSL certificate. Mnadating use of client certificates is done with the “SSLVerifyClient require” directive in Apache. This allows the client to present any client certificate that is signed by the CA configured earlier – that is potentially any user of StartSSL.  My intention is to restrict access exclusively to the certificate that I was issued. This requires specification of some match rules against various fields in the certificate. First lets see the Apache virtual host configuration additions:

<Location /wp-admin>
  SSLVerifyClient require
  SSLVerifyDepth  3
  SSLRequire %{SSL_CLIENT_I_DN_C} eq "IL" and \
             %{SSL_CLIENT_I_DN_O} eq "StartCom Ltd." and \
             %{SSL_CLIENT_I_DN_OU} eq "Secure Digital Certificate Signing" and \
             %{SSL_CLIENT_I_DN_CN} eq "StartCom Class 1 Primary Intermediate Client CA" and \
             %{SSL_CLIENT_S_DN_CN} eq "dan@berrange.com" and \
             %{SSL_CLIENT_S_DN_Email} eq "dan@berrange.com"
</Location>

The first 4 match rules here are saying that the client certificate must have been issued by the StartSSL Class1 client CA, while the last 2 matches are saying that the client certificate must contain my email address. The security thus relies on StartSSL not issuing anyone else a certificate using my email address. The whole lot appears inside a location match against ‘/wp-admin’ which is the URL prefix all the WordPress administration pages have. The entire block must also be duplicated using a location match against ‘/wp-login.php’ to protect the user login page too.

<Location /wp-login.php>
  SSLVerifyClient require
  SSLVerifyDepth  3
  SSLRequire %{SSL_CLIENT_I_DN_C} eq "IL" and \
             %{SSL_CLIENT_I_DN_O} eq "StartCom Ltd." and \
             %{SSL_CLIENT_I_DN_OU} eq "Secure Digital Certificate Signing" and \
             %{SSL_CLIENT_I_DN_CN} eq "StartCom Class 1 Primary Intermediate Client CA" and \
             %{SSL_CLIENT_S_DN_CN} eq "dan@berrange.com" and \
             %{SSL_CLIENT_S_DN_Email} eq "dan@berrange.com"
</Location>

Preventing access to the WordPress admin pages via non-HTTPS connections.

Finally, to ensure the login & admin pages cannot be accessed over plain HTTP, it is necessary to alter the virtual host config for port 80, to include

RewriteEngine On
RewriteRule ^(/wp-admin/.*) https://www.berrange.com$1 [L,R=permanent]
RewriteRule ^(/wp-login.php.*) https://www.berrange.com$1 [L,R=permanent]

To be honest, I should just put a redirect on ‘/’ to prevent any use of the plain HTTP site at all, but I want to test how well my tiny virtual server copes with the load before enabling HTTPs for everything.

Hopefully this blog post has demonstrated that setting up your personal webserver with certificates that any browser will trust, is both easy and cheap (free), so there is no reason to use self-signed certificates unless you need multiple domain names / wildcard addresses in your certificates and you’re unwilling to pay money for them.