Live yum upgrading from Fedora 15 to Fedora 16

Posted: November 11th, 2011 | Filed under: Fedora | Tags: , , , , , | 5 Comments »

With Fedora 16 out, I have to perform upgrades on my Fedora machines, two laptops, one mac mini and five servers. Officially the only supported way to upgrade Fedora is to use the regular Anaconda installer, or use pre-upgrade. In the years I’ve been using Fedora (since Fedora Core 5), I have never used Anaconda for upgrades because I rarely have direct access to the machine to boot CDs/DVDs, and whenever I have tried pre-upgrade it always fails. This time it failed on one server because /boot was on software-RAID, and failed on the other two servers because /boot was too small. So instead I’ve always ended up doing live upgrades using yum directly. The first few times I did this, there were some surprises, but in the end I’ve settled into a recipe that has a high success rate.

I’m NOT encouraging people follow my approach, but if you have ended up in a situation where a live yum upgrade is your only option, these notes might help you avoid some pitfalls.

  • Tip 1: Only attempt to upgrade 1 release at a time, don’t try to skip over releases. This reduces the chances of problems with yum calculating a suitable upgrade path
  • Tip 2: Avoid having any 3rd party repositories configured with yum, unless you know they already support both the current and target distro release. In practice this means you want to avoid anything except official Fedora, RPM Fusion and Livna repositories.

The actual upgrade process I follow is this:

  • Step 1: Ensure the current install is fully updated.
    # yum -y update
  • Step 2: Remove any orphaned packages, ie locally installed packages which are not present in any of your active YUM repositories. Orphaned packages are the most common cause for unresolvable RPM dependencies during upgrade. If you choose to skip this step be aware that you will almost certainly need to revisit it, if yum fails to resolve an upgrade path.
    # package-cleanup --orphans
    # rpm -e ...for each orphan you want to remove...
  • Step 3: Purge all YUM cached data about current repos. This just frees up space by removing cached data files that won’t be needed anymore
    # yum clean all
  • Step 4: Install the fedora-release, fedora-release-rawhide and fedora-release-notes RPMs for the new release.
    # wget http://mirror.bytemark.co.uk/fedora/linux/releases/16/Everything/x86_64/os/Packages/fedora-release-16-1.noarch.rpm
    # wget http://mirror.bytemark.co.uk/fedora/linux/releases/16/Everything/x86_64/os/Packages/fedora-release-notes-16.1.0-1.fc16.noarch.rpm
    # wget http://mirror.bytemark.co.uk/fedora/linux/releases/16/Everything/x86_64/os/Packages/fedora-release-rawhide-16-1.noarch.rpm
    # rpm -Uvh fedora-release*16*.rpm
  • Step 5: Perform the actual upgrade.
     # yum update

Sometimes even after removing all orphan packages and 3rd party packages, the last step will still fail with unresolvable dependencies. This might be the case if an RPM was purged from latest Fedora and thus not having an upgrade path. When this happens, it is usually sufficient to just remove the obsolete package and retry the upgrade.

One significant change between Fedora 15 and 16 that is important for the upgrade process, is the switch from Grub1 to Grub2. While the live YUM upgrade will result in grub2 getting installed, it does not update your bootsector. So there are two post-upgrade tasks that must be done before reboot.

  • Generate the master grub2 config file
    # grub2-mkconfig > /etc/grub2.cfg
  • Install grub2 into the boot sector
    # grub2-install /dev/sda

    As mentioned earlier, on one of the machines I had software RAID configured, so I wanted grub2 installed on both disks

    # grub2-install /dev/sda
    # grub2-install /dev/sdb

After rebooting into the new kernel, I once again run ‘package-cleanup –orphans’ to ensure find any obsolete packages which can be killed off.

That’s all there was to it. Where I have followed these instructions I have a 100% success rate in live yum upgrades. The only upgrade which failed was the one where I forgot to generate the grub2 config file before rebooting :-)

A “Hello World” like example for GTK-VNC in Perl, Python and JavaScript

Posted: November 4th, 2011 | Filed under: Coding Tips, Fedora, Gtk-Vnc, Virt Tools | Tags: , , , | No Comments »

I have written before about what a great benefit GObject Introspection is, by removing the need to write dynamic language bindings for C libraries. When I ported GTK-VNC to optionally build with GTK3, I did not bother to update the previous manually created Python binding. Instead application developers are now instructed to use GObject Introspection if they ever want to use GTK-VNC from non-C languages. As a nice demo of the capabilities I have written the bare minimum “Hello World” like example for GTK-VNC in Perl, Python and JavaScript. The only significant difference between these examples is syntax for actually importing a particular library. The Perl binding is the most verbose for importing libraries, which is a surprise, since Perl is normally a very concise language. Hopefully they will invent a more concise syntax for importing soon.

Perl “hello world” VNC client

#!/usr/bin/perl

use Gtk3 -init;
Glib::Object::Introspection->setup(basename => 'GtkVnc', version => '2.0', package => 'GtkVnc');
Glib::Object::Introspection->setup(basename => 'GVnc', version => '1.0', package => 'GVnc');

GVnc::util_set_debug(1);

my $win = Gtk3::Window->new ('toplevel');
my $dpy = GtkVnc::Display->new();

$win->set_title("GTK-VNC with Perl");
$win->add($dpy);
$dpy->open_host("localhost", "5900");
$win->show_all; 
Gtk3::main;

Python “hello world” VNC client

#!/usr/bin/python

from gi.repository import Gtk;
from gi.repository import GVnc;
from gi.repository import GtkVnc;

GVnc.util_set_debug(True)

win = Gtk.Window()
dpy = GtkVnc.Display()

win.set_title("GTK-VNC with Python")
win.add(dpy)
dpy.open_host("localhost", "5900")
win.show_all()
Gtk.main()

JavaScript “hello world” VNC client

#!/usr/bin/gjs

const Vnc = imports.gi.GtkVnc;
const GVnc = imports.gi.GVnc;
const Gtk = imports.gi.Gtk;

Gtk.init(0, null);
GVnc.util_set_debug(true);

var win = new Gtk.Window();
var dpy = new Vnc.Display();

win.set_title("GTK-VNC with JavaScript");
win.add(dpy);
dpy.open_host("localhost", "5900");
win.show_all();
Gtk.main(); 

Using URI aliases with libvirt 0.9.7 (forthcoming release)

Posted: October 28th, 2011 | Filed under: Fedora, libvirt, Virt Tools | Tags: , , , | 2 Comments »

In the next week or two, we will be releasing libvirt 0.9.7, which comes with a new URI alias feature, which will be particularly useful for users of tools like virsh and virt-install.

In the same way that SSH allows you to setup hostname aliases in $HOME/.ssh/config, libvirt will now allow you to setup URI aliases in $HOME/.libvirt/libvirt.conf (if you are running unprivileged) or /etc/libvirt/libvirt.conf (if you are running as root). NB do not confuse this file with libvirtd.conf which is a server side libvirtd daemon config file.

The new libvirt.conf configuration file will start with a single ‘uri_aliases’ parameter, which takes a list of entries. Each entry is an alias map in the format “ALIAS=URI”. To avoid potential overlap with otherwise valid URIs, the ALIAS part is restricted to the characters a-z, A-Z, 0-9, -, _. The config file format is fairly self-explanatory if you look at an example:

$ cat $HOME/.libvirt/libvirt.conf
uri_aliases = [
  "hail=qemu+ssh://root@hail.cloud.example.com/system",
  "sleet=qemu+tls://sleet.cloud.example.com/system",
]

Once this config has been created, I can then replace any commands like

$ virsh -c qemu+ssh://root@hail.cloud.example.com/system list
$ virt-install -c qemu+tls://sleet.cloud.example.com/system ...some args...

with much simpler commands like

$ virsh -c hail list
$ virt-install -c sleet ...some arg...

Application that use libvirt do not have to do anything special to support URI aliases, as they are automatically handled by the virConnectOpen family of APIs. If, however, an application wishes to prevent the use of URI aliases for some reason, it can do so by passing VIR_CONNECT_NO_ALIASES constant as a flag to the virConnectOpenAuth API.

For further documentation on libvirt URIs consult the URI reference and the remote driver reference pages on the libvirt website.

Debugging early startup of KVM with GDB, when launched by libvirtd

Posted: October 12th, 2011 | Filed under: Fedora, libvirt, Virt Tools | Tags: , , , , | 1 Comment »

Earlier today I was asked how one would go about debugging early startup of KVM under GDB, when launched by libvirtd. It was not possible to simply attach to KVM after it had been launched by libvirtd, since that was too late. In addition running the same KVM command outside libvirt did not exhibit the problem that was being investigated.

Fortunately, with a little cleverness, it is actually possible to debug a KVM guest launched by libvirtd, right from the start. The key is to combine a couple of breakpoints with use of follow-fork-mode. When libvirtd starts up a KVM guest, it runs QEMU a couple of times in order to detect which command line arguments are supported. This means the follow-fork-mode setting cannot be changed too early, otherwise GDB will end up following the wrong process.

I happen to know that there is only one place in the libvirt code which calls virCommandSetPreExecHook, and that is immediately before launching the real QEMU process. A nice thing about GDB is that when following forked/exec’d children, it will apply any existing breakpoints in the child, even if it is a new binary. So a break point set on ‘main’, while still in libvirtd will happily catch ‘main’ in the QEMU process. The only remaining problem is that if QEMU does not setup and activate the monitor quickly enough, libvirtd will try to kill it off again. Fortunately GDB lets you ignore SIGTERM, and even SIGKILL :-)

The start of the trick is this:

# pgrep libvirtd
12345
# gdb
(gdb) attach 12345
(gdb) break virCommandSetPreExecHook
(gdb) cont

Now in a separate shell

# virsh start $GUESTNAME

Back in the GDB shell the breakpoint should have triggered, allowing the trick to be finished:

(gdb) break main
(gdb) handle SIGKILL nopass noprint nostop
Signal        Stop	Print	Pass to program	Description
SIGKILL       No	No	No		Killed
(gdb) handle SIGTERM nopass noprint nostop
Signal        Stop	Print	Pass to program	Description
SIGTERM       No	No	No		Terminated
(gdb) set follow-fork-mode child
(gdb) cont
process 3020 is executing new program: /usr/bin/qemu-kvm
[Thread debugging using libthread_db enabled]
[Switching to Thread 0x7f2a4064c700 (LWP 3020)]
Breakpoint 2, main (argc=38, argv=0x7fff71f85af8, envp=0x7fff71f85c30)
    at /usr/src/debug/qemu-kvm-0.14.0/vl.c:1968
1968	{
(gdb) 

Bingo, you can now debug QEMU startup at your leisure

Setting up a Ceph cluster and exporting a RBD volume to a KVM guest

Posted: October 12th, 2011 | Filed under: Fedora, libvirt, Virt Tools | Tags: , , , , , , , | 5 Comments »

Yesterday I talked about setting up Sheepdog with KVM, so today is it is time to discuss use of Ceph and RBD with KVM.

Host Cluster Setup, the easy way

Fedora has included Ceph for a couple of releases, but since my hosts are on Fedora 14/15, I grabbed the latest ceph 0.3.1 sRPMs from Fedora 16 and rebuilt those to get something reasonably up2date. In the end I have the following packages installed, though to be honest I don’t really need anything except the base ‘ceph’ RPM:

# rpm -qa | grep ceph | sort
ceph-0.31-4.fc17.x86_64
ceph-debuginfo-0.31-4.fc17.x86_64
ceph-devel-0.31-4.fc17.x86_64
ceph-fuse-0.31-4.fc17.x86_64
ceph-gcephtool-0.31-4.fc17.x86_64
ceph-obsync-0.31-4.fc17.x86_64
ceph-radosgw-0.31-4.fc17.x86_64

Installing the software is the easy bit, configuring the cluster is where the fun begins. I had three hosts available for testing all of which are virtualization hosts. Ceph has at least 3 daemons it needs to run, which should all be replicated across several hosts for redundancy. There’s no requirement to use the same hosts for each daemon, but for simplicity I decided to run every Ceph daemon on every virtualization host.

My hosts are called lettuce, avocado and mustard. Following the Ceph wiki instructions, I settled on a configuration file that looks like this:

[global]
    auth supported = cephx
    keyring = /etc/ceph/keyring.admin

[mds]
    keyring = /etc/ceph/keyring.$name
[mds.lettuce]
    host = lettuce
[mds.avocado]
    host = avocado
[mds.mustard]
    host = mustard

[osd]
    osd data = /srv/ceph/osd$id
    osd journal = /srv/ceph/osd$id/journal
    osd journal size = 512
    osd class dir = /usr/lib64/rados-classes
    keyring = /etc/ceph/keyring.$name
[osd.0]
    host = lettuce
[osd.1]
    host = avocado
[osd.2]
    host = mustard

[mon]
    mon data = /srv/ceph/mon$id
[mon.0]
    host = lettuce
    mon addr = 192.168.1.1:6789
[mon.1]
    host = avocado
    mon addr = 192.168.1.2:6789
[mon.2]
    host = mustard
    mon addr = 192.168.1.3:6789

The osd class dir bit should not actually be required, but the OSD code looks in the wrong place (/usr/lib instead of /usr/lib64) on x86_64 arches.

With the configuration file written, it is time to actually initialize the cluster filesystem / object store. This is the really fun bit. The Ceph wiki has a very basic page which talks about the mkcephfs tool, along with a scary warning about how it’ll ‘rm -rf’ all the data on the filesystem it is initializing. It turns out that it didn’t mean your entire host filesystem, AFAICT, it only the blows away the contents of the directory configured for ‘osd data‘ and ‘mon data‘, in my case both under /srv/ceph.

The recommended way is to let mkcephfs ssh into each of your hosts and run all the configuration tasks automatically. Having tried the non-recommended way and failed several times before finally getting it right, I can recommend following the recommended way :-P There are some caveats not mentioned in the wiki page though:

  • The configuration file above must be copied to /etc/ceph/ceph.conf on every node before attempting to run mkcephfs.
  • The configuration file on the host where you run mkcephfsmust be in /etc/ceph/ceph.conf or it will get rather confused about where it is in the other nodes.
  • The mkcephfscommand must be run as root since, it doesn’t specify ‘-l root’ to ssh, leading to an inability to setup the nodes.
  • The directories /srv/ceph/osd$i must be pre-created, since it is unable to do that itself, despite being able to creat the /srv/ceph/mon$idirectories.
  • The Fedora RPMs have also forgotten to create /etc/ceph

With that in mind, I ran the following commands from my laptop, as root

 # n=0
 # for host in lettuce avocado mustard ; \
   do \
       ssh root@$host mkdir -p /etc/ceph /srv/ceph/mon$n; \
       n=$(expr $n + 1; \
       scp /etc/ceph/ceph.conf root@$host:/etc/ceph/ceph.conf
   done
 # mkcephfs -a -c /etc/ceph/ceph.conf -k /etc/ceph/keyring.bin

On the host where you ran mkcephfs there should now be a file /etc/ceph/keyring.admin. This will be needed for mounting filesystems. I copied it across to all my virtualization hosts

 # for host in lettuce avocado mustard ; \
   do \
       scp /etc/ceph/keyring.admin root@$host:/etc/ceph/keyring.admin; \
   done

Host Cluster Usage

Assuming the setup phase all went to plan, the cluster can now be started. A word of warning though, Ceph really wants your clocks VERY well synchronized. If your NTP server is a long way away, the synchronization might not be good enough to stop Ceph complaining. You really want a NTP server on your local LAN for hosts to sync against. Sort this out before trying to start the cluster.

 # for host in lettuce avocado mustard ; \
   do \
       ssh root@$host service ceph start; \
   done

The ceph tool can show the status of everything. The ‘mon’, ‘osd’ and ‘msd’ lines in the status ought to show all 3 host present & correct

# ceph -s
2011-10-12 14:49:39.085764    pg v235: 594 pgs: 594 active+clean; 24 KB data, 94212 MB used, 92036 MB / 191 GB avail
2011-10-12 14:49:39.086585   mds e6: 1/1/1 up {0=lettuce=up:active}, 2 up:standby
2011-10-12 14:49:39.086622   osd e5: 3 osds: 3 up, 3 in
2011-10-12 14:49:39.086908   log 2011-10-12 14:38:50.263058 osd1 192.168.1.1:6801/8637 197 : [INF] 2.1p1 scrub ok
2011-10-12 14:49:39.086977   mon e1: 3 mons at {0=192.168.1.1:6789/0,1=192.168.1.2:6789/0,2=192.168.1.3:6789/0}

The cluster configuration I chose has authentication enabled, so to actually mount the ceph filesystem requires a secret key. This key is stored in the /etc/ceph/keyring.admin file that was created earlier. To view the keyring contents, the cauthtool program must be used

# cauthtool -l /etc/ceph/keyring.admin 
[client.admin]
	key = AQDLk5VOeHkHLxAAfGjcaUsOXOhJr7hZCNjXSQ==
	auid = 18446744073709551615

The base64 key there will be passed to the mount command, repeating on every host needing a filesystem present:

# mount -t ceph 192.168.1.1:6789:/ /mnt/ -o name=admin,secret=AQDLk5VOeHkHLxAAfGjcaUsOXOhJr7hZCNjXSQ==
error adding secret to kernel, key name client.admin: No such device

For some reason, that error message is always printed on my Fedora hosts, and despite that, the mount has actually succeeded

# grep /mnt /proc/mounts 
192.168.1.1:6789:/ /mnt ceph rw,relatime,name=admin,secret= 0 0

Congratulations, /mnt is now a distributed filesystem. If you create a file on one host, it should appear on the other hosts & vica-verca.

RBD Volume setup

A shared filesystem is very nice, and can be used to hold regular virtual disk images in a variety of formats (raw, qcow2, etc). What I really wanted to try was the RBD virtual block device functionality in QEMU. Ceph includes a tool called rbd for manipulating those. The syntax of this tool is pretty self-explanatory

# rbd create --size 100 demo
# rbd ls
demo
# rbd info demo
rbd image 'demo':
	size 102400 KB in 25 objects
	order 22 (4096 KB objects)
	block_name_prefix: rb.0.0
	parent:  (pool -1)

Alternatively RBD volume creation can be done using qemu-img …. at least once the Fedora QEMU package is fixed to enable RBD support.

# qemu-img create -f rbd rbd:rbd/demo  100M
Formatting 'rbd:rbd/foo', fmt=rbd size=104857600 cluster_size=0 
# qemu-img info rbd:rbd/demo
image: rbd:rbd/foo
file format: raw
virtual size: 100M (104857600 bytes)
disk size: unavailable

KVM guest setup

The syntax for configuring a RBD block device in libvirt, is very similar to that used for Sheepdog. In Sheepdog, every single virtualization node is also a storage node, so there is no hostname required. Not so for RBD. Here it is necessary to specify one or more host names, for the RBD servers.

<disk type='network' device='disk'>
  <driver name='qemu' type='raw'/>
  <source protocol='rbd' name='demo/wibble'>
    <host name='lettuce.example.org' port='6798'/>
    <host name='mustard.example.org' port='6798'/>
    <host name='avocado.example.org' port='6798'/>
  </source>
  <target dev='vdb' bus='virtio'/>
</disk>

More observant people might be wondering how QEMU gets permission to connect to the RBD server, given that the configuration earlier enabled authentication. This is thanks to the magic of the /etc/ceph/keyring.admin file which must exist on any virtualization server. Patches are currently being discussed which will allow authentication credentials to be set via libvirt, avoiding the need to store the credentials on the virtualization hosts permanently.