Your open source civic duty (aka NetworkManager bug fixing)

Posted: November 19th, 2006 | Filed under: Uncategorized | 1 Comment »

I’ve been using NetworkManager and the VPNC plugin ever since Fedora Core 4. One thing that always irritated me about the VPN support though was that the applet’s list of VPN connections was displayed in ‘bogo sort’ order – aka not ordered in the slightest. I upgraded to Fedora Core 6 recently and found this still hadn’t changed and was about to submit a bug report when I wondered – “how hard can a fix really be?”. So I decided that rather than just enter another BZ ticket and wait for someone to have time to look at it, I should do my open source civic duty & write the patch myself. It turned out to be really quite trivial – all that was needed was changing a call to g_slist_append with a call to g_slist_insert_sorted. From itch, to patch in 30 minutes!

Update: I noticed the VPN configuration tool had the same sorting problem as the drop down menu of connections. So I knocked up an additional patch which registers a sorting function against the TreeModel for the VPN connection list. Finally, in testing this I discovered that if you rename a VPN connection, then the list entry with the old name is never removed. Which of course calls for a 3rd patch and another BZ entry. A busy evening!

Who/what is doing all this file I/O ? (aka stupid userspace)

Posted: November 5th, 2006 | Filed under: Uncategorized | No Comments »

So I had a little spare time the other day and decided to do some experimentation with SystemTAP again. Previously I’ve looked at which processes were doing at boottime; this needs updating and re-visiting to see how things have changed since FC5, however, I decided to leave that task for later. Instead I decided to look at what’s going on in my ‘idle’ GNOME desktop session over a 5 minute period. The SystemTap script I wrote for bootprobes was trivially adapted to serve this new task. Most of the work was in writing a new perl routine to post-process / analyse the data.

The systemtap script simply output a line every time a process closed a file detailing how many bytes it read & how many it wrote. A very short sample is shown below, the first field is milliseconds since start of the script, then the pid, then command line, then filename & bytes read & written:

60 2224 (pcscd) access /dev/bus/usb/005/001 read: 0 write: 0
60 2224 (pcscd) access /dev/bus/usb/005/006 read: 0 write: 0
112 2588 (hald-addon-stor) access /dev/hda read: 0 write: 0
120 1888 (ipw3945d) access /sys/bus/pci/drivers/ipw3945/0000:03:00.0/cmd read: 3008 write: 0
223 1888 (ipw3945d) access /sys/bus/pci/drivers/ipw3945/0000:03:00.0/cmd read: 3008 write: 0
324 1888 (ipw3945d) access /sys/bus/pci/drivers/ipw3945/0000:03:00.0/cmd read: 3008 write: 0
428 2837 (python) access /proc/xen/privcmd read: 0 write: 0
428 1888 (ipw3945d) access /sys/bus/pci/drivers/ipw3945/0000:03:00.0/cmd read: 3008 write: 0
428 2831 (xenstored) access /var/lib/xenstored/tdb.0x9e1e9b8 read: 0 write: 73728

Processing this I correlated parent-child proesses to see who’s spawning who. More importantly for each unique (pid, filename) pair the I tracked how many times the file was opened, the time of each open & bytes read/written. For files opened more than once, a crude frequency of opening was calculated & average I/O size. This produces output like:

 Pid 2060 irqbalance run 292374+ ms...
   reads /proc/interrupts 30 times, once every 9667 ms avg 8192 bytes
   writes /proc/irq/12/smp_affinity once only avg 1 bytes
   writes /proc/irq/23/smp_affinity 10 times, once every 27002 ms avg 1 bytes

Hall of Shame

Anyway, so looking through the output a few examples really stand out for awards in the hall of shame

1. pcscd – smart card daeon

Top of the hall of shame is pcscd a daemon for dealing with smart card readers. Once per second, it scans all USB devices on the system checking to see if any of them are newly plugged in smartcard readers. For the 99.99999% of people who will never see a smartcard reader this is top kwalitee crack. Perhaps someone might like to help pcscd learn to love HAL which is perfectly capable of notifying it when a device is plugged in…

 Pid 2224 pcscd run 301599+ ms...
   reads /dev/bus/usb/001/001 301 times, once every 1000 ms avg 43 bytes
   reads /dev/bus/usb/002/001 301 times, once every 1000 ms avg 43 bytes
   reads /dev/bus/usb/003/001 301 times, once every 1000 ms avg 43 bytes
   reads /dev/bus/usb/004/001 301 times, once every 1000 ms avg 43 bytes
   reads /dev/bus/usb/004/002 301 times, once every 1000 ms avg 57 bytes
   reads /dev/bus/usb/005/001 301 times, once every 1000 ms avg 43 bytes
   reads /dev/bus/usb/005/002 301 times, once every 1000 ms avg 43 bytes
   reads /dev/bus/usb/005/005 301 times, once every 1000 ms avg 52 bytes
   reads /dev/bus/usb/005/006 301 times, once every 1000 ms avg 52 bytes
   reads /dev/bus/usb/005/008 301 times, once every 1000 ms avg 50 bytes

2. ipw3945d – binary regulatory daemon for ipw3945 wifi

Next up in the hall of shame is Intel’s closed source, binary regulatory daemon for the ipw3945 series of wifi cards. Reading data from a sysfs node corresponding to the wifi PCI device, on average every 100 ms

 Pid 1888 ipw3945d run 301538+ ms...
   reads /sys/bus/pci/drivers/ipw3945/0000:03:00.0/cmd 2954 times, once every 102 ms avg 3008 bytes
   writes /sys/bus/pci/drivers/ipw3945/0000:03:00.0/cmd 9 times, once every 29654 ms avg 20.3333333333333 bytes

This activity doesn’t appear to be at all related to level of network I/O, pretty much a constant rate of access regardless of what the system is doing. Please, pretty please, just move this stuff back into the kernel module or firmware & kill this stupid daemon.

3. multiload-applet – GNOME system monitor applet

It is perhaps a little unfair to put this in the hall of shame because I did make an explicit choice to turn this daemon on, and its entire raison d’etre is to sample system state every ‘n’ seconds. The reason its here, however, is because its default out-of-the-box sampling frequency is a mere 500 ms and because of the sheer number of files it feels the need to poll:

 Pid 3290 multiload-apple run 301191+ ms...
   reads /etc/mtab 603 times, once every 499 ms avg 8192 bytes
   reads /proc/loadavg 603 times, once every 499 ms avg 8191 bytes
   reads /proc/meminfo 1206 times, once every 249 ms avg 8191 bytes
   reads /proc/mounts 3618 times, once every 83 ms avg 4096 bytes
   reads /proc/net/dev 603 times, once every 499 ms avg 8192 bytes
   reads /proc/partitions 3618 times, once every 83 ms avg 5461.33333333333 bytes
   reads /proc/stat 603 times, once every 499 ms avg 8191 bytes
   reads /proc/vmstat 603 times, once every 499 ms avg 8191 bytes
   reads /sys/block/dm-0/stat 603 times, once every 499 ms avg 8191 bytes
   reads /sys/block/dm-1/stat 603 times, once every 499 ms avg 8191 bytes
   reads /sys/block/dm-5/stat 603 times, once every 499 ms avg 8191 bytes
   reads /sys/block/sda/sda1/stat 603 times, once every 499 ms avg 8191 bytes
   reads /sys/class/net/eth0/statistics/collisions 603 times, once every 499 ms avg 8191 bytes
   reads /sys/class/net/eth0/statistics/rx_bytes 603 times, once every 499 ms avg 8191 bytes
   reads /sys/class/net/eth0/statistics/rx_errors 603 times, once every 499 ms avg 8191 bytes
   reads /sys/class/net/eth0/statistics/rx_packets 603 times, once every 499 ms avg 8191 bytes
   reads /sys/class/net/eth0/statistics/tx_bytes 603 times, once every 499 ms avg 8191 bytes
   reads /sys/class/net/eth0/statistics/tx_errors 603 times, once every 499 ms avg 8191 bytes
   reads /sys/class/net/eth0/statistics/tx_packets 603 times, once every 499 ms avg 8191 bytes
[snip another 100 files for other network devices vif0, vif1, etc]

This machine is running Xen so there are a number of virtual interfaces – most of them are not active though. This doesn’t stop the applet for reading their tx & rx statistics once every 500ms. It is not entirely clear why it has to read the tx & rx data for each individual network device, since it only ever displays the aggregate total network traffic for the machine – which it has already obtained from /proc/net/dev.

4. xenstored – Xen metadata database

The xenstored process is a ‘special’ program – and I don’t mean that in a good way.
A few weeks back I was wondering why merely monitoring the runtime state of a handful of domains was causing Xenstored to consume so much CPU time. The long answer is archived, but the short story is that its implementation of transactions requires it to take a complete copy of the database backend file everytime it starts & ends a transaction. With XenD performing about 16 transactions every time you ask for info about a domain, if you monitor 20 guest domains once a second, this translates into xenstored doing 80 MB of disk I/O every second.

 Pid 2831 xenstored run 301230+ ms...
   reads /var/lib/xenstored/tdb.0x9e1e9b8 404 times, once every 742 ms avg 196884 bytes
   reads /var/lib/xenstored/tdb.0x9e1f708 403 times, once every 744 ms avg 196884 bytes
   writes /var/lib/xenstored/tdb.0x9e1e9b8 404 times, once every 742 ms avg 73728 bytes
   writes /var/lib/xenstored/tdb.0x9e1f708 404 times, once every 742 ms avg 73728 bytes

Future…

Next task is to get this setup to run at boot again so we see just what files are contributing to I/O during startup. Also be useful to extend this to track ioctl() calls too, because just tracking read/write is certainly missing some interesting stuff.