Debugging early startup of KVM with GDB, when launched by libvirtd
Earlier today I was asked how one would go about debugging early startup of KVM under GDB, when launched by libvirtd. It was not possible to simply attach to KVM after it had been launched by libvirtd, since that was too late. In addition running the same KVM command outside libvirt did not exhibit the problem that was being investigated.
Fortunately, with a little cleverness, it is actually possible to debug a KVM guest launched by libvirtd, right from the start. The key is to combine a couple of breakpoints with use of follow-fork-mode
. When libvirtd starts up a KVM guest, it runs QEMU a couple of times in order to detect which command line arguments are supported. This means the follow-fork-mode
setting cannot be changed too early, otherwise GDB will end up following the wrong process.
I happen to know that there is only one place in the libvirt code which calls virCommandSetPreExecHook
, and that is immediately before launching the real QEMU process. A nice thing about GDB is that when following forked/exec’d children, it will apply any existing breakpoints in the child, even if it is a new binary. So a break point set on ‘main’, while still in libvirtd will happily catch ‘main’ in the QEMU process. The only remaining problem is that if QEMU does not setup and activate the monitor quickly enough, libvirtd will try to kill it off again. Fortunately GDB lets you ignore SIGTERM, and even SIGKILL :-)
The start of the trick is this:
# pgrep libvirtd 12345 # gdb (gdb) attach 12345 (gdb) break virCommandSetPreExecHook (gdb) cont
Now in a separate shell
# virsh start $GUESTNAME
Back in the GDB shell the breakpoint should have triggered, allowing the trick to be finished:
(gdb) break main (gdb) handle SIGKILL nopass noprint nostop Signal Stop Print Pass to program Description SIGKILL No No No Killed (gdb) handle SIGTERM nopass noprint nostop Signal Stop Print Pass to program Description SIGTERM No No No Terminated (gdb) set follow-fork-mode child (gdb) cont process 3020 is executing new program: /usr/bin/qemu-kvm [Thread debugging using libthread_db enabled] [Switching to Thread 0x7f2a4064c700 (LWP 3020)] Breakpoint 2, main (argc=38, argv=0x7fff71f85af8, envp=0x7fff71f85c30) at /usr/src/debug/qemu-kvm-0.14.0/vl.c:1968 1968 { (gdb)
Bingo, you can now debug QEMU startup at your leisure