For all my development projects I try to make use of code coverage tools to ensure the test suites are reasonably comprehensive, for example, with Test-AutoBuild I use the excellant Devel-Cover module. The nightly build runs the test suite and publishes a code coverage report giving a breakdown of test coverage for API documentation, functions, statements, and even conditional expressions. The colour coding of coverage makes it possible to quickly identify modules which are lacking coverage and, given knowledge about which modules contain most complexity, limited resources for writing tests can be directed to areas of the code which will have the biggest impact in raising application quality.
When using code coverage, however, one must be careful not to fall into the trap of writing tests simply to increase coverage. There are many aspects of the code which just aren’t worth while testing – for example areas so simple that the time involved writing tests is not offset by a meaingful rise in code quality. More importantly though, is that there is a limit to what source code coverage analysis can tell you about the real world test coverage. It is perfectly feasible to have 100% coverage over a region of code and still have serious bugs. The basic root of the problem is that the system being tested is not operating in isolation. No matter how controlled your test environment is, there are always external variables which can affect your code.
I encountered just such an example last weekend. A few months back I added a comprehensive set of tests for validating the checkout of code modules from Perforce, Subversion, and Mercurial. The code coverage report said: 100% covered. Great I thought, I can finally forget about this bit of code for a while. And then we passed the Daylight Savings Time shift and all the tests started failing. It turned out that the modules were not correctly handling timezone information when parsing dates while DST was in effect. There is no easy way test for this other than to run the same test suite over & over under at least 4 different timezones – UTC (GMT), BST (GMT+1), EST (GMT+5), EDT (EST+1/GMT+6). Just setting $TZ isn’t really enough – to automate reliably I would really need to run the builds on four different geographically dispersed servers (or perhaps 4 Xen instances each running in a different timezones).
A second example, testing that no modules have hardcoded the path separator is simply impossible to test for within a single run of a test suite. Running the test on UNIX may give a pass, and 100% coverage, but this merely tells me which tells me that no module has used ‘\’ or ‘:’ as a path separator. To validate that no module has used ‘/’ as a path separator the only option is to re-run the test suite on Windows. Fortunately virtualization can come to the rescue this time again, in the form of QEMU which allows emulation of an x86 CPU.
Going back to example of checking out code from a SCM server, another problem in Test-AutoBuild (which I must address soon) is ensuring that the different failure conditions in talking to the SCM server are handled. Some of the things which can go wrong include, incorrect host name specified, a network outage causes a connection to break mid-operation, incorrect path for the module to checkout, missing installation of local SCM client tools. 100% test coverage of the code for checking out a module can’t tell you that there is a large chunk of error handling code missing altogether.
In summary, no matter how comprehensive your test suite is, there is always room for improvement. Think about what code is not there – error handling code. Think about what external systems you interact with & the failures scenarios that can occur. Think about what environmental assumptions you might have made – OS path separators. Think about what environmental changes can occurr – time zones. In summary while code coverage is an incredibly valuable tool in identifying what areas of *existing* code are not covered, only use it to help priortise ongoing development of a test suite, not as an end goal. There really is no substitute for running the tests under as many different environments as you can lay your hands on. And not having access to a large server farm is no longer an excuse – virtualization (take your pick of Xen, QEMU, UML, and VMWare) will allow a single server to simulate dozens of different environments. The only limit to testing is your imagination….
It has been just over a year & 1/2 since I first blogged about DTrace suggesting that a similar tool would be very valuable to the Linux community. Well after a few long email threads, it turned out that a significant number of people within Red Hat agreed with this assessment and so in partnership with IBM and Intel the SystemTAP project came into life at the start of 2005. Starting with the previously developed KProbes dynamic instrumentation capability a huge amount of work has been done building out a high level language and runtime for safely, efficiently & reliably probing the kernel. It has seen a limited ‘technology preview’ in RHEL-4, and with its inclusion in the forthcoming Fedora Core 5 it will be exposed to a much wider community of users & potential developers.
On the very same day as Dave Jones was looking at the Fedora boot process via static kernel instrumentation, I was (completely co-incidentally ) playing around using SystemTAP to instrument the boot process. The probe I wrote looked at file opens, process fork/execve to enable a hierarchical view of startup to be pieced together. A simplified version of the script looked like:
global indent
function timestamp() {
return string(gettimeofday_ms()) . indent[pid()] . " "
}
function proc() {
return string(pid()) . " (" . execname() . ")"
}
function push(pid, ppid) {
indent[pid] = indent[ppid] . " "
}
function pop(pid) {
delete indent[pid]
}
probe kernel.function("sys_clone").return {
print(timestamp() . proc() . " forks " . string(retval()). "\n")
push(retval(), pid())
}
probe kernel.function("do_execve") {
print(timestamp() . proc() . " execs " . kernel_string($filename) . "\n")
}
probe kernel.function("sys_open") {
if ($flags & 1) {
print(timestamp() . proc() . " writes " . user_string($filename) . "\n")
} else {
print(timestamp() . proc() . " reads " . user_string($filename) . "\n")
}
}
probe kernel.function("do_exit") {
print(timestamp() . proc() . " exit\n")
pop(pid())
}
A few tricks later it was running during boot, and having analysed the results with Perl one can display a summary of how many files each init script opened
1 init read 90 write 30 running...
251 init read 29 write 0 run 23.08s
252 rc.sysinit read 1035 write 45 run 22.91s
274 start_udev read 355 write 128 run 15.10s
286 start_udev read 91 write 0 run 1.90s
287 MAKEDEV read 91 write 0 run 1.88s
291 udevstart read 177 write 124 run 3.95s
614 usleep read 2 write 0 run 1.05s
649 udev-stw.modules read 84 write 5 run 1.23s
701 dmraid read 111 write 0 run 1.07s
748 rc read 235 write 13 run 14.57s
753 S10network read 111 write 16 run 2.85s
833 S12syslog read 44 write 3 run 0.43s
844 S25netfs read 87 write 1 run 1.51s
861 S55cups read 31 write 2 run 1.70s
878 S55sshd read 52 write 1 run 0.86s
892 S97messagebus read 31 write 2 run 0.44s
900 S98NetworkManager read 96 write 10 run 0.58s
910 S98NetworkManagerDispatcher read 92 write 3 run 0.67s
921 S98avahi-daemon read 29 write 0 run 0.41s
929 S98haldaemon read 31 write 2 run 4.20s
955 S99local read 17 write 1 run 0.16s
There are so many other interesting ways to analyse the data collected at boot which I don’t have space for in this blog, so I’ve put all the information (including how to run SystemTAP during boot) up on my Red Hat homepage
Despite the impression given from endless flame wars about the merits of Debian packages vs RPMs, the biggest problems with RPM are not technical, but rather quality control. With its very long release cycles, extensive / comprehensive testing, and higher than average ability of its maintainers, the debian packages one encounters are very well produced indeed. Somewhat a victim of its own success & popularity, RPMs are being produced by a large number of third parties, many of whom only have passing familiarity with Linux, let along RPM packaging guideliens. As an unfortunate result, there are a large number of poorly packaged RPMs floating around the net. The even more unfortunate thing is that if just a few simple guidelines were followed, the situation would improve dramatically. So, without further ado, here are some pointers.
- Group names
- Only use group names which are defined in /usr/share/doc/rpm-[VERSION]/GROUPS – don’t make up new groups
- Installation prefix
- Most software should install into the regular /usr hierarchy, and definitely not into /usr/local, or /usr/local/[APP] which are for *unpackaged* software. Since RPM manages all installed files, there is no need to worry about files from different apps being installed side-by-side. RPM knows what belongs to which app and will ensure clean uninstallation of all files, and prevent installation of two apps with conflicting files. If a private installation location is required, then use /opt/APP-NAME/VERSION, which both separates out the app into its own heirarchy & also allows multiple versions to live in harmony.
- Files list
- The %files section MUST list all files belonging to the application. RPM uses this files list for many purposes
- to ensure that two applications don’t try to install the same file
- to ensure complete removal of all files upon uninstallation
- to verify installed files for changes, from modification time, and ownership right down to md5 sums.
Having a %post section which, for example, extracts a ZIP / tar.gz archive of files for the application completely bypasses these important features of RPM, dramatically lowering the value of having the application packaged in an RPM at all.
- Init scripts
- If the application includes init scripts, make sure they are registered with chkconfig, but don’t turn on the service by default – policy decisions such as whether a daemon should start on boot are the realm of the system administrator, not the software distributor. This is especially important if the application listens on any network ports.
- Patch releases
- After an initial release of software it may be neccessary to distribute a patch update. The approach for this is to take the original source RPM, add one or more patches in the spec file, increment the release number & then rebuild the bina4ry RPM.
The Perl DBI module provides a uniform API to access relational databases. Thanks to Perl’s data type model using it is considerably easier than, say JDBC in the Java world. In common with most database access APIs, the code is split into two bits, the generic infrastructure is in the DBI module, while backends for each database are in the various DBD modules:
The first task in any program is to connect to the server. This is done with the DBI->connect method. To simplify error handling, its good practice to turn on ‘RaiseError’ option, and turn off ‘PrintError’ and ‘AutoCommit’. Then by wrapping the entire unit of work in an eval we get safe transaction commit/rollback without the need to check method return status on each DBI call.
use DBI;
my $db;
eval {
$db = DBI->connect("DBI:Pg:dbname=mydb;host=myhost",
$username, $password, {
RaiseError => 1,
PrintError => 0,
AutoCommit => 0
});
...do some work with the db...
};
if ($@) {
if ($db) {
$db->rollback;
}
die $@;
}
$db->commit;
$db->disconnect;
The next task is issue statements to the DB. Following common practice DBI allows placeholders to be used in SQL, which are substituted with real values at execution time. If the underlying DB doesn’t support placeholders, DBI will emulate them. For maximum performance its advisable to use prepared statement handles, and again DBI will emulate this feature if the underlying driver does not support it.
my $sth1 = $db->prepare_cached("INSERT INTO foo (bar) values (?)");
$sth1->execute($bar);
my $sth2 = $db->prepare_cached("UPDATE foo SET bar = ? WHERE wizz = ?");
$sth2->execute($bar, $wizz);
my $sth3 = $db->prepare_cached("DELETE foo WHERE bar LIKE ?");
$sth3->execute($bar);
The final common task is retrieving data from the DB. There are a number of ways to get data back, but the simplest to code is to bind variables to each return parameter and then call ‘fetchrow’.
my $sth4 = $db->prepare_cached("SELECT bar, wizz FROM foo where bar > ?");
$sth4->execute(20);
my ($bar, $wizz);
$sth4->bind_columns(\$bar, \$wizz);
while ($sth4>fetchrow) {
print "Got $bar $wizz\n";
}
Putting this together a complete example looks like
use DBI;
my $db;
eval {
$db = DBI->connect("DBI:Pg:dbname=mydb;host=myhost",
$username, $password, {
RaiseError => 1,
PrintError => 0,
AutoCommit => 0
});
my $sth1 = $db->prepare_cached("INSERT INTO foo (bar) values (?)");
$sth1->execute($bar);
my $sth2 = $db->prepare_cached("UPDATE foo SET bar = ? WHERE wizz = ?");
$sth2->execute($bar, $wizz);
my $sth3 = $db->prepare_cached("DELETE foo WHERE bar LIKE ?");
$sth3->execute($bar);
my $sth4 = $db->prepare_cached("SELECT bar, wizz FROM foo where bar > ?");
$sth4->execute(20);
my ($bar, $wizz);
$sth4->bind_columns(\$bar, \$wizz);
while ($sth4->fetchrow) {
print "Got $bar $wizz\n";
}
};
if ($@) {
if ($db) {
$db->rollback;
}
die $@;
}
$db->commit;
$db->disconnect;
Scenario
Filtering HTML tags in user entered data is an important aspect of all
web based systems. It serves both to avoid security vunerabilities & allow
the site administrator control over what is displayed in the site. In
all WAF based applications I’ve worked on we’ve relied on the fact that
XSL transformers will automatically escape HTML tags unless you specifically
set the ‘disable-output-escaping’ attribute on the <xsl:value-of> tag.
While this has the virtue of being simple & very safe by default, its
crude on / off action is increasingly becoming a source of problems,
particulary with CMS content items.
For an idea of how its hurting, consider the following situation:
The combination of these two points creates a problem, because we only
want to allow HTML in certain fields, but we need to enter tags
in any field to set the text direction.
The only way out of this is to change the XSLT so that all fields
allow any HTML tag to be rendered. Which in turn implies we need to
filter HTML tags in user entered data.
Use cases
Before considering how to filter HTML, lets enumerate a few use cases:
- Allow tag with ‘rtl’ attribute
- Allow any block or inline tag with ‘rtl’ attribute
- Allow any tag, but no onXXX event tags
- Allow any tag in the HTML-4.0 Strict DTD.
- Disallow any <font> and <blink> tags.
- Allow any inline markup.
- Disallow tables.
There is also the question of what you do to the text when encountering
a forbidden tag. There are two plausible actions:
- Strip out the tag, leaving the content
- Strip out the tag, including the content
The former is applicable for situations where you know the content of
the tag is safe, eg, stripping <font>…some text…</font> tags, you
ought to let ‘….some text…’ pass through. The latter is applicable
when stripping something like an <applet> tag.
Algorith design
So, a reasonable approach to filtering HTML would go something like this:
- Build up a rule set:
- Set a default ‘tag’ policy – one of:
- allow – don’t touch tag
- deny – remove tag & content
- filter – remove tag, leaving content
- Set a default ‘attribute’ policy – one of:
- allow
- deny
- Create a hash mapping ‘tag’ to ‘policy’ for all
tags with a non-default policy
- For each tag, create a hash mapping ‘attribute’ to
‘policy’ for all attributes with a non-default
policy
- Tokenise the HTML document, building up a parse tree,
matching opening & closing tags. Also fill in any closing
tags that were ommitted, eg typically </p>, </li>
- Traverse the parse tree. When encountering a tag, apply
the rules
- if the tag is allowed
- filter out any attributes which are denied
- output the opening tag
- process sub-tags (if any)
- output the closing tag (if any)
- if the tag is denied
- skip the opening tag
- skip sub-tags (if any)
- skip the closing tag
- if the tag is filtered
- skip the opening tag
- process sub-tags (if any)
- skip the closing tag (if any)
The only potentially difficult bit is 2) tokenizing the HTML
and building a syntax tree. Crucial features for such a parser
are
- Thread safe (we can be serving many requests at once)
- Efficient (ie fast at parsing large amounts of data)
- Character set aware (at least UTF-8)
For Java a suitable candidate is HTMLParser
while in Perl there is HTML::Tree
Integrating with applications
Now that we have the basics of the HTML filter worked out, there
is a question of integrating it with applications. There are three
possibilities:
- Filter the data in the form submission
- Throw a validation error in the from submission if
forbidden markup is found.
- Filter the data when generating the output (ie XML or HTML in a JSP/CGI)
In most cases, a) and/or b) are the best approaches since they
catch the problem at the earliest stage. Indeed it may be best
to use a combination of both:
- Default action is to just throw validation error,
giving the user a chance to fixup their data. (this is
nice for letting the user deal with typos).
- If they click the ‘cleanup HTML’ checkbox, then automatically
strip all remaining invalid tags.
The final thought is how to decide on the filtering rule sets.
Again, a one size fits all approach is probably too restrictive.
For example, when using the Article content type in CMS, it is
conceivable that role A (normal authors) should be allowed a
limited set of HTML, but role B (the organization web team) be
allowed arbitrary HTML. Thus there is a case for providing the
site administrator with the means to specify different filtering
rules per role.