Sunday 16 September 2012

Striving for better code quality.

Software is complex and is never bug free, but fortunately there are many different tools and techniques available to help to identify and catch a large class of common and obscure bugs.

Compilers provide build options that can help drive up code quality by being particularly strict to detect questionable code constructions, for example gcc's -Wall and -pedantic flags.  The gcc -Werror flag is useful during code development to ensure compilation halts with an error on warning messages, this ensures the developer will stop and fix code.

Static analysis during compilation is also a very useful technique, tools such as smatch and Concinelle can identify bugs such as deferencing of NULL pointers, checks for return values and ranges,  incorrect use of && and ||, bad use of unsigned or signed values and many more beside.  These tools were aimed for use on the Linux kernel source code, but can be used on C application source too.  Let's take a moment to see how to use smatch when building an application.

Download the dependencies:
 sudo apt-get install libxml2-dev llvm-dev libsqlite3-dev

Download and build smatch:
 mkdir ~/src  
 cd ~/src  
 git clone git://repo.or.cz/smatch  
 cd smatch  
 make  

Now build your application using smatch:
 cd ~/your_source_code  
 make clean  
 make CHECK="~/src/smatch/smatch --full-path" \
   CC=~/src/smatch/cgcc | tee warnings.log  

..and inspect the warnings and errors in the file warnings.log.  Smatch will produce false-positives, so not every warning or error is necessarily buggy code.

Of course, run time profiling of programs also can catch errors.  Valgrind is an excellent run time profiler that I regularly use when developing applications to catch bugs such as memory leaks and incorrect memory read/writes. I recommend starting off using the following valgrind options:
 --leak-check=full --show-possibly-lost=yes --show-reachable=yes --malloc-fill=  

For example:
 valgrind --leak-check=full --show-possibly-lost=yes --show-reachable=yes \
  --malloc-fill=ff your-program  

Since the application is being run on a synthetic software CPU execution can be slow, however it is amazingly thorough and produces detailed output that is extremely helpful in cornering buggy code.

The gcc compiler also provides mechanism to instrument code for run-time analysis.  The -fmudflap family of options instruments risky pointer and array dereferencing operations, some standard library string and heap functions as well as some other range + validity tests.   For threaded applications use -fmudflapth instead of -fmudflap.   The application also needs to be linked with libmudflap.

Here is a simple example:
 int main(int argc, char **argv)  
 {  
     static int x[100];  
     return x[100];  
 }  

Compile with:
 gcc example.c -o example -fmudflap -lmudflap  

..and mudflap detects the error:
 ./example   
 *******  
 mudflap violation 1 (check/read): time=1347817180.586313 ptr=0x701080 size=404  
 pc=0x7f98d3d17f01 location=`example.c:5:2 (main)'  
    /usr/lib/x86_64-linux-gnu/libmudflap.so.0(__mf_check+0x41) [0x7f98d3d17f01]  
    ./example(main+0x7a) [0x4009c6]  
    /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f98d397276d]  
 Nearby object 1: checked region begins 0B into and ends 4B after  
 mudflap object 0x190a370: name=`example.c:3:13 x'  
 bounds=[0x701080,0x70120f] size=400 area=static check=3r/0w liveness=3  
 alloc time=1347817180.586261 pc=0x7f98d3d175f1  
 number of nearby objects: 1  

These are just a few examples, however there are many other options too. Electric Fence is a useful malloc debugger, and gcc's -fstack-protector produces extra code to check for buffer overflows, for example in stack smashing. Tools like bfbtester allow us to brute force check command line overflows - this is useful as I don't know many developers who try to thoroughly validate all the options in their command line utilities.

No doubt there are many more tools and techniques available.  If we use these wisely and regularly we can reduce bugs and drive up code quality.