Macro Programming, Unit Testing

Macro programming is hell. Abuse the preprocessor and soon you’ll be joining the ranks of the fallen in the Turing tarpit.

The problem is compounded by the infamously unhelpful error messages of GCC. Today, I’ve spent a good part of an hour trying to decipher and debug this particularly mystifying one :

In file included from test_bitvector.c:14:0:
checkhelper.h:79:34: error: '#' is not followed by a macro parameter
 #define checkAssertXXInt(X, OP, Y) do { \
                                  ^

Check the source and see if you can find the problem. I’ll make your life really easy and promise you that the error is in this small snippet :

#define checkAssertXXInt(X, OP, Y) do { \
  unsigned __int128 _ck_x = (unsigned __int128) (X); \
  uintmax_t _ck_x_hi = (_ck_x >> 64); \
  uintmax_t _ck_x_lo = (_ck_x & 0xFFFFFFFFFFFFFFFF); \
  unsigned __int128 _ck_y = (unsigned __int128) (Y); \
  uintmax_t _ck_y_hi = (_ck_y >> 64); \
  uintmax_t _ck_y_lo = (_ck_y & 0xFFFFFFFFFFFFFFFF); \
  ck_assert_msg(_ck_x OP _ck_y, \
    "Assertion '%s' failed: '%s'==%jX.%016jX, '%s'==%jX.%016jX"#msg, #X#OP#Y, \
    #X, _ck_x_hi, _ck_x_lo, #Y, _ck_y_hi, _ck_y_lo); \
} while (0)

Found out ? No ? First hint : although GCC is technically right and there is a problem in the line shown, it would be much more helpful to show another (symmetrical) error.

Still nothing ? There is no shame : the macro parameter msg used with the # operator in line 87 was not declared in line 79.

Now imagine ferreting the little troll, among a flurry of cascading errors, from a 150-line source that includes 5 other headers. Maddening enough for a Pirsig treatise.

* * *

Ironically, the error occurred whilst I prepared a common header for my unit testing modules.

As a scientist who mostly writes hundred-line scripts, this is the very first time I am designing unit tests. I was worried that the learning curve for a testing library would be steep, and so I was immediately sold by the no-frills proposition of Check, a unit test framework for C. It comes complete with a tutorial that — frankly — if you are an academic programer like me, tells you everything you need to know. (Although I find the diff notation of the examples less friendly than it could be).

It took me half an hour to understand Check, and then I was joyously writing and running my tests. Of course, just as soon, I was already wondering how it could be perfected :

  1. Have the assertion helper macros (e.g., ck_assert_int_eq) allow the printing of extra info, like ck_assert_msg does ;
  2. Solve a small bug in the helper macros : they show the inspected expressions by embedding them directly in the printfed string. The problem : what happens when the expression has a modulo operator (%) ? Exactly ! Printf takes it as an escape character, and catastrophe ensues ;
  3. Have helper macros for the 128-bit integers that GCC offers as an extension to the standard integers of C ;
  4. Solve an slight aesthetical problem, since I much prefer camelCase for the helper functions than snake_case (But I reserve UGLY_UPPER_SNAKE_CASE for preprocessor macros with UGLY_SECONDARY_EFFECTS, which you MUST_REMEMBER, because THEY_BITE !) ;
  5. Include automatically the main() function — which is very much the same for every test. I took the opportunity to make it parse the command-line to set the verbosity of the output.

Below you’ll find the fruits of those amendments. Or if you prefer, here’s a link for the source repo. It is released under the same LGPL 2.1 license as Check. No warranties : if you choose to use it, and your computer turns into an Antikytherean device,  or your code opens a portal to Baator, I am not liable.

/*
 Name        : checkhelper.h -- Helper header for Check unit test library
 Contributor : Eduardo A. do Valle Jr., 2014-01-27
 License     : LGPL 2.1 --- see http://check.sourceforge.net/COPYING.LESSER
 */

#ifndef CHECK_HELPER_H_
#define CHECK_HELPER_H_

#include <assert.h>
#include <check.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>


/* --- Alternative macros for unit testing --- */

#define checkAbort ck_abort

#define checkAssert ck_assert

#define checkAbortInfo ck_abort_msg

#define checkAssertInfo ck_assert_msg

#define checkAssertInt(X, OP, Y) do { \
  intmax_t _ck_x = (X); \
  intmax_t _ck_y = (Y); \
  ck_assert_msg(_ck_x OP _ck_y, "Assertion '%s' failed: '%s'==%jd, '%s'==%jd", #X#OP#Y, #X, _ck_x, #Y, _ck_y); \
} while (0)

#define checkAssertUInt(X, OP, Y) do { \
  uintmax_t _ck_x = (X); \
  uintmax_t _ck_y = (Y); \
  ck_assert_msg(_ck_x OP _ck_y, "Assertion '%s' failed: '%s'==%ju, '%s'==%ju", #X#OP#Y, #X, _ck_x, #Y, _ck_y); \
} while (0)

#define checkAssertXInt(X, OP, Y) do { \
  uintmax_t _ck_x = (X); \
  uintmax_t _ck_y = (Y); \
  ck_assert_msg(_ck_x OP _ck_y, "Assertion '%s' failed: '%s'==%jX, '%s'==%jX", #X#OP#Y, #X, _ck_x, #Y, _ck_y); \
} while (0)

#define checkAssertString(X, OP, Y) do { \
  const char* _ck_x = (X); \
  const char* _ck_y = (Y); \
  ck_assert_msg(0 OP strcmp(_ck_y, _ck_x), "Assertion '%s' failed: '%s'==\"%s\", '%s'==\"%s\"", #X#OP#Y, #X, _ck_x, #Y, _ck_y); \
} while (0)

#define checkAssertIntInfo(X, OP, Y, msg, ...) do { \
  intmax_t _ck_x = (X); \
  intmax_t _ck_y = (Y); \
  ck_assert_msg(_ck_x OP _ck_y, "Assertion '%s' failed: '%s'==%jd, '%s'==%jd. "#msg, #X#OP#Y, #X, _ck_x, #Y, _ck_y, ## __VA_ARGS__); \
} while (0)

#define checkAssertUIntInfo(X, OP, Y, msg, ...) do { \
  uintmax_t _ck_x = (X); \
  uintmax_t _ck_y = (Y); \
  ck_assert_msg(_ck_x OP _ck_y, "Assertion '%s' failed: '%s'==%ju, '%s'==%ju. "#msg, #X#OP#Y, #X, _ck_x, #Y, _ck_y, ## __VA_ARGS__); \
} while (0)

#define checkAssertXIntInfo(X, OP, Y, msg, ...) do { \
  uintmax_t _ck_x = (X); \
  uintmax_t _ck_y = (Y); \
  ck_assert_msg(_ck_x OP _ck_y, "Assertion '%s' failed: '%s'==%jX, '%s'==%jX. "#msg, #X#OP#Y, #X, _ck_x, #Y, _ck_y, ## __VA_ARGS__); \
} while (0)

#define checkAssertStringInfo(X, OP, Y, msg, ...) do { \
  const char* _ck_x = (X); \
  const char* _ck_y = (Y); \
  ck_assert_msg(0 OP strcmp(_ck_y, _ck_x), \
    "Assertion '%s' failed: '%s'==\"%s\", '%s'==\"%s\". "#msg, #X#OP#Y, #X, _ck_x, #Y, _ck_y, ## __VA_ARGS__); \
} while (0)


#ifdef __SIZEOF_INT128__

    #define checkAssertXXInt(X, OP, Y) do { \
      unsigned __int128 _ck_x = (unsigned __int128) (X); \
      uintmax_t _ck_x_hi = (_ck_x >> 64); \
      uintmax_t _ck_x_lo = (_ck_x & 0xFFFFFFFFFFFFFFFF); \
      unsigned __int128 _ck_y = (unsigned __int128) (Y); \
      uintmax_t _ck_y_hi = (_ck_y >> 64); \
      uintmax_t _ck_y_lo = (_ck_y & 0xFFFFFFFFFFFFFFFF); \
      ck_assert_msg(_ck_x OP _ck_y, "Assertion '%s' failed: '%s'==%jX.%016jX, '%s'==%jX.%016jX", #X#OP#Y, #X, _ck_x_hi, _ck_x_lo, #Y, _ck_y_hi, _ck_y_lo); \
    } while (0)

    #define checkAssertXXIntInfo(X, OP, Y, msg, ...) do { \
      unsigned __int128 _ck_x = (unsigned __int128) (X); \
      uintmax_t _ck_x_hi = (_ck_x >> 64); \
      uintmax_t _ck_x_lo = (_ck_x & 0xFFFFFFFFFFFFFFFF); \
      unsigned __int128 _ck_y = (unsigned __int128) (Y); \
      uintmax_t _ck_y_hi = (_ck_y >> 64); \
      uintmax_t _ck_y_lo = (_ck_y & 0xFFFFFFFFFFFFFFFF); \
      ck_assert_msg(_ck_x OP _ck_y, "Assertion '%s' failed: '%s'==%jX.%016jX, '%s'==%jX.%016jX. "#msg, #X#OP#Y, #X, _ck_x_hi, _ck_x_lo, #Y, _ck_y_hi, _ck_y_lo, ## __VA_ARGS__); \
    } while (0)

#endif


/* --- Unless CHECK_HELPER_TEST_SUITE_NAME is defined with the name to use, testSuite() will be called to build the suite to run --- */

#ifndef CHECK_HELPER_TEST_SUITE_NAME
    #define CHECK_HELPER_TEST_SUITE_NAME testSuite
#endif


/* --- Unless CHECK_HELPER_MAIN_NAME is defined with the name to use, main() will be used --- */

#ifndef CHECK_HELPER_MAIN_NAME
    #define CHECK_HELPER_MAIN_NAME main
#endif


/* --- Provide a main() function by default, unless CHECK_HELPER_NO_MAIN is defined --- */

#ifndef CHECK_HELPER_NO_MAIN

    Suite * CHECK_HELPER_TEST_SUITE_NAME(void);  // declaration

    int CHECK_HELPER_MAIN_NAME(int argc, char **argv) {

        enum print_output mode = CK_NORMAL;

        if      (argc>1 && (strcmp(argv[1], "-v")==0 || strcmp(argv[1], "--verbose")==0)) {
            mode = CK_VERBOSE;
        }
        else if (argc>1 && (strcmp(argv[1], "-q")==0 || strcmp(argv[1], "--quiet")==0)) {
            mode = CK_SILENT;
        }
        else if (argc>1 && (strcmp(argv[1], "-m")==0 || strcmp(argv[1], "--minimal")==0)) {
            mode = CK_MINIMAL;
        }
        else if (argc>1 && (strcmp(argv[1], "-n")==0 || strcmp(argv[1], "--normal")==0)) {
            mode = CK_NORMAL;
        }
        else if (argc>1 && (strcmp(argv[1], "-e")==0 || strcmp(argv[1], "--environment")==0)) {
            mode = CK_ENV;
        }
        else if (argc>1) {
            #ifndef CHECK_HELPER_USAGE_STRING
            printf("usage: [test_executable] [verbosity flags: -(-q)uiet | -(-m)inimal | -(-n)ormal (default) | -(-v)erbose | -(-e)nvironment ]\n"
                   "       if -e or --environment get verbosity from environment variable CK_VERBOSITY (values: silent minimal normal verbose)\n");
            #else
            printf(CHECK_HELPER_USAGE_STRING);
            #endif
            return 1;
        }

        Suite *suite = CHECK_HELPER_TEST_SUITE_NAME();

        SRunner *suiteRunner = srunner_create(suite);
        srunner_run_all(suiteRunner, mode);
        int numberFailed = srunner_ntests_failed(suiteRunner);
        srunner_free(suiteRunner);

        return (numberFailed == 0) ? EXIT_SUCCESS : EXIT_FAILURE;
    }

#endif

#endif

Language Mishaps: Looking for Perfection

I moved through the thesis using a mixture of 90% Java, 7% Python, 2% BASH scripting and 1% C programming (not C++, but to my defence, I’ve always used the C99 standard – so I am not that old-fashioned).

When I’ve first mentioned that I intended to do all the thesis prototyping in Java, reactions were overwhelmingly negative, with gloomy cautionary tales of insufferable slowness, and unbearable memory footprint. I’ve kept those criticisms in mind, but imbued of scientific scepticism, I started prudently to perform some of the experiments in Java.

Python from XKCD, by Randall MunroeSoon I noticed that my prototypes were slower indeed, but not that slower (certainly less than 10× slower, usually less than 2× slower) and that the memory footprint was higher indeed, but not so high that experiments would become unfeasible. But mainly I’ve noticed that my prototypes were much more reliable than the ones written by the C/C++ programmers (everyone else), thus having less tendency to dump the core in front of the Ph.D. supervisor at the meetings. (It’s not that I program better than my colleagues, it’s just the wonders of having a garbage collector).

I was happy with my choice, and I’ve even gently mocked my colleagues when they defended that real programmers do pointer arithmetic and manual memory deallocation (but they know that I was kidding when I said that C++ is a language whose only purpose is to print at random one of the messages “Hello, world !” and “Segmentation Fault: Core Dumped”).

* * *

But despite the success of that “scientific prototyping experience”, Java has never conquered my heart. All languages have their small annoyances, but my gripe with Java comes from quite an essential design choice: the separation between primitive and reference types. This is one of those essential character flaws, and would only be forgivable if Java provided both autoboxing and a very clever optimising compiler. So far, only half of the solution is available: you can pretend at design time that your Integers are ints (and vice-versa), but try doing it in a critical inner loop, and at runtime you’ll become the laughing stock of all C++ programmers of the lab.

I’ve recently flirted with C# (which apparently handles this problem in a better way), but the lack of a general multiplatform implementation has made me shy of adopting it. One of the wonderful things of Java is that I can develop in Windows, my desktop system, and then run in whatever system is available at the University infrastructure: Windows, Linux, Sun’s UNIX – the integration is absolutely seamless.

I have also considered taking the “XKCD Approach” and do everything in Python, but though I really like Python for scripting, I shudder at the idea of relying exclusively on dynamic typing for anything larger than 500 lines of code.

* * *

In a way, this quest for perfection is all the fault of Bertrand Meyer, and of having read his books in my young naïve years. I am still waiting for The Definitive Programming Language. The one with a static type system which make theoreticians drool, and yet is uncannily intuitive to use. The one with a huge Standard Library, yet with a gentle learning curve. The one whose spoonfuls of syntactic sugar helps the medicinal exception handling code go down in a most elegant way.

I suspect I shall be waiting for a good while.