Thursday, August 24, 2006

Re-engineering: Assertions versus Unit Tests

I'm working on re-engineering a whole lot of badly designed code at work. The code has been mostly working for 4 to 5 years (some of it seems to have been working since 2000 even, if I can go by the author:date comment at the top). Part of the problem is that many things have changed in the business since (any particular/the) code was written. Another part is that the database design was optimized for uploading (easy to identify all the new records for the day to be uploaded), but it is incredibly anti-optimized for querying. There are around 24 basic transactional tables (it could be whittled down to around 12-15, probably, that's another issue, related things should be together in one table) and every day, an additional 24 are created with the date (in mmddyyyy format, not even yyyymmdd) tacked on.

So creating a multi-day report involves running the same query over multiple tables and processing the data on the client side, or it involves a union over many tables.

The first part of the re-engineering involves creating suitable abstractions over this mess (and no, it's not fixing the table structures to be more sane, that comes later, there's too much other code that depends on the insanity). The abstractions will allow us to provide a more reasonable view of the DATA (hiding the database details) to higher (reporting) levels. This is at some cost in CPU processing though. We use PHP, and it's just not very efficient when shoveling a lot of data around while transforming it.

I'm working on basic classes that will interact with the database (everything else interacts with these classes, no higher level code will touch the database directly). I'm finding a problem though. The code aggressively checks parameters and (for anything complex), invariants inside the methods. Checking is done through assertions. At the same time though, I developed a UnitTest class (a simpler replacement for PHPUnit and similar). And when I develop unit tests, I find myself wanting to write tests that make sure the assertions are working. But the assertions can't be tested if assert.bail=1, and the assertions would get much longer if I wanted them to return failure codes to the unit testing framework (because assertions should be simple, just make the program fail immediately).

I would also like to know *which* assertion failed (my parameter checking is pretty paranoid, so there are many checks upon method entry, even if there's only one paranoid, it might even be that there are checks for object validity even if there are no parameters).

ifdef-ing (well, if(defined(....)-ing), the assertions when unit testing doesn't seem to be the right thing to do either. But I haven't really thought that through yet. Maybe I'll try that and see what falls out.

I wonder how others have resolved assert versus unit test (actually, assert versus the goal of complete test coverage, or close to it). Hmmm, I'll google that over the weekend, maybe.

No comments: