Tag Archives: unit test

When you can’t (and shouldn’t) unit test

I’m a unit test aficionado, and, as such, have attempted to unit test what really shouldn’t be. It’s common to get excited by a new hammer and then seeing nails everywhere, and unit testing can get out of hand (cough! mocks! cough!).

I still believe that the best tests are free from side-effects, deterministic and fast. What’s important to me isn’t whether or not this fits someone’s definition of what a unit test is, but that these attributes enable the absence of slow and/or flaky tests. There is however another class of tests that are the bane of my existence: brittle tests. These are the ones that break when you change the production code despite your app/library still working as intended. Sometimes, insisting on unit tests means they break for no reason.

Let’s say we’re writing a new build system. Let’s also say that said build system works like CMake does and spits out build files for other build systems such as ninja or make. Our unit test fan comes along and writes a test like this:

assert make_output == "all: foo\nfoo: foo.c\n\tgcc -o foo foo.c"

I believe this to be a bad test, and the reason why is that it’s checking the implementation instead of the behaviour of the production code. Consider what happens when the implementation is changed without affecting behaviour:

all: foo\nfoo: foo.c\n\tgcc -o $@ $<

The behaviour is the same as before: any time `foo.c` is changed, `foo` will get recompiled. The implementation not only isn’t the same, it’s arguably better now, and yet the assertion in the test would fail. I think we can all agree that the ROI for this test is negative if this is all it takes to break it.

The orthodox unit test approach to situations like these is to mock the service in question, except most people don’t get the memo that you should only mock code you own. We don’t control GNU make, so we shouldn’t be doing that. It’s impossible to copy make exactly in a mock/stub/etc. and it’s foolish to even try. We (mostly) don’t care about the string that our code outputs, we care that make interprets that string with the correct semantics.

My conclusion is that I shouldn’t even try to write unit tests for code like this. Integration tests exist for a reason.

Advertisement
Tagged , , , , ,

Unit Testing? Do As I Say, Don’t Do As I Do

I’m a firm believer in unit testing. I’ve done more tech talks on the subject than I’d care to count, and always tell audiences the same thing: prefer unit tests, here’s a picture of the testing pyramid, keep unit tests pure (no side-effects), avoid end-to-end tests (they’re flaky, people will stop paying attention to red builds since all builds will be red). I tell them about adapters, ports and hexagonal architecture. But when it comes to using libclang to parse and translate C and C++ headers, I end up punting and writing a lot of integration tests instead. Hmm.

I know why people write tests with side-effects, and why they end up writing integration and end-to-end ones instead of the nice pure unit test happy place I advocate. It’s easier. There’s less thinking involved. A lot less. However, taking the easy path has always come back to bite me. Those kinds of tests take longer. They higher up the test pyramid you go, the flakier they get. TCP ports stay open longer than a tester would like, for instance. The network goes down. All sorts of things.

I understand why I wrote integration tests instead of unit tests when interfacing with libclang too. Like it is for everyone else, it was just easier. I failed to come up with a plan to unit test what I was doing. It didn’t help that I’d never used libclang and had no idea what the API looked like or what it allowed me to do. It also doesn’t help that libclang doesn’t have an option to take a string to the code to parse and instead takes a file name, but I can work around that.

Because of this, the dpp codebase currently suffers from that lack of separation of concerns. Code that translates C/C++ to D is now intimately tied to libclang and its quirks. If I ever try to use something other than libclang, I won’t be able to. All of the bad things I caution everybody else about? I guaranteed they happened in one of my newest projects.

Before the code collapses under its own complexity, I’ve decided to do what I should’ve done all along and am rewriting dpp so it uses layers to get away from the libclang mess. I’m still figuring it all out, but the main idea is to have a transformation layer between libclang and my code that takes its data types and converts them to a new set of AST types that are my own. From then on it should be trivial to unit test the translation of those AST types that represent C or C++ code into D. Funnily enough, the fact that I wrote so many integration tests will keep me honest since all of those old tests will still have to pass. I’m not sure how I feel about that.

I might do another blog post covering how I ended up porting a codebase with pretty much only integration tests to the unit variety. It might be of interest to anyone maintaining a legacy codebase (i.e. all of us).

Tagged , , , , , ,

unit-threaded: now an executable library

It’s one of those ideas that seem obvious in retrospect, but somehow only ocurred to me last week. Let me explain.

I wrote a unit testing library in D called unit-threaded. It uses D’s compile-time reflection capabilities so that no test registration is required. You write your tests, they get found automatically and everything is good and nice. Except… you have to list the files you want to reflect on, explicitly. D’s compiler can’t go reading the filesystem for you while it compiles, so a pre-build step of generating the file list was needed. I wrote a program to do it, but for several reasons it wasn’t ideal.

Now, as someone who actually wants people to use my library (and also to make it easier for myself), I had to find a way so that it would be easy to opt-in to unit-threaded. This is especially important since D has built-in unit tests, so the barrier for entry is low (which is a good thing!). While working on a far crazier idea to make it a no-brainer to use unit-threaded, I stumbled across my current solution: run the library as an executable binary.

The secret sauce that makes this work is dub, D’s package manager. It can download dependencies to compile and even run them with “dub run”. That way, a user need not even have to download it. The other dub feature that makes this feasible is that it supports “configurations” in which a package is built differently. And using those, I can have a regular library configuration and an alternative executable one. Since dub run can take a configuration as an argument, unit-threaded can now be run as a program with “dub run unit-threaded -c gen_ut_main”. And when it is, it generates the file that’s needed to make it all work.

So now all a user need to is add a declaration to their project’s dub.json file and “dub test” works as intended, using unit-threaded underneath, with named unit tests and all of them running in threads by default. Happy days.

Tagged , , , , ,