Tag Archives: coding

When you can’t (and shouldn’t) unit test

I’m a unit test aficionado, and, as such, have attempted to unit test what really shouldn’t be. It’s common to get excited by a new hammer and then seeing nails everywhere, and unit testing can get out of hand (cough! mocks! cough!).

I still believe that the best tests are free from side-effects, deterministic and fast. What’s important to me isn’t whether or not this fits someone’s definition of what a unit test is, but that these attributes enable the absence of slow and/or flaky tests. There is however another class of tests that are the bane of my existence: brittle tests. These are the ones that break when you change the production code despite your app/library still working as intended. Sometimes, insisting on unit tests means they break for no reason.

Let’s say we’re writing a new build system. Let’s also say that said build system works like CMake does and spits out build files for other build systems such as ninja or make. Our unit test fan comes along and writes a test like this:

assert make_output == "all: foo\nfoo: foo.c\n\tgcc -o foo foo.c"

I believe this to be a bad test, and the reason why is that it’s checking the implementation instead of the behaviour of the production code. Consider what happens when the implementation is changed without affecting behaviour:

all: foo\nfoo: foo.c\n\tgcc -o $@ $<

The behaviour is the same as before: any time `foo.c` is changed, `foo` will get recompiled. The implementation not only isn’t the same, it’s arguably better now, and yet the assertion in the test would fail. I think we can all agree that the ROI for this test is negative if this is all it takes to break it.

The orthodox unit test approach to situations like these is to mock the service in question, except most people don’t get the memo that you should only mock code you own. We don’t control GNU make, so we shouldn’t be doing that. It’s impossible to copy make exactly in a mock/stub/etc. and it’s foolish to even try. We (mostly) don’t care about the string that our code outputs, we care that make interprets that string with the correct semantics.

My conclusion is that I shouldn’t even try to write unit tests for code like this. Integration tests exist for a reason.

Tagged , , , , ,

The joys of translating C++’s std::function to D

I wrote a program to translate C headers to D. Translating C was actually more challenging than I thought; I even got to learn things I didn’t know about the language even though I’ve known it for 24 years. The problems that I encountered were all minor though, and to the extent of my knowledge have all been resolved (modulo bugs).

C++ is a much larger language, so the effort should be considerably more. I didn’t expect it to be as hard as it’s been however, and in this blog I want to talk about how “interesting” it was to translate C++11’s std::function by hand.

The first issue for most languages would be that it relies on template specialisation:

template<typename>
class function;  // doesn't have a definition anywhere

template<typename R, typename... Args>
class function<R(Args...)> { /* ... */ }

This is a way of constraining the std::function template to only accept function types. Perhaps surprisingly to some, the C++ syntax for the type of a function that takes two ints and returns a double is double(int, int). I doubt most people see this outside of C++ standard library templates. If it’s still confusing, think of double(int, int) as the type that is obtained by deferencing a pointer of type double(*)(int, int).

D is, as far as I know, the only other language other than C++ to support partial template specialisation. There are however two immediate problems:

  • function is a keyword in D
  • There is no D syntax for a function type

I can mitigate the name issue by calling the symbol function_ instead; however, this will affect name mangling, meaning nothing will actually link. D does have pragma(mangle) to tell the compiler how to mangle symbols, but std::function is a template; it doesn’t have any mangling until it’s instantiated. Let’s worry about that later and call the template function_ for now.

The second issue can be worked around:

// C++: `using funPtr = double(*)(int, int);`
alias funPtr = double function(int, int);
// C++: `using funType = double(int, int);`
alias funType = typeof(*funPtr.init);

As in C++, the function type is the type one gets from deferencing a function pointer. Unlike C++, currently there’s no syntax to write it directly. First attempt:

// helper to automate getting an alias to a function type
template FunctionType(R, Args...) {
    alias ptr = R function(Args);
    alias FunctionType = typeof(*ptr.init);
}

struct function_(T);
struct function_(T: FunctionType!(R, Args), R, Args...) { }

This doesn’t work, probably due to this bug preventing the helper template FunctionType from working as intended. Let’s forget the template constraint:

extern(C++, "std") {
    struct function_(T) {
        import std.traits: ReturnType, Parameters;
        alias R = ReturnType!T;
        alias Args = Parameters!T;
        // In C++: `R operator()(Args) const`;
        R opCall(Args) const;
    }
}

void main() {
    alias funPtr = double function(double);
    alias funType = typeof(*funPtr.init);
    function_!funType f;
    double result = f(3.3);
}

This compiles but it doesn’t link: there’s an undefined reference to std::function_::operator()(double) const. Looking at the symbols in the object files using nm, we see that g++ emitted _ZNKSt8functionIFddEEclEd but dmd is trying to link to _ZNKSt9function_IFddEEclEd. As expected, name mangling issues related to renaming the symbol.

We could manually add a pragma(mangle) to tell D how to mangle the operator for the double(double) template instantiation, but that solution doesn’t scale. CTFE (constexpr if you speak C++ but not D) to the rescue!

// snip - as before
pragma(mangle, opCall.mangleof.fixMangling)
R opCall(Args) const;

// (elsewhere at file scope)
string fixMangling(string str) {
    import std.array: replace;
    return str.replace("9function_", "8function");
}

What’s going on here is an abuse of D’s compile-time power. The .mangleof property is a compile-time string that tells us how a symbol is going to be mangled. We pass this string to the fixMangling function which is evaluated at compile-time and fed back to the compiler telling it what symbol name to actually use. Notice that function_ is still a template, meaning .mangleof has a different value for each instantiation. It’s… almost magical. Hacky, but magical.

The final code compiles and links. Actually creating a valid std::function<double(double)> from D code is left as an exercise to the reader.

 

Tagged , , , , , , ,

Operator overloading is a good thing (TM)

Brains are weird things. I used to be a private maths tutor, and I always found it amazing how a little change in notation could sometimes manage to completely confuse a student. Notation itself seems to me to be a major impediment for the majority of people to like or be good at maths. I had fun sometimes replacing the x in an equation with a drawing of an apple to try and get the point across that the actual name (or shape!) of a variable didn’t matter, that it was just standing in for something else.

Programmers are more often than not mathematically inclined, and yet a similar phenomenon seems to occur with the “shape” of certain functions, i.e. operators. For reasons that make us much sense to me as x confusing maths students, the fact that a function has a name that has non-alphanumeric characters in them make them particularly weird. So weird that programmers shouldn’t be allowed to defined functions with those names, only the language designers. That’s always a problem for me – languages that don’t give you the same power as the designers are Blub as far as I’m concerned. But every now and again I see a blost post touting the advantages of some language or other, listing the lack of operator overloading as a bonus.

I don’t even understand the common arguments against operator overloading. One is that somehow “a + b” is now confusing, because it’s not clear what the code does. How is that different from having to read the documentation/implementation of “a.add(b)”? If it’s C++ and “a + b” shows up, anyone who doesn’t read it as “a.operator+(b)” or “operator+(a, b)” with built-in implementations of operator+ for integers and floating point numbers needs to brush up on their C++. And then there’s the fact that that particular operator is overloaded anyway, even in C – the compiler emits different instructions for floats and integers, and its behaviour even depends on the signedness of ints.

Then there’s the complaint that one could make operator+ do something stupid like subtract. Because, you know, this is totally impossible:

int add(int i, int j) {
    return i - j;}

Some would say that operator overloading is limited in applicability since only numerical objects and matrices really need them. But used with care, it might just make sense:

auto path = "foo" / "bar" / "baz";

Or in the C++ ranges by Eric Niebler:

using namespace ranges;
int sum = accumulate(view::ints(1)
                   | view::transform([](int i){return i*i;})
                   | view::take(10), 0);

I’d say both of those previous examples are not only readable, but more readable due to use of operator overloading. As I’ve learned however, readability is in the eye of the beholder.

All in all, it confuses me when I hear/read that lacking operator overloading makes a language simpler. It’s just allowing functions to have “special” names and special syntax to call them (or in Haskell, not even that). Why would the names of functions make code so hard to read for some people? I guess you’d have to ask my old maths students.

Tagged , , , ,

The main function should be shunned

The main function (in languages that have it) is…. special. It’s the entry point of the program by convention, there can only be one of them in all the object files being linked, and you can’t run a program without it. And it’s inflexible.

Its presence means that the final output has to be an executable. It’s likely however, that the executable in question might have code that others might rather reuse than rewrite, but they won’t be able to use it in their own executables. There’s already a main function in there. Before clang nobody seemed to stumble on the idea that a compiler as a library would be a great idea. And yet…

This is why I’m now advocating for always putting the main function of an executable in its own file, all by itself. And also that it do the least amount of work possible for maximum flexibility. This way, any executable project is one excluded file away in the build system from being used as a library. This is how I’d start a, say, C++ executable project from scratch today:

#include "runtime.hpp"
#include <iostream>
#include <stdexcept>

int main(int argc, const char* argv[]) {
    try {
        run(argc, argv); // "real" main
        return 0;
    } catch(const std::exception& ex) {
        std::cout << "Oops: " << ex.what() << std::endl;
        return 1;
    }
}

In fact, I think I’ll go write an Emacs snippet for that right now.

Tagged ,

API clarity with types

API design is hard. Really hard. It’s one of the reasons I like TDD – it forces you to use the API as a regular client and it usually comes out all the better for it. At a previous job we’d design APIs as C headers, review them without implementation and call it done. Not one of those didn’t have to change as soon as we tried implementing them.

The Win32 API is rife with examples of what not to do: functions with 12 parameters aren’t uncommon. Another API no-no is several parameters of the same type – which means which? This is ok:

auto p = Point(2, 3);

It’s obvious that 2 is the x coordinate and 3 is y. But:

foo("foo", "bar", "baz", "quux", true);

Sure, the actual strings passed don’t help – but what does true mean in this context? Languages like Python get around this by naming arguments at the call site, but that’s not a feature of most curly brace/semicolon languages.

I semi-recently forked and extended the D wrapper for nanomsg. The original C API copies the Berkely sockets API, for reasons I don’t quite understand. That means that a socket must be created, then bound or connect to another socket. In an OOP-ish language we’d like to just have a contructor deal with that for us. Unfortunately, there’s no way to disambiguate if we want to connect to an address or bind to it – in both cases a string is passed. My first attempt was to follow in Java’s footsteps and use static methods for creation (simplified for the blog post):

struct NanoSocket {
    static NanoSocket createBound(string uri) { /* ... */ }
    static NanoSocket createConnected(string uri) { /* ... */ }
    private this() { /* ... */ } // constructor
}

I never did feel comfortable: object creation shouldn’t look *weird*. But I think Haskell has forever changed by brain, so types to the rescue:

struct NanoSocket {
    this(ConnectTo connectTo) { /* ... */ }
    this(BindTo bindTo) { /* ... */ }
}

struct ConnectTo {
    string uri;
}

struct BindTo {
    string uri;
}

I encountered something similar when I implemented a method on NanoSocket called trySend. It takes two durations: a total time to try for, and an interval to wait to try again. Most people would write it like so:

void trySend(ubyte[] data, 
             Duration totalDuration, 
             Duration retryDuration);

At the call site clients might get confused about which order the durations are in. I think this is much better, since there’s no way to get it wrong:

void trySend(ubyte[] data, 
             TotalDuration totalDuration, 
             RetryDuration retryDuration);

struct TotalDuration {
    Duration duration;
}

struct RetryDuration {
    Duration duration;
}

What do you think?

Tagged , , , , , , , ,

The importance of making the test fail

TDD isn’t for everyone. People work in different ways and their brains even more so, and I think I agree with Bertrand Meyer in that whether you write the test first or last, the important thing is that the test gets written. Even for those of us for whom TDD works, it’s not always applicable. It takes experience to know when or not to do it. For me, whenever I’m not sure of exactly I want to do and am doing exploratory work, I reach for a REPL when I can and don’t even think of writing tests. Even then, by the time I’ve figured out what to do I usually write tests straight afterwards. But that’s me.

However, when fixing bugs I think “TDD” (there’s not any design going on, really) should be almost mandatory. Almost, because I thought of a way that works that doesn’t need the test to be written first, but it’s more cumbersome. More on that later.

Tests are code. Code is buggy. Ergo… tests will contain bugs. So can we trust our tests? Yes, and especially so if we’re careful. First of all, tests are usually a lot smaller than the code they test (they should be!). Less code means fewer bugs on average. If that doesn’t give you a sense of security, it shouldn’t. The important thing is making sure that it’s very difficult to introduce simultaneous bugs in the test and production code that cancel each other out. Unless the tests are tightly coupled with the production code, that comes essentially for free.

Writing the test to reproduce a bug is important because we get to see it go from red to green, which is what gives us confidence. I’ve lost count of how many fake greens I’ve had due to tests that weren’t part of the suite, code that wasn’t even compiled, bugs in the test code, and many other reasons. Making it fail is important. Making changes in a different codebase (the production code) and then the test passing means we’ve actually done something. If at any point things don’t work as they should (red -> green) then we’ve made a mistake somewhere. The fix is wrong, the test is buggy, or our understanding of the problem and what causes it might be completely off.

Reproducing the bug accurately also means that we don’t start with the wrong foot. You might think you know what’s causing the bug, but what better way than to write a failing test? Now, it’s true that one can fix the bug first, write the test later and use the VCS to go back in time and do the red/green dance. But to me that sounds like a lot more work.

Whether you write tests before of after the production code, make sure that at least one test fails without the bugfix. Even if by just changing a number in the production code. I get very suspicious when the test suite is green for a while. Nobody writes correct code that often. I know I don’t.

Tagged , , ,