dlang | Átila on Code

2016/07/18

C is not magically fast, part 2

I wrote a blog post before about how C is not magically fast, but the sentiment that C has properties lacking in other languages that make it so is still widespread. It was with no surprise at all then that a colleague mentioned something resembling that recently at lunch break, and I attempted to tell him why it wasn’t (at least always) true.

He asked for an example where C++ would be faster, and I resorted to the old sort classic: C++ sort is faster than C’s qsort because of templates and inlining. He then asked me if I’d ever measured it myself, and since I hadn’t, I did just that after lunch. I included D as well because, well, it’s my favourite language. Taking the minimum time after ten runs each to sort a random array of 10M simple structs on my laptop yielded the results below:

D: 1.147s
C++: 1.723s
C: 1.789s

I expected C++ to be faster than C, I didn’t expect the difference to be so small. I expected D to be the same speed as C++, but for some reason it’s faster. I haven’t investigated the reason why for lack of interest, but maybe because of how strings are handled?

I used the same compiler backend for all 3 so that wouldn’t be an influence: LLVM. I also seeded all of them with the same number and used the same random number generator: the awful srand from C’s standard library. It’s terrible, but it’s the only easy way to do it in standard C and the same function is available from the other two languages. I also only timed the sort, not counting init code.

The code for all 3 implementations:

// sort.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/time.h>
#include <sys/resource.h>

typedef struct {
    int i;
    char* s;
} Foo;

double get_time() {
    struct timeval t;
    struct timezone tzp;
    gettimeofday(&t, &tzp);
    return t.tv_sec + t.tv_usec*1e-6;
}

int comp(const void* lhs_, const void* rhs_) {
    const Foo *lhs = (const Foo*)lhs_;
    const Foo *rhs = (const Foo*)rhs_;
    if(lhs->i < rhs->i) return -1;
    if(lhs->i > rhs->i) return 1;
    return strcmp(lhs->s, rhs->s);
}

int main(int argc, char* argv[]) {
    if(argc < 2) {
        fprintf(stderr, "Must pass in number of elements\n");
        return 1;
    }

    srand(1337);
    const int size = atoi(argv[1]);
    Foo* foos = malloc(size * sizeof(Foo));
    for(int i = 0; i < size; ++i) {
        foos[i].i = rand() % size;
        foos[i].s = malloc(100);
        sprintf(foos[i].s, "foo%dfoo", foos[i].i);
    }

    const double start = get_time();
    qsort(foos, size, sizeof(Foo), comp);
    const double end = get_time();
    printf("Sort time: %lf\n", end - start);

    free(foos);
    return 0;
}


// sort.cpp
#include <iostream>
#include <algorithm>
#include <string>
#include <vector>
#include <chrono>
#include <cstring>

using namespace std;
using namespace chrono;

struct Foo {
    int i;
    string s;

    bool operator<(const Foo& other) const noexcept {
        if(i < other.i) return true;
        if(i > other.i) return false;
        return s < other.s;
    }

};


template<typename CLOCK, typename START>
static double getElapsedSeconds(CLOCK clock, const START start) {
    //cast to ms first to get fractional amount of seconds
    return duration_cast<milliseconds>(clock.now() - start).count() / 1000.0;
}

#include <type_traits>
int main(int argc, char* argv[]) {
    if(argc < 2) {
        cerr << "Must pass in number of elements" << endl;
        return 1;
    }

    srand(1337);
    const int size = stoi(argv[1]);
    vector<Foo> foos(size);
    for(auto& foo: foos) {
        foo.i = rand() % size;
        foo.s = "foo"s + to_string(foo.i) + "foo"s;
    }

    high_resolution_clock clock;
    const auto start = clock.now();
    sort(foos.begin(), foos.end());
    cout << "Sort time: " << getElapsedSeconds(clock, start) << endl;
}


// sort.d
import std.stdio;
import std.exception;
import std.datetime;
import std.algorithm;
import std.conv;
import core.stdc.stdlib;


struct Foo {
    int i;
    string s;

    int opCmp(ref Foo other) const @safe pure nothrow {
        if(i < other.i) return -1;
        if(i > other.i) return 1;
        return s < other.s
            ? -1
            : (s > other.s ? 1 : 0);
    }
}

void main(string[] args) {
    enforce(args.length > 1, "Must pass in number of elements");
    srand(1337);
    immutable size = args[1].to!int;
    auto foos = new Foo[size];
    foreach(ref foo; foos) {
        foo.i = rand % size;
        foo.s = "foo" ~ foo.i.to!string ~ "foo";
    }

    auto sw = StopWatch();
    sw.start;
    sort(foos);
    sw.stop;
    writeln("Elapsed: ", cast(Duration)sw.peek);
}

Tagged c++, D, dlang

2016/05/24

Leave a comment

Programming

Write custom assertions whenever possible

I’ve been very interested in readable tests with great error messages recently. Mostly because they kept failing and I wanted the most information possible in order to quickly identify the cause. This is another reason why I like TDD: you see the test failing first, so if the error message isn’t great you’ll know straight away instead of months later.

The good testing frameworks provide a way of writing your own custom assertions. I’d never really looked into them that much before, but now I realize the error of my ways. Recently I wrote a test that contained this line:

fileName.exists.shouldBeTrue;

Readable, right? The problem is when it fails:

foo.d:42 - Expected: true
foo.d:42 -      Got: false

And now you have to go read the test and figure out what went wrong. It’s a lot better to get the information that a file was supposed to exist instead right away. So I wrote a custom assertion and was then ready to write this:

fileName.shouldExist;

With the corresponding failure message:

foo.d:42 - Expected /tmp/foo.txt to exist but it didn't

Now it’s a lot easier to pinpoint where the problem is. For starters, you would probably want to start checking the contents of the surrounding directory, having saved the time you would have had to spend figuring out what exactly was false.

Tagged dlang, testing

2016/05/02

1 Comment

Programming

main is just another function

Last week I talked about code that isn’t unit-testable, at least not by my definition of what a unit test is. In keeping with that, this blog post will talk about testing code that has side-effects.

Recently I’d come to accept a defeatist attitude where I couldn’t think of any other way to test that passing certain command-line options to a console binary had a certain effect. I mean, the whole point is to test that running the app differently will have different consequences. As a result I ended up only ever doing end-to-end testing. And… that’s simply not where I want to be.

Then it dawned on me: main is just another function. Granted, it has a special status that makes it so you can’t call it directly from a test, but nearly all my main functions lately have looked like this:

int main(string[] args) {
    try {
        doStuff(args);
        return 0;
    } catch(Exception ex) {
        stderr.writeln(ex.msg);
        return 1;
    }
}

It should be easy enough to translate this to the equivalent C++ in your head. With main so conveniently delegating to a function that does real work, I can now easily write integration tests. After all, is there really any difference between:

doStuff(["myapp", "--option", "arg1", "arg2"]);
// assert stuff happened

And (in, say, a shell script):

./myapp --option arg1 arg2
# assert stuff happened

I’d say no. This way I have one end-to-end test for sanity’s sake, and everything else being tested from the same binary by calling the “real” main function directly.

If your main doesn’t look like the one above, and you happen to be writing C or C++, there’s another technique: use the preprocessor to rename main to something else and call it from your integration/component test. And then, as they say, Bob’s your uncle.

Happy testing!

Tagged D, dlang, integration testing, testing practices

2016/02/29

Leave a comment

Programming

unit-threaded: now an executable library

It’s one of those ideas that seem obvious in retrospect, but somehow only ocurred to me last week. Let me explain.

I wrote a unit testing library in D called unit-threaded. It uses D’s compile-time reflection capabilities so that no test registration is required. You write your tests, they get found automatically and everything is good and nice. Except… you have to list the files you want to reflect on, explicitly. D’s compiler can’t go reading the filesystem for you while it compiles, so a pre-build step of generating the file list was needed. I wrote a program to do it, but for several reasons it wasn’t ideal.

Now, as someone who actually wants people to use my library (and also to make it easier for myself), I had to find a way so that it would be easy to opt-in to unit-threaded. This is especially important since D has built-in unit tests, so the barrier for entry is low (which is a good thing!). While working on a far crazier idea to make it a no-brainer to use unit-threaded, I stumbled across my current solution: run the library as an executable binary.

The secret sauce that makes this work is dub, D’s package manager. It can download dependencies to compile and even run them with “dub run”. That way, a user need not even have to download it. The other dub feature that makes this feasible is that it supports “configurations” in which a package is built differently. And using those, I can have a regular library configuration and an alternative executable one. Since dub run can take a configuration as an argument, unit-threaded can now be run as a program with “dub run unit-threaded -c gen_ut_main”. And when it is, it generates the file that’s needed to make it all work.

So now all a user need to is add a declaration to their project’s dub.json file and “dub test” works as intended, using unit-threaded underneath, with named unit tests and all of them running in threads by default. Happy days.

Tagged D, dlang, dub, Programming, testing, unit test

2015/11/02

Leave a comment

Uncategorized

My first D Improvement Proposal

After over a year of thinking about this (I remember bringing this up at DConf 2014), I finally wrote a DIP about introducing what I call “static inheritance” to D.

The principle is similar to C++ concepts: D already has a way of requiring a certain compile-time interface for templates to be instantiated, the most common being isInputRange. What is lacking right now in my opinion is helpful compiler error messages for when a type was intended to be, e.g. an input range but isn’t due to programmer error.

I tried a library solution first, assuming this would be easier to get accepted than a language change. Since there was nearly 0 interest, I had to write a DIP. Here’s hoping it’s more succesful.

Tagged D, dlang

2015/02/20

3 Comments

Uncategorized

Haskell actually does change the way you think

Last year I started trying to learn Haskell. There have been many ups and downs, but my only Haskell project so far is on hold while I work on other things. I’m not sure yet if I’d choose to use Haskell in production. The problems I had (and the time it’s taken so far) writing a simple server make me think twice, but that’s a story for another blog post.

The thing is, the whole reason I decided to learn Haskell were the many reports that it made me you think differently. As much as I like D, learning it was easy and essentially I’m using it as a better C++. There are things I routinely do in D that I wouldn’t have thought of or bother in C++ because they’re easier. But it’s not really changed my brain.

I didn’t think Haskell had either, until I started thinking of solutions to problems I was having in D in Haskell ways. I’m currently working on a build system, and since the configuration language is D, it has to be compiled. So I have interesting problems to solve with regards to what runs when: compile-time or run-time. Next thing I know I’m thinking of lazy evaluation, thunks, and the IO monad. Some things aren’t possible to be evaluated at compile-time in D. So I replaced a value with a function that when run (i.e. at run-time) would produce that value. And (modulo current CTFE limitations)… it works! I’m even thinking of making a wrapper type that composes nicely… (sound familiar?)

So, thanks Haskell. You made my head hurt more than anything I’ve tried learning since Physics, but apparently you’ve made me a better programmer.

Tagged build systems, c++, dlang, haskell, metaprogramming, monad, Programming

2015/02/11

Leave a comment

Uncategorized

The craziest code I ever wrote

A few years ago at work my buddy Jeff was as usual trying to do something in Go. I can’t remember why, but he wanted to arrange text strings in memory so that they were all contiguous. I said something about C++ and he remarked that the only thing C++11 could do that Go couldn’t would be perhaps to do this work at compile-time. I hadn’t learned D yet (which would have made the task trivial), so I spent the rest of the day writing the monstrosity below for “teh lulz”. It ended up causing my first ever question on Stackoverflow. “Enjoy” the code:

//Arrange strings contiguously in memory at compile-time from string literals.
//All free functions prefixed with "my" to faciliate grepping the symbol tree
//(none of them should show up).

#include <iostream>

using std::size_t;

//wrapper for const char* to "allocate" space for it at compile-time
template<size_t N>
struct String {
    //C arrays can only be initialised with a comma-delimited list
    //of values in curly braces. Good thing the compiler expands
    //parameter packs into comma-delimited lists. Now we just have
    //to get a parameter pack of char into the constructor.
    template<typename... Args>
    constexpr String(Args... args):_str{ args... } { }
    const char _str[N];
};

//takes variadic number of chars, creates String object from it.
//i.e. myMakeStringFromChars('f', 'o', 'o', '') -> String<4>::_str = "foo"
template<typename... Args>
constexpr auto myMakeStringFromChars(Args... args) -> String<sizeof...(Args)> {
    return String<sizeof...(args)>(args...);
}

//This struct is here just because the iteration is going up instead of
//down. The solution was to mix traditional template metaprogramming
//with constexpr to be able to terminate the recursion since the template
//parameter N is needed in order to return the right-sized String<N>.
//This class exists only to dispatch on the recursion being finished or not.
//The default below continues recursion.
template<bool TERMINATE>
struct RecurseOrStop {
    template<size_t N, size_t I, typename... Args>
    static constexpr String<N> recurseOrStop(const char* str, Args... args);
};

//Specialisation to terminate recursion when all characters have been
//stripped from the string and converted to a variadic template parameter pack.
template<>
struct RecurseOrStop<true> {
    template<size_t N, size_t I, typename... Args>
    static constexpr String<N> recurseOrStop(const char* str, Args... args);
};

//Actual function to recurse over the string and turn it into a variadic
//parameter list of characters.
//Named differently to avoid infinite recursion.
template<size_t N, size_t I = 0, typename... Args>
constexpr String<N> myRecurseOrStop(const char* str, Args... args) {
    //template needed after :: since the compiler needs to distinguish
    //between recurseOrStop being a function template with 2 paramaters
    //or an enum being compared to N (recurseOrStop < N)
    return RecurseOrStop<I == N>::template recurseOrStop<N, I>(str, args...);
}

//implementation of the declaration above
//add a character to the end of the parameter pack and recurse to next character.
template<bool TERMINATE>
template<size_t N, size_t I, typename... Args>
constexpr String<N> RecurseOrStop<TERMINATE>::recurseOrStop(const char* str,
                                                            Args... args) {
    return myRecurseOrStop<N, I + 1>(str, args..., str[I]);
}

//implementation of the declaration above
//terminate recursion and construct string from full list of characters.
template<size_t N, size_t I, typename... Args>
constexpr String<N> RecurseOrStop<true>::recurseOrStop(const char* str,
                                                       Args... args) {
    return myMakeStringFromChars(args...);
}

//takes a compile-time static string literal and returns String<N> from it
//this happens by transforming the string literal into a variadic paramater
//pack of char.
//i.e. myMakeString("foo") -> calls myMakeStringFromChars('f', 'o', 'o', '');
template<size_t N>
constexpr String<N> myMakeString(const char (&str)[N]) {
    return myRecurseOrStop<N>(str);
}

//Simple tuple implementation. The only reason std::tuple isn't being used
//is because its only constexpr constructor is the default constructor.
//We need a constexpr constructor to be able to do compile-time shenanigans,
//and it's easier to roll our own tuple than to edit the standard library code.

//use MyTupleLeaf to construct MyTuple and make sure the order in memory
//is the same as the order of the variadic parameter pack passed to MyTuple.
template<typename T>
struct MyTupleLeaf {
    constexpr MyTupleLeaf(T value):_value(value) { }
    T _value;
};

//Use MyTupleLeaf implementation to define MyTuple.
//Won't work if used with 2 String<> objects of the same size but this
//is just a toy implementation anyway. Multiple inheritance guarantees
//data in the same order in memory as the variadic parameters.
template<typename... Args>
struct MyTuple: public MyTupleLeaf<Args>... {
    constexpr MyTuple(Args... args):MyTupleLeaf<Args>(args)... { }
};

//Helper function akin to std::make_tuple. Needed since functions can deduce
//types from parameter values, but classes can't.
template<typename... Args>
constexpr MyTuple<Args...> myMakeTuple(Args... args) {
    return MyTuple<Args...>(args...);
}

//Takes a variadic list of string literals and returns a tuple of String<> objects.
//These will be contiguous in memory. Trailing '' adds 1 to the size of each string.
//i.e. ("foo", "foobar") -> (const char (&arg1)[4], const char (&arg2)[7]) params ->
//                       ->  MyTuple<String<4>, String<7>> return value
template<size_t... Sizes>
constexpr auto myMakeStrings(const char (&...args)[Sizes]) -> MyTuple<String<Sizes>...> {
    //expands into myMakeTuple(myMakeString(arg1), myMakeString(arg2), ...)
    return myMakeTuple(myMakeString(args)...);
}

//Prints tuple of strings
template<typename T> //just to avoid typing the tuple type of the strings param
void printStrings(const T& strings) {
    //No std::get or any other helpers for MyTuple, so intead just cast it to
    //const char* to explore its layout in memory. We could add iterators to
    //myTuple and do "for(auto data: strings)" for ease of use, but the whole
    //point of this exercise is the memory layout and nothing makes that clearer
    //than the ugly cast below.
    const char* const chars = reinterpret_cast<const char*>(&strings);
    std::cout << "Printing strings of total size " << sizeof(strings);
    std::cout << " bytes:\n";
    std::cout << "-------------------------------\n";

    for(size_t i = 0; i < sizeof(strings); ++i) {
        chars[i] == '' ? std::cout << "\n" : std::cout << chars[i];
    }

    std::cout << "-------------------------------\n";
    std::cout << "\n\n";
}

int main() {
    {
        constexpr auto strings = myMakeStrings("foo", "foobar",
                                               "strings at compile time");
        printStrings(strings);
    }

    {
        constexpr auto strings = myMakeStrings("Some more strings",
                                               "just to show Jeff to not try",
                                               "to challenge C++11 again :P",
                                               "with more",
                                               "to show this is variadic");
        printStrings(strings);
    }

    std::cout << "Running 'objdump -t |grep my' should show that none of the\n";
    std::cout << "functions defined in this file (except printStrings()) are in\n";
    std::cout << "the executable. All computations are done by the compiler at\n";
    std::cout << "compile-time. printStrings() executes at run-time.\n";
}

Tagged c++, c++11, constexpr, D, dlang, Go, golang, metaprogramming

2014/10/16

4 Comments

Programming, Uncategorized

Computer languages: ordering my favourites

This isn’t even remotely supposed to be based on facts, evidence, benchmarks or anything like that. You could even disagree with what are “scripting languages” or not. All of the below just reflect my personal preferences. In any case, here’s my list of favourite computer languages, divided into two categories: scripting and, err… I guess “not scripting”.

My favourite scripting languages, in order:

Python
Ruby
Emacs Lisp
Lua
Powershell
Perl
bash/zsh
m4
Microsoft batch files
Tcl

I haven’t written enough Ruby yet to really know. I suspect I’d like it more than Python but at the moment I just don’t have enough experience with it to know its warts. Even I’m surprised there’s something below Perl here but Tcl really is that bad. If you’re wondering where PHP is, well I don’t know because I’ve never written any but from what I’ve seen and heard I’d expect it to be (in my opinion of course) better than Tcl and worse than Perl. I’m surprised how high Perl is given my extreme dislike for it. When I started thinking about it I realised there’s far far worse.

My favourite non-scripting languages, in order:

D
C++
Haskell
Common Lisp
Rust
Java
Go
Objective C
C
Pascal
Fortran
Basic / Visual Basic

I’ve never used Scheme, if that explains where Common Lisp is. I’m still learning Haskell so not too sure there. As for Rust, I’ve never written a line of code in it and yet I think I can confidently place it in the list, especially with respect to Go. It might place higher than C++ but I don’t know yet.

Tagged c++, D, dlang, Fortran, golang, haskell, Java, Lisp, Pascal, Perl, Python, Ruby, rust, rustlang

2014/05/30

Leave a comment

Uncategorized

To learn BDD with Cucumber, you must first learn BDD with Cucumber.

So I read about Cucumber a while back and was intrigued, but never had time to properly play with it. While writing my MQTT broker, however, I kept getting annoyed at breaking functionality that wasn’t caught by unit tests. The reason being that the internals were fine, the problems I was creating had to do with the actual business of sending packets. But I was busy so I just dealt with it.

A few weeks ago I read a book about BDD with Cucumber and RSpec but for me it was a bit confusing. The reason being that since the step definitions, unit tests and implementation were all written in Ruby, it was hard for me to distinguish which part was what in the whole BDD/TDD concentric cycles. Even then, I went back to that MQTT project and wrote two Cucumber features (it needs a lot more but since it works I stopped there). These were easy enough to get going: essentially the step definitions run the broker in another process, connect to it over TCP and send packets to it, evaluating if the response was the expected one or not. Pretty cool stuff, and it works! It’s what I should have been doing all along.

So then I started thinking about learning BDD (after all, I wrote the features for MQTT afterwards) by using it on a D project. So I investigated how I could call D code from my step definitions. After spending the better part of an afternoon playing with Thrift and binding Ruby to D, I decided that the best way to go about this was to implement the Cucumber wire protocol. That way a server would listen to JSON requests from Cucumber, call D functions and everything would work. Brilliant.

I was in for a surprise though, me who’s used to implementing protocols after reading an RFC or two. Instead of a usual protocol definition all I had to go on was… Cucumber features! How meta. So I’d use Cucumber to know how to implement my Cucumber server. A word to anyone wanting to do this in another language: there’s hardly any documentation on how to implement the wire protocol. Whenever I got lost and/or confused I just looked at the C++ implementation for guidance. It was there that I found a git submodule with all of Cucumber’s features. Basically, you need to implement all of the “core” features first (therefore ensuring that step definitions actually work), and only then do you get to implement the protocol server itself.

So I wanted to be able to write Cucumber step definitions in D so I could learn and apply BDD to my next project. As it turned out, I learned BDD implementing the wire protocol itself. It took a while to get the hang of transitioning from writing a step definition to unit testing but I think I’m there now. There might be a lot more Cucumber in my future. I might also implement the entirety of Cucumber’s features in D as well, I’m not sure yet.

My implementation is here.

Tagged bdd, cucumber, D, dlang, MQTT, Programming, rspec, Ruby, TCP, tdd, testing

2014/03/31

Leave a comment

Uncategorized

Increasing performance with static polymorphism (and other neat tricks)

So I wrote a serialisation library in D. I initially wrote it to try and understand and see the benefits of the compile-time reflection available in the language. I based it off of the library I wrote in C++11, and as such my brain already had the design in its head. This design was in turn inspired by similar code a colleague had written at work. So when I wrote the code, I used dynamic polymorphism, which in D means classes.

That usage of the new operator bugged me though. In real code (such as here) the objects doing the serialisation were always actually short-lived. Enter function, do the job, exit. It seemed like a waste to allocate memory on the garbage-collected heap and I started thinking about transforming them into structs, which can live on the stack. Then it dawned on me: dynamic polymorphism wasn’t ever actually needed. It’s never the case that the code doesn’t know whether it wants to marshall or unmarshall a value. That decision is always made at compile-time so the the cost of the virtual functions and garbage collection was being paid for nothing. The other place that was using GC allocation was the serialiser itself, which was appending to a dynamic array. It was the simplest thing that would work, so that’s what I started out with. Of course, the same realisation could’ve happened whilst maintaing the C++ version and there also it would’ve been possible to convert, but it’s so much pain do it in C++ and so easy in D that it just happens naturally.

I converted the codebase to use structs and template functions instead, breaking backwards compatibility with the old (V0.4.x) version of Cerealed. In the process, I ended up discovering the Appender struct in std.array. This happened as a result of trying to do policy-based design so I could separate the algorithm (in this case, how to marshall) from the process of actually writing to an OutputRange (see std.range). I had also recently read about warp and the new ScopeBuffer in Phobos (the D standard library) and wanted to see how these new additions would affect performance. I wrote a small, not particularly well-written test program, which can be seen in this gist. I used both gdc and dmd. I left ldc out because its frontend (at least the package currently available on Arch Linux) is older and can’t compile the code, and I didn’t feel like making alterations just to see how well it would do.

The results are presented in the tables below. I left out standard deviations because they were too small for nearly every measurement, so the values are just averages of a few different runs. “Classes” is the original V0.4.1 Cerealed OOP code, “Structs” is the code with structs instead of classes but still using a dynamic array, “Appender” and “ScopeBuffer” use the Phobos structs mentioned above. Since ScopeBuffer isn’t part of my distribution of Phobos, I copied it to the project instead. That way it can be compiled by other people who, like me, don’t compile their own versions of the compiler and standard library. I did 25M loops for serialisation and 75M loops for deserialisation. I compiled using (g)dmd with options -O -release -inline -noboundscheck.

Cerealiser (seconds, lower is better)	dmd	gdc
Classes	22.8	16.1
Structs (dynamic array)	19.9	14.0
Appender	16.1	9.5
ScopeBuffer	4.6	4.5

Decerealiser (seconds, lower is better)	dmd	gdc
Classes	21.7	17.3
Structs	8.9	9.9

For unmarshalling, I only compared classes vs. structs. The reason is that unmarshalling didn’t allocate memory (it uses whatever slice is passed to it so allocation is the responsibility of the client code), so there wasn’t much I could do to improve performance. Even then, that slight alteration causes a dramatic reduction in the time spent deserialising, with dmd making it more than twice as fast. Inlining is great!

For marshalling, the results between the slowest and fastest version are nearly a factor of 4-5! ScopeBuffer made such a difference, despite me using, on purpose, a static array that was too small to hold the struct so it had to allocate. I tried with a larger array and there was no difference in performance. Most of the results shows gdc generating more efficient code for most cases. The real lesson here is that using the right algorithm for the job (in this case, ScopeBuffer), makes a much larger difference than everything else.

I’m really happy with the results. None of the new V0.5.0 and newer Cerealed code needs to use the garbage collector anymore and I even added a convenience function called cerealise to use ScopeBuffer. It works by passing in a lambda to act on the resulting byte array and is templated on the size of the static array, which has a default of 32 bytes.

I could go back and do something similar for the C++ version but… that would be a lot more work (moving everything into the headers alone would take quite some time), I don’t really have any projects that require super fast serialisation and these days I just want to hack on D anyway. I still want to finish the networking part of my game (in progress and won’t compile on Windows, but there are binaries for the old version on sourceforge), which means some more C++, but after that I’ll avoid it when I can. Unless the alternative is C, of course. I really dislike C.

All in all, it’s unlikely the bottleneck of any app using Cerealed will be the serialisation, but if it is… it just got a whole lot faster.

Tagged dlang, dynamic array, serialisation, structs

	Tabitha Levine on The main function should be…
	Project Highlight: D… on The joys of translating C++…
	ricardo on Seriously, just use D to call…
	Daniel Stracaboško on Type inference debate: a C++ c…
	Martyn S on Boden, Flutter, React Nat…

Átila on Code

Code, testing and systems programming

Tag Archives: dlang

C is not magically fast, part 2

Write custom assertions whenever possible

main is just another function

unit-threaded: now an executable library

My first D Improvement Proposal

Haskell actually does change the way you think

The craziest code I ever wrote

Computer languages: ordering my favourites

To learn BDD with Cucumber, you must first learn BDD with Cucumber.

Increasing performance with static polymorphism (and other neat tricks)