I guess I am a GNU Make zealot now

Posted on .

We’ve been working on a new project at my day job for several months now and we’ve spent part of that time writing a new build system for our software. Our old system was actually very old, based on imake (with sugar on top) and too complicated for our tastes. It also had several important technical limitations, the biggest one being parallel builds not working very well or at all, and it was hard to fix them.

We evaluated different possibilities like SCons or CMake but finally went with good ol' GNU Make, and I’ve come to appreciate it more and more as I dig into it. I believe GNU Make sits at an intermediate level between plain POSIX Make and a more complex system like autotools or the previously mentioned SCons or CMake. It has several features that make it a bit more complex, powerful and useful than plain POSIX Make, and I think these features could be leveraged to create a higher level build system if needed.

Specifically, I perceive there are two major details in GNU Make that are game changers among many other minor details. The first one is multiple passes. In GNU Make, a Makefile can contain “include” directives that tell make to read the contents of another Makefile. But here comes the shift: if the included file does not exist, but the available Makefile rules explain how to create it, it will be created and the whole process will start again. It may not look important but, together with the “wildcard” function and the C and C++ compiler -M family of options, it’s very powerful. Let me give you an example. Imagine a simple C or C++ program in which every source file is contained in a single directory together with the Makefile. In GNU Make it’s perfectly possible to write a short and generic Makefile that will build the program and doesn’t need to change ever. That Makefile can dynamically obtain the list of source files with the “wildcard” function, it can create a dependency file for each one of them using the compiler’s -M family of options (these dependency files are essentially mini-Makefiles) to properly calculate dependencies and the right build order, and GNU Make will compile the objects and link them in an executable file for you. You can then proceed to add or remove source code files as needed, reorganizing code and the Makefile would not change. Such a Makefile would look very similar to the one I rewrote for my darts miniproject.

The second important detail is the “eval” function. GNU Make allows creating macros that receive arguments and, in the most crude version, generate text. The “eval” function allows this text to be evaluated as part of the Makefile, which essentially means rules can be generated on the fly depending on runtime conditions. This can be very powerful and is yet another trick in a bag that already contained the previous multipass inclusion explained above, a number of functions to manipulate text, word lists and strings, conditional evaluation of parts of the Makefile and pattern rules.

Our old build system could build the whole set of libraries and executable files in about 15 minutes. After we changed the build system to something much simpler using GNU Make, with parallel build support always in mind, our build time was down to 4 minutes in the same 8-core machine.

But we didn’t stop there. The first version of the new build system used recursive Makefiles. I sincerely believe from the human point of view recursive Makefiles are easier to reason about than non-recursive Makefiles when you have a complex directory hierarchy in a non-toy, real-world complex project, and that’s why so many people and projects use them, despite most GNU Make users having at least heard of the Recursive Make Considered Harmful paper (from 1998!). Obviously, when we were naive new GNU Make users and we started creating the new build system, we did it recursively. And while GNU Make gave us a lot of flexibility and power, and we were very satisfied with the move and the results, we could see how the recursive Makefiles approach was giving us a couple of minor headaches with parallel builds and needlessly increasing build times by a small fraction.

At that moment and for the first time I read the paper instead of just skimming through it and I could only feel like I was rediscovering an old treasure. Every problem we were having with our recursive Makefiles in our real-world complex project was reflected there, and the paper offered at least hints at possible solutions and explained the basics of how to create a non-recursive build system using GNU Make.

Our non-recursive solution would have to be a bit more complex but it could work, so we started a second journey writing another new build system based on GNU Make, this time non-recursive. The new new build system (not a typo) cut our build time down to less than 3 minutes on the same machine (the actual time is around 2:40 but it varies a bit up and down as we modify the codebase) and, while more complex in logic, it’s actually shorter in the total number of build system lines.

In fact, I believe the total build time could be brought down to around 2 minutes if “gnatmake” worked better. “gnatmake” is part of the GCC suite and it’s the only reasonable way to compile Ada code in Linux. Our code is a mix of C++ and Ada. I could write a whole post criticizing “gnatmake” but that would be a bit off-topic here and would make this post even longer. Let’s just say the compiler normally needs to collaborate with the build system, like “gcc” and “g++” do by being able to analyze dependencies, or be integrated with it, and “gnatmake” is not very good at that. With both being GNU projects, it’s a bit surprising to see they don’t work very well together. “gnatmake” cannot analyze source code dependencies for Ada without actually compiling code, which is a big drawback, and can’t effectively communicate with GNU Make about the desired level of build parallelism (i.e. the -j option). The first of those points is crucial. Calculating and generating the dependency files for C++ takes around 10 seconds and, after that, “top” reveals the build system fully using all 8 cores until the C++ part is done, which takes another 50 seconds. Overlapping at the end of that, the Ada part kicks in but “gnatmake”, by analyzing dependencies on the fly while it compiles code, rarely uses more than 3 or 4 processor cores.

comments powered by Disqus