GCC surprisingly slow with #pragma once

Posted on . Updated on .

Just yesterday I made some tweaks to the header file in my small bcrypt project to use “#pragma once” instead of the classic include guards. I was about to post something short recommending its usage because, even if it’s not standard, it’s accepted by the majority of compilers according to Wikipedia, is simpler, much easier to type and does not pollute the preprocessor global symbol namespace. However, a small benchmark showed surprising results.

Another advantage of “#pragma once” mentioned in Wikipedia and some other sources is its supposed performance. By using the pragma, the compiler should not need to open the file and scan for its contents, processing the include guards. It could know if the file has already been included, maybe taking note of its inode number or similar, and avoid lookups. I tried to check these performance improvements with surprising results.

Benchmark conditions

The test consisted on creating a common header that would be included 10,000 times. This common header is called common.h and has two versions, one with guards and one with the pragma.

#ifndef COMMON_HEADER
#define COMMON_HEADER
#define VALUE 7
#endif
#pragma once
#define VALUE 7

Then, there are 10,000 other headers that include this “common.h” file, named header0000.h to header9999.h, all like this:

#include "common.h"

There’s an "all.h" header that includes all these numbered headers:

#include "header0000.h"
...
#include "header9999.h"

Finally, there’s a main.c file like this:

#include "all.h"
#include "common.h"

int main(void)
{
        return VALUE;
}

The files are all stored in the same directory in a tmpfs file system, so they don’t touch the disk, and the compilers tested have been GCC version 4.8.1 and Clang version 3.3.

I ran each of the following commands 20 times in a row, taking the "user time" in each case, discarding the lowest and highest, and calculating the average and standard deviation, with both versions of the header file.

time gcc -o main main.c
... 20 times ...
time clang -o main main.c
... 20 times ...

Results

The results table shows there’s a performance hit with GCC when using the pragma that is not present when using guards, and Clang doesn’t seem to care much about one or the other.

Average Std. Deviation

#pragma once

GCC

1.460

0.029

Clang

0.073

0.006

Include guards

GCC

0.074

0.006

Clang

0.071

0.004

Comments? Is the benchmark flawed somehow? Should I fill a bug report in the GCC bug tracker?

The performance hit is not avoided when combining both guards and the pragma, and moving the code to a real filesystem makes no difference.

Load comments