Include Guards and their Optimizations
This article discusses the purpose and importance of include guards in C/C++ projects. It also explores the optimizations that compilers have surrounding include guards to improve build times, and the how easy it is to unintentionally disable these optimizations!
The Preprocessor
One of the initial phases of C/C++ compilation is the preprocessor. This phase
involves running the preprocessor against a source file, typically identified by
the file extensions .c or .cpp. The preprocessor handles all preprocessing
directives, which start with the #
symbol, such as
#include
and #define
. At this stage, the preprocessor
recursively replaces all #include
directives with the contents of
the pointed file. The output is a single file called a translation unit
(TU) that is then passed to the C/C++ compiler to be compiled into
an object file.
The preprocessor only focuses on preprocessing directives and ignores all other code as it lacks understanding of C/C++ language. Therefore, it is feasible to add preprocessing directives into any file, and the preprocessor can still process them without any issue.
For an example of what the preprocessor does, we will look the 2 files below
// zero.h
#define ZERO 0
int zero() { return ZERO; }
// main.cpp
#include "zero.h"
int main() { return zero(); }
If we run "main.cpp" through the preprocessor (which can be done with gcc
-E main.cpp
) we end up with the following file with all preprocessing
directives applied
int zero() { return 0; }
int main() { return zero(); }
If you include the header file "zero.h", it is not going to add much text to your
translation unit since it is small. However, if you add #include
<vector>
, it will add around 1MB of text even if you never use
std::vector
in your code.
Normally, the preprocessor does not remember what has been included before and will add the content of a file each time it is included. Although this may be helpful in some cases, it is often unwanted.
For example, if we were to add a new file "one.h" and reference this in "main.cpp"
// one.h
#include "zero.h"
int one() { return zero() + 1; }
// main.cpp
#include "zero.h"
#include "one.h"
int main(int argc, const char** argv) {
return argc == 0 ? zero() : one();
}
After preprocessing we would have the zero
function defined twice.
Once from the #include "zero.h"
in "main.cpp", and the other from
the #include "zero.h"
in "one.h" that it itself includes "zero.h"
int zero() { return 0; }
int zero() { return 0; }
int one() { return zero() + 1; }
int main(int argc, const char** argv) {
return argc == 0 ? zero() : one();
}
The One Definition Rule (ODR) in C/C++ prevents us from defining the same non-inline function in a translation unit. Therefore, the code mentioned above would not compile.
To solve this issue, we can either make these functions inline or avoid defining functions within header files. Instead, we can separate "zero.h" into "zero.h" and "zero.cpp".
// zero.h
int zero();
// zero.cpp
#include "zero.h"
int zero() { return 0; }
Which would give us the following translation unit,
int zero();
int zero();
int one() { return zero() + 1; }
int main(int argc, const char** argv) {
return argc == 0 ? zero() : one();
}
It is allowed to declare (but not define) the same function multiple times within a translation unit. This means that the translation unit would compile without any problems and function correctly.
Include Guards
This method becomes too limiting due to the One Definition Rule, which prohibits multiple definitions of classes and structs. This means that we could only forward declare types and use them as opaque pointers, which unnecessarily restricts us to a smaller set of C/C++ features and performance.
To solve this issue, a common approach is to use preprocessor directives to prevent the inclusion of a file multiple times and introduce state into the preprocessor.
// zero.h
#ifndef INCLUDED_ZERO_H
#define INCLUDED_ZERO_H
int zero();
#endif
When the preprocessor processes a source file and encounters the first instance
of #include "zero.h"
, the macro INCLUDED_ZERO_H
is not
yet defined, so the #ifndef
condition passes. We then define the
macro and add the rest of the file before ending it with #endif
. If
"zero.h" is included again, INCLUDED_ZERO_H
is already defined, so
the preprocessor will skip the contents of the file until it reaches the
#endif
at the end.
This is called the include guard idiom and is commonly used to
prevent multiple inclusions of header files. To avoid macro collisions with
other projects, it is recommended to include your project name along with the
file name. Alternatively, you can generate a new GUID for your macro, such as
INCLUDED_60B80A74_3952_4DAE_BB89_36D93CBDC5C6
, which is unlikely to
collide with other macros and won't require modification if the header file is
renamed.
External Include Guards
The use of include guards can cause a performance issue because the preprocessor
needs to open the file and scan the entire content to locate the closing
#endif
for every include directive. Modern preprocessors skip over
approximately 100-300MB/s to find a matching #endif
, which is
relatively efficient.
To improve performance, one suggestion is to use an additional
#ifndef
guard to wrap any include directives itself, in addition to
the standard include guard. This can help the preprocessor to skip unnecessary
processing of previously included headers, improving the overall compilation
time.
// main.cpp
#ifndef INCLUDED_ZERO_H
#include "zero.h"
#endif
#ifndef INCLUDED_ONE_H
#include "one.h"
#endif
#ifndef INCLUDED_STD_IOSTREAM
#define INCLUDED_STD_IOSTREAM
#include <iostream>
#endif
int main(int argc, const char** argc) {
return argc == 0 ? zero() : one();
}
To save time preprocessing a file, external include guards can be used to avoid encountering the include directives entirely if they have already been included. However, this solution can be verbose and requires keeping the guarding macro name in sync with any dependencies. Additionally, different standard library includes do not agree on guard macros and this would need to be addressed by creating a properly guarded wrapper header or defining a macro before the include statement as shown previously.
#pragma once
A widely-supported, but non-standard, alternative to include guards is
#pragma once
(equivalently _Pragma("once")
). If
#pragma once
appears in a file, the compiler will flag it and avoid
preprocessing it for all subsequent includes in that source file. This method
saves performance time compared to include guards, which require finding a
matching #endif
statement.
Multiple-Inclusion Optimization
Most major compilers now implement the multiple-inclusion
optimization that avoids opening a guarded file after the first
time it's encountered, regardless of whether it's guarded with #pragma
once
or include guards. This brings both techniques to the same level
of performance.
However, what is considered a valid include guard differs between compilers. Ignoring these hidden rules may negatively impact compilation time.
The depth and quality of documentation on each compiler's specific rules varies,
To guarantee the multiple-include optimization on Clang, GCC, and MSVC, ensure your headers follow the format:
- Comments and whitespace only
#ifndef MACRO_NAME
- Your code
#endif
- Comments and whitespace only
MACRO_NAME
is defined.
Note that the position of your #define MACRO_NAME
within the
#ifndef
/#endif
pair doesn't matter - it can appear
anywhere or not at all. However, to avoid compilation errors, it's recommended
to define the macro immediately after the #ifndef
check. If you
have a circular include dependency, where a file #include
s itself,
make sure the #define appears before any includes to prevent an infinite cycle.
Alternatively, make sure your header contains:
#pragma once
or_Pragma("once")
anywhere in the file
#pragma once
directive. If the pragma is inside
#if FOO
/#endif
the file will only be marked for this
optimization after it is preprocessed while FOO
is defined.
You can see the results of these experiments at multiple-inclusion-optimization-tests.
Real-world Problems
These rules are simple, but it's easy to make mistakes. Even in the Boost library, many libraries take longer to compile than they should.
Boost Preprocessor
MSVC and GCC permit the null directive to be placed outside of the
#ifndef
/#endif
pair without disrupting the
multiple-inclusion optimization. The null directive is a single
#
symbol with optional comments on the same line, and it has no
impact on the preprocessor output. However, Clang does not flag any file for
the multiple-inclusion optimization if it detects a null directive outside
the guard. This prevents Clang from enabling all Boost Preprocessor
headers for this optimization, as all of its header files use
the null directive liberally for alignment:
// example.hpp
# /* Copyright (C) 2023
# * FakeCompany"
# * https://www.example.com
# */
#
# /* See https://www.example.com/docs for documentation. */
#
# ifndef INCLUDED_ONE_H
# define INCLUDED_ONE_H
#
# /* code goes here */
#
# endif
Luckily this is fixed by D147928 for future versions of Clang.
Boost Fusion
The Clang and GCC compilers can optimize include guards that use the syntax
#if !defined(MACRO_NAME)
, but this is not optimized by MSVC.
So, if you're using MSVC, it's better to use #ifndef
or
#pragma once
instead of #if !defined
. Even though
the documentation
for MSVC might say that #if !defined HEADER_H_
is
equivalent to #ifndef HEADER_H_
, it's not really the case.
To improve compilation speed with MSVC, a pull request has
been made to change the include guards in Fusion from using #if
!defined
to #ifndef
.
BOOST_PP_IS_ITERATING
Boost preprocessor has a tool that allows for code generation through
self-including a header file, which is explained in the documentation's Self-Iteration
section. However, this method requires an include guard that is
nested inside a preprocesor conditional (#ifndef
BOOST_PP_IS_ITERATING
/#else
/#endif
), as
demonstrated by this example that creates specialized code for
IsSmallInt<N>
for values 1 to 5.
// is_small_int.h
#if !BOOST_PP_IS_ITERATING
#ifndef INCLUDED_IS_SMALL_INT
#define INCLUDED_IS_SMALL_INT
#include <boost/preprocessor/iteration/iterate.hpp>
template<int N>
struct IsSmallInt : {
static const bool value = false;
}
#define BOOST_PP_ITERATION_LIMITS (1, 5)
#define BOOST_PP_FILENAME_1 "is_small_int.h"
??=include BOOST_PP_ITERATE()
#endif // INCLUDED_SELF_ITERATION
#else
template<>
struct IsSmallInt<BOOST_PP_ITERATION()> {
static const bool value = true;
};
#endif
BOOST_PP_IS_ITERATING
is used around 170 times in Boost
libraries. Most of these uses are in private header files and are unlikely
to be included more than once. However, some public header files, like mpl/bind.hpp,
also use it.
A pull request
has been made to improve the documentation for Boost Preprocessor, but there
are still many Boost headers that use BOOST_PP_IS_ITERATING
and
don't benefit from the multiple-include optimization.
Avoiding Mistakes
The issues above were found using the IncludeGuardian tool on Boost Graph and looking through the unguarded files
section in the results. To prevent these mistakes in your own codebase, you can download IncludeGuardian for free and keep your C/C++ builds fast!
If you find or fix any include guard issues in your own or other projects, you can let us know on Twitter by tagging them with @includeguardian.