HOME

TheInfoList



OR:

In
computer programming Computer programming or coding is the composition of sequences of instructions, called computer program, programs, that computers can follow to perform tasks. It involves designing and implementing algorithms, step-by-step specifications of proc ...
, a precompiled header (PCH) is a ( C or C++)
header file An include directive instructs a text file processor to replace the directive text with the content of a specified file. The act of including may be logical in nature. The processor may simply process the include file content at the location of ...
that is compiled into an intermediate form that is faster to process for the
compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
. Usage of precompiled headers may significantly reduce compilation time, especially when applied to large header files, header files that include many other header files, or header files that are included in many translation units.


Rationale

In the C and C++
programming language A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
s, a
header file An include directive instructs a text file processor to replace the directive text with the content of a specified file. The act of including may be logical in nature. The processor may simply process the include file content at the location of ...
is a file whose text may be automatically included in another
source file In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer. Since a computer, at base, onl ...
by the
C preprocessor The C preprocessor (CPP) is a text file processor that is used with C, C++ and other programming tools. The preprocessor provides for file inclusion (often header files), macro expansion, conditional compilation, and line control. Although ...
by the use of a
preprocessor directive In computer programming, a directive or pragma (from "pragmatic") is a language construct that specifies how a compiler (or other translator) should process its input. Depending on the programming language, directives may or may not be part of the ...
in the source file. Header files can sometimes contain very large amounts of source code (for instance, the header files windows.h and Cocoa/Cocoa.h on
Microsoft Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
and
OS X macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
, respectively). This is especially true with the advent of large "header" libraries that make extensive use of
template Template may refer to: Tools * Die (manufacturing), used to cut or shape material * Mold, in a molding process * Stencil, a pattern or overlay used in graphic arts (drawing, painting, etc.) and sewing to replicate letters, shapes or designs C ...
s, like the Eigen math library and
Boost C++ libraries Boost, boosted or boosting may refer to: Science, technology and mathematics * Boost, positive manifold pressure in turbocharged engines * Boost (C++ libraries), a set of free peer-reviewed portable C++ libraries * Boost (material), a material b ...
. They are written almost entirely as header files that the user #includes, rather than being linked at runtime. Thus, each time the user compiles their program, the user is essentially recompiling numerous header libraries as well. (These would be precompiled into shared objects or dynamic link libraries in non "header" libraries.) To reduce compilation times, some compilers allow header files to be compiled into a form that is faster for the compiler to process. This intermediate form is known as a ''precompiled header'', and is commonly held in a file named with the extension .pch or similar, such as .gch under the
GNU Compiler Collection The GNU Compiler Collection (GCC) is a collection of compilers from the GNU Project that support various programming languages, Computer architecture, hardware architectures, and operating systems. The Free Software Foundation (FSF) distributes ...
.


Usage

For example, given a C++ file source.cpp that includes header.hpp: //header.hpp ... //source.cpp #include "header.hpp" ... When compiling source.cpp for the first time with the precompiled header feature turned on, the compiler will generate a precompiled header, header.pch. The next time, if the timestamp of this header did not change, the compiler can skip the compilation phase relating to header.hpp and instead use header.pch directly.


Common implementations


Microsoft Visual C and C++

Microsoft
Visual C++ Microsoft Visual C++ (MSVC) is a compiler for the C, C++, C++/CLI and C++/CX programming languages by Microsoft. MSVC is proprietary software; it was originally a standalone product but later became a part of Visual Studio and made available ...
(version 6.0 and newer) can precompile any code, not just headers. It can do this in two ways: either precompiling all code up to a file whose name matches the /Ycfilename option or (when /Yc is specified without any filename) precompiling all code up to the first occurrence of #pragma hdrstop in the code The precompiled output is saved in a file named after the filename given to the /Yc option, with a .pch extension, or in a file named according to the name supplied by the /Fpfilename option. The /Yu option, subordinate to the /Yc option if used together, causes the compiler to make use of already precompiled code from such a file. pch.h (named stdafx.h before Visual Studio 2017) is a file generated by the
Microsoft Visual Studio Visual Studio is an integrated development environment (IDE) developed by Microsoft. It is used to develop computer programs including websites, web apps, web services and mobile apps. Visual Studio uses Microsoft software development platforms ...
IDE wizard, that describes both standard system and project specific include files that are used frequently but hardly ever change. The ''afx'' in ''stdafx.h'' stands for ''application framework extensions''. AFX was the original abbreviation for the
Microsoft Foundation Classes Microsoft Foundation Class Library (MFC) is a C++ Object-oriented programming, object-oriented Library (computer science), library for developing desktop applications for Windows. MFC was introduced by Microsoft in 1992 and quickly gained wides ...
(MFC). While the name stdafx.h was used by default in MSVC projects prior to version 2017, any alternative name may be manually specified. Compatible compilers will precompile this file to reduce overall compile times. Visual C++ will not compile anything before the #include "pch.h" in the source file, unless the compile option /Yu'pch.h' is unchecked (by default); it assumes all code in the source up to and including that line is already compiled.


GCC

Precompiled headers are supported in GCC (3.4 and newer). GCC's approach is similar to these of VC and compatible compilers. GCC saves precompiled versions of header files using a ".gch" suffix. When compiling a source file, the compiler checks whether this file is present in the same directory and uses it if possible. GCC can only use the precompiled version if the same compiler switches are set as when the header was compiled and it may use at most one. Further, only preprocessor instructions may be placed before the precompiled header (because it must be directly or indirectly included through another normal header, before any compilable code). GCC automatically identifies most header files by their extension. However, if this fails (e.g. because of non-standard header extensions), the -x switch can be used to ensure that GCC treats the file as a header.


clang

The
clang Clang () is a compiler front end for the programming languages C, C++, Objective-C, Objective-C++, and the software frameworks OpenMP, OpenCL, RenderScript, CUDA, SYCL, and HIP. It acts as a drop-in replacement for the GNU Compiler ...
compiler added support for PCH in Clang 2.5 / LLVM 2.5 of 2009. The compiler both tokenizes the input source code and performs syntactic and semantic analyses of headers, writing out the compiler's internal generated
abstract syntax tree An abstract syntax tree (AST) is a data structure used in computer science to represent the structure of a program or code snippet. It is a tree representation of the abstract syntactic structure of text (often source code) written in a formal ...
(AST) and
symbol table In computer science, a symbol table is a data structure used by a language translator such as a compiler or interpreter, where each identifier, symbol, constant, procedure and function in a program's source code is associated with information ...
to a precompiled header file. clang's precompiled header scheme, with some improvements such as the ability for one precompiled header to reference another, internally used, precompiled header, also forms the basis for its modules mechanism. It uses the same ''bitcode'' file format that is employed by
LLVM LLVM, also called LLVM Core, is a target-independent optimizer and code generator. It can be used to develop a Compiler#Front end, frontend for any programming language and a Compiler#Back end, backend for any instruction set architecture. LLVM i ...
, encapsulated in clang-specific sections within Common Object File Format or Extensible Linking Format files.


C++Builder

In the default project configuration, the
C++Builder C++Builder is a rapid application development (RAD) environment for developing software in the C++ programming language. Originally developed by Borland, it is owned by Embarcadero Technologies, a subsidiary of Idera. C++Builder can compile ...
compiler implicitly generates precompiled headers for all headers included by a source module until the line #pragma hdrstop is found. Precompiled headers are shared for all modules of the project if possible. For example, when working with the
Visual Component Library The Visual Component Library (VCL) is a visual component-based object-oriented framework for developing the user interface of Microsoft Windows applications. It is written in Object Pascal. History The VCL was developed by Borland for use i ...
, it is common to include the vcl.h header first which contains most of the commonly used VCL header files. Thus, the precompiled header can be shared across all project modules, which dramatically reduces the build times. In addition, C++Builder can be instrumented to use a specific header file as precompiled header, similar to the mechanism provided by Visual C++. C++Builder 2009 introduces a "Precompiled Header Wizard" which parses all source modules of the project for included header files, classifies them (i.e. excludes header files if they are part of the project or do not have an
Include guard In the C and C++ programming languages, an #include guard, sometimes called a macro guard, header guard or file guard, is a way to avoid the problem of ''double inclusion'' when dealing with the include directive. The C preprocessor processe ...
) and generates and tests a precompiled header for the specified files automatically.


Pretokenized headers

A pretokenized header (PTH) is a header file stored in a form that has been run through
lexical analysis Lexical tokenization is conversion of a text into (semantically or syntactically) meaningful ''lexical tokens'' belonging to categories defined by a "lexer" program. In case of a natural language, those categories include nouns, verbs, adjectives ...
, but no semantic operations have been done on it. PTH is present in Clang before it supported PCH, and has also been tried in a branch of GCC. Compared to a full PCH mechanism, PTH has the advantages of language (and dialect) independence, as lexical analysis is similar for the C-family languages, and architecture independence, as the same stream of tokens can be used when compiling for different target architectures. It however has the disadvantage of not going any ''further'' than simple lexical analysis, requiring that
syntactic In linguistics, syntax ( ) is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituency ...
and semantic analysis of the token stream be performed with every compilation. In addition, the time to compile scaling linearly with the size, in lexical tokens, of the pretokenized file, which is not necessarily the case for a fully-fledged precompilation mechanism (PCH in clang allows random access). Clang's pretokenization mechanism includes several minor mechanisms for assisting the pre-processor: caching of file existence and datestamp information, and recording inclusion guards so that guarded code can be quickly skipped over.


Modules

Since
C++20 C20 or C-20 may refer to: Science and technology * Carbon-20 (C-20 or 20C), an isotope of carbon * C20, the smallest possible fullerene (a carbon molecule) * C20 (engineering), a mix of concrete that has a compressive strength of 20 newtons per squ ...
, C++ has offered modules as a modern alternative to precompiled headers, however they differ from precompiled headers in that they do not require the preprocessor directive #include, but rather are accessed using the word import. A module must be declared using the word module to indicate that a file is a module. A module, once compiled, is stored as a (precompiled module) file which acts very similar to a (precompiled header) file. Modules provide the benefits of precompiled headers in that they compile much faster than traditional headers which are #included and are processed much faster during the linking phase, but also greatly reduce boilerplate code, allowing code to be implemented in a single file, rather than being separated across an
header file An include directive instructs a text file processor to replace the directive text with the content of a specified file. The act of including may be logical in nature. The processor may simply process the include file content at the location of ...
and source implementation file which was typical prior to the introduction of modules (however, this separation of "interface file" and "implementation file" is still possible with modules, but less common due to the increased boilerplate). Furthermore, modules eliminate the necessity to use guards or , as modules do not directly modify the source code, unlike #includes, which during the preprocessing step must include source code from the specified header. Thus, importing a module is not handled by the
preprocessor In computer science, a preprocessor (or precompiler) is a Computer program, program that processes its input data to produce output that is used as input in another program. The output is said to be a preprocessed form of the input data, which i ...
, but is rather handled during the compilation phase. Modules, unlike headers, do not have to be processed multiple times during compilation. However, similar to headers, any change in a module necessitates the recompilation of not only the module itself but also all its dependencies — and the dependencies of those dependencies, et cetera. This being said, much like what was the case with headers, C++ modules do not permit circular dependencies, and a circular/cyclic dependency will fail to compile. C++ modules most commonly have the extension (primarily common within Clang and GCC toolchains), though some alternative extensions include and (more common in Microsoft/MSVC toolchains). All symbols within a module that the programmer wishes to be accessible outside of the module must be marked export. Modules do not allow for granular imports of specific namespaces, classes, or symbols within a module, unlike
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
or
Rust Rust is an iron oxide, a usually reddish-brown oxide formed by the reaction of iron and oxygen in the catalytic presence of water or air moisture. Rust consists of hydrous iron(III) oxides (Fe2O3·nH2O) and iron(III) oxide-hydroxide (FeO(OH) ...
which do allow for the aforementioned. Importing a module imports all symbols marked with export, making it akin to a wildcard import in Java or Rust. Importing links the file and makes all exported symbols accessible to the importing translation unit, and thus if a module is never imported, it will never be linked. Since
C++23 C++23, formally ISO/IEC 14882:2024, is the current open standard for the C++ programming language that follows C++20. The final draft of this version is N4950. In February 2020, at the final meeting for C++20 in Prague, an overall plan for C++ ...
, the
C++ standard library The C standard library, sometimes referred to as libc, is the standard library for the C programming language, as specified in the ISO C standard.ISO/ IEC (2018). '' ISO/IEC 9899:2018(E): Programming Languages - C §7'' Starting from the origina ...
has been exported as a module as well, though as of currently it must be imported in its entirety (using import std;). The C++ standards offer two standard library modules: However, this may change in the future, with proposals to separate the standard library into more modules such as std.core, std.math, and std.io.C++ Standards Committee. (2018). ''P0581R1 - Modules for C++''. Retrieved fro
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0581r1.pdf
/ref>C++ Standards Committee. (2021). ''P2412R0 - Further refinements to the C++ Modules Design''. Retrieved fro
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2412r0.pdf
/ref> The module names std and std.* are reserved by the C++ standard, however most compilers allow a flag to override this. Modules may not export or leak macros, and because of this the order of modules does not matter (however convention is typically to begin with standard library imports, then all project imports, then external dependency imports in alphabetical order). If a module must re-export an imported module, it can do so using export import, meaning that the module is first imported and then exported out of the importing module, transitively importing modules which the imported module imports itself. Also, because using statements will not be included into importing files (unless explicitly marked export), it is much less likely that using a using statement to bring symbols into the global namespace will cause name clashes within module translation units. C++ modules do not enforce any notion of namespaces, but it is not uncommon to see projects manually associate modules to namespaces. Though standard C has not included modules into the standard, dialects of C allow for modules, for example
Clang Clang () is a compiler front end for the programming languages C, C++, Objective-C, Objective-C++, and the software frameworks OpenMP, OpenCL, RenderScript, CUDA, SYCL, and HIP. It acts as a drop-in replacement for the GNU Compiler ...
supports modules for the C language, though the syntax and semantics of Clang C modules differ from C++ modules significantly. Currently, only GCC,
Clang Clang () is a compiler front end for the programming languages C, C++, Objective-C, Objective-C++, and the software frameworks OpenMP, OpenCL, RenderScript, CUDA, SYCL, and HIP. It acts as a drop-in replacement for the GNU Compiler ...
, and
MSVC Microsoft Visual C++ (MSVC) is a compiler for the C, C++, C++/CLI and C++/CX programming languages by Microsoft. MSVC is proprietary software; it was originally a standalone product but later became a part of Visual Studio and made available ...
offer support for C++ modules. It is the same set of compilers that also currently support standard library modules.


Module partitions and hierarchy

Modules may have partitions, which separate the implementation of the module across several files. Module partitions are declared using the syntax A:B, meaning the module A has the partition B. Module partitions cannot individually be imported outside of the module that owns the partition itself, meaning that anyone who desires to access code that is part of a module partition must import the entire module that owns the partition. To link the module partition B back to the owning module A, write import :B; inside the file containing the declaration of module A or any other module partition of A (say A:C). These import statements may themselves be exported by the owning module, even if the partition itself cannot be imported directly – thus, to import code belonging to partition B that is re-exported by A, one simply has to write import A;. Other than partitions, C++ modules do not have a hierarchical system, but typically use a hierarchical naming convention, like Java's packages. In other words, C++ does not have "submodules", meaning the . symbol which may be included in a module name bears no syntactic meaning and is used only to suggest the association of a module. Meanwhile, only alphanumeric characters plus the period can be used in module names, and so the character * cannot be used in a module name (which otherwise would have been used to denote a wildcard import, like in Java). In C++, the name of a module is not tied to the name of its file or the module's location, unlike Java in which the name of a file must match the name of the public class it declares if any, and the package it belongs to must match the path it is located in. For example, the modules A and A.B in theory are disjoint modules and need not necessarily have any relation, however such a naming scheme is often employed to suggest that the module A.B is somehow related or otherwise associated with the module A. The naming scheme of a C++ module is inherently hierarchical, and the C++ standard recommends re-exporting "sub-modules" belonging to the same public API (i.e. module alpha.beta.gamma should be re-exported by alpha.beta, etc.), even though dots in module names do not enforce any hierarchy. The C++ standard recommends lower-case ASCII module names (without hyphens or underscores), even though there is technically no restriction in such names. Also, because modules cannot be re-aliased or renamed (short of re-exporting all symbols in another module), the C++ standards recommends organisations/projects to prefix module names with organisation/project names for both clarity and to prevent naming clashes (i.e. google.abseil instead of abseil). Also, unlike Java, whose packages may typically include a TLD to avoid namespace clashes, C++ modules do not have this convention. Thus, it may be more common to see example.myfunctionality.MyModule than com.example.myfunctionality.MyModule, though both names are valid. Similar to Java, some organisations of code will split modules into exporting one class/struct/namespace each, and name the final word in PascalCase to match the name of the exported class, while modules re-exporting multiple "sub-modules" may be in all-lowercase.


Example

A simple example of using C++ modules is as follows: export module myproject.MyClass; import std; using String = std::string; export namespace myproject import std; import myproject.MyClass; using myproject::MyClass; int main()


Module purview and global module fragment

Everything above the line export module myproject.MyClass; in the file is referred to as what is "outside the module purview", meaning what is outside of the scope of the module. Typically, if headers must be included, all #includes are placed outside the module purview between a line containing only the statement module; and the declaration of export module, like so: module; // Optional, marks the beginning of the global module fragment (mandatory if an include directive is invoked above the export module declaration) #include #include "MyHeader.h" export module myproject.MyModule; // Mandatory, marks the beginning of the module preamble module: private; // Optional, marks the beginning of the private module fragment The file containing main() may declare a module, but typically it does not (as it is unusual to export main() as it is typically only used as an entry point to the program, and thus the file is usually a .cpp file and not a .cppm file). A program is ill-formed if it exports main() (causes
undefined behaviour In computer programming, a program exhibits undefined behavior (UB) when it contains, or is executing code for which its programming language specification does not mandate any specific requirements. This is different from unspecified behavior, ...
), but will not necessarily be rejected by the compiler. All code which does not belong to any module belongs to the so-called "unnamed module" (also known as the global module fragment), and thus cannot be imported by any module.


Private module fragment

A module may declare a "private module fragment" by writing module: private;, in which all declarations or definitions after the line are visible only from within the file, and thus inaccessible from importers of the module. Any module unit that contains a private module fragment must be the only module unit of its module.


Header units

Headers may also be imported using import, even if they are not declared as modules – these are called "header units", and they are designed to allow existing codebases to migrate from headers to modules more gradually. The syntax is similar to including a header, with the difference being that #include is replaced with import and a semicolon is placed at the end of the statement. Header units automatically export all symbols, and differ from proper modules in that they allow the emittance of macros, meaning all who import the header unit will obtain its contained macros. This offers minimal breakage between migration to modules. The semantics of searching for the file depending on whether quotation marks or angle brackets are used apply here as well. For instance, one may write import ; to import the header, or import "MyHeader.h"; to import the file "MyHeader.h" as a header unit. Most build systems, such as CMake, do not support this feature yet.


See also

* Prefix header * Single compilation unit *
Java package A Java package organizes Java classes into namespaces, providing a unique namespace for each type it contains. Classes in the same package can access each other's package-private and protected members. In general, a package can contain the fo ...
* Java Platform Module System


Notes


References


External links


The Care and Feeding of Pre-Compiled Headers




{{C++ programming language Source code C (programming language) headers C++ Modularity