In
C and
C++
C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
programming language terminology, a translation unit (or more casually a compilation unit) is the ultimate input to a C or C++
compiler
In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
from which an
object file
An object file is a computer file containing object code, that is, machine code output of an assembler or compiler. The object code is usually relocatable, and not usually directly executable. There are various formats for object files, and the ...
is generated.
[ISO/IEC 9899:TC3 - Committee Draft of the C99 Standard - Section 5.1.1.1](_blank)
/ref> A translation unit roughly consists of a source file
In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the wo ...
after it has been processed by the C preprocessor
The C preprocessor is the macro preprocessor for the C, Objective-C and C++ computer programming languages. The preprocessor provides the ability for the inclusion of header files, macro expansions, conditional compilation, and line control ...
, meaning that header file
Many programming languages and other computer files have a directive, often called include (sometimes copy or import), that causes the contents of the specified file to be inserted into the original file. These included files are called copybooks ...
s listed in #include
directives are literally included, sections of code within #ifndef
may be included, and macros have been expanded.
Context
A C program consists of ''units'' called ''source files
In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the w ...
'' (or ''preprocessing files''), which, in addition to source code, includes directives for the C preprocessor
The C preprocessor is the macro preprocessor for the C, Objective-C and C++ computer programming languages. The preprocessor provides the ability for the inclusion of header files, macro expansions, conditional compilation, and line control ...
. A translation unit is the output of the C preprocessor – a source file after it has been preprocessed.
Preprocessing notably consists of expanding a source file to recursively replace all #include
directives with the literal file declared in the directive (usually header file
Many programming languages and other computer files have a directive, often called include (sometimes copy or import), that causes the contents of the specified file to be inserted into the original file. These included files are called copybooks ...
s, but possibly other source files); the result of this step is a ''preprocessing translation unit''. Further steps include macro expansion of #define
directives, and conditional compilation
In computer programming, conditional compilation is a compilation technique which results in an executable program that is able to be altered by changing specified parameters. This technique is commonly used when these alterations to the program a ...
of #ifdef
directives, among others; this translates the preprocessing translation unit into a ''translation unit''. From a translation unit, the compiler generates an object file
An object file is a computer file containing object code, that is, machine code output of an assembler or compiler. The object code is usually relocatable, and not usually directly executable. There are various formats for object files, and the ...
, which can be further processed and linked (possibly with other object files) to form an ''executable program''.
Note that the preprocessor is in principle language agnostic, and is a lexical preprocessor
In computer science, a preprocessor (or precompiler) is a program that processes its input data to produce output that is used as input in another program. The output is said to be a preprocessed form of the input data, which is often used by s ...
, working at the lexical analysis
In computer science, lexical analysis, lexing or tokenization is the process of converting a sequence of characters (such as in a computer program or web page) into a sequence of ''lexical tokens'' ( strings with an assigned and thus identified ...
level – it does not do parsing, and thus is unable to do any processing specific to C syntax. The input to the compiler is the translation unit, and thus it does not see any preprocessor directives, which have all been processed before compiling starts. While a given translation unit is fundamentally based on a file, the actual source code fed into the compiler may appear substantially different than the source file that the programmer views, particularly due to the recursive inclusion of headers.
Scope
Translation units define a scope
Scope or scopes may refer to:
People with the surname
* Jamie Scope (born 1986), English footballer
* John T. Scopes (1900–1970), central figure in the Scopes Trial regarding the teaching of evolution
Arts, media, and entertainment
* Cinem ...
, roughly file scope
In computer programming, the scope of a name binding (an association of a name to an entity, such as a variable) is the part of a program where the name binding is valid; that is, where the name can be used to refer to the entity. In other parts o ...
, and functioning similarly to module scope
In computer programming, the scope of a name binding (an association of a name to an entity, such as a variable) is the part of a program where the name binding is valid; that is, where the name can be used to refer to the entity. In other parts ...
; in C terminology this is referred to as internal linkage
Internal may refer to:
*Internality as a concept in behavioural economics
*Neijia, internal styles of Chinese martial arts
*Neigong or "internal skills", a type of exercise in meditation associated with Daoism
*''Internal (album)'' by Safia, 2016
...
, which is one of the two forms of linkage
Linkage may refer to:
* ''Linkage'' (album), by J-pop singer Mami Kawada, released in 2010
*Linkage (graph theory), the maximum min-degree of any of its subgraphs
*Linkage (horse), an American Thoroughbred racehorse
* Linkage (hierarchical cluster ...
in C. Names (functions and variables) declared outside of a function block may be visible either only within a given translation unit, in which case they are said to have internal linkage – they are not visible to the linker – or may be visible to other object files, in which case they are said to have external linkage In programming languages, particularly the compiled ones like C, C++, and D, linkage describes how names can or can not refer to the same entity throughout the whole program or one single translation unit.
The static keyword is used in C to re ...
, and are visible to the linker.
C does not have a notion of modules. However, separate object files (and hence also the translation units used to produce object files) function similarly to separate modules, and if a source file does not include other source files, internal linkage (translation unit scope) may be thought of as "file scope, including all header files".
Code organization
The bulk of a project's code is typically held in files with a .c
suffix (or .cpp
, .cxx
or .cc
for C++
C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
, of which .cpp
is used most conventionally). Files intended to be included typically have a .h
suffix ( .hpp
or .hh
are also used for C++, but .h
is the most common even for C++), and generally do not contain function or variable definitions to avoid name conflicts when headers are included in multiple source files, as is often the case. Header files can be, and often are, included in other header files. It is standard practice for all .c
files in a project to include at least one .h
file.
See also
* Single compilation unit
Single compilation unit (SCU) is a computer programming technique for the C and C++ languages, which reduces compilation time for programs spanning multiple files. Specifically, it allows the compiler to keep data from shared header files, defini ...
References
{{reflist
C (programming language)
Compiler construction