HOME

TheInfoList



OR:

A multi-pass compiler is a type of
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
that processes the
source code In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the wo ...
or
abstract syntax tree In computer science, an abstract syntax tree (AST), or just syntax tree, is a tree representation of the abstract syntactic structure of text (often source code) written in a formal language. Each node of the tree denotes a construct occurring ...
of a program several times. This is in contrast to a
one-pass compiler In computer programming, a one-pass compiler is a compiler that passes through the parts of each compilation unit only once, immediately translating each part into its final machine code. This is in contrast to a multi-pass compiler which convert ...
, which traverses the program only once. Each pass takes the result of the previous pass as the input, and creates an intermediate output. In this way, the (intermediate) code is improved pass by pass, until the final pass produces the final code. Multi-pass compilers are sometimes called wide compilers, referring to the greater scope of the passes: they can "see" the entire program being compiled, instead of just a small portion of it. The wider scope thus available to these compilers allows better code generation (e.g. smaller code size, faster code) compared to the output of one-pass compilers, at the cost of higher compiler time and memory consumption. In addition, some
languages Language is a structured system of communication. The structure of a language is its grammar and the free components are its vocabulary. Languages are the primary means by which humans communicate, and may be conveyed through a variety of met ...
cannot be compiled in a single pass, as a result of their design.


Typical multi-pass compiler


Lexical analysis

This stage of a multi-pass compiler is to remove irrelevant information from the source program that syntax analysis will not be able to use or interpret. Irrelevant information could include things like comments and white space. In addition to removing the irrelevant information, the lexical analysis determines the lexical tokens of the language. This step means that
forward declaration In computer programming, a forward declaration is a declaration of an identifier (denoting an entity such as a type, a variable, a constant, or a function) for which the programmer has not yet given a complete definition. It is required for a com ...
is generally not necessary if a multi-pass compiler is used. This phase is focused on breaking a sequence of characters into tokens with attributes such as kind, type, value, and potentially others, as well.


Syntax analysis

Syntax analysis is responsible for looking at syntax rules of the language (often as a
context-free grammar In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules are of the form :A\ \to\ \alpha with A a ''single'' nonterminal symbol, and \alpha a string of terminals and/or nonterminals (\alpha can be empt ...
), and building some intermediate representation of the language. An example of this intermediate representation could be something like an
Abstract Syntax Tree In computer science, an abstract syntax tree (AST), or just syntax tree, is a tree representation of the abstract syntactic structure of text (often source code) written in a formal language. Each node of the tree denotes a construct occurring ...
or a
Directed Acyclic Graph In mathematics, particularly graph theory, and computer science, a directed acyclic graph (DAG) is a directed graph with no directed cycles. That is, it consists of vertices and edges (also called ''arcs''), with each edge directed from one ve ...
.


Semantic analysis

Semantic analysis takes the representation made from syntax analysis and applies semantic rules to the representation to make sure that the program meets the semantic rules requirements of the language. For example, in the example below at the stage of semantic analysis if the language required that conditions on ''if'' statements were boolean expressions the ''cond'' would be type-checked to make sure it would be a valid boolean expression. if(cond) else In addition to performing semantic analysis at this stage of compilation, often
symbol tables In computer science, a symbol table is a data structure used by a language translator such as a compiler or interpreter, where each identifier (or symbols), constants, procedures and functions in a program's source code is associated with i ...
are created in order to assist in code generation.


Code generation

This final stage of a typical compiler converts the intermediate representation of program into an executable set of instructions (often
assembly Assembly may refer to: Organisations and meetings * Deliberative assembly, a gathering of members who use parliamentary procedure for making decisions * General assembly, an official meeting of the members of an organization or of their representa ...
). This last stage is the only stage in compilation that is machine dependent. There can also be
optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfi ...
done at this stage of compilation that make the program more efficient. Other passes of compiler include intermediate code generation phase which takes place before code generation and code optimization phase which can take place when the source program is written, or after intermediate code generation phase, or after code generation phase.


Advantages of multi-pass compilers

Machine Independent: Since the multiple passes include a modular structure, and the code generation decoupled from the other steps of the compiler, the passes can be reused for different hardware/machines. More Expressive Languages: Multiple passes obviate the need for forward declarations, allowing mutual recursion to be implemented elegantly. The prime examples of languages requiring forward declarations due to the requirement of being compilable in a single pass include C and
Pascal Pascal, Pascal's or PASCAL may refer to: People and fictional characters * Pascal (given name), including a list of people with the name * Pascal (surname), including a list of people and fictional characters with the name ** Blaise Pascal, Fren ...
, whereas
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
does not have forward declarations.


References

* Bornat, Richard
''Understanding and Writing Compilers: A Do It Yourself Guide''
Macmillan Publishing, 1979. * Bent Thomsen, ''Languages and Compilers SProg og Overseattere'', Department of Computer Science, Aalborg University {{refend Compilers