HOME

TheInfoList



OR:

In computer science, a preprocessor (or precompiler) is a program that processes its input data to produce output that is used as input in another program. The output is said to be a preprocessed form of the input data, which is often used by some subsequent programs like compilers. The amount and kind of processing done depends on the nature of the preprocessor; some preprocessors are only capable of performing relatively simple textual substitutions and
macro Macro (or MACRO) may refer to: Science and technology * Macroscopic, subjects visible to the eye * Macro photography, a type of close-up photography * Image macro, a picture with text superimposed * Monopole, Astrophysics and Cosmic Ray Observat ...
expansions, while others have the power of full-fledged programming languages. A common example from computer programming is the processing performed on source code before the next step of compilation. In some computer languages (e.g., C and PL/I) there is a phase of translation known as ''preprocessing''. It can also include macro processing, file inclusion and language extensions.


Lexical preprocessors

Lexical preprocessors are the lowest-level of preprocessors as they only require lexical analysis, that is, they operate on the source text, prior to any parsing, by performing simple substitution of tokenized character sequences for other tokenized character sequences, according to user-defined rules. They typically perform macro substitution, textual inclusion of other files, and conditional compilation or inclusion.


C preprocessor

The most common example of this is the C preprocessor, which takes lines beginning with '#' as directives. The C preprocessor does not expect its input to use the syntax of the C language. Some languages take a different approach and use built-in language features to achieve similar things. For example: * Instead of macros, some languages use aggressive inlining and templates. * Instead of includes, some languages use compile-time imports that rely on type information in the object code. * Some languages use if-then-else and dead code elimination to achieve conditional compilation.


Other lexical preprocessors

Other lexical preprocessors include the general-purpose m4, most commonly used in cross-platform build systems such as autoconf, and GEMA, an open source macro processor which operates on patterns of context.


Syntactic preprocessors

Syntactic preprocessors were introduced with the
Lisp A lisp is a speech impairment in which a person misarticulates sibilants (, , , , , , , ). These misarticulations often result in unclear speech. Types * A frontal lisp occurs when the tongue is placed anterior to the target. Interdental lisping ...
family of languages. Their role is to transform syntax trees according to a number of user-defined rules. For some programming languages, the rules are written in the same language as the program (compile-time reflection). This is the case with
Lisp A lisp is a speech impairment in which a person misarticulates sibilants (, , , , , , , ). These misarticulations often result in unclear speech. Types * A frontal lisp occurs when the tongue is placed anterior to the target. Interdental lisping ...
and
OCaml OCaml ( , formerly Objective Caml) is a general-purpose programming language, general-purpose, multi-paradigm programming language which extends the Caml dialect of ML (programming language), ML with object-oriented programming, object-oriented ...
. Some other languages rely on a fully external language to define the transformations, such as the
XSLT XSLT (Extensible Stylesheet Language Transformations) is a language originally designed for transforming XML documents into other XML documents, or other formats such as HTML for web pages, plain text or XSL Formatting Objects, which may subseque ...
preprocessor for XML, or its statically typed counterpart
CDuce {{notability, date=August 2018 CDuce is an XML-oriented functional language, which extends XDuce in a few directions. It features XML regular expression types, XML regular expression patterns, XML iterators. CDuce is not strictly speaking an XML t ...
. Syntactic preprocessors are typically used to customize the syntax of a language, extend a language by adding new primitives, or embed a domain-specific programming language (DSL) inside a general purpose language.


Customizing syntax

A good example of syntax customization is the existence of two different syntaxes in the
Objective Caml OCaml ( , formerly Objective Caml) is a general-purpose, multi-paradigm programming language which extends the Caml dialect of ML with object-oriented features. OCaml was created in 1996 by Xavier Leroy, Jérôme Vouillon, Damien Doligez, D ...
programming language. Programs may be written indifferently using the "normal syntax" or the "revised syntax", and may be pretty-printed with either syntax on demand. Similarly, a number of programs written in
OCaml OCaml ( , formerly Objective Caml) is a general-purpose programming language, general-purpose, multi-paradigm programming language which extends the Caml dialect of ML (programming language), ML with object-oriented programming, object-oriented ...
customize the syntax of the language by the addition of new operators.


Extending a language

The best examples of language extension through macros are found in the
Lisp A lisp is a speech impairment in which a person misarticulates sibilants (, , , , , , , ). These misarticulations often result in unclear speech. Types * A frontal lisp occurs when the tongue is placed anterior to the target. Interdental lisping ...
family of languages. While the languages, by themselves, are simple dynamically typed functional cores, the standard distributions of
Scheme A scheme is a systematic plan for the implementation of a certain idea. Scheme or schemer may refer to: Arts and entertainment * ''The Scheme'' (TV series), a BBC Scotland documentary series * The Scheme (band), an English pop band * ''The Schem ...
or
Common Lisp Common Lisp (CL) is a dialect of the Lisp programming language, published in ANSI standard document ''ANSI INCITS 226-1994 (S20018)'' (formerly ''X3.226-1994 (R1999)''). The Common Lisp HyperSpec, a hyperlinked HTML version, has been derived fro ...
permit imperative or object-oriented programming, as well as static typing. Almost all of these features are implemented by syntactic preprocessing, although it bears noting that the "macro expansion" phase of compilation is handled by the compiler in Lisp. This can still be considered a form of preprocessing, since it takes place before other phases of compilation.


Specializing a language

One of the unusual features of the
Lisp A lisp is a speech impairment in which a person misarticulates sibilants (, , , , , , , ). These misarticulations often result in unclear speech. Types * A frontal lisp occurs when the tongue is placed anterior to the target. Interdental lisping ...
family of languages is the possibility of using macros to create an internal DSL. Typically, in a large
Lisp A lisp is a speech impairment in which a person misarticulates sibilants (, , , , , , , ). These misarticulations often result in unclear speech. Types * A frontal lisp occurs when the tongue is placed anterior to the target. Interdental lisping ...
-based project, a module may be written in a variety of such minilanguages, one perhaps using a SQL-based dialect of
Lisp A lisp is a speech impairment in which a person misarticulates sibilants (, , , , , , , ). These misarticulations often result in unclear speech. Types * A frontal lisp occurs when the tongue is placed anterior to the target. Interdental lisping ...
, another written in a dialect specialized for GUIs or pretty-printing, etc.
Common Lisp Common Lisp (CL) is a dialect of the Lisp programming language, published in ANSI standard document ''ANSI INCITS 226-1994 (S20018)'' (formerly ''X3.226-1994 (R1999)''). The Common Lisp HyperSpec, a hyperlinked HTML version, has been derived fro ...
's standard library contains an example of this level of syntactic abstraction in the form of the LOOP macro, which implements an Algol-like minilanguage to describe complex iteration, while still enabling the use of standard Lisp operators. The MetaOCaml preprocessor/language provides similar features for external DSLs. This preprocessor takes the description of the semantics of a language (i.e. an interpreter) and, by combining compile-time interpretation and code generation, turns that definition into a compiler to the
OCaml OCaml ( , formerly Objective Caml) is a general-purpose programming language, general-purpose, multi-paradigm programming language which extends the Caml dialect of ML (programming language), ML with object-oriented programming, object-oriented ...
programming language—and from that language, either to bytecode or to native code.


General purpose preprocessor

Most preprocessors are specific to a particular data processing task (e.g., compiling the C language). A preprocessor may be promoted as being ''general purpose'', meaning that it is not aimed at a specific usage or programming language, and is intended to be used for a wide variety of text processing tasks. M4 is probably the most well known example of such a general purpose preprocessor, although the C preprocessor is sometimes used in a non-C specific role. Examples: * using C preprocessor for JavaScript preprocessing. * using C preprocessor for
devicetree In computing, a devicetree (also written device tree) is a data structure describing the hardware components of a particular computer so that the operating system's kernel can use and manage those components, including the CPU or CPUs, the memory, ...
processing within the
Linux kernel The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel. It was originally authored in 1991 by Linus Torvalds for his i386-based PC, and it was soon adopted as the kernel for the GNU ope ...
. * using M4 (see on-article example) or C preprocessorShow how to use C-preprocessor as template engine
"Using a C preprocessor as an HTML authoring tool"
''by J. Korpela'', 2000.
as a template engine, to HTML generation. * imake, a make interface using the C preprocessor, written for the X Window System but now deprecated in favour of automake. * grompp, a preprocessor for simulation input files for GROMACS (a fast, free, open-source code for some problems in
computational chemistry Computational chemistry is a branch of chemistry that uses computer simulation to assist in solving chemical problems. It uses methods of theoretical chemistry, incorporated into computer programs, to calculate the structures and properties of m ...
) which calls the system C preprocessor (or other preprocessor as determined by the simulation input file) to parse the topology, using mostly the #define and #include mechanisms to determine the effective topology at grompp run time.


See also

* * * * * * * * * * The * The * The * The *


References


External links

{{Wiktionary, preprocessor
DSL Design in Lisp


* Th
Generic PreProcessor
* Gema, th
General Purpose Macro Processor
* The PIKTbr>piktc

pyexpander, a python based general purpose macro processor

minimac, a minimalist macro processor

Java Comment Preprocessor
Programming language implementation