Camlp4
   HOME

TheInfoList



OR:

Camlp4 is a software system for writing extensible parsers for programming languages. It provides a set of OCaml libraries that are used to define grammars as well as loadable syntax extensions of such grammars. Camlp4 stands for
Caml Caml (originally an acronym for Categorical Abstract Machine Language) is a multi-paradigm, general-purpose programming language which is a dialect of the ML programming language family. Caml was developed in France at INRIA and ENS. Caml is ...
Preprocessor In computer science, a preprocessor (or precompiler) is a program that processes its input data to produce output that is used as input in another program. The output is said to be a preprocessed form of the input data, which is often used by so ...
and Pretty-Printer and one of its most important applications was the definition of domain-specific extensions of the syntax of OCaml. Camlp4 was part of the official OCaml distribution which is developed at the
INRIA The National Institute for Research in Digital Science and Technology (Inria) () is a French national research institution focusing on computer science and applied mathematics. It was created under the name ''Institut de recherche en informatiq ...
. Its original author is Daniel de Rauglaudre. OCaml version 3.10.0, released in May 2007, introduced a significantly modified and
backward-incompatible Backward compatibility (sometimes known as backwards compatibility) is a property of an operating system, product, or technology that allows for interoperability with an older legacy system, or with input designed for such a system, especially i ...
version of Camlp4. De Rauglaudre maintains a separate backward-compatible version, which has been renamed Camlp5. All of the examples below are for Camlp5 or the previous version of Camlp4 (versions 3.09 and prior). Version 4.08, released in the summer of 2019, was the last official version of this library. It is currently deprecated; instead, it is recommended to use the PPX (PreProcessor eXtensions) libraries.


Concrete and abstract syntax

A Camlp4 preprocessor operates by loading a collection of compiled modules which define a
parser Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term ''parsing'' comes from Lat ...
as well as a pretty-printer: the parser converts an input
program Program, programme, programmer, or programming may refer to: Business and management * Program management, the process of managing several related projects * Time management * Program, a part of planning Arts and entertainment Audio * Progra ...
into an internal representation. This internal representation constitutes the
abstract syntax tree In computer science, an abstract syntax tree (AST), or just syntax tree, is a tree representation of the abstract syntactic structure of text (often source code) written in a formal language. Each node of the tree denotes a construct occurr ...
(AST). It can be output in a binary form, e.g. it can be passed directly to one of the OCaml
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs tha ...
s, or it can be converted back into a clear text program. The notion of ''concrete syntax'' refers to the format in which the ''abstract syntax'' is represented. For instance, the OCaml
expression Expression may refer to: Linguistics * Expression (linguistics), a word, phrase, or sentence * Fixed expression, a form of words with a specific meaning * Idiom, a type of fixed expression * Metaphorical expression, a particular word, phrase, o ...
(1 + 2) can also be written ((+) 1 2) or (((+) 1) 2). The difference is only at the level of the concrete syntax, since these three versions are equivalent representations of the same abstract syntax tree. As demonstrated by the definition of a revised syntax for OCaml, the same programming language can use different concrete syntaxes. They would all converge to an abstract syntax tree in a unique format that a compiler can handle. The abstract syntax tree is at the center of the syntax extensions, which are in fact OCaml programs. Although the definition of grammars must be done in OCaml, the parser that is being defined or extended is not necessarily related to OCaml, in which case the syntax tree that is being manipulated is not the one of OCaml. Several libraries are provided which facilitate the specific manipulation of OCaml syntax trees.


Fields of application

Domain-specific languages are a major application of Camlp4. Since OCaml is a multi-paradigm language, with an interactive toplevel and a native code compiler, it can be used as a backend for any kind of original language. The only thing that the developer has to do is write a Camlp4 grammar which converts the domain-specific language in question into a regular OCaml program. Other target languages can also be used, such as C. If the target language is OCaml, simple syntax add-ons or
syntactic sugar In computer science, syntactic sugar is syntax within a programming language that is designed to make things easier to read or to express. It makes the language "sweeter" for human use: things can be expressed more clearly, more concisely, or in an ...
can be defined, in order to provide an expressivity which is not easy to achieve using the standard features of the OCaml language. A syntax extension is defined by a compiled OCaml module, which is passed to the camlp4o executable along with the program to process. Camlp4 includes a domain-specific language as it provides syntax extensions which ease the development of syntax extensions. These extensions allow a compact definition of grammars (EXTEND statements) and quotations such as <:expr< 1 + 1 >>, i.e. deconstructing and constructing abstract syntax trees in concrete syntax.


Example

The following example defines a syntax extension of OCaml. It provides a new
keyword Keyword may refer to: Computing * Keyword (Internet search), a word or phrase typically used by bloggers or online content creator to rank a web page on a particular topic * Index term, a term used as a keyword to documents in an information syst ...
, memo, which can be used as a replacement for function and provides automatic memoization of functions with
pattern matching In computer science, pattern matching is the act of checking a given sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually has to be exact: "either it will or will not be ...
. Memoization consists in storing the results of previous computations in a table so that the actual computation of the function for each possible argument occurs at most once. This is pa_memo.ml, the file which defines the syntax extension: let unique = let n = ref 0 in fun () -> incr n; "__pa_memo" ^ string_of_int !n EXTEND GLOBAL: Pcaml.expr; Pcaml.expr: LEVEL "expr1" [ [ "memo"; OPT ", "; pel = LIST1 match_case SEP ", " -> let tbl = unique () in let x = unique () in let result = unique () in <:expr< let $lid:tbl$ = Hashtbl.create 100 in fun $lid:x$ -> try Hashtbl.find $lid:tbl$ $lid:x$ with [ Not_found -> let $lid:result$ = match $lid:x$ with [ $list:pel$ ] in do { Hashtbl.replace $lid:tbl$ $lid:x$ $lid:result$; $lid:result$ } ] >> ] ]; match_case: [ [ p = Pcaml.patt; w = OPT [ "when"; e = Pcaml.expr -> e ]; "->"; e = Pcaml.expr -> (p, w, e) ] ]; END Example of program using this syntax extension: let counter = ref 0 (* global counter of multiplications *) (* factorial with memoization *) let rec fac = memo 0 -> 1 , n when n > 0 -> (incr counter; n * fac (n - 1)) , _ -> invalid_arg "fac" let run n = let result = fac n in let count = !counter in Printf.printf "%i! = %i number of multiplications so far = %i\n" n result count let _ = List.iter run ; 4; 6 The output of the program is as follows, showing that the fac function (factorial) only computes products that were not computed previously: 5! = 120 number of multiplications so far = 5 4! = 24 number of multiplications so far = 5 6! = 720 number of multiplications so far = 6


References


External links


Camlp4 Wiki
- covers version 3.10
Camlp5 website


an

- cover version 3.07

- covers versions up to 3.09

- covers versions up to 3.10 * ttp://www.venge.net/graydon/talks/mkc/html/index.html One-Day Compilers or How I learned to stop worrying and love metaprogramming
Tutorial on building extensible parsers with Camlp4
OCaml software Parsing