PackCC
   HOME

TheInfoList



OR:

PackCC is a
parser generator In computer science, a compiler-compiler or compiler generator is a programming tool that creates a parser, interpreter, or compiler from some form of formal description of a programming language and machine. The most common type of compiler- ...
for C. Its main features are as follows: * Generates a parser in written C from a grammar described in a PEG, * Gives a parser great efficiency by packrat parsing, * Supports direct and indirect left-recursive grammar rules, * Generates a
thread-safe Thread safety is a computer programming concept applicable to multi-threaded code. Thread-safe code only manipulates shared data structures in a manner that ensures that all threads behave properly and fulfill their design specifications without uni ...
and reentrant parser, * Consists of just a single compact source file. The grammar of an output parser can be described in a PEG (Parsing Expression Grammar). The PEG is a
top-down parsing language Top-Down Parsing Language (TDPL) is a type of analytic grammar, analytic formal grammar developed by Alexander Birman in the early 1970s in order to study formally the behavior of a common class of practical top-down parsing, top-down parsers that s ...
, and is similar to the
regular expression A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" or ...
grammar. Compared with a bottom-up parsing language, like
Yacc Yacc (Yet Another Compiler-Compiler) is a computer program for the Unix operating system developed by Stephen C. Johnson. It is a Look Ahead Left-to-Right Rightmost Derivation (LALR) parser generator, generating a LALR parser (the part of a com ...
's one, the PEG is much more intuitive and cannot be ambiguous. The PEG does not require
tokenization Tokenization may refer to: * Tokenization (lexical analysis) in language processing * Tokenization (data security) in the field of data security * Word segmentation * Tokenism Tokenism is the practice of making only a perfunctory or symbolic ...
to be a separate step, and tokenization rules can be written in the same way as any other grammar rules. The generated parser can parse inputs very efficiently by packrat parsing. The packrat parsing is the recursive descent parsing algorithm that is accelerated using
memoization In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again. Memoization ...
. By using packrat parsing, any input can be parsed in linear time. Without it, however, the resulting parser could exhibit exponential time performance in the worst case due to the unlimited look-ahead capability. Unlike common packrat parsers, PackCC can support direct and indirect left-recursive grammar rules.''"Packrat Parsers Can Support Left Recursion"''
authored by A. Warth, J. R. Douglass, and T. Millstein This makes grammar rules much more intuitive. The generated code is beautified and as ease-of-understanding as possible. Actually, it uses many goto statements, but the control flows are much more traceable than goto spaghetti storms generated by some other parser generators. PackCC itself is under MIT license, but the generated code can be distributed under any license or can be used in proprietary software.


Input file example

A desktop calculator. Note that left-recursive grammar rules are included. %prefix "calc" statement <- _ e:expression _ EOL / ( !EOL . )* EOL expression <- e:term term <- l:term _ '+' _ r:factor / l:term _ '-' _ r:factor / e:factor factor <- l:factor _ '*' _ r:unary / l:factor _ '/' _ r:unary / e:unary unary <- '+' _ e:unary / '-' _ e:unary / e:primary primary <- < -9 > / '(' _ e:expression _ ')' _ <- \t EOL <- '\n' / '\r\n' / '\r' / ';' %% int main()


Notes

{{Reflist


External links


PackCC

"''Packrat Parsers Can Support Left Recursion''"

peg/leg
Parser generators