Ragel is a
finite-state machine
A finite-state machine (FSM) or finite-state automaton (FSA, plural: ''automata''), finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number o ...
compiler
In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
and a
parser generator
In computer science, a compiler-compiler or compiler generator is a programming tool that creates a parser, interpreter, or compiler from some form of formal description of a programming language and machine.
The most common type of compiler- ...
. Initially Ragel supported output for
C,
C++
C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
and
Assembly
Assembly may refer to:
Organisations and meetings
* Deliberative assembly, a gathering of members who use parliamentary procedure for making decisions
* General assembly, an official meeting of the members of an organization or of their representa ...
source code, and was expanded to support several other languages including
Objective C
Objective-C is a general-purpose, object-oriented programming language that adds Smalltalk-style messaging to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was selected by NeXT for its NeXTS ...
,
D,
Go,
Ruby
A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sa ...
, and
Java
Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
.
Additional language support is also in development. It supports the generation of
table
Table may refer to:
* Table (furniture), a piece of furniture with a flat surface and one or more legs
* Table (landform), a flat area of land
* Table (information), a data arrangement with rows and columns
* Table (database), how the table data ...
or
control flow
In computer science, control flow (or flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. The emphasis on explicit control flow distinguishes an ''imper ...
driven state machines from
regular expressions
A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" or ...
and/or state charts and can also build
lexical analysers via the longest-match method. Ragel specifically targets
text parsing
In computer science, a parsing expression grammar (PEG) is a type of formal grammar#Analytic grammars, analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing string (computer science), strings in the ...
and
input validation.
[Omar Badreddin (2010) "]Umple
Umple is a language for both object-oriented programming and modelling with class diagrams and state diagrams.
The name Umple is a portmanteau of "UML", "ample" and "Simple", indicating that it is designed to provide ample features to extend pro ...
: a model-oriented programming language." ''Software Engineering, 2010 ACM/IEEE 32nd International Conference on. Vol. 2''. IEEE, 2010.
Overview
Ragel supports the generation of
table
Table may refer to:
* Table (furniture), a piece of furniture with a flat surface and one or more legs
* Table (landform), a flat area of land
* Table (information), a data arrangement with rows and columns
* Table (database), how the table data ...
or
control flow
In computer science, control flow (or flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. The emphasis on explicit control flow distinguishes an ''imper ...
driven
state machine
A finite-state machine (FSM) or finite-state automaton (FSA, plural: ''automata''), finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number o ...
s from
regular expression
A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" or ...
s and/or
state charts and can also build
lexical analysers via the longest-match method.
A unique feature of Ragel is that user actions can be associated with arbitrary state machine transitions using operators that are integrated into the regular expressions. Ragel also supports visualization of the generated machine via
graphviz
Graphviz (short for ''Graph Visualization Software'') is a package of open-source tools initiated by AT&T Labs Research for drawing graphs specified in DOT language scripts having the file name extension "gv". It also provides libraries for sof ...
.
The above graph represents a state-machine that takes user input as a series of bytes representing
ASCII
ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
characters and control codes. 48..57 is equivalent to the regular expression
-9(i.e. any digit), so only sequences beginning with a digit can be recognised. If 10 (line feed) is encountered, we're done. 46 is the decimal point ('.'), 43 and 45 are positive and negative signs ('+', '-') and 69/101 is uppercase/lowercase 'e' (to indicate a number in scientific format). As such it will recognize the following properly:
2
45
055
78.1
2e5
78.3e12
69.0e-3
3e+3
but not:
.3
46.
-5
3.e2
2e5.1
Syntax
Ragel's input is a regular expression only in the sense that it describes a
regular language
In theoretical computer science and formal language theory, a regular language (also called a rational language) is a formal language that can be defined by a regular expression, in the strict sense in theoretical computer science (as opposed to ...
; it is usually not written in a concise regular expression, but written out into multiple parts like in
Extended Backus–Naur form
In computer science, extended Backus–Naur form (EBNF) is a family of metasyntax notations, any of which can be used to express a context-free grammar. EBNF is used to make a formal description of a formal language such as a computer programmin ...
. For example, instead of supporting POSIX character classes in regex syntax, Ragel implements them as built-in production rules. As with usual parser generators, Ragel allows for handling code for productions to be written with the syntax.
The code yielding the above example from the official website is:
action dgt
action dec
action exp
action exp_sign
action number
# A floating-point number literal.
number = (
-9 $dgt ( '.' @dec -9 $dgt )?
( E( \-$exp_sign )? -9 $exp )?
) %number;
main := ( number '\n' )*;
See also
*
Comparison of parser generators
This is a list of notable lexer generators and parser generators for various language classes.
Regular languages
Regular languages are a category of languages (sometimes termed Chomsky Type 3) which can be matched by a state machine (more spe ...
*
Executable UML Executable UML (xtUML or xUML) is both a software development method and a highly abstract software language. It was described for the first time in 2002 in the book "Executable UML: A Foundation for Model-Driven Architecture". The language "combine ...
*
Finite-state machine
A finite-state machine (FSM) or finite-state automaton (FSA, plural: ''automata''), finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number o ...
*
Regular expression
A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" or ...
*
Thompson's construction
In computer science, Thompson's construction algorithm, also called the McNaughton–Yamada–Thompson algorithm, is a method of transforming a regular expression into an equivalent nondeterministic finite automaton (NFA). This NFA can be used to ...
- the algorithm used by Ragel
*
Umple
Umple is a language for both object-oriented programming and modelling with class diagrams and state diagrams.
The name Umple is a portmanteau of "UML", "ample" and "Simple", indicating that it is designed to provide ample features to extend pro ...
*
Helsinki Finite-State Technology (HFST)
References
External links
*
*{{Official website, https://www.colm.net/open-source/ragel/
Free compilers and interpreters
Parser generators
Programming language implementation
Pattern matching