OMeta is a specialized object-oriented programming language for
pattern matching
In computer science, pattern matching is the act of checking a given sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually has to be exact: "either it will or will not be ...
, developed by Alessandro Warth and Ian Piumarta in 2007 under the
Viewpoints Research Institute. The language is based on
Parsing Expression Grammars (PEGs) rather than
Context-Free Grammars
In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules are of the form
:A\ \to\ \alpha
with A a ''single'' nonterminal symbol, and \alpha a string of terminals and/or nonterminals (\alpha can be empt ...
with the intent of providing "a natural and convenient way for programmers to implement
tokenizers,
parsers,
visitors, and tree-transformers".
[Warth, Alessandro, and Ian Piumarta.]
OMeta: An Object-Oriented Language for Pattern Matching
" ACM SIGPLAN 2007 Dynamic Languages Symposium (DLS '07). 03rd ed. Vol. TR-2007. Glendale, CA: Viewpoints Research Institute, 2007. VPRI Technical Report. Web. 30 Sept. 2013.
OMeta's main goal is to allow a broader audience to use techniques generally available only to language programmers, such as parsing.
[ It is also known for its use in quickly creating prototypes, though programs written in OMeta are noted to be generally less efficient than those written in vanilla (base language) implementations, such as ]JavaScript
JavaScript (), often abbreviated as JS, is a programming language that is one of the core technologies of the World Wide Web, alongside HTML and CSS. As of 2022, 98% of websites use JavaScript on the client side for webpage behavior, of ...
.[Klint, Paul, Tijs Van Der Storm, and Jurgen Vinju.]
On the Impact of DSL Tools on the Maintainability of Language Implementations
" LDTA '10 Proceedings of the Tenth Workshop on Language Descriptions, Tools and Applications. New York, NY. N.p., 2010. Web. 30 Sept. 2013.[Heirbaut, Nickolas. "Two Implementation Techniques for Domain Specific Languages Compared: OMeta/JS vs. Javascript." Thesis. University of Amsterdam, 2009. Web. 30 Sept. 2013..]
OMeta is noted for its use in creating domain-specific languages
A domain-specific language (DSL) is a computer language specialized to a particular application Domain (software engineering), domain. This is in contrast to a general-purpose language (GPL), which is broadly applicable across domains. There are a ...
, and especially for the maintainability of its implementations (Newcome). OMeta, like other meta languages, requires a host language; it was originally created as a COLA implementation.
Description
OMeta is a meta-language used in the prototyping and creation of domain-specific languages
A domain-specific language (DSL) is a computer language specialized to a particular application Domain (software engineering), domain. This is in contrast to a general-purpose language (GPL), which is broadly applicable across domains. There are a ...
. It was introduced as "an object-oriented language for pattern matching". It uses parsing expression grammar
In computer science, a parsing expression grammar (PEG) is a type of analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing strings in the language. The formalism was introduced by Bryan Ford in 200 ...
s (descriptions of languages "based on recognizing strings instead of generating them"[Mascarenhas, Fabio, Sergio Medeiros, and Roberto Ierusalimschy. Parsing Expression Grammars for Structured Data. N.p.: n.p., n.d. Web..]) designed "to handle arbitrary kinds of data", such as characters, numbers, strings, atoms, and lists. This increases its versatility, enabling it to work on both structured and unstructured data
Unstructured data (or unstructured information) is information that either does not have a pre-defined data model or is not organized in a pre-defined manner. Unstructured information is typically text-heavy, but may contain data such as dates, num ...
.
The language's main advantage over similar languages is its ability to use the same code for all steps of compiling, (e.g. lexing and parsing). OMeta also supports the defining of production rules based on arguments; this can be used to add such rules to OMeta itself, as well as the host language that OMeta is running in. Additionally, these rules can use each other as arguments, creating "higher-order rules", and inherit each other to gain production rules from existing code. OMeta is capable of using host-language booleans (True/False) while pattern matching; these are referred to as "semantic predicates". OMeta uses generalized pattern-matching to allow programmers to more easily implement and extend phases of compilation with a single tool.
OMeta uses grammars to determine the rules in which it operates. The grammars are able to hold an indefinite number of variables due to the use of an __init__ function called when a grammar is created. Grammars can inherit as well as call each other (using the "foreign production invocation mechanism", enabling grammars to "borrow" each other's input streams), much like classes in full programming languages. OMeta also prioritizes options within a given grammar in order to remove ambiguity, unlike most meta-languages. After pattern-matching an input to a given grammar, OMeta then assigns each component of the pattern to a variable, which it then feeds into the host language.[Moser, Jeff. "Moserware." ]
OMeta#: Who? What? When? Where? Why?
Blogger, 24 June 2008. Web. 30 Sept. 2013.
OMeta uses pattern matching in order to accomplish all of the steps of traditional compiling by itself. It first finds patterns in characters to create tokens, then it matches those tokens to its grammar to make syntax trees. Typecheckers then match patterns on the syntax trees to make annotated trees, and visitors do the same to produce other trees. A code generator then pattern-matches the trees to produce the code. In OMeta, it is easy to "traverse through the parse tree since such functionality is natively supported".
The meta-language is noted for its usability in most programming languages, though it is most commonly used in its language of implementation—OMeta/JS, for example, is used in JavaScript. Because it requires a host language, the creators of OMeta refer to it as a "parasitic language".[Warth, Alessandro. " metaOn OMeta's Syntax." metaOn OMeta's Syntax. N.p., 4 July 2008. Web. 16 Oct. 2013..]
Development
Alessandro Warth and Ian Piumarta developed OMeta at the Viewpoints Research Institute, an organization intended to improve research systems and personal computing, in 2007. They first used a Combined Object Lambda Architecture, or COLA (a self-describing language investigated at Viewpoints Research Institute) as OMeta's host language, and later, assisted by Yoshiki Ohshima, ported it to Squeak Smalltalk to verify its usability with multiple host languages. OMeta was also used "to implement a nearly complete subset of…Javascript" as a case study in its introductory paper.
Usage
OMeta, like other meta languages, is primarily used to create domain-specific languages
A domain-specific language (DSL) is a computer language specialized to a particular application Domain (software engineering), domain. This is in contrast to a general-purpose language (GPL), which is broadly applicable across domains. There are a ...
(DSLs in short); specifically, it is used to quickly prototype DSLs — OMeta's slow running speed and unclear error reports remove much of its functionality as a full programming language (Heirbaut 73–74). OMeta is useful thanks to its ability to use one syntax for every phase of compiling, allowing it to be used rather than several separate tools in the creation of a compiler. Additionally, OMeta is valued both for the speed at which it can be used to create DSLs and the significantly lower amount of code it requires to perform such a task as opposed to vanilla implementations, with reports showing around 26% as many lines of functional code as vanilla.
Examples
The following is an example of a basic calculator language in C# using OMeta:
ometa BasicCalc <: Parser
It is also possible to create subclasses of languages you have written:
ometa ExponentCalc <: BasicCalc
Previously written languages can also be called rather than inherited:
ometa ScientificCalc <: Parser
Versions
OMeta can theoretically be implemented into any host language, but it is used most often as OMeta/JS, a JavaScript implementation. Warth has stated that patterns in "OMeta/X---where X is some host language" are better left to be influenced by "X" than standardized within OMeta, due to the fact that different host languages recognize different types of objects.
MetaCOLA
MetaCOLA was the first implementation of OMeta, used in the language's introductory paper. MetaCOLA implemented OMeta's first test codes, and was one of the three forms (the others being OMeta/Squeak and a nearly-finished OMeta/JS) of the language made prior to its release.
OMeta/Squeak
OMeta/Squeak was a port of OMeta used during the initial demonstration of the system. OMeta/Squeak is used "to experiment with alternative syntaxes for the Squeak EToys system" OMeta/Squeak requires square brackets and "pointy brackets" (braces) in rule operations, unlike OMeta/JS, which requires only square brackets. OMeta/Squeak 2, however, features syntax more similar to that of OMeta/JS.[Warth, Alessandro. "OMeta/Squeak 2." OMeta/Squeak 2. N.p., n.d. Web. 16 Oct. 2013..] Unlike the COLA implementation of OMeta, the Squeak version does not memorize intermediate results (store numbers already used in calculation).
OMeta/JS
OMeta/JS is OMeta in the form of a JavaScript implementation. Language implementations using OMeta/JS are noted to be easier to use and more space-efficient than those written using only vanilla JavaScript, but the former have been shown to perform much more slowly. Because of this, OMeta/JS is seen as a highly useful tool for prototyping, but is not preferred for production language implementations.
Vs. JavaScript
The use of DSL development tools, such as OMeta, are considered much more maintainable than "vanilla implementations" (i. e. JavaScript) due to their low NCLOC (Non-Comment Lines of Code) count. This is due in part to the "semantic action code which creates the AST objects or performs limited string operations". OMeta's lack of "context-free syntax" allows it to be used in both parser and lexer creation at the cost of extra lines of code. Additional factors indicating OMeta's maintainability include a high maintainability index "while Halstead Effort indicate that the vanilla parser requires three times more development effort compared to the OMeta parser". Like JavaScript, OMeta/JS supports "the complete syntax notation of Waebric".
One of the major advantages of OMeta responsible for the difference in NCLOC is OMeta's reuse of its "tree walking mechanism" by allowing the typechecker to inherit the mechanism from the parser, which causes the typechecker to adapt to changes in the OMeta parser, while JavaScript's tree walking mechanism contains more code and must be manually adapted to the changes in the parser. Another is the fact that OMeta's grammars have a "higher abstraction level...than the program code". It can also be considered "the result of the semantic action code which creates the AST objects or performs limited string operations", though the grammar's non-semantics create a need for relatively many lines of code per function because of explicit whitespace definition—a mechanism implemented to allow OMeta to act as a single tool for DSL creation.
In terms of performance, OMeta is found to run at slow speeds in comparison to vanilla implementations. The use of backtracking techniques by OMeta is a potential major cause for this (OMeta's parser "includes seven look-ahead operators...These operators are necessary to distinguish certain rules from each other and cannot be left out of the grammar"); however, it is more likely that this performance drop is due to OMeta's method of memoization:
"The storage of intermediate parsing steps causes the size of the parsing table
to be proportional with the number of terminals and non-terminals (operands)
used in the grammar. Since the grammar of the OMeta parser contains 446
operands, it is believed that performance is affected negatively.".
Where OMeta gains time on the vanilla implementation, however, is in lexing. JavaScript's vanilla lexer slows down significantly due to a method by which the implementation converts the entire program into a string through Java before the lexer starts. Despite this, the OMeta implementation runs significantly slower overall.
OMeta also falls behind in terms of error reporting. While vanilla implementations return the correct error message in about "92% of the test cases" in terms of error location, OMeta simply returns "Match failed!" to any given error. Finding the source through OMeta requires "manually...counting the newline characters in the semantic action code in order to output at least the line number at which parsing fails".
OMeta#
OMeta# is a project by Jeff Moser meant to translate OMeta/JS into a C# functionality; as such, the design of OMeta# is based on Alessandro Warth's OMeta/JS design.. The goal of the project is to give users the ability to make working languages with high simplicity. Specifically, OMeta# is intended to work as a single tool for .NET language development, reduce the steep learning curve of language development, become a useful teaching resource, and be practical for use in real applications. OMeta# currently uses C# 3.0 as OMeta's host language rather than 4.0; because C# 3.0 is a static language rather than a dynamic one, recognition of the host language within OMeta# is "two to three times uglier and larger than it might have been" in a dynamically typed language.[Moser, Jeff. "Moserware." ]
Meta-FizzBuzz
Blogger, 25 August 2008. Web. 30 Sept. 2013.
OMeta# uses .NET classes, or Types, as grammars and methods for the grammars’ internal "rules". OMeta# uses braces ( ) to recognize its host language in grammars. The language has a focus on strong, clean, static typing much like that of its host language, though this adds complexity to the creation of the language. New implementations in C# must also be compatible with the .NET meta-language, making the creation even more complex. Additionally, to prevent users from accidentally misusing the metarules in OMeta#, Moser has opted to implement them as "an explicit interface exposed via a property (e.g. instead of "_apply", I have "MetaRules.Apply")." Later parts of OMeta# are written in the language itself, though the functionality of the language remains fairly tied to C#.[Moser, Jeff. "Moserware." : Building an Object-Oriented Parasitic Metalanguage Blogger, 31 July 2008. Web. 30 Sept. 2013.] The OMeta# source code is posted on Codeplex, and is intended to remain as an open-source project. However, updates have been on indefinite hiatus since shortly after the project's beginnings, with recommits by the server on October 1, 2012.
IronMeta
Gordon Tisher create
IronMeta
for .NET in 2009, and while similar to OMeta#, it's a much more supported and robust implementation, distributed under BSD license on GitHub.
Ohm
Ohm
is a successor to Ometa that aims to improve on it by (amongst other things) separating the grammar from the semantic actions.
See also
* ANTLR
In computer-based language recognition, ANTLR (pronounced ''antler''), or ANother Tool for Language Recognition, is a parser generator that uses LL(*) for parsing. ANTLR is the successor to the Purdue Compiler Construction Tool Set (PCCTS), firs ...
(ANother Tool for Language Recognition), a similar meta language
* META II An early compiler-compiler, influential in OMeta's implementation
OMeta/JS github repository
References
{{Metasyntax
Parser generators
Object-oriented programming languages