Extensible Programming
   HOME

TheInfoList



OR:

Extensible programming is a term used in
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
to describe a style of computer programming that focuses on mechanisms to extend the
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
,
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
and
runtime environment In computer programming, a runtime system or runtime environment is a sub-system that exists both in the computer where a program is created, as well as in the computers where the program is intended to be run. The name comes from the compile t ...
. Extensible programming languages, supporting this style of programming, were an active area of work in the 1960s, but the movement was marginalized in the 1970s. Extensible programming has become a topic of renewed interest in the 21st century.Gregory V. Wilson,
Extensible Programming for the 21st Century
, ''ACM Queue'' 2 no. 9 (Dec/Jan 2004–2005).


Historical movement

The first paper usuallyStandish, Thomas A.,
Extensibility in Programming Language Design
, ''SIGPLAN Notices'' 10 no. 7 (July 1975), pp. 18–21.
Sammet, Jean E., ''Programming Languages: History and Fundamentals'', Prentice-Hall, 1969, section III.7.2 associated with the extensible programming language movement is M. Douglas McIlroy's 1960 paper on macros for higher-level programming languages.McIlroy, M.D.,
Macro Instruction Extensions of Compiler Languages
, ''Communications of the ACM'' 3 no. 4 (April 1960), pp. 214–220.
Another early description of the principle of extensibility occurs in Brooker and Morris's 1960 paper on the Compiler-Compiler.Brooker, R.A. and Morris, D.,
A General Translation Program for Phrase Structure Languages
, ''Journal of the ACM'' 9 no. 1 (January 1962), pp. 1–10. The paper was received in 1960.
The peak of the movement was marked by two academic symposia, in 1969 and 1971.Christensen, C. and Shaw, C.J., eds., Proceedings of the Extensible Languages Symposium, ''SIGPLAN Notices'' 4 no. 8 (August 1969).Schuman, S.A., ed., Proceedings of the International Symposium on Extensible Languages, ''SIGPLAN Notices'' 6 no. 12 (December 1971). By 1975, a survey article on the movement by Thomas A. Standish was essentially a post mortem. The Forth programming language was an exception, but it went essentially unnoticed.


Character of the historical movement

As typically envisioned, an extensible programming language consisted of a base language providing elementary computing facilities, and a
meta-language In logic and linguistics, a metalanguage is a language used to describe another language, often called the ''object language''. Expressions in a metalanguage are often distinguished from those in the object language by the use of italics, quot ...
capable of modifying the base language. A program then consisted of meta-language modifications and code in the modified base language. The most prominent language-extension technique used in the movement was macro definition. Grammar modification was also closely associated with the movement, resulting in the eventual development of adaptive grammar formalisms. The
Lisp A lisp is a speech impairment in which a person misarticulates sibilants (, , , , , , , ). These misarticulations often result in unclear speech. Types * A frontal lisp occurs when the tongue is placed anterior to the target. Interdental lisping ...
language community remained separate from the extensible language community, apparently because, as one researcher observed,
any programming language in which programs and data are essentially interchangeable can be regarded as an extendible iclanguage. ... this can be seen very easily from the fact that Lisp has been used as an extendible language for years.Harrison, M.C., in "Panel on the Concept of Extensibility", pp. 53–54 of the 1969 symposium.
At the 1969 conference,
Simula Simula is the name of two simulation programming languages, Simula I and Simula 67, developed in the 1960s at the Norwegian Computing Center in Oslo, by Ole-Johan Dahl and Kristen Nygaard. Syntactically, it is an approximate superset of ALGOL 6 ...
was presented as an extensible programming language. Standish described three classes of language extension, which he called ''
paraphrase A paraphrase () is a restatement of the meaning of a text or passage using other words. The term itself is derived via Latin ', . The act of paraphrasing is also called ''paraphrasis''. History Although paraphrases likely abounded in oral tra ...
'', ''orthophrase'', and ''metaphrase'' (otherwise paraphrase and metaphrase being
translation Translation is the communication of the Meaning (linguistic), meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The ...
terms). *
Paraphrase A paraphrase () is a restatement of the meaning of a text or passage using other words. The term itself is derived via Latin ', . The act of paraphrasing is also called ''paraphrasis''. History Although paraphrases likely abounded in oral tra ...
defines a facility by showing how to exchange it for something previously defined (or to be defined). As examples, he mentions macro definitions, ordinary procedure definitions, grammatical extensions, data definitions, operator definitions, and control structure extensions. * Orthophrase adds features to a language that could not be achieved using the base language, such as adding an i/o system to a base language that previously had no i/o primitives. Extensions must be understood as orthophrase ''relative'' to some given base language, since a feature not defined in terms of the base language must be defined in terms of some other language. Orthophrase corresponds to the modern notion of plug-ins. * Metaphrase modifies the interpretation rules used for pre-existing expressions. It corresponds to the modern notion of
reflection Reflection or reflexion may refer to: Science and technology * Reflection (physics), a common wave phenomenon ** Specular reflection, reflection from a smooth surface *** Mirror image, a reflection in a mirror or in water ** Signal reflection, in ...
.


Death of the historical movement

Standish attributed the failure of the extensibility movement to the difficulty of programming successive extensions. An ordinary programmer might build a single shell of macros around a base language, but if a second shell of macros was to be built around that, the programmer would have to be intimately familiar with both the base language and the first shell; a third shell would require familiarity with the base and both the first and second shells; and so on. (Note that shielding the programmer from lower-level details is the intent of the
abstraction Abstraction in its main sense is a conceptual process wherein general rules and concepts are derived from the usage and classification of specific examples, literal ("real" or "concrete") signifiers, first principles, or other methods. "An abstr ...
movement that supplanted the extensibility movement.) Despite the earlier presentation of Simula as extensible, by 1975, Standish's survey does not seem in practice to have included the newer abstraction-based technologies (though he used a very general definition of extensibility that technically could have included them). A 1978 history of programming abstraction from the invention of the computer to the (then) present day made no mention of macros, and gave no hint that the extensible languages movement had ever occurred.Guarino, L.R.,
The Evolution of Abstraction in Programming Languages
, ''CMU-CS-78-120'', Department of Computer Science, Carnegie-Mellon University, Pennsylvania, 22 May 1978.
Macros were tentatively admitted into the abstraction movement by the late 1980s (perhaps due to the advent of hygienic macros), by being granted the pseudonym ''syntactic abstractions''.Gabriel, Richard P., ed.,
Draft Report on Requirements for a Common Prototyping System
, ''SIGPLAN Notices'' 24 no. 3 (March 1989), pp. 93ff.


Modern movement

In the modern sense, a system that supports extensible programming will provide ''all'' of the features described below.


Extensible syntax

This simply means that the source language(s) to be compiled must not be closed, fixed, or static. It must be possible to add new keywords, concepts, and structures to the source language(s). Languages which allow the addition of constructs with user defined syntax include Racket,
Camlp4 Camlp4 is a software system for writing extensible parsers for programming languages. It provides a set of OCaml libraries that are used to define grammars as well as loadable syntax extensions of such grammars. Camlp4 stands for Caml Preprocessor ...
, OpenC++, Seed7,Zingaro, Daniel,
Modern Extensible Languages
, SQRL Report 47 McMaster University (October 2007), page 16.
Red Red is the color at the long wavelength end of the visible spectrum of light, next to orange and opposite violet. It has a dominant wavelength of approximately 625–740 nanometres. It is a primary color in the RGB color model and a secondar ...
, Rebol, and
Felix Felix may refer to: * Felix (name), people and fictional characters with the name Places * Arabia Felix is the ancient Latin name of Yemen * Felix, Spain, a municipality of the province Almería, in the autonomous community of Andalusia, ...
. While it is acceptable for some fundamental and intrinsic language features to be immutable, the system must not rely solely on those language features. It must be possible to add new ones.


Extensible compiler

In extensible programming, a compiler is not a monolithic program that converts source code input into binary executable output. The compiler itself must be extensible to the point that it is really a collection of plugins that assist with the translation of source language input into ''anything''. For example, an extensible compiler will support the generation of object code, code documentation, re-formatted source code, or any other desired output. The architecture of the compiler must permit its users to "get inside" the compilation process and provide alternative processing tasks at every reasonable step in the compilation process. For just the task of translating source code into something that can be executed on a computer, an extensible compiler should: * use a plug-in or component architecture for nearly every aspect of its function * determine which language or language variant is being compiled and locate the appropriate plug-in to recognize and validate that language * use formal language specifications to syntactically and structurally validate arbitrary source languages * assist with the semantic validation of arbitrary source languages by invoking an appropriate validation plug-in * allow users to select from different kinds of code generators so that the resulting executable can be targeted for different processors, operating systems, virtual machines, or other execution environment. * provide facilities for error generation and extensions to it * allow new kinds of nodes in the
abstract syntax tree In computer science, an abstract syntax tree (AST), or just syntax tree, is a tree representation of the abstract syntactic structure of text (often source code) written in a formal language. Each node of the tree denotes a construct occurring ...
(AST), * allow new values in nodes of the AST, * allow new kinds of edges between nodes, * support the transformation of the input AST, or portions thereof, by some external "pass" * support the translation of the input AST, or portions thereof, into another form by some external "pass" * assist with the flow of information between internal and external passes as they both transform and translate the AST into new ASTs or other representations


Extensible runtime

At runtime, extensible programming systems must permit languages to extend the set of operations that it permits. For example, if the system uses a
byte-code Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (normall ...
interpreter, it must allow new byte-code values to be defined. As with extensible syntax, it is acceptable for there to be some (smallish) set of fundamental or intrinsic operations that are immutable. However, it must be possible to overload or augment those intrinsic operations so that new or additional behavior can be supported.


Content separated from form

Extensible programming systems should regard programs as data to be processed. Those programs should be completely devoid of any kind of formatting information. The visual display and editing of programs to users should be a translation function, supported by the extensible compiler, that translates the program data into forms more suitable for viewing or editing. Naturally, this should be a two-way translation. This is important because it must be possible to easily process extensible programs in a ''variety'' of ways. It is unacceptable for the only uses of source language input to be editing, viewing and translation to machine code. The arbitrary processing of programs is facilitated by de-coupling the source input from specifications of how it should be processed (formatted, stored, displayed, edited, etc.).


Source language debugging support

Extensible programming systems must support the debugging of programs using the constructs of the original source language regardless of the extensions or transformation the program has undergone in order to make it executable. Most notably, it cannot be assumed that the only way to display runtime data is in ''structures'' or ''arrays''. The debugger, or more correctly 'program inspector', must permit the display of runtime data in forms suitable to the source language. For example, if the language supports a data structure for a
business process A business process, business method or business function is a collection of related, structured activities or tasks by people or equipment in which a specific sequence produces a service or product (serves a particular business goal) for a parti ...
or
work flow A workflow consists of an orchestrated and repeatable pattern of activity, enabled by the systematic organization of resources into processes that transform materials, provide services, or process information. It can be depicted as a sequence of ...
, it must be possible for the debugger to display that data structure as a fishbone chart or other form provided by a plugin.


Examples

*
Camlp4 Camlp4 is a software system for writing extensible parsers for programming languages. It provides a set of OCaml libraries that are used to define grammars as well as loadable syntax extensions of such grammars. Camlp4 stands for Caml Preprocessor ...
* Felix *
Nemerle Nemerle is a general-purpose, high-level, statically typed programming language designed for platforms using the Common Language Infrastructure ( .NET/Mono). It offers functional, object-oriented, aspect-oriented, reflective and imperative fea ...
* Seed7 * Rebol **
Red Red is the color at the long wavelength end of the visible spectrum of light, next to orange and opposite violet. It has a dominant wavelength of approximately 625–740 nanometres. It is a primary color in the RGB color model and a secondar ...
*
Ruby A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sa ...
(
metaprogramming Metaprogramming is a programming technique in which computer programs have the ability to treat other programs as their data. It means that a program can be designed to read, generate, analyze or transform other programs, and even modify itself ...
) *
IMP IMP or imp may refer to: * Imp, a fantasy creature Arts and entertainment Fictional characters * Imp (She-Ra), a character in ''She-Ra: Princess of Power'' * Imp a character in '' Artemis Fowl: The Lost Colony'' * Imp, a character in the '' Cl ...
* OpenC++ * XL *
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable ...
*
Forth Forth or FORTH may refer to: Arts and entertainment * ''forth'' magazine, an Internet magazine * ''Forth'' (album), by The Verve, 2008 * ''Forth'', a 2011 album by Proto-Kaw * Radio Forth, a group of independent local radio stations in Scotla ...
*
Lisp A lisp is a speech impairment in which a person misarticulates sibilants (, , , , , , , ). These misarticulations often result in unclear speech. Types * A frontal lisp occurs when the tongue is placed anterior to the target. Interdental lisping ...
** Racket **
Scheme A scheme is a systematic plan for the implementation of a certain idea. Scheme or schemer may refer to: Arts and entertainment * ''The Scheme'' (TV series), a BBC Scotland documentary series * The Scheme (band), an English pop band * ''The Schem ...
*
Lua Lua or LUA may refer to: Science and technology * Lua (programming language) * Latvia University of Agriculture * Last universal ancestor, in evolution Ethnicity and language * Lua people, of Laos * Lawa people, of Thailand sometimes referred t ...
*
PL/I PL/I (Programming Language One, pronounced and sometimes written PL/1) is a procedural, imperative computer programming language developed and published by IBM. It is designed for scientific, engineering, business and system programming. I ...
*
Smalltalk Smalltalk is an object-oriented, dynamically typed reflective programming language. It was designed and created in part for educational use, specifically for constructionist learning, at the Learning Research Group (LRG) of Xerox PARC by Alan Ka ...


See also

*
Adaptive grammar An adaptive grammar is a formal grammar that explicitly provides mechanisms within the formalism to allow its own production rules to be manipulated. Overview John N. Shutt defines adaptive grammar as a grammatical formalism that allows rule set ...
*
Concept programming Christophe de Dinechin is a French computer scientist, with contributions in video games, programming languages and operating systems. Programming languages Dinechin contributed to C++, notably a high-performance exception handling implementati ...
* Dialecting * Grammar-oriented programming *
Language-oriented programming Language-oriented programming (LOP) is a software-development paradigm where "language" is a software building block with the same status as objects, modules and components, and rather than solving problems in general-purpose programming languages, ...
*
Homoiconicity In computer programming, homoiconicity (from the Greek words ''homo-'' meaning "the same" and ''icon'' meaning "representation") is a property of some programming languages. A language is homoiconic if a program written in it can be manipulated as ...


References


External links


General


Greg Wilson's Article in ACM Queue

Slashdot Discussion

Modern Extensible Languages
- A paper from Daniel Zingaro


Tools


MetaL
â€


XPS
— eXtensible Programming System (in development)
MPS
— JetBrains Metaprogramming system


Programming languages with extensible syntax


OpenZz

xtc — eXTensible C

English-script

Nemerle Macros

Boo Syntactic Macros

Stanford University Intermediate Format compiler

Seed7 - The extensible programming language

Katahdin
- a programming language with syntax and semantics that are mutable at runtime

- another programming language with extensible syntax, implemented using an
Earley parser In computer science, the Earley parser is an algorithm for parsing strings that belong to a given context-free language, though (depending on the variant) it may suffer problems with certain nullable grammars. The algorithm, named after its inve ...
{{DEFAULTSORT:Extensible Programming Extensible syntax programming languages Programming paradigms