C-- (pronounced ''C minus minus'') is a
C-like
programming language
A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language.
The description of a programming ...
. Its creators,
functional programming
In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that ...
researchers
Simon Peyton Jones
Simon Peyton Jones (born 18 January 1958) is a British computer scientist who researches the implementation and applications of functional programming languages, particularly lazy functional programming.
Education
Peyton Jones graduated fr ...
and Norman Ramsey, designed it to be generated mainly by
compiler
In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs tha ...
s for
very high-level languages rather than written by human programmers. Unlike many other intermediate languages, its representation is plain
ASCII
ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because ...
text, not
bytecode
Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (norma ...
or another
binary
Binary may refer to:
Science and technology Mathematics
* Binary number, a representation of numbers using only two digits (0 and 1)
* Binary function, a function that takes two arguments
* Binary operation, a mathematical operation that ta ...
format.
There are two main branches:
* C--, the original branch, with the final version 2.0 released in May 2005
* Cmm, the fork actively used as the
intermediate representation
An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A "good" ...
(IR) in the
Glasgow Haskell Compiler
The Glasgow Haskell Compiler (GHC) is an open-source native code compiler for the functional programming language Haskell.
It provides a cross-platform environment for the writing and testing of Haskell code and it supports numerous extensions, ...
(GHC)
Design
C-- is a "portable
assembly language", designed to ease the implementation of compilers that produce high-quality
machine code
In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ve ...
. This is done by delegating low-level
code-generation and
program optimization
In computer science, program optimization, code optimization, or software optimization, is the process of modifying a software system to make some aspect of it work more efficiently or use fewer resources. In general, a computer program may be o ...
to a C-- compiler. The language's
syntax borrows heavily from C while omitting or changing standard C features such as
variadic function
In mathematics and in computer programming, a variadic function is a function of indefinite arity, i.e., one which accepts a variable number of arguments. Support for variadic functions differs widely among programming languages.
The term ''vari ...
s,
pointer syntax, and aspects of C's type system, because they hamper essential features of C-- and ease of code-generation.
The name of the language is an in-joke, indicating that C-- is a reduced form of C, in the same way that
C++
C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
is basically an expanded form of C. (in C,
--
and
++
mean "decrement" and "increment," respectively.)
Work on C-- began in the late 1990s. Since writing a custom
code generator In computing, Code generation denotes software techniques or systems that generate program code which may then be used independently of the generator system in a runtime environment.
Specific articles:
* Code generation (compiler), a mechanism to pr ...
is a challenge in itself, and the compiler
backends available to researchers at that time were complex and poorly documented, several projects had written compilers which generated C code (for instance, the original
Modula-3
Modula-3 is a programming language conceived as a successor to an upgraded version of Modula-2 known as Modula-2+. While it has been influential in research circles (influencing the designs of languages such as Java, C#, and Python) it has not ...
compiler). However, C is a poor choice for functional languages: it does not guarantee
tail-call optimization
In computer science, a tail call is a subroutine call performed as the final action of a procedure. If the target of a tail is the same subroutine, the subroutine is said to be tail recursive, which is a special case of direct recursion. Tail recu ...
, or support accurate
garbage collection
Waste collection is a part of the process of waste management. It is the transfer of solid waste from the point of use and disposal to the point of treatment or landfill. Waste collection also includes the curbside collection of recyclabl ...
or efficient
exception handling
In computing and computer programming, exception handling is the process of responding to the occurrence of ''exceptions'' – anomalous or exceptional conditions requiring special processing – during the execution of a program. In general, an ...
. C-- is a tightly-defined simpler alternative to C which supports all of these. Its most innovative feature is a run-time interface which allows writing of portable garbage collectors, exception handling systems and other run-time features which work with any C-- compiler.
The first version of C-- was released in April 1998 as a MSRA paper,
[ accompanied by a January 1999 paper on garbage collection.] A revised manual was posted in HTML form in May 1999. Two sets of major changes proposed in 2000 by Norman Ramsey ("Proposed Changes") and Christian Lindig ("A New Grammar") led to C-- version 2, which was finalized around 2004 and officially released in 2005.
Type system
The C-- type system
In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a type to every "term" (a word, phrase, or other set of symbols). Usually the terms are various constructs of a computer progr ...
is designed to reflect constraints imposed by hardware rather than conventions imposed by higher-level languages. A value stored in a register or memory may have only one type: bit-vector. However, bit-vector is a polymorphic type which comes in several widths, e.g. , , or . A separate 32-or-64 bit family of floating-point types is supported. In addition to the bit-vector type, C-- provides a boolean type , which can be computed by expressions and used for control flow but cannot be stored in a register or memory. As in an assembly language, any higher type discipline, such as distinctions between signed, unsigned, float, and pointer, is imposed by the C-- operators or other syntactic constructs. C-- is not type-checked, nor does it enforce or check the calling convention.[
C-- version 2 removes the distinction between bit-vector and floating-point types. These types can be annotated with a string "kind" tag to distinguish, among other things, a variable's integer vs float typing and its storage behavior (global or local). The former is useful on targets that have separate registers for integer and floating-point values. Special types for pointers and the native word were introduced, although they are mapped to a bit-vector with a target-dependent length.]
Implementations
The specification page of C-- lists a few implementations of C--. The "most actively developed" compiler, Quick C--, was abandoned in 2013.
Haskell
Some developers of C--, including Simon Peyton Jones, João Dias, and Norman Ramsey, work or have worked on GHC, whose development has led to extensions in the C-- language, forming the ''Cmm'' dialect which uses the C preprocessor
The C preprocessor is the macro preprocessor for the C, Objective-C and C++ computer programming languages. The preprocessor provides the ability for the inclusion of header files, macro expansions, conditional compilation, and line control ...
for ergonomics.
GHC backends are responsible for further transforming C-- into executable code, via LLVM IR, slow C, or directly through the built-in native backend. Despite the original intention, GHC does perform many of its generic optimizations on C--. As with other compiler IRs, the C-- representation can be dumped for debugging. Target-specific optimizations are performed later by the backend.
See also
* BCPL
* C++
C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
* LLVM
LLVM is a set of compiler and toolchain technologies that can be used to develop a front end for any programming language and a back end for any instruction set architecture. LLVM is designed around a language-independent intermediate repre ...
References
{{Reflist, refs=
[{{Cite journal, last1=Nordin, first1=Thomas, last2=Jones, first2=Simon Peyton, last3=Iglesias, first3=Pablo Nogueira, last4=Oliva, first4=Dino, date=1998-04-23, title=The C– Language Reference Manual, url=https://www.microsoft.com/en-us/research/publication/the-c-language-reference-manual/, language=en-US]
[{{Cite journal, last1=Reig, first1=Fermin, last2=Ramsey, first2=Norman, last3=Jones, first3=Simon Peyton, date=1999-01-01, title=C–: a portable assembly language that supports garbage collection, pages=1–28 , url=https://www.microsoft.com/en-us/research/publication/portable-assembly-language-supports-garbage-collection/, language=en-US]
[{{cite web , last1=Ramsey, first1=Norman, last2=Jones, first2=Simon Peyton, title=The C-- Language Specification, Version 2.0 , url=https://www.cs.tufts.edu/~nr/c--/extern/man2.pdf , accessdate=11 December 2019]
[GHC Commentary: What the hell is a .cmm file?](_blank)
/ref>
[{{cite web, title=An improved LLVM backend, url=https://ghc.haskell.org/trac/ghc/wiki/ImprovedLLVMBackend]
/ref>
[Debugging compilers with optimization fuel](_blank)
/ref>
[{{cite web , title=C-- Downloads , url=https://www.cs.tufts.edu/~nr/c--/code.html , website=www.cs.tufts.edu , accessdate=11 December 2019]
[{{cite web , url=https://www.cs.tufts.edu/~nr/c--/extern/manual.html , last1=Nordin, first1=Thomas, last2=Jones, first2=Simon Peyton, last3=Iglesias, first3=Pablo Nogueira, last4=Oliva, first4=Dino, date=1999-05-23, title=The C– Language Reference Manual]
External links
Archive of old official website
(cminusminus.org)
Quick C-- code archive
(the reference implementation)
C programming language family
Compilers