The Cyclone
programming language
A programming language is a system of notation for writing computer programs.
Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
was intended to be a safe dialect of the
C language
C (''pronounced'' '' – like the letter c'') is a general-purpose programming language. It was created in the 1970s by Dennis Ritchie and remains very widely used and influential. By design, C's features cleanly reflect the capabilities o ...
. It avoids
buffer overflows and other vulnerabilities that are possible in C programs by design, without losing the power and convenience of C as a tool for
system programming
Systems programming, or system programming, is the activity of programming computer system software. The primary distinguishing characteristic of systems programming when compared to application programming is that application programming aims to ...
. It is no longer supported by its original developers, with the reference tooling not supporting
64-bit platforms. The
Rust
Rust is an iron oxide, a usually reddish-brown oxide formed by the reaction of iron and oxygen in the catalytic presence of water or air moisture. Rust consists of hydrous iron(III) oxides (Fe2O3·nH2O) and iron(III) oxide-hydroxide (FeO(OH) ...
language is mentioned by the original developers for having integrated many of the same ideas Cyclone had.
Cyclone development was started as a joint project of Trevor Jim from
AT&T Labs
AT&T Labs, Inc. (formerly AT&T Laboratories, Inc.) is the research & development division of AT&T, the telecommunications company. It employs some 1,800 people in various locations, including: Bedminster, New Jersey; Middletown Township, New J ...
Research and
Greg Morrisett's group at
Cornell University
Cornell University is a Private university, private Ivy League research university based in Ithaca, New York, United States. The university was co-founded by American philanthropist Ezra Cornell and historian and educator Andrew Dickson W ...
in 2001. Version 1.0 was released on May 8, 2006.
Language features
Cyclone attempts to avoid some of the common pitfalls of
C, while still maintaining its look and performance. To this end, Cyclone places the following limits on programs:
*
NULL
Null may refer to:
Science, technology, and mathematics Astronomy
*Nuller, an optical tool using interferometry to block certain sources of light Computing
*Null (SQL) (or NULL), a special marker and keyword in SQL indicating that a data value do ...
checks are inserted to prevent
segmentation fault
In computing, a segmentation fault (often shortened to segfault) or access violation is a Interrupt, failure condition raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restricted ...
s
*
Pointer arithmetic
In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer ''refe ...
is limited
* Pointers must be initialized before use (this is enforced by
definite assignment analysis In computer science, definite assignment analysis is a data-flow analysis used by compilers to conservatively ensure that a variable or location is always assigned before it is used.
Motivation
In C and C++ programs, a source of particularly diff ...
)
*
Dangling pointer
Dangling pointers and wild pointers in computer programming are pointers that do not point to a valid object of the appropriate type. These are special cases of memory safety violations. More generally, dangling references and wild references a ...
s are prevented through region analysis and limits on
free()
C dynamic memory allocation refers to performing manual memory management for dynamic memory allocation in the C programming language via a group of functions in the C standard library, namely , , , and .
The C++ programming language include ...
* Only "safe" casts and unions are allowed
*
goto
into scopes is disallowed
*
switch
labels in different scopes are disallowed
* Pointer-returning functions must execute
return
*
setjmp
and longjmp
are not supported
To maintain the tool set that C programmers are used to, Cyclone provides the following extensions:
* Never-
NULL
pointers do not require
NULL
checks
* "Fat" pointers support pointer arithmetic with run-time
bounds checking
In computer programming, bounds checking is any method of detecting whether a variable is within some bounds before it is used. It is usually used to ensure that a number fits into a given type (range checking), or that a variable being used as ...
* Growable regions support a form of safe manual memory management
*
Garbage collection
Waste collection is a part of the process of waste management. It is the transfer of solid waste from the point of use and disposal to the point of treatment or landfill. Waste collection also includes the curbside collection of recyclable ...
for heap-allocated values
*
Tagged union
In computer science, a tagged union, also called a variant, variant record, choice type, discriminated union, disjoint union, sum type, or coproduct, is a data structure used to hold a value that could take on several different, but fixed, types. ...
s support type-varying arguments
* Injections help automate the use of tagged unions for programmers
*
Polymorphism replaces some uses of
void *
* varargs are implemented as fat pointers
*
Exceptions replace some uses of
setjmp
and
longjmp
For a better high-level introduction to Cyclone, the reasoning behind Cyclone and the source of these lists, se
this paper
Cyclone looks, in general, much like C, but it should be viewed as a C-like language.
Pointer types
Cyclone implements three kinds of
pointer:
*
*
(the normal type)
*
@
(the never-
NULL
pointer), and
*
?
(the only type with
pointer arithmetic
In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer ''refe ...
allowed,
"fat" pointers).
The purpose of introducing these new pointer types is to avoid common problems when using pointers. Take for instance a function, called
foo
that takes a pointer to an int:
int foo(int *);
Although the person who wrote the function
foo
could have inserted
NULL
checks, let us assume that for performance reasons they did not. Calling
foo(NULL);
will result in
undefined behavior
In computer programming, a program exhibits undefined behavior (UB) when it contains, or is executing code for which its programming language specification does not mandate any specific requirements. This is different from unspecified behavior, ...
(typically, although not necessarily, a
SIGSEGV
In computing, a segmentation fault (often shortened to segfault) or access violation is a Interrupt, failure condition raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restricted ...
signal
A signal is both the process and the result of transmission of data over some media accomplished by embedding some variation. Signals are important in multiple subject fields including signal processing, information theory and biology.
In ...
being sent to the application). To avoid such problems, Cyclone introduces the
@
pointer type, which can never be
NULL
. Thus, the "safe" version of
foo
would be:
int foo(int @);
This tells the Cyclone compiler that the argument to
foo
should never be
NULL
, avoiding the aforementioned undefined behavior. The simple change of
*
to
@
saves the programmer from having to write
NULL
checks and the operating system from having to trap
NULL
pointer dereferences. This extra limit, however, can be a rather large stumbling block for most C programmers, who are used to being able to manipulate their pointers directly with arithmetic. Although this is desirable, it can lead to
buffer overflows and other "off-by-one"-style mistakes. To avoid this, the
?
pointer type is delimited by a known bound, the size of the array. Although this adds overhead due to the extra information stored about the pointer, it improves safety and security. Take for instance a simple (and naïve)
strlen
function, written in C:
int strlen(const char *s)
This function assumes that the string being passed in is terminated by
'\0'
. However, what would happen if were passed to this string? This is perfectly legal in C, yet would cause
strlen
to iterate through memory not necessarily associated with the string
s
. There are functions, such as
strnlen
which can be used to avoid such problems, but these functions are not standard with every implementation of
ANSI C
ANSI C, ISO C, and Standard C are successive standards for the C programming language published by the American National Standards Institute (ANSI) and ISO/IEC JTC 1/SC 22/WG 14 of the International Organization for Standardization (ISO) and the ...
. The Cyclone version of
strlen
is not so different from the C version:
int strlen(const char ? s)
Here,
strlen
bounds itself by the length of the array passed to it, thus not going over the actual length. Each of the kinds of pointer type can be safely cast to each of the others, and arrays and strings are automatically cast to
?
by the compiler. (Casting from
?
to
*
invokes a
bounds check, and casting from
?
to
@
invokes both a
NULL
check and a bounds check. Casting from
*
to
?
results in no checks whatsoever; the resulting
?
pointer has a size of 1.)
Dangling pointers and region analysis
Consider the following code, in C:
char *itoa(int i)
The function
itoa
allocates an array of chars
buf
on the stack and returns a pointer to the start of
buf
. However, the memory used on the stack for
buf
is deallocated when the function returns, so the returned value cannot be used safely outside of the function. While
GNU Compiler Collection
The GNU Compiler Collection (GCC) is a collection of compilers from the GNU Project that support various programming languages, Computer architecture, hardware architectures, and operating systems. The Free Software Foundation (FSF) distributes ...
and other compilers will warn about such code, the following will typically compile without warnings:
char *itoa(int i)
GNU Compiler Collection can produce warnings for such code as a side-effect of option or , but there are no guarantees that all such errors will be detected.
Cyclone does regional analysis of each segment of code, preventing dangling pointers, such as the one returned from this version of
itoa
. All of the local variables in a given scope are considered to be part of the same region, separate from the heap or any other local region. Thus, when analyzing
itoa
, the Cyclone compiler would see that
z
is a pointer into the local stack, and would report an error.
See also
*
C
*
ML
*
Rust
Rust is an iron oxide, a usually reddish-brown oxide formed by the reaction of iron and oxygen in the catalytic presence of water or air moisture. Rust consists of hydrous iron(III) oxides (Fe2O3·nH2O) and iron(III) oxide-hydroxide (FeO(OH) ...
References
External links
Cyclone homepageCyclone - source code repositoriesCyclone - FAQCyclone for C programmersCyclone user manualCyclone: a Type-safe Dialect of Cby Dan Grossman, Michael Hicks, Trevor Jim, and Greg Morrisett - published January 2005
Presentations:
Cyclone: A Type-Safe Dialect of CCyclone: A Memory-Safe C-Level Programming Language
{{DEFAULTSORT:Cyclone (Programming Language)
C programming language family
Programming languages created in 2002