HOME

TheInfoList



OR:

Standard ML (SML) is a general-purpose,
modular Broadly speaking, modularity is the degree to which a system's components may be separated and recombined, often with the benefit of flexibility and variety in use. The concept of modularity is used primarily to reduce complexity by breaking a s ...
, functional programming language with compile-time type checking and
type inference Type inference refers to the automatic detection of the type of an expression in a formal language. These include programming languages and mathematical type systems, but also natural languages in some branches of computer science and linguistic ...
. It is popular among
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs tha ...
writers and programming language researchers, as well as in the development of theorem provers. Standard ML is a modern dialect of ML, the language used in the Logic for Computable Functions (LCF) theorem-proving project. It is distinctive among widely used languages in that it has a formal specification, given as
typing rule In type theory, a typing rule is an inference rule that describes how a type system assigns a type to a Syntax (programming languages), syntactic construction. These rules may be applied by the type system to determine if a Computer program, progr ...
s and operational semantics in ''The Definition of Standard ML''.


Language

Standard ML is a functional
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
with some impure features. Programs written in Standard ML consist of expressions as opposed to statements or commands, although some expressions of type unit are only evaluated for their side-effects.


Functions

Like all functional languages, a key feature of Standard ML is the function, which is used for abstraction. The factorial function can be expressed as follows: fun factorial n = if n = 0 then 1 else n * factorial (n - 1)


Type inference

An SML compiler must infer the static type without user-supplied type annotations. It has to deduce that is only used with integer expressions, and must therefore itself be an integer, and that all terminal expressions are integer expressions.


Declarative definitions

The same function can be expressed with clausal function definitions where the ''if''-''then''-''else'' conditional is replaced with templates of the factorial function evaluated for specific values: fun factorial 0 = 1 , factorial n = n * factorial (n - 1)


Imperative definitions

or iteratively: fun factorial n = let val i = ref n and acc = ref 1 in while !i > 0 do (acc := !acc * !i; i := !i - 1); !acc end


Lambda functions

or as a lambda function: val rec factorial = fn 0 => 1 , n => n * factorial (n - 1) Here, the keyword introduces a binding of an identifier to a value, introduces an
anonymous function In computer programming, an anonymous function (function literal, lambda abstraction, lambda function, lambda expression or block) is a function definition that is not bound to an identifier. Anonymous functions are often arguments being passed t ...
, and allows the definition to be self-referential.


Local definitions

The encapsulation of an invariant-preserving tail-recursive tight loop with one or more accumulator parameters within an invariant-free outer function, as seen here, is a common idiom in Standard ML. Using a local function, it can be rewritten in a more efficient tail-recursive style: local fun loop (0, acc) = acc , loop (m, acc) = loop (m - 1, m * acc) in fun factorial n = loop (n, 1) end


Type synonyms

A type synonym is defined with the keyword . Here is a type synonym for points on a plane, and functions computing the distances between two points, and the area of a triangle with the given corners as per Heron's formula. (These definitions will be used in subsequent examples). type loc = real * real fun square (x : real) = x * x fun dist (x, y) (x', y') = Math.sqrt (square (x' - x) + square (y' - y)) fun heron (a, b, c) = let val x = dist a b val y = dist b c val z = dist a c val s = (x + y + z) / 2.0 in Math.sqrt (s * (s - x) * (s - y) * (s - z)) end


Algebraic datatypes

Standard ML provides strong support for
algebraic datatype In computer programming, especially functional programming and type theory, an algebraic data type (ADT) is a kind of composite type, i.e., a type formed by combining other types. Two common classes of algebraic types are product types (i.e ...
s (ADT). A datatype can be thought of as a disjoint union of tuples (or a "sum of products"). They are easy to define and easy to use, largely because of pattern matching as well as most Standard ML implementations' pattern-exhaustiveness checking and pattern redundancy checking. In object-oriented programming languages, a disjoint union can be expressed as class hierarchies. However, as opposed to class hierarchies, ADTs are
closed Closed may refer to: Mathematics * Closure (mathematics), a set, along with operations, for which applying those operations on members always results in a member of the set * Closed set, a set which contains all its limit points * Closed interval, ...
. Thus the extensibility of ADTs is orthogonal to the extensibility of class hierarchies. Class hierarchies can be extended with new subclasses which implement the same interface, while the functionality of ADTs can be extended for the fixed set of constructors. See
expression problem The expression problem is a challenging problem in programming languages that concerns the extensibility and modularity of statically typed data abstractions. The goal is to define a data abstraction that is extensible both in its representations ...
. A datatype is defined with the keyword , as in: datatype shape = Circle of loc * real (* center and radius *) , Square of loc * real (* upper-left corner and side length; axis-aligned *) , Triangle of loc * loc * loc (* corners *) Note that a type synonym cannot be recursive; datatypes are necessary to define recursive constructors. (This is not at issue in this example.)


Pattern matching

Patterns are matched in the order in which they are defined. C programmers can use tagged unions, dispatching on tag values, to accomplish what ML accomplishes with datatypes and pattern matching. Nevertheless, while a C program decorated with appropriate checks will, in a sense, be as robust as the corresponding ML program, those checks will of necessity be dynamic; ML's static checks provide strong guarantees about the correctness of the program at compile time. Function arguments can be defined as patterns as follows: fun area (Circle (_, r)) = Math.pi * square r , area (Square (_, s)) = square s , area (Triangle p) = heron p (* see above *) The so-called "clausal form" of function definition, where arguments are defined as patterns, is merely syntactic sugar for a case expression: fun area shape = case shape of Circle (_, r) => Math.pi * square r , Square (_, s) => square s , Triangle p => heron p


Exhaustiveness checking

Pattern-exhaustiveness checking will make sure that each constructor of the datatype is matched by at least one pattern. The following pattern is not exhaustive: fun center (Circle (c, _)) = c , center (Square ((x, y), s)) = (x + s / 2.0, y + s / 2.0) There is no pattern for the case in the function. The compiler will issue a warning that the case expression is not exhaustive, and if a is passed to this function at runtime, will be raised.


Redundancy checking

The pattern in the second clause of the following (meaningless) function is redundant: fun f (Circle ((x, y), r)) = x + y , f (Circle _) = 1.0 , f _ = 0.0 Any value that would match the pattern in the second clause would also match the pattern in the first clause, so the second clause is unreachable. Therefore, this definition as a whole exhibits redundancy, and causes a compile-time warning. The following function definition is exhaustive and not redundant: val hasCorners = fn (Circle _) => false , _ => true If control gets past the first pattern (), we know the shape must be either a or a . In either of those cases, we know the shape has corners, so we can return without discerning the actual shape.


Higher-order functions

Functions can consume functions as arguments: fun map f (x, y) = (f x, f y) Functions can produce functions as return values: fun constant k = (fn _ => k) Functions can also both consume and produce functions: fun compose (f, g) = (fn x => f (g x)) The function from the basis library is one of the most commonly used higher-order functions in Standard ML: fun map _ [] = [] , map f (x :: xs) = f x :: map f xs A more efficient implementation with tail-recursive : fun map f = List.rev o List.foldl (fn (x, acc) => f x :: acc) []


Exceptions

Exceptions are raised with the keyword and handled with the pattern matching construct. The exception system can implement non-local exit; this optimization technique is suitable for functions like the following. local exception Zero; val p = fn (0, _) => raise Zero , (a, b) => a * b in fun prod xs = List.foldl p 1 xs handle Zero => 0 end When is raised, control leaves the function altogether. Consider the alternative: the value 0 would be returned, it would be multiplied by the next integer in the list, the resulting value (inevitably 0) would be returned, and so on. The raising of the exception allows control to skip over the entire chain of frames and avoid the associated computation. Note the use of the underscore () as a wildcard pattern. The same optimization can be obtained with a tail call. local fun p a (0 :: _) = 0 , p a (x :: xs) = p (a * x) xs , p a [] = a in val prod = p 1 end


Module system

Standard ML's advanced module system allows programs to be decomposed into hierarchically organized ''structures'' of logically related type and value definitions. Modules provide not only namespace control but also abstraction, in the sense that they allow the definition of abstract data types. Three main syntactic constructs comprise the module system: signatures, structures and functors.


Signatures

A ''signature'' is an interface, usually thought of as a type for a structure; it specifies the names of all entities provided by the structure as well as the
arity Arity () is the number of arguments or operands taken by a function, operation or relation in logic, mathematics, and computer science. In mathematics, arity may also be named ''rank'', but this word can have many other meanings in mathematics. ...
of each type component, the type of each value component, and the signature of each substructure. The definitions of type components are optional; type components whose definitions are hidden are ''abstract types''. For example, the signature for a queue may be: signature QUEUE = sig type 'a queue exception QueueError; val empty : 'a queue val isEmpty : 'a queue -> bool val singleton : 'a -> 'a queue val fromList : 'a list -> 'a queue val insert : 'a * 'a queue -> 'a queue val peek : 'a queue -> 'a val remove : 'a queue -> 'a * 'a queue end This signature describes a module that provides a polymorphic type , , and values that define basic operations on queues.


Structures

A ''structure'' is a module; it consists of a collection of types, exceptions, values and structures (called ''substructures'') packaged together into a logical unit. A queue structure can be implemented as follows: structure TwoListQueue :> QUEUE = struct type 'a queue = 'a list * 'a list exception QueueError; val empty = ([], []) fun isEmpty ([], []) = true , isEmpty _ = false fun singleton a = ([], [a]) fun fromList a = ([], a) fun insert (a, ([], [])) = singleton a , insert (a, (ins, outs)) = (a :: ins, outs) fun peek (_, []) = raise QueueError , peek (ins, outs) = List.hd outs fun remove (_, []) = raise QueueError , remove (ins, [a]) = (a, ([], List.rev ins)) , remove (ins, a :: outs) = (a, (ins, outs)) end This definition declares that implements . Furthermore, the ''opaque ascription'' denoted by states that any types which are not defined in the signature (i.e. ) should be abstract, meaning that the definition of a queue as a pair of lists is not visible outside the module. The structure implements all of the definitions in the signature. The types and values in a structure can be accessed with "dot notation": val q : string TwoListQueue.queue = TwoListQueue.empty val q' = TwoListQueue.insert (Real.toString Math.pi, q)


Functors

A ''functor'' is a function from structures to structures; that is, a functor accepts one or more arguments, which are usually structures of a given signature, and produces a structure as its result. Functors are used to implement
generic Generic or generics may refer to: In business * Generic term, a common name used for a range or class of similar things not protected by trademark * Generic brand, a brand for a product that does not have an associated brand or trademark, other ...
data structures and algorithms. One popular algorithm for breadth-first search of trees makes use of queues. Here we present a version of that algorithm parameterized over an abstract queue structure: (* after Okasaki, ICFP, 2000 *) functor BFS (Q: QUEUE) = struct datatype 'a tree = E , T of 'a * 'a tree * 'a tree local fun bfsQ q = if Q.isEmpty q then [] else search (Q.remove q) and search (E, q) = bfsQ q , search (T (x, l, r), q) = x :: bfsQ (insert (insert q l) r) and insert q a = Q.insert (a, q) in fun bfs t = bfsQ (Q.singleton t) end end structure QueueBFS = BFS (TwoListQueue) Within , the representation of the queue is not visible. More concretely, there is no way to select the first list in the two-list queue, if that is indeed the representation being used. This data abstraction mechanism makes the breadth-first search truly agnostic to the queue's implementation. This is in general desirable; in this case, the queue structure can safely maintain any logical invariants on which its correctness depends behind the bulletproof wall of abstraction.


Code examples

Snippets of SML code are most easily studied by entering them into an interactive top-level.


Hello world

The following is a Hello, world! program:


Algorithms


Insertion sort

Insertion sort for (ascending) can be expressed concisely as follows: fun insert (x, []) = [x] , insert (x, h :: t) = sort x (h, t) and sort x (h, t) = if x < h then [x, h] @ t else h :: insert (x, t) val insertionsort = List.foldl insert []


Mergesort

Here, the classic mergesort algorithm is implemented in three functions: split, merge and mergesort. Also note the absence of types, with the exception of the syntax and which signify lists. This code will sort lists of any type, so long as a consistent ordering function is defined. Using Hindley–Milner type inference, the types of all variables can be inferred, even complicated types such as that of the function . Split is implemented with a
stateful In information technology and computer science, a system is described as stateful if it is designed to remember preceding events or user interactions; the remembered information is called the state of the system. The set of states a system can o ...
closure which alternates between and , ignoring the input: fun alternator = let val state = ref true in fn a => !state before state := not (!state) end (* Split a list into near-halves which will either be the same length, * or the first will have one more element than the other. * Runs in O(n) time, where n = , xs, . *) fun split xs = List.partition (alternator ) xs Merge Merge uses a local function loop for efficiency. The inner is defined in terms of cases: when both lists are non-empty () and when one list is empty (). This function merges two sorted lists into one sorted list. Note how the accumulator is built backwards, then reversed before being returned. This is a common technique, since is represented as a
linked list In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes which ...
; this technique requires more clock time, but the asymptotics are not worse. (* Merge two ordered lists using the order cmp. * Pre: each list must already be ordered per cmp. * Runs in O(n) time, where n = , xs, + , ys, . *) fun merge cmp (xs, []) = xs , merge cmp (xs, y :: ys) = let fun loop (a, acc) (xs, []) = List.revAppend (a :: acc, xs) , loop (a, acc) (xs, y :: ys) = if cmp (a, y) then loop (y, a :: acc) (ys, xs) else loop (a, y :: acc) (xs, ys) in loop (y, []) (ys, xs) end Mergesort The main function: fun ap f (x, y) = (f x, f y) (* Sort a list in according to the given ordering operation cmp. * Runs in O(n log n) time, where n = , xs, . *) fun mergesort cmp [] = [] , mergesort cmp [x] = [x] , mergesort cmp xs = (merge cmp o ap (mergesort cmp) o split) xs


Quicksort

Quicksort can be expressed as follows. is a [ closure that consumes an order operator . infix << fun quicksort (op <<) = let fun part p = List.partition (fn x => x << p) fun sort [] = [] , sort (p :: xs) = join p (part p xs) and join p (l, r) = sort l @ p :: sort r in sort end


Expression interpreter

Note the relative ease with which a small expression language can be defined and processed: exception TyErr; datatype ty = IntTy , BoolTy fun unify (IntTy, IntTy) = IntTy , unify (BoolTy, BoolTy) = BoolTy , unify (_, _) = raise TyErr datatype exp = True , False , Int of int , Not of exp , Add of exp * exp , If of exp * exp * exp fun infer True = BoolTy , infer False = BoolTy , infer (Int _) = IntTy , infer (Not e) = (assert e BoolTy; BoolTy) , infer (Add (a, b)) = (assert a IntTy; assert b IntTy; IntTy) , infer (If (e, t, f)) = (assert e BoolTy; unify (infer t, infer f)) and assert e t = unify (infer e, t) fun eval True = True , eval False = False , eval (Int n) = Int n , eval (Not e) = if eval e = True then False else True , eval (Add (a, b)) = (case (eval a, eval b) of (Int x, Int y) => Int (x + y)) , eval (If (e, t, f)) = eval (if eval e = True then t else f) fun run e = (infer e; SOME (eval e)) handle TyErr => NONE Example usage on well-typed and ill-typed expressions: val SOME (Int 3) = run (Add (Int 1, Int 2)) (* well-typed *) val NONE = run (If (Not (Int 1), True, False)) (* ill-typed *)


Arbitrary-precision integers

The module provides arbitrary-precision integer arithmetic. Moreover, integer literals may be used as arbitrary-precision integers without the programmer having to do anything. The following program implements an arbitrary-precision factorial function:


Partial application

Curried functions have a great many applications, such as eliminating redundant code. For example, a module may require functions of type , but it is more convenient to write functions of type where there is a fixed relationship between the objects of type and . A function of type can factor out this commonality. This is an example of the adapter pattern. In this example, computes the numerical derivative of a given function at point : - fun d delta f x = (f (x + delta) - f (x - delta)) / (2.0 * delta) val d = fn : real -> (real -> real) -> real -> real The type of indicates that it maps a "float" onto a function with the type . This allows us to partially apply arguments, known as
currying In mathematics and computer science, currying is the technique of translating the evaluation of a function that takes multiple arguments into evaluating a sequence of functions, each with a single argument. For example, currying a function f tha ...
. In this case, function can be specialised by partially applying it with the argument . A good choice for when using this algorithm is the cube root of the
machine epsilon Machine epsilon or machine precision is an upper bound on the relative approximation error due to rounding in floating point arithmetic. This value characterizes computer arithmetic in the field of numerical analysis, and by extension in the sub ...
. - val d' = d 1E~8; val d' = fn : (real -> real) -> real -> real Note that the inferred type indicates that expects a function with the type as its first argument. We can compute an approximation to the derivative of f(x) = x^3-x-1 at x=3. The correct answer is f'(3) = 27-1 = 26. - d' (fn x => x * x * x - x - 1.0) 3.0; val it = 25.9999996644 : real


Libraries


Standard

The Basis Library has been standardized and ships with most implementations. It provides modules for trees, arrays and other data structures as well as input/output and system interfaces.


Third party

For numerical computing, a Matrix module exists (but is currently broken), https://www.cs.cmu.edu/afs/cs/project/pscico/pscico/src/matrix/README.html. For graphics, cairo-sml is an open source interface to the
Cairo Cairo ( ; ar, القاهرة, al-Qāhirah, ) is the capital of Egypt and its largest city, home to 10 million people. It is also part of the largest urban agglomeration in Africa, the Arab world and the Middle East: The Greater Cairo metr ...
graphics library. For machine learning, a library for graphical models exists.


Implementations

Implementations of Standard ML include the following: Standard
HaMLet
a Standard ML interpreter that aims to be an accurate and accessible reference implementation of the standard *
MLton MLton is an open-source whole-program optimizing compiler for Standard ML. MLton development began in 1997, and continues with a worldwide community of developers and users, who have helped to port MLton to a number of platforms. MLton was a parti ...

mlton.org
: a whole-program optimizing compiler which strictly conforms to the Definition and produces very fast code compared to other ML implementations, including backends for
LLVM LLVM is a set of compiler and toolchain technologies that can be used to develop a front end for any programming language and a back end for any instruction set architecture. LLVM is designed around a language-independent intermediate repre ...
and C
Moscow ML
a light-weight implementation, based on the CAML Light runtime engine which implements the full Standard ML language, including modules and much of the basis library
Poly/ML
a full implementation of Standard ML that produces fast code and supports multicore hardware (via Posix threads); its runtime system performs parallel garbage collection and online sharing of immutable substructures. * Standard ML of New Jersey
smlnj.org
: a full compiler, with associated libraries, tools, an interactive shell, and documentation with support for
Concurrent ML Concurrent ML (CML) is a concurrent extension of the Standard ML programming language characterized by its ability to allow programmers to create composable communication abstractions that are first-class rather than built into the language. ...

SML.NET
a Standard ML compiler for the
common language runtime The Common Language Runtime (CLR), the virtual machine component of Microsoft .NET Framework, manages the execution of .NET programs. Just-in-time compilation converts the managed code (compiled intermediate language code) into machine instru ...
with extensions for linking with other .NET code
ML Kit
: an implementation based very closely on the Definition, integrating a garbage collector (which can be disabled) and region-based memory management with automatic inference of regions, aiming to support real-time applications Derivative *
Alice Alice may refer to: * Alice (name), most often a feminine given name, but also used as a surname Literature * Alice (''Alice's Adventures in Wonderland''), a character in books by Lewis Carroll * ''Alice'' series, children's and teen books by ...
: an interpreter for Standard ML by Saarland University with support for parallel programming using futures, lazy evaluation,
distributed computing A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another from any system. Distributed computing is a field of computer sci ...
via
remote procedure call In distributed computing, a remote procedure call (RPC) is when a computer program causes a procedure ( subroutine) to execute in a different address space (commonly on another computer on a shared network), which is coded as if it were a normal ...
s and constraint programming
SML#
an extension of SML providing record polymorphism and C language interoperability. It is a conventional native compiler and its name is ''not'' an allusion to running on the .NET framework
SOSML
an implementation written in TypeScript, supporting most of the SML language and select parts of the basis library Research
CakeML
is a REPL version of ML with formally verified runtime and translation to assembler. *
Isabelle Isabel is a female name of Spanish origin. Isabelle is a name that is similar, but it is of French origin. It originates as the medieval Spanish form of ''Elizabeth (given name), Elisabeth'' (ultimately Hebrew ''Elisheba, Elisheva''), Arising in ...

Isabelle/ML
integrates parallel Poly/ML into an interactive theorem prover, with a sophisticated IDE (based on jEdit) for official Standard ML (SML'97), the Isabelle/ML dialect, and the proof language. Starting with Isabelle2016, there is also a source-level debugger for ML. *
Poplog Poplog is an open source, reflective, incrementally compiled software development environment for the programming languages POP-11, Common Lisp, Prolog, and Standard ML, originally created in the UK for teaching and research in Artificial Inte ...
implements a version of Standard ML, along with
Common Lisp Common Lisp (CL) is a dialect of the Lisp programming language, published in ANSI standard document ''ANSI INCITS 226-1994 (S20018)'' (formerly ''X3.226-1994 (R1999)''). The Common Lisp HyperSpec, a hyperlinked HTML version, has been derived fr ...
and Prolog, allowing mixed language programming; all are implemented in
POP-11 POP-11 is a reflective, incrementally compiled programming language with many of the features of an interpreted language. It is the core language of the Poplog programming environment developed originally by the University of Sussex, and recent ...
, which is compiled incrementally.
TILT
is a full certifying compiler for Standard ML which uses typed intermediate languages to optimize code and ensure correctness, and can compile to
typed assembly language In computer science, a typed assembly language (TAL) is an assembly language that is extended to include a method of annotating the datatype of each value that is manipulated by the code. These annotations can then be used by a program (type check ...
. All of these implementations are
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized so ...
and freely available. Most are implemented themselves in Standard ML. There are no longer any commercial implementations;
Harlequin Harlequin (; it, Arlecchino ; lmo, Arlechin, Bergamasque pronunciation ) is the best-known of the '' zanni'' or comic servant characters from the Italian '' commedia dell'arte'', associated with the city of Bergamo. The role is traditional ...
, now defunct, once produced a commercial IDE and compiler called MLWorks which passed on to Xanalys and was later open-sourced after it was acquired by Ravenbrook Limited on April 26, 2013.


Major projects using SML

The IT University of Copenhagen's entire enterprise architecture is implemented in around 100,000 lines of SML, including staff records, payroll, course administration and feedback, student project management, and web-based self-service interfaces. The
proof assistant In computer science and mathematical logic, a proof assistant or interactive theorem prover is a software tool to assist with the development of formal proofs by human-machine collaboration. This involves some sort of interactive proof edi ...
s HOL4,
Isabelle Isabel is a female name of Spanish origin. Isabelle is a name that is similar, but it is of French origin. It originates as the medieval Spanish form of ''Elizabeth (given name), Elisabeth'' (ultimately Hebrew ''Elisheba, Elisheva''), Arising in ...
,
LEGO Lego ( , ; stylized as LEGO) is a line of plastic construction toys that are manufactured by The Lego Group, a privately held company based in Billund, Denmark. The company's flagship product, Lego, consists of variously colored interlocki ...
, and Twelf are written in Standard ML. It is also used by compiler writers and
integrated circuit design Integrated circuit design, or IC design, is a sub-field of electronics engineering, encompassing the particular logic and circuit design techniques required to design integrated circuits, or ICs. ICs consist of miniaturized electronic compon ...
ers such as ARM.


See also

*
Declarative programming In computer science, declarative programming is a programming paradigm—a style of building the structure and elements of computer programs—that expresses the logic of a computation without describing its control flow. Many languages that a ...


References


External links

About Standard ML
Revised definition

Standard ML Family GitHub Project




About successor ML
successor ML (sML)
evolution of ML using Standard ML as a starting point
HaMLet on GitHub
reference implementation for successor ML Practical
Basic introductory tutorial

Examples in Rosetta Code
Academic
Programming in Standard ML


{{Authority control ML programming language family Functional languages Procedural programming languages Programming languages created in 1990