Iteratee
   HOME

TheInfoList



OR:

In
functional programming In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that ...
, an iteratee is a composable abstraction for incrementally processing sequentially presented chunks of input data in a purely functional fashion. With iteratees, it is possible to lazily transform how a resource will emit data, for example, by converting each chunk of the input to uppercase as they are retrieved or by limiting the data to only the five first chunks without loading the whole input data into memory. Iteratees are also responsible for opening and closing resources, providing predictable resource management. On each step, an iteratee is presented with one of three possible types of values: the next chunk of data, a value to indicate no data is available, or a value to indicate the iteration process has finished. It may return one of three possible types of values, to indicate to the caller what should be done next: one that means "stop" (and contains the final return value), one that means "continue" (and specifies how to continue), and one that means "signal an error". The latter types of values in effect represent the possible "states" of an iteratee. An iteratee would typically start in the "continue" state. Iteratees are used in
Haskell Haskell () is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research and industrial applications, Haskell has pioneered a number of programming lan ...
and Scala (in the
Play Framework Play Framework is an open-source web application framework which follows the model–view–controller (MVC) architectural pattern. It is written in Scala and usable from other programming languages that are compiled to JVM bytecode, e.g. Java. ...
and in Scalaz), and are also available for F#. Various slightly different implementations of iteratees exist. For example, in the Play framework, they involve
Futures Futures may mean: Finance *Futures contract, a tradable financial derivatives contract *Futures exchange, a financial market where futures contracts are traded * ''Futures'' (magazine), an American finance magazine Music * ''Futures'' (album), a ...
so that asynchronous processing can be performed. Because iteratees are called by other code which feeds them with data, they are an example of
inversion of control In software engineering, inversion of control (IoC) is a design pattern in which custom-written portions of a computer program receive the flow of control from a generic framework. A software architecture with this design inverts control as co ...
. However, unlike many other examples of inversion of control such as SAX XML parsing, the iteratee retains a limited amount of control over the process. It cannot reverse back and look at previous data (unless it stores that data internally), but it can stop the process cleanly without throwing an exception (using exceptions as a means of
control flow In computer science, control flow (or flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. The emphasis on explicit control flow distinguishes an ''im ...
, rather than to signal an exceptional event, is often frowned upon by programmers).


Commonly associated abstractions

The following abstractions are not strictly speaking necessary to work with iteratees, but they do make it more convenient.


Enumerators

An Enumerator is a convenient abstraction for feeding data into an iteratee from an arbitrary data source. Typically the enumerator will take care of any necessary resource cleanup associated with the data source. Because the enumerator knows exactly when the iteratee has finished reading data, it will do the resource cleanup (such as closing a file) at exactly the right time – neither too early nor too late. However, it can do this without needing to know about, or being co-located to, the implementation of the iteratee – so enumerators and iteratees form an example of
separation of concerns In computer science, separation of concerns is a design principle for separating a computer program into distinct sections. Each section addresses a separate '' concern'', a set of information that affects the code of a computer program. A concern ...
.


Enumeratees

An Enumeratee is a convenient abstraction for transforming the output of either an enumerator or iteratee, and feeding that output to an iteratee. For example, a "map" enumeratee would
map A map is a symbolic depiction emphasizing relationships between elements of some space, such as objects, regions, or themes. Many maps are static, fixed to paper or some other durable medium, while others are dynamic or interactive. Although ...
a function over each input chunk.


Motivations

Iteratees were created due to problems with existing purely functional solutions to the problem of making
input/output In computing, input/output (I/O, or informally io or IO) is the communication between an information processing system, such as a computer, and the outside world, possibly a human or another information processing system. Inputs are the signals ...
composable yet correct. Lazy I/O in Haskell allowed pure functions to operate on data on disk as if it were in memory, without explicitly doing I/O at all after opening the file - a kind of
memory-mapped file A memory-mapped file is a segment of virtual memory that has been assigned a direct byte-for-byte correlation with some portion of a file or file-like resource. This resource is typically a file that is physically present on disk, but can also b ...
feature - but because it was impossible in general (due to the
Halting problem In computability theory, the halting problem is the problem of determining, from a description of an arbitrary computer program and an input, whether the program will finish running, or continue to run forever. Alan Turing proved in 1936 that a ...
) for the runtime to know whether the file or other resource was still needed, excessive numbers of files could be left open unnecessarily, resulting in file descriptor exhaustion at the
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs. Time-sharing operating systems schedule tasks for efficient use of the system and may also i ...
level. Traditional C-style I/O, on the other hand, was too low-level and required the developer to be concerned with low-level details such as the current position in the file, which hindered composability. Iteratees and enumerators combine the high-level functional programming benefits of lazy I/O, with the ability to control resources and low-level details where necessary afforded by C-style I/O.


Examples


Uses

Iteratees are used in the Play framework to push data out to long-running
Comet A comet is an icy, small Solar System body that, when passing close to the Sun, warms and begins to release gases, a process that is called outgassing. This produces a visible atmosphere or coma, and sometimes also a tail. These phenomena ...
and
WebSocket WebSocket is a computer communications protocol, providing full-duplex communication channels over a single TCP connection. The WebSocket protocol was standardized by the IETF as in 2011. The current API specification allowing web applications ...
connections to
web browsers A web browser is application software for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's screen. Browsers are used on ...
. Iteratees may also be used to perform incremental
parsing Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term ''parsing'' comes from L ...
(that is, parsing that does not read all the data into memory at once), for example of JSON. It is important to note, however, that iteratees are a very general abstraction and can be used for arbitrary kinds of
sequential In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called ''elements'', or ''terms''). The number of elements (possibly infinite) is called th ...
information processing (or mixed sequential/random-access processing) - and need not involve any I/O at all. This makes it easy to repurpose an iteratee to work on an in-memory dataset instead of data flowing in from the network.


History

In a sense, a distant predecessor of the notion of an enumerator pushing data into a chain of one or more iteratees, was the
pipeline Pipeline may refer to: Electronics, computers and computing * Pipeline (computing), a chain of data-processing stages or a CPU optimization found on ** Instruction pipelining, a technique for implementing instruction-level parallelism within a s ...
concept in operating systems. However, unlike a typical pipeline, iteratees are not separate processes (and hence do not have the overhead of IPC) - or even separate threads, although they can perform work in a similar manner to a chain of worker threads sending messages to each other. This means that iteratees are more lightweight than processes or threads - unlike the situations with separate processes or threads, no extra stacks are needed. Iteratees and enumerators were invented by
Oleg Kiselyov Oleg Viktorovich Kiselyov (russian: Олег Викторович Киселёв; born 11 January 1967 in Yaroslavl) is a Russian handball player. He won a gold medal with the Unified Team at the 1992 Summer Olympics in Barcelona. H ...
for use in Haskell. Later, they were introduced into Scalaz (in version 5.0; enumeratees were absent and were introduced in Scalaz 7) and into Play Framework 2.0.


Formal semantics

Iteratees have been formally modelled as free monads, allowing equational laws to be validated, and employed to optimise programs using iteratees.


Alternatives

* Iterators may be used instead of iteratees in Scala, but they are imperative, so are not a purely functional solution. * In Haskell, two alternative abstractions known as Conduits and Pipes have been developed. (These Pipes are not operating system level pipes, so like iteratees they do not require the use of system calls). Conduits in particular are associated with substantially richer libraries of primitives and combinators than iteratees; conduit adapters for incremental functionalities such as parsing HTML, XML, generalised parsing, making HTTP requests and processing the responses, exist, making conduits more suitable than iteratees for industrial software development in Haskell, out of the box. * There is also a high-level abstraction name
machines
In Scala there is a package calle
FS2: Functional Streams for Scala
whose ancestry can be traced back to machines via several ports, renames and refactors. * In Haskell, the packag
safe-lazy-io
exists. It provides a simpler solution to some of the same problems, which essentially involves being "strict enough" to pull all data that is required, or might be required, through a pipeline which takes care of cleaning up the resources on completion.


References


Further reading

* {{cite web, url=http://themonadreader.wordpress.com/2010/05/12/issue-16/, title=Iteratee: Teaching an Old Fold New Tricks, author=John W. Lato, work=Issue #16 of The Monad Reader, access-date=29 June 2013, date=12 May 2010 This relates to Haskell.


External links

* Scala tutorials ** Play 2.0 **
Understanding Play 2 iteratees for normal humans
**

** Scalaz **
Scalaz tutorial: Enumeration-based I/O with iteratees
* Haskell tutorials *

* Further information *

Functional programming Iteration in programming