HOME

TheInfoList



OR:

Automata-based programming is a
programming paradigm Programming paradigms are a way to classify programming languages based on their features. Languages can be classified into multiple paradigms. Some paradigms are concerned mainly with implications for the execution model of the language, suc ...
in which the program or part of it is thought of as a model of a
finite-state machine A finite-state machine (FSM) or finite-state automaton (FSA, plural: ''automata''), finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number o ...
(FSM) or any other (often more complicated) formal automaton (see
automata theory Automata theory is the study of abstract machines and automata, as well as the computational problems that can be solved using them. It is a theory in theoretical computer science. The word ''automata'' comes from the Greek word αὐτόματο ...
). Sometimes a potentially infinite set of possible states is introduced, and such a set can have a complicated structure, not just an enumeration. Finite-state machine-based programming is generally the same, but, formally speaking, does not cover all possible variants, as FSM stands for finite-state machine, and automata-based programming does not necessarily employ FSMs in the strict sense. The following properties are key indicators for automata-based programming: * The time period of the program's execution is clearly separated down to the ''automaton steps''. Each step is effectively an execution of a code section (same for all the steps) which has a single entry point. That section might be divided down to subsections to be executed depending on different states, although this is not necessary. * Any communication between the automaton steps is only possible via the explicitly noted set of variables named the ''automaton state''. Between any two steps, the program cannot have implicit components of its state, such as local variables' values, return addresses, the current instruction pointer, etc. That is, the state of the whole program, taken at any two moments of entering an automaton step, can only differ in the values of the variables being considered as the automaton state. The whole execution of the automata-based code is a ''cycle'' of the automaton steps. Another reason for using the notion of automata-based programming is that the programmer's style of thinking about the program in this technique is very similar to the style of thinking used to solve mathematical tasks using
Turing machine A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer algori ...
s,
Markov algorithm In theoretical computer science, a Markov algorithm is a string rewriting system that uses grammar-like rules to operate on strings of symbols. Markov algorithms have been shown to be Turing-complete, which means that they are suitable as a gener ...
s, etc.


Example


Task

Consider the task of reading a text from
standard input In computer programming, standard streams are interconnected input and output communication channels between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin ...
line-by-line and writing the first word of each line to
standard output In computer programming, standard streams are interconnected input and output communication channels between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin ...
. First we skip all leading whitespace characters, if any. Then we print all the characters of the first word. Finally we skip all the trailing characters until a
newline Newline (frequently called line ending, end of line (EOL), next line (NEL) or line break) is a control character or sequence of control characters in character encoding specifications such as ASCII, EBCDIC, Unicode, etc. This character, or a ...
character is encountered. Whenever a sequence of newline characters is encountered not at the beginning of the stream, we print only the first one and skip the remaining ones; else, we skip all. Next we restart the process at the following line. Upon encountering the
end-of-file In computing, end-of-file (EOF) is a condition in a computer operating system where no more data can be read from a data source. The data source is usually called a file or stream. Details In the C standard library, the character reading functio ...
condition (regardless of the stage), we stop.


Traditional program

A traditional program in C which performs the above task could look like this: #include #include int main(void) For instance, compiling and running the above program on this input: $ clang program.c && (printf "\t\v\f\r \n\n\t\v\f\r foo bar baz\n\n\t\v\f\r qux quux corge" , ./a.out) yields: foo qux


Automata-based program


Procedural

The same task can be solved by thinking in terms of
finite-state machine A finite-state machine (FSM) or finite-state automaton (FSA, plural: ''automata''), finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number o ...
s. Note that the parsing of a line has three stages: skipping the leading whitespace characters, printing the characters of the first word and skipping the trailing characters. Let's call these automaton states BEFORE, INSIDE and AFTER. An automata-based version of the program could look like this: #include #include enum State ; int main(void) Although the program now looks longer, it has at least one significant advantage: there is only ''one reading'' (that is, call to the getchar function) instruction. Besides that, there is only one loop instead of the four the traditional version had. The body of the while loop is the ''automaton step'' and the loop itself is the ''cycle'' of the automaton step. The program implements the work of a finite-state machine shown in the state diagram. The most important property of the program is that the automaton step code section is clearly localized. With an explicit function step for the automation step, the program better demonstrates this property: #include #include enum State ; void step(enum State* const s, int const c) int main(void) The program now clearly demonstrates the basic properties of automata-based code: * time periods of automaton step executions may not overlap; * the only information passed from the previous step to the next is the explicitly specified ''automaton state''. A finite automaton can be defined by a state-transition table whose rows stand for the current states, columns stand for the inputs, and cells stand for the next states and actions to perform. Generally speaking, an automata-based program can naturally use this approach. With an explicit two-dimensional array transitions for the state-transition table, the program uses this approach: #include #include enum State ; void nop(int const c) void print(int const c) struct Branch ; struct Branch const transitions 3] = ; void step(enum State* const s, int const c) int main(void)


Object-oriented

If the implementation language supports
object-oriented programming Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of pr ...
, a simple refactoring of the program is to encapsulate the automaton into an object, thus hiding its implementation details. The program in
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
using object-oriented style could look like this: #include #include enum State ; struct Branch ; class StateMachine ; StateMachine::StateMachine(): _state(BEFORE) void StateMachine::feedChar(int const c) void StateMachine::nop(int const c) void StateMachine::print(int const c) struct Branch const StateMachine::_transitions 3] = ; int main() ''Note.'' — To minimize changes not directly related to the subject of the article, the
input/output In computing, input/output (I/O, or informally io or IO) is the communication between an information processing system, such as a computer, and the outside world, possibly a human or another information processing system. Inputs are the signals ...
getchar and putchar functions from the standard library of C are being used. The state design pattern is a way for an object to change its behavior at runtime according to its internal state ''without resorting to large conditional statements or table lookups'' thanks to virtual function calls. Its main advantage over code using large conditional statements is that state-specific code is distributed across different objects rather than localized in a monolithic block, which improves maintainability. Its main advantages over code using state-transition tables are that virtual function calls are often more efficient than table lookups, that state-transition criteria are more explicit than in tabular format, and that it is easier to add actions accompanying state transitions. However it introduces a new problem: the number of classes makes the code less compact than the other approaches. The program using the state design pattern could look like this: #include #include class StateMachine; class State ; class Before: public State ; class Inside: public State ; class After: public State ; class StateMachine ; State const* Before::instantiate() void Before::feedChar(StateMachine* const m, int const c) const State const* Before::_instance = nullptr; State const* Inside::instantiate() void Inside::feedChar(StateMachine* const m, int const c) const State const* Inside::_instance = nullptr; State const* After::instantiate() void After::feedChar(StateMachine* const m, int const c) const State const* After::_instance = nullptr; StateMachine::StateMachine(): _state(Before::instantiate()) void StateMachine::feedChar(int const c) void StateMachine::setState(State const* const s) int main()


Automation and automata

Automata-based programming indeed closely matches the programming needs found in the field of
automation Automation describes a wide range of technologies that reduce human intervention in processes, namely by predetermining decision criteria, subprocess relationships, and related actions, as well as embodying those predeterminations in machines ...
. A production cycle is commonly modelled as: * a sequence of stages stepping according to input data (from captors); * a set of actions performed depending on the current stage. Various dedicated programming languages allow expressing such a model in more or less sophisticated ways.


Automation program

The example presented above could be expressed according to this view like in the following
pseudo-code In computer science, pseudocode is a plain language description of the steps in an algorithm or another system. Pseudocode often uses structural conventions of a normal programming language, but is intended for human reading rather than machine re ...
('set' activates a logic variable, 'reset' inactivates a logic variable, ':' assigns a variable, and '=' tests for equality): newline: '\n' whitespace: ('\t', '\n', '\v', '\f', '\r', ' ') states: (before, inside, after) setState(c) doAction(c) cycle The separation of routines expressing cycle progression on one side, and actual action on the other (matching input and output) allows clearer and simpler code.


Events

In the field of automation, stepping from step to step depends on input data coming from the machine itself. This is represented in the program by reading characters from a text. In reality, those data inform about position, speed, temperature, etc. of critical elements of a machine. Like in
GUI The GUI ( "UI" by itself is still usually pronounced . or ), graphical user interface, is a form of user interface that allows users to interact with electronic devices through graphical icons and audio indicator such as primary notation, inste ...
programming, ''changes'' in the machine state can thus be considered as events causing the passage from a state to another, until the final one is reached. The combination of possible states can generate a wide variety of events, thus defining a more complex production cycle. As a consequence, cycles are usually far to be simple linear sequences. There are commonly parallel branches running together and alternatives selected according to different events, schematically represented below: s:stage c:condition s1 , , -c2 , s2 , ---------- , , , -c31 , -c32 , , s31 s32 , , , -c41 , -c42 , , ---------- , s4


Applications

Automata-based programming is widely used in
lexical Lexical may refer to: Linguistics * Lexical corpus or lexis, a complete set of all words in a language * Lexical item, a basic unit of lexicographical classification * Lexicon, the vocabulary of a person, language, or branch of knowledge * Lex ...
and syntactic analyses. Besides that, thinking in terms of automata (that is, breaking the execution process down to ''automaton steps'' and passing information from step to step through the explicit ''automaton state'') is necessary for
event-driven programming In computer programming, event-driven programming is a programming paradigm in which the flow of the program is determined by events such as user actions ( mouse clicks, key presses), sensor outputs, or message passing from other programs or t ...
as the only alternative to using parallel processes or threads. The notions of states and state machines are often used in the field of
formal specification In computer science, formal specifications are mathematically based techniques whose purpose are to help with the implementation of systems and software. They are used to describe a system, to analyze its behavior, and to aid in its design by verif ...
. For instance,
UML The Unified Modeling Language (UML) is a general-purpose, developmental modeling language in the field of software engineering that is intended to provide a standard way to visualize the design of a system. The creation of UML was originally m ...
-based software architecture development uses state diagrams to specify the behaviour of the program. Also various
communication protocol A communication protocol is a system of rules that allows two or more entities of a communications system to transmit information via any kind of variation of a physical quantity. The protocol defines the rules, syntax, semantics (computer scien ...
s are often specified using the explicit notion of state (e.g., ). Thinking in terms of automata (steps and states) can also be used to describe semantics of some
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s. For example, the execution of a program written in the
Refal Refal ("Recursive functions algorithmic language"; russian: РЕФАЛ) "is a functional programming language oriented toward symbolic computations", including " string processing, language translation, ndartificial intelligence". It is one of th ...
language is described as a sequence of ''steps'' of a so-called abstract Refal machine; the state of the machine is a ''view'' (an arbitrary Refal expression without variables).
Continuations In computer science, a continuation is an abstract representation of the control state of a computer program. A continuation implements ( reifies) the program control state, i.e. the continuation is a data structure that represents the computati ...
in the
Scheme A scheme is a systematic plan for the implementation of a certain idea. Scheme or schemer may refer to: Arts and entertainment * ''The Scheme'' (TV series), a BBC Scotland documentary series * The Scheme (band), an English pop band * ''The Schem ...
language require thinking in terms of steps and states, although Scheme itself is in no way automata-related (it is recursive). To make it possible for the
call/cc In the Scheme computer programming language, the procedure call-with-current-continuation, abbreviated call/cc, is used as a control flow operator. It has been adopted by several other programming languages. Taking a function f as its only argum ...
feature to work, implementation needs to be able to catch a whole state of the executing program, which is only possible when there is no implicit part in the state. Such a caught state is the very thing called ''continuation'', and it can be considered as the state of a (relatively complicated) automaton. The automaton step is deducing the next continuation from the previous one, and the execution process is the cycle of such steps. Alexander Ollongren in his book explains the so-called ''Vienna method'' of programming languages semantics description which is fully based on formal automata. The STAT syste

is a good example of using the automata-based approach; this system, besides other features, includes an embedded language called ''STATL'' which is purely automata-oriented.


History

Automata-based techniques were used widely in the domains where there are algorithms based on automata theory, such as formal language analyses. One of the early papers on this is by Johnson et al., 1968. One of the earliest mentions of automata-based programming as a general technique is found in the paper by
Peter Naur Peter Naur (25 October 1928 – 3 January 2016) was a Danish computer science pioneer and Turing award winner. He is best remembered as a contributor, with John Backus, to the Backus–Naur form (BNF) notation used in describing the syntax for m ...
, 1963. The author calls the technique ''Turing machine approach'', however no real
Turing machine A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer algori ...
is given in the paper; instead, the technique based on steps and states is described.


Comparison with imperative and procedural programming

The notion of
state State may refer to: Arts, entertainment, and media Literature * ''State Magazine'', a monthly magazine published by the U.S. Department of State * ''The State'' (newspaper), a daily newspaper in Columbia, South Carolina, United States * ''Our S ...
is not exclusive property of automata-based programming. Generally speaking, state (or
program state In information technology and computer science, a system is described as stateful if it is designed to remember preceding events or user interactions; the remembered information is called the state of the system. The set of states a system can oc ...
) appears during execution of any
computer program A computer program is a sequence or set of instructions in a programming language for a computer to execute. Computer programs are one component of software, which also includes documentation and other intangible components. A computer program ...
, as a combination of all information that can change during the execution. For instance, a state of a traditional imperative program consists of * values of all variables and the information stored within dynamic memory; * values stored in registers; * stack contents (including
local variable In computer science, a local variable is a Variable (programming), variable that is given ''local scope (programming), scope''. A local variable reference in the subroutine, function or block (programming), block in which it is declared overrides ...
s' values and return addresses); * current value of the
instruction pointer The program counter (PC), commonly called the instruction pointer (IP) in Intel x86 and Itanium microprocessors, and sometimes called the instruction address register (IAR), the instruction counter, or just part of the instruction sequencer, is ...
. These can be divided to the ''explicit'' part (such as values stored in variables) and the ''implicit'' part (return addresses and the instruction pointer). Having said this, an automata-based program can be considered as a special case of an imperative program, in which implicit part of the state is minimized. The state of the whole program taken at the two distinct moments of entering the ''step'' code section can differ in the automaton state only. This simplifies the analysis of the program.


Object-oriented programming relationship

In the theory of
object-oriented programming Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of pr ...
, an ''object'' is said to have an internal ''state'' and is capable of ''receiving messages'', ''responding'' to them, ''sending'' messages to other objects and changing its internal state during message handling. In more practical terminology, ''to call an object's method'' is considered the same as ''to send a message to the object''. Thus, on the one hand, objects from object-oriented programming can be considered as automata (or models of automata) whose ''state'' is the combination of private fields, and one or more methods are considered to be the ''step''. Such methods must not call each other nor themselves, neither directly nor indirectly, otherwise the object can not be considered to be implemented in an automata-based manner. On the other hand, object is good for implementing a model of an automaton. When the automata-based approach is used within an object-oriented language, an automaton model is usually implemented by a class, the state is represented with private fields of the class, and the step is implemented as a method; such a method is usually the only non-constant public method of the class (besides constructors and destructors). Other public methods could query the state but don't change it. All the secondary methods (such as particular state handlers) are usually hidden within the private part of the class.


See also

*
Cellular automaton A cellular automaton (pl. cellular automata, abbrev. CA) is a discrete model of computation studied in automata theory. Cellular automata are also called cellular spaces, tessellation automata, homogeneous structures, cellular structures, tessel ...
*
Nondeterministic programming A nondeterministic programming language is a language which can specify, at certain points in the program (called "choice points"), various alternatives for program flow. Unlike an if-then statement, the method of choice between these alternativ ...
*
State pattern The state pattern is a behavioral software design pattern that allows an object to alter its behavior when its internal state changes. This pattern is close to the concept of finite-state machines. The state pattern can be interpreted as a strategy ...
*
Esterel Esterel is a synchronous programming language for the development of complex reactive systems. The imperative programming style of Esterel allows the simple expression of parallelism and preemption. As a consequence, it is well suited for contr ...
, an automata-based language *
Umple Umple is a language for both object-oriented programming and modelling with class diagrams and state diagrams. The name Umple is a portmanteau of "UML", "ample" and "Simple", indicating that it is designed to provide ample features to extend pro ...
, a tool to add automata to Java and C++


References


External links


J. V. Noble. «Finite State Machines in Forth»
— automata-based programming in
Forth Forth or FORTH may refer to: Arts and entertainment * ''forth'' magazine, an Internet magazine * ''Forth'' (album), by The Verve, 2008 * ''Forth'', a 2011 album by Proto-Kaw * Radio Forth, a group of independent local radio stations in Scotla ...
* * {{cite journal , last = Harel , first = David , author2 = Drusinsky, D. , s2cid = 8754800 , year = 1989 , title = Using Statecharts for Hardware Description and Synthesis , journal = IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , volume = 8 , issue = 7 , pages = 798–807 , doi = 10.1109/43.31537 * Polikarpova N. I., Shalyto A. A
Automata-based programming
SPb.: Piter. 2009 (rus)

ITMO University, "Programming Technology" department
Programming paradigms