Partition Refinement
   HOME

TheInfoList



OR:

In the design of
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algorithms are used as specificat ...
s, partition refinement is a technique for representing a
partition of a set In mathematics, a partition of a set is a grouping of its elements into non-empty subsets, in such a way that every element is included in exactly one subset. Every equivalence relation on a set defines a partition of this set, and every parti ...
as a
data structure In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, a ...
that allows the partition to be refined by splitting its sets into a larger number of smaller sets. In that sense it is dual to the
union-find data structure In computer science, a disjoint-set data structure, also called a union–find data structure or merge–find set, is a data structure that stores a collection of disjoint (non-overlapping) sets. Equivalently, it stores a partition of a se ...
, which also maintains a partition into
disjoint set In mathematics, two sets are said to be disjoint sets if they have no element in common. Equivalently, two disjoint sets are sets whose intersection is the empty set.. For example, and are ''disjoint sets,'' while and are not disjoint. A c ...
s but in which the operations merge pairs of sets. In some applications of partition refinement, such as
lexicographic breadth-first search In computer science, lexicographic breadth-first search or Lex-BFS is a linear time algorithm for ordering the vertices of a graph. The algorithm is different from a breadth-first search, but it produces an ordering that is consistent with breadt ...
, the data structure maintains as well an ordering on the sets in the partition. Partition refinement forms a key component of several efficient algorithms on
graphs Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discre ...
and
finite automata A finite-state machine (FSM) or finite-state automaton (FSA, plural: ''automata''), finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number o ...
, including
DFA minimization In automata theory (a branch of theoretical computer science), DFA minimization is the task of transforming a given deterministic finite automaton (DFA) into an equivalent DFA that has a minimum number of states. Here, two DFAs are called equival ...
, the
Coffman–Graham algorithm In job shop scheduling and graph drawing, the Coffman–Graham algorithm is an algorithm, named after Edward G. Coffman, Jr. and Ronald Graham, for arranging the elements of a partially ordered set into a sequence of levels. The algorithm chooses ...
for parallel scheduling, and
lexicographic breadth-first search In computer science, lexicographic breadth-first search or Lex-BFS is a linear time algorithm for ordering the vertices of a graph. The algorithm is different from a breadth-first search, but it produces an ordering that is consistent with breadt ...
of graphs.


Data structure

A partition refinement algorithm maintains a family of disjoint sets . At the start of the algorithm, this family contains a single set of all the elements in the data structure. At each step of the algorithm, a set is presented to the algorithm, and each set in the family that contains members of is split into two sets, the intersection and the
difference Difference, The Difference, Differences or Differently may refer to: Music * ''Difference'' (album), by Dreamtale, 2005 * ''Differently'' (album), by Cassie Davis, 2009 ** "Differently" (song), by Cassie Davis, 2009 * ''The Difference'' (al ...
. Such an algorithm may be implemented efficiently by maintaining
data structure In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, a ...
s representing the following information: *The ordered sequence of the sets in the family, in a form such as a
doubly linked list In computer science, a doubly linked list is a linked data structure that consists of a set of sequentially linked record (computer science), records called node (computer science), nodes. Each node contains three field (computer science), fields: ...
that allows new sets to be inserted into the middle of the sequence *Associated with each set , a
collection Collection or Collections may refer to: * Cash collection, the function of an accounts receivable department * Collection (church), money donated by the congregation during a church service * Collection agency, agency to collect cash * Collectio ...
of its elements of , in a form such as a
doubly linked list In computer science, a doubly linked list is a linked data structure that consists of a set of sequentially linked record (computer science), records called node (computer science), nodes. Each node contains three field (computer science), fields: ...
or
array data structure In computer science, an array is a data structure consisting of a collection of ''elements'' (values or variables), each identified by at least one ''array index'' or ''key''. An array is stored such that the position of each element can be co ...
that allows for rapid deletion of individual elements from the collection. Alternatively, this component of the data structure may be represented by storing all of the elements of all of the sets in a single array, sorted by the identity of the set they belong to, and by representing the collection of elements in any set by its starting and ending positions in this array. *Associated with each element, the set it belongs to. To perform a refinement operation, the algorithm loops through the elements of the given set . For each such element , it finds the set that contains , and checks whether a second set for has already been started. If not, it creates the second set and adds to a list of the sets that are split by the operation. Then, regardless of whether a new set was formed, the algorithm removes from and adds it to . In the representation in which all elements are stored in a single array, moving from one set to another may be performed by swapping with the final element of and then decrementing the end index of and the start index of the new set. Finally, after all elements of have been processed in this way, the algorithm loops through , separating each current set from the second set that has been split from it, and reports both of these sets as being newly formed by the refinement operation. The time to perform a single refinement operations in this way is , independent of the number of elements in the family of sets and also independent of the total number of sets in the data structure. Thus, the time for a sequence of refinements is proportional to the total size of the sets given to the algorithm in each refinement step.


Applications

An early application of partition refinement was in an algorithm by for
DFA minimization In automata theory (a branch of theoretical computer science), DFA minimization is the task of transforming a given deterministic finite automaton (DFA) into an equivalent DFA that has a minimum number of states. Here, two DFAs are called equival ...
. In this problem, one is given as input a
deterministic finite automaton In the theory of computation, a branch of theoretical computer science, a deterministic finite automaton (DFA)—also known as deterministic finite acceptor (DFA), deterministic finite-state machine (DFSM), or deterministic finite-state autom ...
, and must find an equivalent automaton with as few states as possible. Hopcroft's algorithm maintains a partition of the states of the input automaton into subsets, with the property that any two states in different subsets must be mapped to different states of the output automaton. Initially, there are two subsets, one containing all the accepting states of the automaton and one containing the remaining states. At each step one of the subsets and one of the input symbols of the automaton are chosen, and the subsets of states are refined into states for which a transition labeled would lead to , and states for which an -transition would lead somewhere else. When a set that has already been chosen is split by a refinement, only one of the two resulting sets (the smaller of the two) needs to be chosen again; in this way, each state participates in the sets for refinement steps and the overall algorithm takes time , where is the number of initial states and is the size of the alphabet. Partition refinement was applied by in an efficient implementation of the
Coffman–Graham algorithm In job shop scheduling and graph drawing, the Coffman–Graham algorithm is an algorithm, named after Edward G. Coffman, Jr. and Ronald Graham, for arranging the elements of a partially ordered set into a sequence of levels. The algorithm chooses ...
for parallel scheduling. Sethi showed that it could be used to construct a lexicographically ordered topological sort of a given
directed acyclic graph In mathematics, particularly graph theory, and computer science, a directed acyclic graph (DAG) is a directed graph with no directed cycles. That is, it consists of vertices and edges (also called ''arcs''), with each edge directed from one ve ...
in linear time; this lexicographic topological ordering is one of the key steps of the Coffman–Graham algorithm. In this application, the elements of the disjoint sets are vertices of the input graph and the sets used to refine the partition are sets of neighbors of vertices. Since the total number of neighbors of all vertices is just the number of edges in the graph, the algorithm takes time linear in the number of edges, its input size. Partition refinement also forms a key step in
lexicographic breadth-first search In computer science, lexicographic breadth-first search or Lex-BFS is a linear time algorithm for ordering the vertices of a graph. The algorithm is different from a breadth-first search, but it produces an ordering that is consistent with breadt ...
, a graph search algorithm with applications in the recognition of
chordal graph In the mathematical area of graph theory, a chordal graph is one in which all cycles of four or more vertices have a ''chord'', which is an edge that is not part of the cycle but connects two vertices of the cycle. Equivalently, every induced cy ...
s and several other important classes of graphs. Again, the disjoint set elements are vertices and the set represent sets of neighbors, so the algorithm takes linear time..


See also

*
Refinement (sigma algebra) Refinement may refer to: Mathematics * Equilibrium refinement, the identification of actualized equilibria in game theory * Refinement of an equivalence relation, in mathematics ** Refinement (topology), the refinement of an open cover in mathe ...


References

{{reflist Data structures