HOME

TheInfoList



OR:

In
graph theory In mathematics, graph theory is the study of ''graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conne ...
, a maximal independent set (MIS) or maximal stable set is an independent set that is not a
subset In mathematics, Set (mathematics), set ''A'' is a subset of a set ''B'' if all Element (mathematics), elements of ''A'' are also elements of ''B''; ''B'' is then a superset of ''A''. It is possible for ''A'' and ''B'' to be equal; if they are ...
of any other independent set. In other words, there is no
vertex Vertex, vertices or vertexes may refer to: Science and technology Mathematics and computer science *Vertex (geometry), a point where two or more curves, lines, or edges meet *Vertex (computer graphics), a data structure that describes the position ...
outside the independent set that may join it because it is maximal with respect to the independent set property. For example, in the graph , a
path A path is a route for physical travel – see Trail. Path or PATH may also refer to: Physical paths of different types * Bicycle path * Bridle path, used by people on horseback * Course (navigation), the intended path of a vehicle * Desire p ...
with three vertices , , and , and two edges and , the sets and are both maximally independent. The set is independent, but is not maximal independent, because it is a subset of the larger independent set In this same graph, the maximal cliques are the sets and A MIS is also a
dominating set In graph theory, a dominating set for a graph is a subset of its vertices, such that any vertex of is either in , or has a neighbor in . The domination number is the number of vertices in a smallest dominating set for . The dominating set ...
in the graph, and every dominating set that is independent must be maximal independent, so MISs are also called independent dominating sets. A graph may have many MISs of widely varying sizes; the largest, or possibly several equally large, MISs of a graph is called a
maximum independent set In graph theory, an independent set, stable set, coclique or anticlique is a set of vertices in a graph, no two of which are adjacent. That is, it is a set S of vertices such that for every two vertices in S, there is no edge connecting the two ...
. The graphs in which all maximal independent sets have the same size are called
well-covered graph In graph theory, a well-covered graph is an undirected graph in which every minimal vertex cover has the same size as every other minimal vertex cover. Equivalently, these are the graphs in which all maximal independent sets have equal size. Well ...
s. The phrase "maximal independent set" is also used to describe maximal subsets of independent elements in mathematical structures other than graphs, and in particular in
vector space In mathematics and physics, a vector space (also called a linear space) is a set whose elements, often called ''vectors'', may be added together and multiplied ("scaled") by numbers called '' scalars''. Scalars are often real numbers, but can ...
s and
matroid In combinatorics, a branch of mathematics, a matroid is a structure that abstracts and generalizes the notion of linear independence in vector spaces. There are many equivalent ways to define a matroid axiomatically, the most significant being ...
s. Two algorithmic problems are associated with MISs: finding a ''single'' MIS in a given graph and listing ''all'' MISs in a given graph.


Definition

For a graph G = (V, E), an independent set S is a maximal independent set if for v \in V, one of the following is true: # v \in S # N(v) \cap S \neq \emptyset where N(v) denotes the neighbors of v The above can be restated as a vertex either belongs to the independent set or has at least one neighbor vertex that belongs to the independent set. As a result, every edge of the graph has at least one endpoint not in S. However, it is not true that every edge of the graph has at least one, or even one endpoint in S Notice that any neighbor to a vertex in the independent set S cannot be in S because these vertices are disjoint by the independent set definition.


Related vertex sets

If ''S'' is a maximal independent set in some graph, it is a maximal clique or maximal complete subgraph in the complementary graph. A maximal clique is a set of vertices that
induces Electromagnetic or magnetic induction is the production of an electromotive force (emf) across an electrical conductor in a changing magnetic field. Michael Faraday is generally credited with the discovery of induction in 1831, and James Cler ...
a complete subgraph, and that is not a subset of the vertices of any larger complete subgraph. That is, it is a set ''S'' such that every pair of vertices in ''S'' is connected by an edge and every vertex not in ''S'' is missing an edge to at least one vertex in ''S''. A graph may have many maximal cliques, of varying sizes; finding the largest of these is the
maximum clique problem In computer science, the clique problem is the computational problem of finding cliques (subsets of vertices, all adjacent to each other, also called complete subgraphs) in a graph. It has several different formulations depending on which cli ...
. Some authors include maximality as part of the definition of a clique, and refer to maximal cliques simply as cliques. The
complement A complement is something that completes something else. Complement may refer specifically to: The arts * Complement (music), an interval that, when added to another, spans an octave ** Aggregate complementation, the separation of pitch-class ...
of a maximal independent set, that is, the set of vertices not belonging to the independent set, forms a minimal vertex cover. That is, the complement is a
vertex cover In graph theory, a vertex cover (sometimes node cover) of a graph is a set of vertices that includes at least one endpoint of every edge of the graph. In computer science, the problem of finding a minimum vertex cover is a classical optimizat ...
, a set of vertices that includes at least one endpoint of each edge, and is minimal in the sense that none of its vertices can be removed while preserving the property that it is a cover. Minimal vertex covers have been studied in
statistical mechanics In physics, statistical mechanics is a mathematical framework that applies statistical methods and probability theory to large assemblies of microscopic entities. It does not assume or postulate any natural laws, but explains the macroscopic be ...
in connection with the hard-sphere lattice gas model, a mathematical abstraction of fluid-solid state transitions. Every maximal independent set is a
dominating set In graph theory, a dominating set for a graph is a subset of its vertices, such that any vertex of is either in , or has a neighbor in . The domination number is the number of vertices in a smallest dominating set for . The dominating set ...
, a set of vertices such that every vertex in the graph either belongs to the set or is adjacent to the set. A set of vertices is a maximal independent set if and only if it is an independent dominating set.


Graph family characterizations

Certain graph families have also been characterized in terms of their maximal cliques or maximal independent sets. Examples include the maximal-clique irreducible and hereditary maximal-clique irreducible graphs. A graph is said to be ''maximal-clique irreducible'' if every maximal clique has an edge that belongs to no other maximal clique, and ''hereditary maximal-clique irreducible'' if the same property is true for every induced subgraph. Hereditary maximal-clique irreducible graphs include
triangle-free graph In the mathematical area of graph theory, a triangle-free graph is an undirected graph in which no three vertices form a triangle of edges. Triangle-free graphs may be equivalently defined as graphs with clique number ≤ 2, graphs with ...
s,
bipartite graph In the mathematical field of graph theory, a bipartite graph (or bigraph) is a graph whose vertices can be divided into two disjoint and independent sets U and V, that is every edge connects a vertex in U to one in V. Vertex sets U and V are ...
s, and
interval graph In graph theory, an interval graph is an undirected graph formed from a set of intervals on the real line, with a vertex for each interval and an edge between vertices whose intervals intersect. It is the intersection graph of the intervals. Int ...
s.
Cograph In graph theory, a cograph, or complement-reducible graph, or ''P''4-free graph, is a graph that can be generated from the single-vertex graph ''K''1 by complementation and disjoint union. That is, the family of cographs is the smallest class of ...
s can be characterized as graphs in which every maximal clique intersects every maximal independent set, and in which the same property is true in all induced subgraphs.


Bounding the number of sets

showed that any graph with ''n'' vertices has at most 3''n''/3 maximal cliques. Complementarily, any graph with ''n'' vertices also has at most 3''n''/3 maximal independent sets. A graph with exactly 3''n''/3 maximal independent sets is easy to construct: simply take the disjoint union of ''n''/3 triangle graphs. Any maximal independent set in this graph is formed by choosing one vertex from each triangle. The complementary graph, with exactly 3''n''/3 maximal cliques, is a special type of
Turán graph The Turán graph, denoted by T(n,r), is a complete multipartite graph; it is formed by partitioning a set of n vertices into r subsets, with sizes as equal as possible, and then connecting two vertices by an edge if and only if they belong to dif ...
; because of their connection with Moon and Moser's bound, these graphs are also sometimes called Moon-Moser graphs. Tighter bounds are possible if one limits the size of the maximal independent sets: the number of maximal independent sets of size ''k'' in any ''n''-vertex graph is at most :\lfloor n/k \rfloor^\lfloor n/k+1 \rfloor^. The graphs achieving this bound are again Turán graphs. Certain families of graphs may, however, have much more restrictive bounds on the numbers of maximal independent sets or maximal cliques. If all ''n''-vertex graphs in a family of graphs have O(''n'') edges, and if every subgraph of a graph in the family also belongs to the family, then each graph in the family can have at most O(''n'') maximal cliques, all of which have size O(1). For instance, these conditions are true for the
planar graph In graph theory, a planar graph is a graph that can be embedded in the plane, i.e., it can be drawn on the plane in such a way that its edges intersect only at their endpoints. In other words, it can be drawn in such a way that no edges cross ...
s: every ''n''-vertex planar graph has at most 3''n'' − 6 edges, and a subgraph of a planar graph is always planar, from which it follows that each planar graph has O(''n'') maximal cliques (of size at most four).
Interval graph In graph theory, an interval graph is an undirected graph formed from a set of intervals on the real line, with a vertex for each interval and an edge between vertices whose intervals intersect. It is the intersection graph of the intervals. Int ...
s and
chordal graph In the mathematical area of graph theory, a chordal graph is one in which all cycles of four or more vertices have a ''chord'', which is an edge that is not part of the cycle but connects two vertices of the cycle. Equivalently, every induced cy ...
s also have at most ''n'' maximal cliques, even though they are not always
sparse graph In mathematics, a dense graph is a graph in which the number of edges is close to the maximal number of edges (where every pair of vertices is connected by one edge). The opposite, a graph with only a few edges, is a sparse graph. The distinction ...
s. The number of maximal independent sets in ''n''-vertex
cycle graph In graph theory, a cycle graph or circular graph is a graph that consists of a single cycle, or in other words, some number of vertices (at least 3, if the graph is simple) connected in a closed chain. The cycle graph with vertices is called ...
s is given by the
Perrin number In mathematics, the Perrin numbers are defined by the recurrence relation : for , with initial values :. The sequence of Perrin numbers starts with : 3, 0, 2, 3, 2, 5, 5, 7, 10, 12, 17, 22, 29, 39, ... The number of different maxima ...
s, and the number of maximal independent sets in ''n''-vertex path graphs is given by the
Padovan sequence In number theory, the Padovan sequence is the sequence of integers ''P''(''n'') defined. by the initial values :P(0)=P(1)=P(2)=1, and the recurrence relation :P(n)=P(n-2)+P(n-3). The first few values of ''P''(''n'') are :1, 1, 1, 2, 2, 3, 4, 5 ...
. Therefore, both numbers are proportional to powers of 1.324718, the
plastic number In mathematics, the plastic number (also known as the plastic constant, the plastic ratio, the minimal Pisot number, the platin number, Siegel's number or, in French, ) is a mathematical constant which is the unique real solution of the cubic ...
.


Finding a single maximal independent set


Sequential algorithm

Given a Graph G(V,E), it is easy to find a single MIS using the following algorithm: # Initialize I to an empty set. # While V is not empty: #* Choose a node v∈V; #* Add v to the set I; #* Remove from V the node v and all its neighbours. # Return I. The runtime is O(''m'') since in the worst case we have to check all edges. O(m) is obviously the best possible runtime for a serial algorithm. But a parallel algorithm can solve the problem much faster.


Random-selection parallel algorithm uby's Algorithm

The following algorithm finds a MIS in time O(log ''n'').Luby’s Algorithm, in: Lecture Notes for Randomized Algorithms, Last Updated by Eric Vigoda on February 2, 2006
/ref> # Initialize I to an empty set. # While V is not empty: #* Choose a random set of vertices S ⊆ V, by selecting each vertex v independently with probability 1/(2d(v)), where d is the degree of v (the number of neighbours of v). #* For every edge in E, if both its endpoints are in the random set S, then remove from S the endpoint whose degree is lower (i.e. has fewer neighbours). Break ties arbitrarily, e.g. using a lexicographic order on the vertex names. #* Add the set S to I. #* Remove from V the set S and all the neighbours of nodes in S. # Return I. ANALYSIS: For each node v, divide its neighbours to ''lower neighbours'' (whose degree is lower than the degree of v) and ''higher neighbours'' (whose degree is higher than the degree of v), breaking ties as in the algorithm. Call a node v ''bad'' if more than 2/3 of its neighbors are higher neighbours. Call an edge ''bad'' if both its endpoints are bad; otherwise the edge is ''good''. * At least 1/2 of all edges are always good. PROOF: Build a directed version of G by directing each edge to the node with the higher degree (breaking ties arbitrarily). So for every bad node, the number of out-going edges is more than 2 times the number of in-coming edges. So every bad edge, that enters a node v, can be matched to a distinct set of two edges that exit the node v. Hence the total number of edges is at least 2 times the number of bad edges. * For every good node u, the probability that a neighbour of u is selected to S is at least a certain positive constant. PROOF: the probability that NO neighbour of u is selected to S is at most the probability that none of u's ''lower neighbours'' is selected. For each lower-neighbour v, the probability that it is not selected is (1-1/2d(v)), which is at most (1-1/2d(u)) (since d(u)>d(v)). The number of such neighbours is at least d(u)/3, since u is good. Hence the probability that no lower-neighbour is selected is at most 1-exp(-1/6). * For every node u that is selected to S, the probability that u will be removed from S is at most 1/2. PROOF: This probability is at most the probability that a higher-neighbour of u is selected to S. For each higher-neighbour v, the probability that it is selected is at most 1/2d(v), which is at most 1/2d(u) (since d(v)>d(u)). By union bound, the probability that no higher-neighbour is selected is at most d(u)/2d(u) = 1/2. * Hence, for every good node u, the probability that a neighbour of u is selected to S and remains in S is a certain positive constant. Hence, the probability that u is removed, in each step, is at least a positive constant. * Hence, for every good edge e, the probability that e is removed, in each step, is at least a positive constant. So the number of good edges drops by at least a constant factor each step. * Since at least half the edges are good, the total number of edges also drops by a constant factor each step. * Hence, the number of steps is O(log ''m''), where ''m'' is the number of edges. This is bounded by O(\log(n)). A worst-case graph, in which the average number of steps is \Theta(\log(n)), is a graph made of ''n''/2 connected components, each with 2 nodes. The degree of all nodes is 1, so each node is selected with probability 1/2, and with probability 1/4 both nodes in a component are not chosen. Hence, the number of nodes drops by a factor of 4 each step, and the expected number of steps is \log_4(n).


Random-priority parallel algorithm

The following algorithm is better than the previous one in that at least one new node is always added in each connected component: # Initialize I to an empty set. # While V is not empty, each node v does the following: #* Selects a random number r(v) in ,1and sends it to its neighbours; #* If r(v) is smaller than the numbers of all neighbours of v, then v inserts itself into I, removes itself from V and tells its neighbours about this; #* If v heard that one of its neighbours got into I, then v removes itself from V. # Return I. Note that in every step, the node with the smallest number in each connected component always enters I, so there is always some progress. In particular, in the worst-case of the previous algorithm (''n''/2 connected components with 2 nodes each), a MIS will be found in a single step. ANALYSIS: * A node v has probability at least \frac of being removed. PROOF: For each edge connecting a pair of nodes (v, w), replace it with two directed edges, one from (v, w) and the other (w, v). Notice that , E, is now twice as large. For every pair of directed edges (v, w), define two events: (v \rightarrow w) and (w \rightarrow v), v pre-emptively removes w and w pre-emptively removes v, respectively. The event (v \rightarrow w) occurs when r(v) < r(w) and r(v) < r(x), where w is a neighbor of v and x is neighbor w. Recall that each node is given a random number in the same
, 1 The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline (t ...
range. In a simple example with two disjoint nodes, each has probability \frac of being smallest. If there are three disjoint nodes, each has probability \frac of being smallest. In the case of v, it has probability at least \frac of being smallest because it is possible that a neighbor of v is also the neighbor of w, so a node becomes double counted. Using the same logic, the event (w \rightarrow v) also has probability at least \frac of being removed. * When the events (v \rightarrow w) and (w \rightarrow v) occur, they remove d(w) and d(v) directed outgoing edges, respectively. PROOF: In the event (v \rightarrow w), when v is removed, all neighboring nodes w are also removed. The number of outgoing directed edges from w removed is d(w). With the same logic, (w \rightarrow v) removes d(v) directed outgoing edges. * In each iteration of step 2, in expectation, half the edges are removed. PROOF: If the event (v \rightarrow w) happens then all neighbours of w are removed; hence the expected number of edges removed due to this event is at least \frac. The same is true for the reverse event (w \rightarrow v), i.e. the expected number of edges removed is at least \frac. Hence, for every undirected edge (w, v), the expected number of edges removed due to one of these nodes having smallest value is \frac = 1. Summing over all edges, \sum_ 1= , E, , gives an expected number of , E, edges removed every step, but each edge is counted twice (once per direction), giving \frac edges removed in expectation every step. * Hence, the expected run time of the algorithm is 3 \log_(m)+1 which is O(\log(n)).


Random-permutation parallel algorithm lelloch's Algorithm

Instead of randomizing in each step, it is possible to randomize once, at the beginning of the algorithm, by fixing a random ordering on the nodes. Given this fixed ordering, the following parallel algorithm achieves exactly the same MIS as the #Sequential algorithm (i.e. the result is deterministic): # Initialize I to an empty set. # While V is not empty: #* Let W be the set of vertices in V with no earlier neighbours (based on the fixed ordering); #* Add W to I; #* Remove from V the nodes in the set W and all their neighbours. # Return I. Between the totally sequential and the totally parallel algorithms, there is a continuum of algorithms that are partly sequential and partly parallel. Given a fixed ordering on the nodes and a factor δ∈(0,1], the following algorithm returns the same MIS: # Initialize I to an empty set. # While V is not empty: #* Select a factor δ∈(0,1]. #* Let P be the set of δ''n'' nodes that are first in the fixed ordering. #* Let W be a MIS on P using the totally parallel algorithm. #* Add W to I; #* Remove from V all the nodes in the prefix P, and all the neighbours of nodes in the set W. # Return I. Setting δ=1/''n'' gives the totally sequential algorithm; setting δ=1 gives the totally parallel algorithm. ANALYSIS: With a proper selection of the parameter δ in the partially parallel algorithm, it is possible to guarantee that it finishes after at most log(n) calls to the fully parallel algorithm, and the number of steps in each call is at most log(n). Hence the total run-time of the partially parallel algorithm is O(\log^2(n)). Hence the run-time of the fully parallel algorithm is also at most O(\log^2(n)). The main proof steps are: * If, in step ''i'', we select \delta = 2^i \log / D, where ''D'' is the maximum degree of a node in the graph, then with high probability, WHP all nodes remaining after step ''i'' have degree at most D / 2^i. Thus, after log(''D'') steps, all remaining nodes have degree 0 (since ''D''<''n''), and can be removed in a single step. * If, in any step, the degree of each node is at most ''d'', and we select \delta = C \log/d (for any constant ''C''), then with high probability, WHP the longest path in the directed graph determined by the fixed ordering has length O(\log(n)). Hence the fully parallel algorithm takes at most O(\log(n)) steps (since the longest path is a worst-case bound on the number of steps in that algorithm). * Combining these two facts gives that, if we select \delta = 2^i \log / D, then WHP the run-time of the partially parallel algorithm is O(\log(D)\log(n)) = O(\log^2(n)).


Listing all maximal independent sets

An algorithm for listing all maximal independent sets or maximal cliques in a graph can be used as a subroutine for solving many NP-complete graph problems. Most obviously, the solutions to the maximum independent set problem, the maximum clique problem, and the minimum independent dominating problem must all be maximal independent sets or maximal cliques, and can be found by an algorithm that lists all maximal independent sets or maximal cliques and retains the ones with the largest or smallest size. Similarly, the minimum vertex cover can be found as the complement of one of the maximal independent sets. observed that listing maximal independent sets can also be used to find 3-colorings of graphs: a graph can be 3-colored if and only if the
complement A complement is something that completes something else. Complement may refer specifically to: The arts * Complement (music), an interval that, when added to another, spans an octave ** Aggregate complementation, the separation of pitch-class ...
of one of its maximal independent sets is bipartite. He used this approach not only for 3-coloring but as part of a more general graph coloring algorithm, and similar approaches to graph coloring have been refined by other authors since. Other more complex problems can also be modeled as finding a clique or independent set of a specific type. This motivates the algorithmic problem of listing all maximal independent sets (or equivalently, all maximal cliques) efficiently. It is straightforward to turn a proof of Moon and Moser's 3''n''/3 bound on the number of maximal independent sets into an algorithm that lists all such sets in time O(3''n''/3). For graphs that have the largest possible number of maximal independent sets, this algorithm takes constant time per output set. However, an algorithm with this time bound can be highly inefficient for graphs with more limited numbers of independent sets. For this reason, many researchers have studied algorithms that list all maximal independent sets in polynomial time per output set. The time per maximal independent set is proportional to that for
matrix multiplication In mathematics, particularly in linear algebra, matrix multiplication is a binary operation that produces a matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the s ...
in dense graphs, or faster in various classes of sparse graphs.


Parallelization of finding maximum independent sets


History

The maximal independent set problem was originally thought to be non-trivial to parallelize due to the fact that the lexicographical maximal independent set proved to be
P-Complete In computational complexity theory, a decision problem is P-complete (complete for the complexity class P) if it is in P and every problem in P can be reduced to it by an appropriate reduction. The notion of P-complete decision problems is usef ...
; however, it has been shown that a deterministic parallel solution could be given by an NC^1 reduction from either the maximum set packing or the
maximal matching In the mathematical discipline of graph theory, a matching or independent edge set in an undirected graph is a set of edges without common vertices. Finding a matching in a bipartite graph can be treated as a network flow problem. Definit ...
problem or by an NC^2 reduction from the
2-satisfiability In computer science, 2-satisfiability, 2-SAT or just 2SAT is a computational problem of assigning values to variables, each of which has two possible values, in order to satisfy a system of constraints on pairs of variables. It is a special case ...
problem. Typically, the structure of the algorithm given follows other parallel graph algorithms - that is they subdivide the graph into smaller local problems that are solvable in parallel by running an identical algorithm. Initial research into the maximal independent set problem started on the PRAM model and has since expanded to produce results for
distributed algorithm A distributed algorithm is an algorithm designed to run on computer hardware constructed from interconnected processors. Distributed algorithms are used in different application areas of distributed computing, such as telecommunications, scientific ...
s on
computer cluster A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software. The comp ...
s. The many challenges of designing distributed parallel algorithms apply in equal to the maximum independent set problem. In particular, finding an algorithm that exhibits efficient runtime and is optimal in data communication for subdividing the graph and merging the independent set.


Complexity class

It was shown in 1984 by Karp et al. that a deterministic parallel solution on PRAM to the maximal independent set belonged in the Nick's Class complexity zoo of NC_4. That is to say, their algorithm finds a maximal independent set in O(\log^4n) using O((n/\log n)^3), where n is the vertex set size. In the same paper, a randomized parallel solution was also provided with a runtime of O(\log^4n) using O(n^2) processors. Shortly after, Luby and Alon et al. independently improved on this result, bringing the maximal independent set problem into the realm of NC_2 with an O(\log^2n) runtime using O(mn^2) processors, where m is the number of edges in the graph. In order to show that their algorithm is in NC_2, they initially presented a randomized algorithm that uses O(m) processors but could be derandomized with an additional O(n^2) processors. Today, it remains an open question as to if the maximal independent set problem is in NC_1.


Communication and data exchange

Distributed maximal independent set algorithms are strongly influenced by algorithms on the PRAM model. The original work by Luby and Alon et al. has led to several distributed algorithms. In terms of exchange of bits, these algorithms had a message size lower bound per round of O(\log n) bits and would require additional characteristics of the graph. For example, the size of the graph would need to be known or the maximum degree of neighboring vertices for a given vertex could be queried. In 2010, Métivier et al. reduced the required message size per round to O(1), which is optimal and removed the need for any additional graph knowledge.


Footnotes


Notes


References

*. *. *. *. *. *. *. *. *. *. * *. *. *. *. * *. . *. *. * *. *. *. *. {{DEFAULTSORT:Maximal Independent Set Graph theory objects Computational problems in graph theory de:Stabile Menge#Maximale stabile Menge