HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, a property testing algorithm for a
decision problem In computability theory and computational complexity theory, a decision problem is a computational problem that can be posed as a yes–no question of the input values. An example of a decision problem is deciding by means of an algorithm whethe ...
is an
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algorithms are used as specificat ...
whose
query complexity In computational complexity the decision tree model is the model of computation in which an algorithm is considered to be basically a decision tree, i.e., a sequence of ''queries'' or ''tests'' that are done adaptively, so the outcome of the pre ...
to its input is much smaller than the instance size of the problem. Typically property testing algorithms are used to distinguish if some combinatorial structure ''S'' (such as a
graph Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discre ...
or a
boolean function In mathematics, a Boolean function is a function whose arguments and result assume values from a two-element set (usually , or ). Alternative names are switching function, used especially in older computer science literature, and truth function ( ...
) satisfies some property ''P'', or is "far" from having this property (meaning an ε-fraction of the representation of ''S'' need be modified in order to make ''S'' satisfy ''P''), using only a small number of "local" queries to the object. For example, the following
promise problem In computational complexity theory, a promise problem is a generalization of a decision problem where the input is promised to belong to a particular subset of all possible inputs. Unlike decision problems, the ''yes'' instances (the inputs for whi ...
admits an algorithm whose query complexity is independent of the instance size (for an arbitrary constant ε > 0): :"Given a graph ''G'' on ''n'' vertices, decide if ''G'' is bipartite, or ''G'' cannot be made bipartite even after removing an arbitrary subset of at most \epsilon\tbinom n2 edges of ''G''." Property testing algorithms are central to the definition of
probabilistically checkable proof In computational complexity theory, a probabilistically checkable proof (PCP) is a type of proof that can be checked by a randomized algorithm using a bounded amount of randomness and reading a bounded number of bits of the proof. The algorithm is ...
s, as a probabilistically checkable proof is essentially a proof that can be verified by a property testing algorithm.


Definition and variants

Formally, a property testing algorithm with query complexity ''q''(''n'') and ''proximity parameter'' ε for a decision problem ''L'' is a
randomized algorithm A randomized algorithm is an algorithm that employs a degree of randomness as part of its logic or procedure. The algorithm typically uses uniformly random bits as an auxiliary input to guide its behavior, in the hope of achieving good performan ...
that, on input ''x'' (an instance of ''L'') makes at most ''q''(, ''x'', ) queries to ''x'' and behaves as follows: * If ''x'' is in ''L'', the algorithm accepts ''x'' with probability at least ⅔. * If ''x'' is ε-far from ''L'', the algorithm rejects ''x'' with probability at least ⅔. Here, "''x'' is ε-far from ''L''" means that the Hamming distance between ''x'' and any string in ''L'' is at least ε, ''x'', . A property testing algorithm is said to have ''one-sided error'' if it satisfies the stronger condition that the accepting probability for instances ''x ∈ L'' is 1 instead of ⅔. A property testing algorithm is said be ''non-adaptive'' if it performs all its queries before it "observes" any answers to previous queries. Such an algorithm can be viewed as operating in the following manner. First the algorithm receives its input. Before looking at the input, using its internal randomness, the algorithm decides which symbols of the input are to be queried. Next, the algorithm observes these symbols. Finally, without making any additional queries (but possibly using its randomness), the algorithm decides whether to accept or reject the input.


Features and limitations

The main efficiency parameter of a property testing algorithm is its query complexity, which is the maximum number of input symbols inspected over all inputs of a given length (and all random choices made by the algorithm). One is interested in designing algorithms whose query complexity is as small as possible. In many cases the running time of property testing algorithms is
sublinear In linear algebra, a sublinear function (or functional as is more often used in functional analysis), also called a quasi-seminorm or a Banach functional, on a vector space X is a real-valued function with only some of the properties of a seminorm. ...
in the instance length. Typically, the goal is first to make the query complexity as small as possible as a function of the instance size ''n'', and then study the dependency on the proximity parameter ε. Unlike other complexity-theoretic settings, the asymptotic query complexity of property testing algorithms is affected dramatically by the representation of instances. For example, when ε = 0.01, the problem of testing bipartiteness of ''dense graphs'' (which are represented by their
adjacency matrix In graph theory and computer science, an adjacency matrix is a square matrix used to represent a finite graph. The elements of the matrix indicate whether pairs of vertices are adjacent or not in the graph. In the special case of a finite simp ...
) admits an algorithm of constant query complexity. In contrast, sparse graphs on ''n'' vertices (which are represented by their adjacency list) require property testing algorithms of query complexity \Omega(\sqrt). The query complexity of property testing algorithms grows as the proximity parameter ε becomes smaller for all non-trivial properties. This dependence on ε is necessary as a change of fewer than ε symbols in the input cannot be detected with constant probability using fewer than O(1/ε) queries. Many interesting properties of dense graphs can be tested using query complexity that depends only on ε and not on the graph size ''n''. However, the query complexity can grow enormously fast as a function of ε. For example, for a long time the best known algorithm for testing if a graph does not contain any triangle had a query complexity which is a tower function of ''poly''(1/ε), and only in 2010 this has been improved to a tower function of ''log''(1/ε). One of the reasons for this enormous growth in bounds is that many of the positive results for property testing of graphs are established using the
Szemerédi regularity lemma Szemerédi's regularity lemma is one of the most powerful tools in extremal graph theory, particularly in the study of large dense graphs. It states that the vertices of every large enough graph can be partitioned into a bounded number of parts so ...
, which also has tower-type bounds in its conclusions. The connection of property testing to the Szemerédi regularity lemma and related graph removal lemmas is elaborated on below.


Testing graph properties

For a graph ''G'' with ''n'' vertices, the notion of distance we will use is the ''
edit distance In computational linguistics and computer science, edit distance is a string metric, i.e. a way of quantifying how dissimilar two strings (e.g., words) are to one another, that is measured by counting the minimum number of operations required to tr ...
''. That is, we say that the distance between two graphs is the smallest ε such that one can add and/or delete \varepsilon n^2 edges and get from the first graph to the second. Under a reasonable representation of graphs, this is equivalent to the earlier ''
Hamming distance In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of ''substitutions'' required to chan ...
'' definition (up to possibly a change of constants). To make precise the general notions of property testing in the context of graphs, we say a tester for graph property ''P'' should distinguish with at least ⅔ probability between the cases of ''G'' satisfying ''P'' and the cases where ''G'' is ε-far in edit distance from satisfying ''P''. The tester can access some
oracle An oracle is a person or agency considered to provide wise and insightful counsel or prophetic predictions, most notably including precognition of the future, inspired by deities. As such, it is a form of divination. Description The word '' ...
to query whether a pair of vertices has an edge between them in ''G'' or not. The ''query complexity'' is the number of such oracle queries. Say the tester has ''one-sided error'' if it has false positives and not false negatives, i.e. if ''G'' satisfies ''P'', the tester always outputs the correct answer. We can only differentiate between graphs that satisfy ''P'' versus those far from ''P'' as opposed to satisfy versus not satisfy ''P''. In the latter case, consider two graphs: ''G'' satisfying ''P'' and ''H'' not satisfying ''P'' by changing only a few edges. One example is testing triangle-freeness with ''H'' a graph with exactly one triangle and ''G'' having one of these edges removed. Then, the tester cannot tell them apart unless it queries every edge, which it cannot do.


Short history

The field of graph property testing was first introduced by Goldreich, Goldwasser, and Ron. In their seminal paper published in 1998, an abstract graph partition problem is analyzed and some testers provided. These include as special cases several important graph properties like bipartiteness, k-colorability, having a large
clique A clique ( AusE, CanE, or ), in the social sciences, is a group of individuals who interact with one another and share similar interests. Interacting with cliques is part of normative social development regardless of gender, ethnicity, or popular ...
, and having a large
cut Cut may refer to: Common uses * The act of cutting, the separation of an object into two through acutely-directed force ** A type of wound ** Cut (archaeology), a hole dug in the past ** Cut (clothing), the style or shape of a garment ** Cut (ea ...
. In particular, the ''natural'' algorithms that sample a subgraph and check whether it satisfy the property are all correct albeit with perhaps suboptimal query complexities. Since then, several related discoveries have been made * In 1992, Alon, Duke, Lefmann, Rödl, and Yuster showed that for every graph ''H'' the property of not contain ''H'' as a subgraph is testable. * In 1999, Alon, Fischer, Krivelevich, and Szegedy showed that for every graph ''H'' the property of not contain ''H'' as an induced subgraph subgraph is testable. * In 2005, Alon and Shapira showed that any ''monotone graph property'' (one that is preserved under vertex and edge deletion) is testable with one-sided error. * In 2008, Alon and Shapira exhibited testers with one-sided error for all ''
hereditary Heredity, also called inheritance or biological inheritance, is the passing on of traits from parents to their offspring; either through asexual reproduction or sexual reproduction, the offspring cells or organisms acquire the genetic inform ...
'' graph properties. They also characterized properties are easy to test. Namely, these natural properties are ''semi-hereditary''. These statements will be made clearer below.


Testing hereditary graph properties

A graph property is ''
hereditary Heredity, also called inheritance or biological inheritance, is the passing on of traits from parents to their offspring; either through asexual reproduction or sexual reproduction, the offspring cells or organisms acquire the genetic inform ...
'' if it is preserved under deletion of vertices, or equivalently, if it is preserved under taking
induced subgraph In the mathematical field of graph theory, an induced subgraph of a graph is another graph, formed from a subset of the vertices of the graph and ''all'' of the edges (from the original graph) connecting pairs of vertices in that subset. Defini ...
s, hence the name hereditary. A few important hereditary properties are ''H''-freeness (for some graph ''H''), ''k''-colorability, and
planarity Planarity is a puzzle computer game by John Tantalo, based on a concept by Mary Radcliffe at Western Michigan University. The name comes from the concept of planar graphs in graph theory; these are graphs that can be embedded in the Euclidean pla ...
. All hereditary properties are testable. :Theorem (Alon & Shapira 2008). Every hereditary graph property is testable with one-sided error. The proof relies on a version of the
graph removal lemma In graph theory, the graph removal lemma states that when a graph contains few copies of a given subgraph, then all of the copies can be eliminated by removing a small number of edges. The special case in which the subgraph is a triangle is known a ...
for infinite families of induced subgraphs. We reproduce the theorem here without proof. In particular, note that constant h_ come up naturally as the size of the samples. Also, the query complexity using this regularity approach is large due to the tower function bound in
Szemerédi regularity lemma Szemerédi's regularity lemma is one of the most powerful tools in extremal graph theory, particularly in the study of large dense graphs. It states that the vertices of every large enough graph can be partitioned into a bounded number of parts so ...
. :Theorem (Infinite graph removal lemma). For each (possibly infinite) set of graphs \mathcal and \varepsilon >0 , there exist h_ and \delta >0 so that if G is an n-vertex graph with fewer than \delta n^ copies of H for every H\in \mathcal with at most h_ vertices, then G can be made induced \mathcal -free by adding/removing fewer than \varepsilon n^ edges.


Oblivious testers

Informally, an ''oblivious tester'' is oblivious to the size of the input. For a graph property ''P'', it is an algorithm that takes as input a parameter ε and graph ''G'', and then runs as a property testing algorithm on ''G'' for the property ''P'' with proximity parameter ε that makes exactly ''q''(ε) queries to ''G''. :Definition. An oblivious tester is an algorithm that takes as input a parameter ε. It computes an integer ''q''(ε) and then asks an oracle for an induced subgraph ''H'' on exactly ''q''(ε) vertices from ''G'' chosen uniformly at random. It then accepts or rejects (possibly randomly) according to ε and ''H''. We say it tests for the property ''P'' if it accepts with probability at least ⅔ for ''G'' that has property ''P'', and rejects with probability at least ⅔ or ''G'' that is ε-far from having property ''P''. Crucially, the number of queries an oblivious tester makes is a constant only dependent on ε and not the size of the input graph ''G''. In complete analogy with property testing algorithms, we can talk about oblivious testers with one-sided error.


Testing semi-hereditary graph properties

We can certainly contrive some graph properties where a tester for it must access the number of vertices. Here is one example. : Example. A graph ''G'' satisfies property ''P'' if it is bipartite with an even number of vertices or perfect with an odd number of vertices. In this case, the tester cannot even differentiate which property (bipartiteness or perfectness) to test unless it knows the number of vertices. There are many examples of such unnatural properties. In fact, the characterization of graph properties testable by an oblivious tester with one-sided error leads to a class of natural properties. :Definition. A graph property ''H'' is semi-hereditary if there exists a hereditary graph property ''H'' such that any graph satisfying ''P'' satisfies ''H'', and for every \varepsilon >0 there is an M(\varepsilon) such that every graph of size at least M(\varepsilon) that is ε-far from satisfying ''P'' contains an induced subgraph that does not satisfy ''H''. Trivially, hereditary properties are also semi-hereditary. This characterization partially answers the converse to the other Alon & Shapira theorem above: the properties that are easy to test properties (having oblivious testers with one-sided error) are almost hereditary. In the same paper, they showed :Theorem (Alon & Shapira 2008). A graph property ''P'' has an oblivious one-sided error tester if and only if ''P'' is semi-hereditary.


Examples: testing some graph properties

In this section, we will give some ''natural'' oblivious testing algorithms with one-sided error for triangle-freeness, bipartiteness, and ''k''-colorability. They are natural in the sense that we follow the natural idea of randomly sampling some subset ''X'' of vertices of ''G'' and checking whether the graph property holds on the subgraph spanned by ''X'' by
brute-force search In computer science, brute-force search or exhaustive search, also known as generate and test, is a very general problem-solving technique and algorithmic paradigm that consists of systematically enumerating all possible candidates for the soluti ...
. We have one-sided error since these properties are actually hereditary: if ''G'' satisfy the property, so must the induced subgraph spanned by ''X'', so our tester always accepts. For triangle-freeness, the tester is an application of the triangle removal lemma. In particular, it tells us that if graph ''G'' is ε-far from being triangle-free, there is a (computable) constant \delta=\delta(\varepsilon) so that ''G'' has at least \delta n^3 triangles.
Example (Triangle-freeness Testing Algorithm). # Given graph ''G'', choose a random set ''X'' of q(\varepsilon)=1/\delta triples of vertices independently at random, where ''δ'' is as above. # For every triple of vertices in ''X'', query if all three pairs of vertices are adjacent in ''G''. # The algorithm ''accepts'' if no triple of vertices induces a triangle, and ''rejects'' otherwise.
For bipartiteness and ''k''-colorability, let ''δ'' be the desired upper bound on error probability for the following testers. Note, query complexity should not be confused with running time. The latter is often exponential (as is the case of both) due to a lack of polynomial time decision algorithm to test the peroperty on the induced subgraph. We instead check by
brute-force search In computer science, brute-force search or exhaustive search, also known as generate and test, is a very general problem-solving technique and algorithmic paradigm that consists of systematically enumerating all possible candidates for the soluti ...
.
Example (Bipartite Testing Algorithm). # Given graph ''G'', choose a random set ''X'' of q(\varepsilon)=O(\log (1/(\varepsilon \delta))/\varepsilon^) vertices. # For every pair of vertices in ''X'', query if they are adjacent in ''G''. # It ''accepts'' if the induced subgraph of ''G'' on ''X'' is bipartite and ''reject'' otherwise.
Example (k-colorability Testing Algorithm). # Given graph ''G'', choose a random set ''X'' of q(\varepsilon)=O(k^\log^ (k/\delta)/\varepsilon ^) vertices. # For every pair of vertices in ''X'', query if they are adjacent in ''G''. # It ''accepts'' if the induced subgraph of ''G'' on ''X'' is k-colorable and ''reject'' otherwise.


References

{{reflist, refs= {{cite arXiv , last=Fox , first=Jacob , eprint=1006.1300 , title= A new proof of the graph removal lemma , date=2010 , class=math.CO {{cite journal , title=A characterization of the (natural) graph properties testable with one-sided error., last1=Alon, first1=Noga, authorlink=Noga Alon , last2=Shapira , first2=Asaf, journal=SIAM Journal on Computing , volume=37 , pages=1703–1727 , year=2008 , issue=6, doi=10.1137/06064888X, url=https://www.tau.ac.il/~nogaa/PDFS/heredit2.pdf {{cite journal , last1=Goldreich , first1=Oded , last2=Goldwasser , first2=Shari , last3=Ron , first3=Dana , title=Property testing and its connection to learning and approximation , journal=Journal of the ACM , date=1 July 1998 , volume=45 , issue=4 , pages=653–750 , doi=10.1145/285055.285060 {{cite book , first=Oded , last=Goldreich , authorlink=Oded Goldreich , title= Introduction to Property Testing , year=2017 , publisher=Cambridge University Press , isbn=9781107194052 {{cite techreport , first=Dana , last=Ron , authorlink=Dana Ron , title=Property Testing , year=2000 {{cite journal , last1=Rubinfeld , first1=Ronitt , last2=Shapira , first2=Asaf , title=Sublinear Time Algorithms , journal=SIAM Journal on Discrete Mathematics , volume=25 , number=4 , pages=1562–1588 , year=2011 , doi=10.1137/100791075 , citeseerx=10.1.1.221.1797 , s2cid=1319122 {{cite journal , title=Combinatorial Property Testing (a survey) , last=Goldreich , first=Oded , authorlink=Oded Goldreich , journal=Randomization Methods in Algorithm Design , series=DIMACS Series in Discrete Mathematics and Theoretical Computer Science , volume=43 , pages=45–59 , year=1999 , doi=10.1090/dimacs/043/04 , isbn=0821870874 , url=http://www.wisdom.weizmann.ac.il/~oded/PS/testSU.ps {{cite journal , last1=Alon , first1=N. , last2=Duke , first2=R. A. , last3=Lefmann , first3=H. , last4=Rodl , first4=V. , last5=Yuster , first5=R. , title=The Algorithmic Aspects of the Regularity Lemma , journal=Journal of Algorithms , date=1 January 1994 , volume=16 , issue=1 , pages=80–109 , doi=10.1006/jagm.1994.1005 {{cite journal , last1=Alon , first1=Noga , last2=Fischer , first2=Eldar , last3=Krivelevich , first3=Michael , last4=Szegedy , first4=Mario , title=Efficient Testing of Large Graphs , journal=Combinatorica , date=1 April 2000 , volume=20 , issue=4 , pages=451–476 , doi=10.1007/s004930070001 {{cite journal , last1=Alon , first1=Noga , last2=Shapira , first2=Asaf , title=Every monotone graph property is testable , journal=Proceedings of the Thirty-Seventh Annual ACM Symposium on Theory of Computing , date=22 May 2005 , pages=128–137 , doi=10.1145/1060590.1060611, isbn=1581139608 , s2cid=14096855 Approximation algorithms Randomized algorithms Theoretical computer science