HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, the cell-probe model is a model of computation similar to the
random-access machine In computer science, random-access machine (RAM) is an abstract machine in the general class of register machines. The RAM is very similar to the counter machine but with the added capability of 'indirect addressing' of its registers. Like the cou ...
, except that all operations are free except memory access. This model is useful for proving lower bounds of algorithms for data structure problems.


Overview

The cell-probe model is a minor modification of the
random-access machine In computer science, random-access machine (RAM) is an abstract machine in the general class of register machines. The RAM is very similar to the counter machine but with the added capability of 'indirect addressing' of its registers. Like the cou ...
model, itself a minor modification of the
counter machine A counter machine is an abstract machine used in a formal logic and theoretical computer science to model computation. It is the most primitive of the four types of register machines. A counter machine comprises a set of one or more unbounded ''re ...
model A model is an informative representation of an object, person or system. The term originally denoted the Plan_(drawing), plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin ''modulus'', a mea ...
, in which computational cost is only assigned to accessing units of memory called cells. In this model, computation is framed as a problem of querying a set of memory cells. The problem has two phases: the preprocessing phase and the query phase. The input to the first phase, the preprocessing phase, is a set of data from which to build some structure from memory cells. The input to the second phase, the query phase, is a query datum. The problem is to determine if the query datum was included in the original input data set. Operations are free except to access memory cells. This model is useful in the analysis of data structures. In particular, the model clearly shows a minimum number of memory accesses to solve a problem in which there is stored data on which we would like to run some query. An example of such a problem is the dynamic
partial sum In mathematics, a series is, roughly speaking, a description of the operation of adding infinitely many quantities, one after the other, to a given starting quantity. The study of series is a major part of calculus and its generalization, math ...
problem.


History

In
Andrew Yao Andrew Chi-Chih Yao (; born December 24, 1946) is a Chinese computer scientist and computational theorist. He is currently a professor and the dean of Institute for Interdisciplinary Information Sciences (IIIS) at Tsinghua University. Yao use ...
's 1981 paper "Should Tables Be Sorted?", Yao described the cell-probe model and used it to give a minimum number of memory cell "probes" or accesses necessary to determine whether a given query datum exists within a table stored in memory.


Formal definition

Given a set of data S construct a structure consisting of c memory cells, each able to store w bits. Then when given a query element s determine whether s \in S with correctness 1 - \varepsilon by accessing t memory cells. Such an algorithm is called an \varepsilon-error t-probe algorithm using c cells with word size w.


Notable results


Dynamic Partial Sums

The dynamic partial sum problem defines two operations which conceptually operation sets the value in an array at index to be , and which returns the sum of the values in at indices through . Such an implementation would take O(1) time for and O(n) time for . Instead, if the values are stored as leaves in a tree whose inner nodes store the values of the subtree rooted at that node. In this structure requires O(\log n) time to update each node in the leaf to root path, and similarly requires O(\log n) time to traverse the tree from leaf to root summing the values of all subtrees left of the query index. Mihai Pătraşcu used the cell-probe model and an information transfer argument to show that the partial sums problem requires \Omega\left(\log n\right) time per operation.


Approximate Nearest Neighbor Searching

The exact
nearest neighbor search Nearest neighbor search (NNS), as a form of proximity search, is the optimization problem of finding the point in a given set that is closest (or most similar) to a given point. Closeness is typically expressed in terms of a dissimilarity function ...
problem is to determine the closest in a set of input points to a given query point. An approximate version of this problem is often considered since many applications of this problem are in very high dimension spaces and solving the problem in high dimensions requires exponential time or space with respect to the dimension. Chakrabarti and Regev proved that the approximate nearest neighbor search problem on the
Hamming cube In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of ''substitutions'' required to ch ...
using polynomial storage and d^ word size requires a worst-case query time of \Omega\left(\frac\right). This proof used the cell-probe model and information theoretic techniques for communication complexity.


External links


NIST's Dictionary of Algorithms and Data Structures entry on the cell-probe model


References

{{reflist Register machines Models of computation