Linear Probing

picture info	Linear Probing Linear probing is a scheme in computer programming for resolving hash collision, collisions in hash tables, data structures for maintaining a collection of Attribute–value pair, key–value pairs and looking up the value associated with a given key. It was invented in 1954 by Gene Amdahl, Elaine M. McGraw, and Arthur Samuel (computer scientist), Arthur Samuel and first analyzed in 1963 by Donald Knuth. Along with quadratic probing and double hashing, linear probing is a form of open addressing. In these schemes, each cell of a hash table stores a single key–value pair. When the hash function causes a collision by mapping a new key to a cell of the hash table that is already occupied by another key, linear probing searches the table for the closest following free location and inserts the new key there. Lookups are performed in the same way, by searching the table sequentially starting at the position given by the hash function, until finding a cell with a matching key or an e ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	MurmurHash MurmurHash is a non-cryptographic hash function suitable for general hash-based lookup. It was created by Austin Appleby in 2008 and, as of 8 January 2016, is hosted on GitHub along with its test suite named SMHasher. It also exists in a number of variants, all of which have been released into the public domain. The name comes from two basic operations, multiply (MU) and rotate (R), used in its inner loop. Unlike cryptographic hash functions, it is not specifically designed to be difficult to reverse by an adversary, making it unsuitable for cryptographic purposes. Variants MurmurHash1 The original MurmurHash was created as an attempt to make a faster function than Lookup3. Although successful, it had not been tested thoroughly and was not capable of providing 64-bit hashes as in Lookup3. Its design would be later built upon in MurmurHash2, combining a multiplicative hash (similar to the Fowler–Noll–Vo hash function) with an Xorshift. MurmurHash2 MurmurHash2 yields a 32- o ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Stirling's Approximation In mathematics, Stirling's approximation (or Stirling's formula) is an asymptotic approximation for factorials. It is a good approximation, leading to accurate results even for small values of n. It is named after James Stirling, though a related but less precise result was first stated by Abraham de Moivre. One way of stating the approximation involves the logarithm of the factorial: \ln(n!) = n\ln n - n +O(\ln n), where the big O notation means that, for all sufficiently large values of n, the difference between \ln(n!) and n\ln n-n will be at most proportional to the logarithm of n. In computer science applications such as the worst-case lower bound for comparison sorting, it is convenient to instead use the binary logarithm, giving the equivalent form \log_2 (n!) = n\log_2 n - n\log_2 e +O(\log_2 n). The error term in either base can be expressed more precisely as \tfrac12\log(2\pi n)+O(\tfrac1n), corresponding to an approximate formula for the factorial itself, n! \sim \sqr ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Chernoff Bound In probability theory, a Chernoff bound is an exponentially decreasing upper bound on the tail of a random variable based on its moment generating function. The minimum of all such exponential bounds forms ''the'' Chernoff or Chernoff-Cramér bound, which may decay faster than exponential (e.g. sub-Gaussian). It is especially useful for sums of independent random variables, such as sums of Bernoulli random variables. The bound is commonly named after Herman Chernoff who described the method in a 1952 paper, though Chernoff himself attributed it to Herman Rubin. In 1938 Harald Cramér had published an almost identical concept now known as Cramér's theorem. It is a sharper bound than the first- or second-moment-based tail bounds such as Markov's inequality or Chebyshev's inequality, which only yield power-law bounds on tail decay. However, when applied to sums the Chernoff bound requires the random variables to be independent, a condition that is not required by either Markov's i ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Load Factor (computer Science) In computer science, a hash table is a data structure that implements an associative array, also called a dictionary or simply map; an associative array is an abstract data type that maps keys to values. A hash table uses a hash function to compute an ''index'', also called a ''hash code'', into an array of ''buckets'' or ''slots'', from which the desired value can be found. During lookup, the key is hashed and the resulting hash indicates where the corresponding value is stored. A map implemented by a hash table is called a hash map. Most hash table designs employ an imperfect hash function. Hash collisions, where the hash function generates the same index for more than one key, therefore typically must be accommodated in some way. In a well-dimensioned hash table, the average time complexity for each lookup is independent of the number of elements stored in the table. Many hash table designs also allow arbitrary insertions and deletions of key–value pairs, at amortize ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Big O Notation Big ''O'' notation is a mathematical notation that describes the asymptotic analysis, limiting behavior of a function (mathematics), function when the Argument of a function, argument tends towards a particular value or infinity. Big O is a member of a #Related asymptotic notations, family of notations invented by German mathematicians Paul Gustav Heinrich Bachmann, Paul Bachmann, Edmund Landau, and others, collectively called Bachmann–Landau notation or asymptotic notation. The letter O was chosen by Bachmann to stand for '':wikt:Ordnung#German, Ordnung'', meaning the order of approximation. In computer science, big O notation is used to Computational complexity theory, classify algorithms according to how their run time or space requirements grow as the input size grows. In analytic number theory, big O notation is often used to express a bound on the difference between an arithmetic function, arithmetical function and a better understood approximation; one well-known exam ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Expected Time In computational complexity theory, the average-case complexity of an algorithm is the amount of some computational resource (typically time) used by the algorithm, averaged over all possible inputs. It is frequently contrasted with worst-case complexity which considers the maximal complexity of the algorithm over all possible inputs. There are three primary motivations for studying average-case complexity. First, although some problems may be intractable in the worst-case, the inputs which elicit this behavior may rarely occur in practice, so the average-case complexity may be a more accurate measure of an algorithm's performance. Second, average-case complexity analysis provides tools and techniques to generate hard instances of problems which can be utilized in areas such as cryptography and derandomization. Third, average-case complexity allows discriminating the most efficient algorithm in practice among algorithms of equivalent best case complexity (for instance Quicksort). ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Primary Clustering In computer programming, primary clustering is a phenomenon that causes performance degradation in linear-probing hash tables. The phenomenon states that, as elements are added to a linear probing hash table, they have a tendency to cluster together into long runs (i.e., long contiguous regions of the hash table that contain no free slots). If the hash table is at a load factor of 1 - 1/x for some parameter x \ge 2 , then the expected length of the run containing a given element u is \Theta(x^2). This causes insertions and negative queries to take expected time \Theta(x^2) in a linear-probing hash table. Causes of primary clustering Primary clustering has two causes: * Winner keeps winning: The longer that a run becomes, the more likely it is to accrue additional elements. This causes a positive feedback loop that contributes to the clustering effect. However, this alone would not cause the quadratic blowup. * Joining of runs: A single insertion may not only increase the length ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Sentinel Value In computer programming, a sentinel value (also referred to as a flag value, trip value, rogue value, signal value, or dummy data) is a special value in the context of an algorithm which uses its presence as a condition of termination, typically in a loop or recursive algorithm. The sentinel value is a form of in-band data that makes it possible to detect the end of the data when no out-of-band data (such as an explicit size indication) is provided. The value should be selected in such a way that it is guaranteed to be distinct from all legal data values since otherwise, the presence of such values would prematurely signal the end of the data (the semipredicate problem). A sentinel value is sometimes known as an " Elephant in Cairo", due to a joke where this is used as a physical sentinel. In safe languages, most sentinel values could be replaced with option types, which enforce explicit handling of the exceptional case. Examples Some examples of common sentinel values and t ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Lazy Deletion In computer science, lazy deletion refers to a method of deleting elements from a hash table that uses open addressing Open addressing, or closed hashing, is a method of Hash table#Collision resolution, collision resolution in hash tables. With this method a hash collision is resolved by probing, or searching through alternative locations in the array (the ''prob .... In this method, deletions are done by marking an element as deleted, rather than erasing it entirely. Deleted locations are treated as empty when inserting and as occupied during a search. The deleted locations are sometimes referred to as ''tombstones.'' The problem with this scheme is that as the number of delete/insert operations increases, the cost of a successful search increases. To improve this, when an element is searched and found in the table, the element is relocated to the first location marked for deletion that was probed during the search. Instead of finding an element to relocate when the deletion occ ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Linear Probing Deletion In mathematics, the term ''linear'' is used in two distinct senses for two different properties: * linearity of a ''function'' (or '' mapping''); * linearity of a ''polynomial''. An example of a linear function is the function defined by f(x)=(ax,bx) that maps the real line to a line in the Euclidean plane R2 that passes through the origin. An example of a linear polynomial in the variables X, Y and Z is aX+bY+cZ+d. Linearity of a mapping is closely related to '' proportionality''. Examples in physics include the linear relationship of voltage and current in an electrical conductor (Ohm's law), and the relationship of mass and weight. By contrast, more complicated relationships, such as between velocity and kinetic energy, are ''nonlinear''. Generalized for functions in more than one dimension, linearity means the property of a function of being compatible with addition and scaling, also known as the superposition principle. Linearity of a polynomial means that its degree is ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Dynamic Array In computer science, a dynamic array, growable array, resizable array, dynamic table, mutable array, or array list is a random access, variable-size list data structure that allows elements to be added or removed. It is supplied with standard libraries in many modern mainstream programming languages. Dynamic arrays overcome a limit of static arrays, which have a fixed capacity that needs to be specified at allocation. A dynamic array is not the same thing as a dynamically allocated array or variable-length array, either of which is an array whose size is fixed when the array is allocated, although a dynamic array may use such a fixed-size array as a back end.See, for example, thsource code of java.util.ArrayList class from OpenJDK 6 Bounded-size dynamic arrays and capacity A simple dynamic array can be constructed by allocating an array of fixed-size, typically larger than the number of elements immediately required. The elements of the dynamic array are stored contiguous ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]