Streaming Algorithm
   HOME
*





Streaming Algorithm
In computer science, streaming algorithms are algorithms for processing data streams in which the input is presented as a sequence of items and can be examined in only a few passes (typically just one). In most models, these algorithms have access to limited memory (generally logarithmic in the size of and/or the maximum value in the stream). They may also have limited processing time per item. These constraints may mean that an algorithm produces an approximate answer based on a summary or "sketch" of the data stream. History Though streaming algorithms had already been studied by Munro and Paterson as early as 1978, as well as Philippe Flajolet and G. Nigel Martin in 1982/83, the field of streaming algorithms was first formalized and popularized in a 1996 paper by Noga Alon, Yossi Matias, and Mario Szegedy. For this paper, the authors later won the Gödel Prize in 2005 "for their foundational contribution to streaming algorithms." There has since been a large body of work cen ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Computer Science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical disciplines (including the design and implementation of Computer architecture, hardware and Computer programming, software). Computer science is generally considered an area of research, academic research and distinct from computer programming. Algorithms and data structures are central to computer science. The theory of computation concerns abstract models of computation and general classes of computational problem, problems that can be solved using them. The fields of cryptography and computer security involve studying the means for secure communication and for preventing Vulnerability (computing), security vulnerabilities. Computer graphics (computer science), Computer graphics and computational geometry address the generation of images. Progr ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Elephant Flow
In computer networking, an elephant flow is an extremely large (in total bytes) continuous flow set up by a TCP (or other protocol) flow measured over a network link. Elephant flows, though not numerous, can occupy a disproportionate share of the total bandwidth over a period of time. It is not clear who coined "elephant flow", but the term began occurring in published Internet network research in 2001 when the observations were made that a small number of flows carry the majority of Internet traffic and the remainder consists of a large number of flows that carry very little Internet traffic ( mice flows). For example, researchers Mori et al. studied the traffic flows on several Japanese universities and research networks. At the WIDE network they found elephant flows were only 4.7% of all flows but occupied 41.3% of all data transmitted during the time period. The actual impact of elephant flows on Internet traffic is still an area of research and debate. Some research shows th ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Jelani Nelson
Jelani Osei Nelson ( am, ጄላኒ ኔልሰን; born June 28, 1984) is an Ethiopian-American Professor of Electrical Engineering and Computer Sciences at the University of California, Berkeley. He won the 2014 Presidential Early Career Award for Scientists and Engineers. Nelson is the creator of ''AddisCoder'', a computer science summer program for Ethiopian high school students in Addis Ababa. Early life and education Nelson was born to an Ethiopian mother and an African-American father in Los Angeles, then grew up in St. Thomas, U.S. Virgin Islands. He studied mathematics and computer science at the Massachusetts Institute of Technology and remained there to complete his doctoral studies in computer science. His Master's dissertation, ''External-Memory Search Trees with Fast Insertions'', was supervised by Bradley C. Kuszmaul and Charles E. Leiserson. He was a member of the theory of computation group, working on efficient algorithms for massive datasets. His doctoral dissert ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Daniel Kane (mathematician)
Daniel Mertz Kane (born 1986) is an American mathematician. He is an associate professor with a joint position in the Mathematics Department and the Computer Science and Engineering Department at the University of California, San Diego.. Early life and education Kane was born in Madison, Wisconsin, to Janet E. Mertz and Jonathan M. Kane, professors of oncology and of mathematics and computer science, respectively... The article is primarily about a study jointly authored by Kane's parents, but also mentions Kane's IMO results. He attended Wingra School, a small alternative K-8 school in Madison that focuses on self-guided education. By 3rd grade, he had mastered K through 9th-grade mathematics. Starting at age 13, he took honors math courses at the University of Wisconsin–Madison and did research under the mentorship of Ken Ono while dual enrolled at Madison West High School. He earned gold medals in the 2002 and 2003 International Mathematical Olympiads. Prior to his 17th b ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Moving Average
In statistics, a moving average (rolling average or running average) is a calculation to analyze data points by creating a series of averages of different subsets of the full data set. It is also called a moving mean (MM) or rolling mean and is a type of finite impulse response filter. Variations include: simple, cumulative, or weighted forms (described below). Given a series of numbers and a fixed subset size, the first element of the moving average is obtained by taking the average of the initial fixed subset of the number series. Then the subset is modified by "shifting forward"; that is, excluding the first number of the series and including the next value in the subset. A moving average is commonly used with time series data to smooth out short-term fluctuations and highlight longer-term trends or cycles. The threshold between short-term and long-term depends on the application, and the parameters of the moving average will be set accordingly. It is also used in economics ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Misra–Gries Summary
In the field of streaming algorithms, Misra–Gries summaries are used to solve the frequent elements problem in the data stream model. That is, given a long stream of input that can only be examined once (and in some arbitrary order), the Misra-Gries algorithm can be used to compute which (if any) value makes up a majority of the stream, or more generally, the set of items that constitute some fixed fraction of the stream. The term "summary" is due to Graham Cormode. The algorithm is also called the Misra–Gries heavy hitters algorithm. The Misra–Gries summary As for all algorithms in the data stream model, the input is a finite sequence of integers from a finite domain. The algorithm outputs an associative array which has values from the stream as keys, and estimates of their frequency as the corresponding values. It takes a parameter which determines the size of the array, which impacts both the quality of the estimates and the amount of memory used. algorithm misra ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Misra–Gries Heavy Hitters Algorithm
Misra and Gries defined the ''heavy-hitters problem'' (though they did not introduce the term ''heavy-hitters'') and described the first algorithm for it in the paper ''Finding repeated elements''. Their algorithm extends the Boyer-Moore majority finding algorithm in a significant way. One version of the heavy-hitters problem is as follows: Given is a bag of elements and an integer . Find the values that occur more than times in . The Misra-Gries algorithm solves the problem by making two passes over the values in , while storing at most values from and their number of occurrences during the course of the algorithm. Misra-Gries is one of the earliest streaming algorithms, and it is described below in those terms in section #Summaries. Misra–Gries algorithm A bag is a like a set in which the same value may occur multiple times. Assume that a bag is available as an array of elements. In the abstract description of the algorithm, we treat and its segments also as bags ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Bloom Filter
A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not – in other words, a query returns either "possibly in set" or "definitely not in set". Elements can be added to the set, but not removed (though this can be addressed with the counting Bloom filter variant); the more items added, the larger the probability of false positives. Bloom proposed the technique for applications where the amount of source data would require an impractically large amount of memory if "conventional" error-free hashing techniques were applied. He gave the example of a hyphenation algorithm for a dictionary of 500,000 words, out of which 90% follow simple hyphenation rules, but the remaining 10% require expensive disk accesses to retrieve specific hyphenation patterns. With sufficient core memory, an error-free hash coul ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Lossy Count Algorithm
The lossy count algorithm is an algorithm to identify elements in a data stream whose frequency exceeds a user-given threshold. The algorithm works by dividing the data stream into buckets for frequent items, but fill as many buckets as possible in main memory one time. The frequency computed by this algorithm is not always accurate, but has an error threshold that can be specified by the user. The run time and space required by the algorithm is inversely proportional to the specified error threshold; hence the larger the error, the smaller the footprint. The algorithm was created by computer scientists Rajeev Motwani and Gurmeet Singh Manku. It finds applications in computations where data takes the form of a continuous data stream instead of a finite data set, such as network traffic measurements, web server logs, and clickstream A click path or clickstream is the sequence of hyperlinks one or more website visitors follows on a given site, presented in the order viewed. A visitor' ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Boyer–Moore Majority Vote Algorithm
The Boyer–Moore majority vote algorithm is an algorithm for finding the majority of a sequence of elements using linear time and constant space. It is named after Robert S. Boyer and J Strother Moore, who published it in 1981,. Originally published as a technical report in 1981. and is a prototypical example of a streaming algorithm. In its simplest form, the algorithm finds a majority element, if there is one: that is, an element that occurs repeatedly for more than half of the elements of the input. A version of the algorithm that makes a second pass through the data can be used to verify that the element found in the first pass really is a majority. If a second pass is not performed and there is no majority the algorithm will not detect that no majority exists. In the case that no strict majority exists, the returned element can be arbitrary; it is not guaranteed to be the element that occurs most often (the mode of the sequence). It is not possible for a streaming algorithm ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Set (mathematics)
A set is the mathematical model for a collection of different things; a set contains '' elements'' or ''members'', which can be mathematical objects of any kind: numbers, symbols, points in space, lines, other geometrical shapes, variables, or even other sets. The set with no element is the empty set; a set with a single element is a singleton. A set may have a finite number of elements or be an infinite set. Two sets are equal if they have precisely the same elements. Sets are ubiquitous in modern mathematics. Indeed, set theory, more specifically Zermelo–Fraenkel set theory, has been the standard way to provide rigorous foundations for all branches of mathematics since the first half of the 20th century. History The concept of a set emerged in mathematics at the end of the 19th century. The German word for set, ''Menge'', was coined by Bernard Bolzano in his work ''Paradoxes of the Infinite''. Georg Cantor, one of the founders of set theory, gave the following defin ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]