Lecture 1: Introduction
Pass-Efficient Model
Definition (Pass-Efficient Model):
Pass: In the Pass-Efficient Model, the only access an algorithm has to the data is via a pass, where a pass over the data is a sequential read of the entire input data set.
Pass-Efficient: In addition to the external storage space to store the data and to a small number of passes over the data, an algorithm in the Pass-Efficient Model is permitted to use additional RAM space and additional computation time. An algorithm is considered pass-efficient if it requires a small constant number of passes and additional space and time which are sublinear in the length of the data stream in order to compute
the solution (or a “description” of the solution)
Randomized Selection Algorithm
Input: { a1,...an} , ai≥0 , read in one pass, i.e., one sequential read, over the data.
Output: i∗ , ai∗
1: D = 0
2: for i = 1 to n do
3: D = D + ai
4: With probability ai/D, let i∗ = i and ai∗ = ai
5: end for
6: return i∗, ai∗
Lemma 1: Suppose that { a1,...,an} , ai≥0 , are read in one pass, i.e., one sequential read over the data, by the Select algorithm. Then the Select algorithm requires O(1) additional storage space and returns i∗ such that Pr[i∗=i]=ai/∑ni=1ai .
proof: by induction
a1/a1=1 after read the first element
Let Dl=∑li′=