Pre-Grant Publication Number: 20070244891
Filing Date: April 18, 2006
Inventors: Parikshit Gopalan, Robert Krauthgamer, Jayram Thathachar
Assignee: International Business Machines Corporation
Current U.S. Classification: 707, 707/007000
View Prior Art for Claim 00018
A program storage device readable by computer and tangibly embodying a program of instructions executable by said computer to perform a method of evaluating elements in a data stream, said method comprising:
scanning said elements in one pass;
as said elements are scanned, randomly selecting scanned elements for storage in multiple data buckets such that selection of said scanned elements for storage in each one of said multiple data buckets is independent of selection of said scanned elements for storage in any other of said multiple data buckets;
storing said scanned elements in said multiple data buckets such that each of said data buckets comprises a predetermined number of said scanned elements; and
at selected times during said scanning, selecting a sample of said scanned elements from said multiple data buckets such that multiple samples are obtained and such that said multiple samples comprise uniform samples of said scanned elements for specified intervals immediately prior to said selected times.