Full Citation
Title: A Bayesian Method for Guessing the Extreme Values in a Data Set
Citation Type: Miscellaneous
Publication Year: 2007
ISBN: 9781595936493
ISSN:
DOI:
NSFID:
PMCID:
PMID:
Abstract: For a large number of data management problems, it would be very useful to be able to obtain a few samples from a data set, and to use the samples to guess the largest (or smallest) value in the entire data set. Min/max online aggregation, top-k query processing, outlier detection, and distance join are just a few possible applications. This paper details a statistically rigorous, Bayesian approach to attacking this problem. Just as importantly, we demonstrate the utility of our approach by showing how it can be applied to two specific problems that arise in the context of data management.
Url: http://vldb.org/conf/2007/papers/research/p471-wu.pdf
User Submitted?: No
Authors: Wu, Mingxi; Jermaine, Christopher
Publisher: University of Florida
Data Collections: IPUMS USA
Topics: Population Data Science
Countries: United States