IPUMS.org Home Page

BIBLIOGRAPHY

Publications, working papers, and other research using data resources from IPUMS.

Full Citation

Title: Near Linear Time Detection of Distance-Based Outliers and Applications to Security

Citation Type: Conference Paper

Publication Year: 2003

Abstract: Many automated systems for detecting threats are based on matching a new database record to known attack types. However, this approach can only spot known threats and thus researchers have also begun to use unsupervised approaches based on detecting outliers or anomalous examples. A popular method of finding these outliers is to use the distance to an example's k nearest neighbors as a measure of unusualness. However, existing algorithms for finding distance-based outliers have poor scaling properties, making it difficult to apply them to large datasets typically available in security domains. In this paper, we propose modifications to a simple, but quadratic, algorithm for finding distance-based outliers, and show that it achieves near linear time scaling allowing it to be applied to real data sets with millions of examples and many features.

User Submitted?: No

Authors: Schwabacher, Mark; Bay, Stephen D.

Conference Name: Workshop on Data Mining for Counter Terrorism and Security

Publisher Location: San Francisco, CA

Data Collections: IPUMS USA

Topics: Methodology and Data Collection, Other

Countries:

IPUMS NHGIS NAPP IHIS ATUS Terrapop