An effective and efficient algorithm for high-dimensional outlier detection期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

An effective and efficient algorithm for high-dimensional outlier detection

Authors:	Charu C Aggarwal Philip S Yu

Affiliation:	(1) IBM T.J. Watson Research Center, 19 Skyline Drive, NY 10532 Hawthorne, USA

Abstract:	The outlier detection problem has important applications in the field of fraud detection, network robustness analysis, and intrusion detection. Most such applications are most important for high-dimensional domains in which the data can contain hundreds of dimensions. Many recent algorithms have been proposed for outlier detection that use several concepts of proximity in order to find the outliers based on their relationship to the other points in the data. However, in high-dimensional space, the data are sparse and concepts using the notion of proximity fail to retain their effectiveness. In fact, the sparsity of high-dimensional data can be understood in a different way so as to imply that every point is an equally good outlier from the perspective of distance-based definitions. Consequently, for high-dimensional data, the notion of finding meaningful outliers becomes substantially more complex and nonobvious. In this paper, we discuss new techniques for outlier detection that find the outliers by studying the behavior of projections from the data set.Received: 19 November 2002, Accepted: 6 February 2004, Published online: 19 August 2004Edited by: R. Ng.

Keywords:	Data mining High-dimensional spaces Outlier detection
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏