首页 | 本学科首页   官方微博 | 高级检索  
     


Record-level peculiarity-based data analysis and classifications
Authors:Jian Yang  Ning Zhong  Yiyu Yao  Jue Wang
Affiliation:1.International WIC Institute,Beijing University of Technology,Beijing,China;2.The Key Laboratory of Complex Systems and Intelligence Science, Institute of Automation,Chinese Academy of Sciences,Beijing,China;3.Department of Life Science and Informatics,Maebashi Institute of Technology,Maebashi,Japan;4.Department of Computer Science,University of Regina,Regina,Canada
Abstract:Peculiarity-oriented mining is a data mining method consisting of peculiar data identification and peculiar data analysis. Peculiarity factor and local peculiarity factor are important concepts employed to describe the peculiarity of a data point in the identification step. One can study the notions at both attribute and record levels. In this paper, a new record LPF called distance-based record LPF (D-record LPF) is proposed, which is defined as the sum of distances between a point and its nearest neighbors. The authors prove that D-record LPF can characterize the probability density of a continuous m-dimensional distribution accurately. This provides a theoretical basis for some existing distance-based anomaly detection techniques. More importantly, it also provides an effective method for describing the class-conditional probabilities in a Bayesian classifier. The result enables us to apply D-record LPF to solve classification problems. A novel algorithm called LPF-Bayes classifier and its kernelized implementation are proposed, which have some connection to the Bayesian classifier. Experimental results on several benchmark datasets demonstrate that the proposed classifiers are competitive to some excellent classifiers such as AdaBoost, support vector machines and kernel Fisher discriminant.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号