首页 | 本学科首页   官方微博 | 高级检索  
     


Feature selection using correlation fractal dimension: Issues and applications in binary classification problems
Affiliation:1. School of Accounting, Finance and Economics, Edith Cowan University, WA, Australia;2. Institute of Mathematical Sciences, University of Malaya, Malaysia;3. School of Mathematics and Statistics F07, University of Sydney NSW, Australia;1. German Research Center for Geosciences (GFZ), Telegrafenberg, 14473 Potsdam, Germany;2. University of Potsdam, Am Neuen Palais, 14469 Potsdam, Germany;3. Institute of Geodesy and Geophysics, Chinese Academy of Sciences, 430077 Wuhan, China;1. Leibniz Center for Tropical Marine Ecology (ZMT), Fahrenheitstrasse 6, 28359 Bremen, Germany;2. Centre of Excellence in Marine Sciences (CEMarin), Cra. 2 No. 11-68, 47004 Santa Marta, Colombia;3. Grupo de Investigación en Ecología de Estuarios y Manglares, Departmento de Biología, Universidad del Valle, A. A. 25360 Cali, Colombia;4. Thünen Institute of Baltic Sea Fisheries (TI-OF), Alter Hafen Süd 2, 18069 Rostock, Germany;1. Institute for Choice, University of South Australia, 140 Arthur St, North Sydney, NSW, Australia;2. Institute of Transport and Logistics Studies, University of Sydney, St James Campus (C13), Sydney, NSW 2006, Australia
Abstract:Feature selection methods can be classified broadly into filter and wrapper approaches. Filter-based methods filter out features which are irrelevant to the target concept by ranking each feature according to some discrimination measure and then select features with high ranking value. In this paper, a filter-based feature selection method based on correlation fractal dimension (CFD) discrimination measure is proposed. One of the subgoals of this paper is to outline some issues that arise while calculating fractal dimension for higher dimensional ‘empirical’ data sets. It is well known that the calculation of fractal dimension for empirical data sets is meaningful only for an appropriate range of scales. We demonstrate through experimentation on data sets of various sizes that fractal dimension-based algorithms cannot be applied routinely to higher dimensional data sets as the calculation of fractal dimension is inherently sensitive to parameters like range of scales and the size of the data sets. Based on the empirical analysis, we propose a new feature selection technique using CFD that avoids the above issues. We successfully applied the proposed algorithm on a challenging classification problem in bioinformatics, namely, Promoter Recognition.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号