MIFS-ND: A mutual information-based feature selection method |
| |
Affiliation: | 1. Department of Computer Science & Engineering, Tezpur University, Napaam, Tezpur 784028, Assam, India;2. Department of Computer Science, University of Colorado at Colorado Springs, CO 80933-7150, USA;1. SEM, Beijing Jiaotong University, Shangyuancun 3#, Haidian Ward, Beijing 100044, China;2. IPS, Waseda University, 808-0135, Japan;1. Department of Energy, Politecnico di Milano, via Ponzio 34/3, 20133 Milan, Italy;2. Systems Science and the Energetic Challenge, European Foundation for New Energy-Electricité de France, Ecole Centrale Paris and Supelec, Paris, 92295 Chatenay-Malabry Cedex, France;3. Faculty of Engineering and Computing, Coventry University, Priory Street, Coventry, UK;1. Department of Engineering, University of Almería, 04120 Almería, Spain;2. Department of Informatics, University of Almería, 04120 Almería, Spain;3. Department of Applied Physics, University of Cordoba, Cordoba, Spain;4. Department of Cartographic Engineering, Geodesy and Photogrammetry, Polytechnic University of Madrid, 28040 Madrid, Spain;1. Department of Computer Science and Engineering, University of Calcutta, Kolkata 700009, West Bengal, India;2. Department of Mathematics, National Institute of Technology-Puducherry, Karaikal 609605, India;3. Department of Mathematics, Jadavpur University, Kolkata 700032, West Bengal, India |
| |
Abstract: | ![]() Feature selection is used to choose a subset of relevant features for effective classification of data. In high dimensional data classification, the performance of a classifier often depends on the feature subset used for classification. In this paper, we introduce a greedy feature selection method using mutual information. This method combines both feature–feature mutual information and feature–class mutual information to find an optimal subset of features to minimize redundancy and to maximize relevance among features. The effectiveness of the selected feature subset is evaluated using multiple classifiers on multiple datasets. The performance of our method both in terms of classification accuracy and execution time performance, has been found significantly high for twelve real-life datasets of varied dimensionality and number of instances when compared with several competing feature selection techniques. |
| |
Keywords: | Features Mutual information Relevance Classification |
本文献已被 ScienceDirect 等数据库收录! |
|