首页 | 本学科首页   官方微博 | 高级检索  
     


Improved multiclass feature selection via list combination
Affiliation:1. Universidad Nacional de Colombia - Carrera 30 No 45-03 Bogota, Colombia 111321;2. Centro Nacional de Investigaciones de Café (Cenicafé.)- Kilómetro 4 Vía antigua Chinchiná - Manizales, Colombia 170009;1. Department of Electronics and Communication Engineering, National Institute of Technology Goa, Farmagudi, Ponda, Goa, 403401, India;2. Department of Instrumentation and Control Engineering, PSG College of Technology, Coimbatore, 641004, India;3. Department of Electronics and Communication Engineering, Institute of Aeronautical Engineering, Dundigal, Hyderabad, 500 043, India;1. Advanced Visualization Laboratory–VizLab–Vale do Rio dos Sinos University (UNISINOS), Av. Unisinos, 950, São Leopoldo, 93022-000, RS, Brazil;2. Department of Civil Construction, Federal Institute of Santa Catarina (IFSC), Florianopolis, 88020-300, SC, Brazil;3. Institute of Geography, Federal University of Uberlandia (UFU), Monte Carmelo, 38500-000, MG, Brazil;4. Graduate Program in Transportation Engineering, University of São Paulo, São Carlos School of Engineering (EESC), São Carlos - SP, Brazil;1. Luxembourg Institute of Socio-Economic Research (LISER), Maison des Sciences Humaines, 11, Porte des Sciences L- 4366 Esch-sur-Alzette, Luxembourg\n;2. University of Salerno, Via Giovanni Paolo II, 132 84084 Fisciano (SA), Italy
Abstract:Feature selection is a crucial machine learning technique aimed at reducing the dimensionality of the input space. By discarding useless or redundant variables, not only it improves model performance but also facilitates its interpretability. The well-known Support Vector Machines–Recursive Feature Elimination (SVM-RFE) algorithm provides good performance with moderate computational efforts, in particular for wide datasets. When using SVM-RFE on a multiclass classification problem, the usual strategy is to decompose it into a series of binary ones, and to generate an importance statistics for each feature on each binary problem. These importances are then averaged over the set of binary problems to synthesize a single value for feature ranking. In some cases, however, this procedure can lead to poor selection. In this paper we discuss six new strategies, based on list combination, designed to yield improved selections starting from the importances given by the binary problems. We evaluate them on artificial and real-world datasets, using both One–Vs–One (OVO) and One–Vs–All (OVA) strategies. Our results suggest that the OVO decomposition is most effective for feature selection on multiclass problems. We also find that in most situations the new K-First strategy can find better subsets of features than the traditional weight average approach.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号