Feature clustering and ranking for selecting stable features from high dimensional remotely sensed data |
| |
Authors: | Dugal Harris Adriaan Van Niekerk |
| |
Affiliation: | 1. Department of Geography and Environmental Studies, Stellenbosch University, Stellenbosch, South Africadugalh@gmail.com https://orcid.org/0000-0003-3480-5707;3. Centre for Geographical Analysis, Stellenbosch University, Stellenbosch, South Africa https://orcid.org/0000-0002-5631-0206 |
| |
Abstract: | ABSTRACTHigh dimensional remote sensing data sets typically contain redundancy amongst the features. Traditional approaches to feature selection are prone to instability and selection of sub-optimal features in these circumstances. They can also be computationally expensive, especially when dealing with very large remote sensing data sets. This article presents an efficient, deterministic feature ranking method that is robust to redundancy. Affinity propagation is used to group correlated features into clusters. A relevance criterion is evaluated for each feature. Clusters are then ranked based on the median of the relevance values of their constituent features. The most relevant individual features can then be selected automatically from the best clusters. Other criteria, such as computation time or measurement cost, can optionally be considered interactively when making this selection. The proposed feature selection method is compared to competing filter approach methods on a number of remote sensing data sets containing feature redundancy. Mutual information and naive Bayes relevance criteria were evaluated in conjunction with the feature selection methods. Using the proposed method it was shown that the stability of selected features improved under different data samplings, while similar or better classification accuracies were achieved compared to competing methods. |
| |
Keywords: | |
|
|