An incremental feature selection approach based on scatter matrices for classification of cancer microarray data |
| |
Authors: | Manju Sardana R.K. Agrawal Baljeet Kaur |
| |
Affiliation: | 1. School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi 110067, India;2. Hansraj College, University of Delhi, Delhi 110007, India |
| |
Abstract: | Microarray data are often characterized by high dimension and small sample size. There is a need to reduce its dimension for better classification performance and computational efficiency of the learning model. The minimum redundancy and maximum relevance (mRMR), which is widely explored to reduce the dimension of the data, requires discretization and setting of external parameters. We propose an incremental formulation of the trace of ratio of the scatter matrices to determine a relevant set of genes which does not involve discretization and external parameter setting. It is analytically shown that the proposed incremental formulation is computationally efficient in comparison to its batch formulation. Extensive experiments on 14 well-known available microarray cancer datasets demonstrate that the performance of the proposed method is better in comparison to the well-known mRMR method. Statistical tests also show that the proposed method is significantly better when compared to the mRMR method. |
| |
Keywords: | microarrays feature selection minimum redundancy maximum relevance ratio of scatter matrices cancer classification |
|
|