Mapping classifiers and datasets期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Mapping classifiers and datasets

Authors:	Olcay Taner Yıldız

Affiliation:	1. College of Electrical and Information Engineering, Hunan University, Changsha 410082, China;2. INESC Porto, Universidade do Porto, Porto, Portugal;1. School of Computer Science and Engineering, Nanyang Technological University, Singapore;2. Imperial College London, UK;3. Microsoft Research Asia, China;4. Stanford University, USA;1. Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong;2. Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA;1. Northwestern Polytechnical University, Xi’an, China;2. Google Inc., Cambridge, USA;3. The Chinese University of Hong Kong, Hong Kong;4. University of Massachusetts, Amherst, USA;5. University of Minnesota, Minneapolis, USA

Abstract:	Given the posterior probability estimates of 14 classifiers on 38 datasets, we plot two-dimensional maps of classifiers and datasets using principal component analysis (PCA) and Isomap. The similarity between classifiers indicate correlation (or diversity) between them and can be used in deciding whether to include both in an ensemble. Similarly, datasets which are too similar need not both be used in a general comparison experiment. The results show that (i) most of the datasets (approximately two third) we used are similar to each other, (ii) multilayer perceptrons and k-nearest neighbor variants are more similar to each other than support vector machine and decision tree variants, (iii) the number of classes and the sample size has an effect on similarity.

Keywords:
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏