Case-based reasoning for classification in the mixed data sets employing the compound distance methods期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Case-based reasoning for classification in the mixed data sets employing the compound distance methods

Authors:	Mohammad Taghi Rezvan Ali Zeinal Hamadani Ali Shalbafzadeh

Affiliation:	1. Department of Industrial Engineering, Isfahan University of Technology, 84156-83111 Isfahan, Iran;2. Department of Electrical and Computer Engineering, Isfahan University of Technology, 84156-83111 Isfahan, Iran

Abstract:	Development of classification methods using case-based reasoning systems is an active area of research. In this paper, two new case-based reasoning systems with two similarity measures that support mixed categorical and numerical data as well as only categorical data are proposed. The principal difference between these two measures lies in the calculations of distance for categorical data. The first one, named distance in unsupervised learning (DUL), is derived from co-occurrence of values, and the other one, named distance in supervised learning (DSL), is used to calculate the distance between two values of the same feature with respect to every other feature for a given class. However, the distance between numerical data is computed using the Euclidean distance. Furthermore, the importance of numeric features is determined by linear discrimination analysis (LDA) and the weight assignment to categorical features depends on co-occurrence of feature values when calculating the similarity between a new case and the old one. The performance of the proposed case-based reasoning systems has been investigated on the University of California, Irvine (UCI) data sets by 5-fold cross validation. The results indicate that these case-based reasoning systems will produce a proper performance in predictive accuracy and interpretability.

Keywords:	Classification Case-based reasoning Compound distance Mixed data sets
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏