首页 | 本学科首页   官方微博 | 高级检索  
     


Case-based reasoning for classification in the mixed data sets employing the compound distance methods
Authors:Mohammad Taghi Rezvan  Ali Zeinal Hamadani  Ali Shalbafzadeh
Affiliation:1. Department of Industrial Engineering, Isfahan University of Technology, 84156-83111 Isfahan, Iran;2. Department of Electrical and Computer Engineering, Isfahan University of Technology, 84156-83111 Isfahan, Iran
Abstract:Development of classification methods using case-based reasoning systems is an active area of research. In this paper, two new case-based reasoning systems with two similarity measures that support mixed categorical and numerical data as well as only categorical data are proposed. The principal difference between these two measures lies in the calculations of distance for categorical data. The first one, named distance in unsupervised learning (DUL), is derived from co-occurrence of values, and the other one, named distance in supervised learning (DSL), is used to calculate the distance between two values of the same feature with respect to every other feature for a given class. However, the distance between numerical data is computed using the Euclidean distance. Furthermore, the importance of numeric features is determined by linear discrimination analysis (LDA) and the weight assignment to categorical features depends on co-occurrence of feature values when calculating the similarity between a new case and the old one. The performance of the proposed case-based reasoning systems has been investigated on the University of California, Irvine (UCI) data sets by 5-fold cross validation. The results indicate that these case-based reasoning systems will produce a proper performance in predictive accuracy and interpretability.
Keywords:Classification  Case-based reasoning  Compound distance  Mixed data sets
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号