首页 | 本学科首页   官方微博 | 高级检索  
     

基于空间结构的符号数据仿射传播算法*
引用本文:王齐,钱宇华,李飞江.基于空间结构的符号数据仿射传播算法*[J].模式识别与人工智能,2016,29(12):1132-1139.
作者姓名:王齐  钱宇华  李飞江
作者单位:山西大学 计算机与信息技术学院 太原 030006
山西大学 计算智能与中文信息处理教育部重点实验室 太原 030006
基金项目:国家自然科学基金项目(No.61432011,U1435212,61322211)、教育部新世纪优秀人才支持计划(No.NCET-12-1031)、高等学校博士学科点专项科研基金(博导类)(No.20121401110013)、山西省高等学校优秀青年学术带头人项目(No.20120301)资助
摘    要:由于符号型数据缺乏清晰的空间结构,很难构造一种合理的相似性度量,从而使诸多数值型聚类算法难以推广至符号型数据聚类.基于此种情况,文中引入一种空间结构表示方法,把符号型数据转化为数值型数据,能够在保持原符号型数据的结构特征的基础上重新构造样本之间的相似度.基于此方法,将仿射传播(AP)聚类算法迁移至符号数据聚类中,提出基于空间结构的符号数据AP算法(SBAP).在UCI数据集中若干符号型数据集上的实验表明,SBAP可以使AP算法有效处理符号型数据聚类问题,并且可以提升算法性能.

关 键 词:聚类  符号型数据  仿射传播(AP)  空间结构  相似度  
收稿时间:2016-05-13

Space Structure Based Affinity Propagation Algorithm for Categorical Data
WANG Qi,QIAN Yuhua,LI Feijiang.Space Structure Based Affinity Propagation Algorithm for Categorical Data[J].Pattern Recognition and Artificial Intelligence,2016,29(12):1132-1139.
Authors:WANG Qi  QIAN Yuhua  LI Feijiang
Affiliation:School of Computer and Information Technology, Shanxi University, Taiyuan 030006
Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education,
Shanxi University, Taiyuan 030006
Abstract:Constructing a reasonable similarity measure is difficult due to the lack of clear space structure in categorical data. Therefore, numerical clustering algorithms can hardly be extended to categorical data clustering. In this paper, a representation method for transforming the categorical data into numerical data is introduced. The similarity between samples is reconstructured and the structure feature of the original categorical data is maintained in the reconstruction process. Based on the data representation method, the affinity propagation(AP) clustering algorithm is migrated to the categorical data clustering. A space structure based AP algorithm for categorical data(SBAP) is proposed. Experimental results on several categorical datasets from the UCI dataset show that the proposed method makes AP algorithm deal with the categorical data clustering problem effectively with a significant improvement in performance.
Keywords:Clustering  Categorical Data  Affinity Propagation (AP)  Space Structure  Similarity  
点击此处可从《模式识别与人工智能》浏览原始摘要信息
点击此处可从《模式识别与人工智能》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号