首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于概念相似度的数据分类方法
引用本文:彭京,唐常杰,元昌安,李川,胡建军.一种基于概念相似度的数据分类方法[J].软件学报,2007,18(2):311-322.
作者姓名:彭京  唐常杰  元昌安  李川  胡建军
作者单位:1. 四川大学,计算机学院,四川,成都,610065;成都市公安局,科技处,四川,成都,610017
2. 四川大学,计算机学院,四川,成都,610065
基金项目:国家自然科学基金;中国博士后科学基金;四川省重点科技计划;四川省青年科技基金
摘    要:依据数据属性间的相似信息,提出了一种分类方法.该方法将属性矢量化,属性作为m维空间的基本矢量,数据记录作为属性矢量的和.利用属性间先验的概念相似信息,给出了求取任意属性矢量对的相似距离算法,并将数据间相关度计算转换为属性矢量及其相互投影的公式,从而得到任意两条数据的相关度;利用相关度,提出了一种分类算法.用详实的实验证明了该算法的有效性.

关 键 词:数据挖掘  概念相似度  相似距离  属性矢量  分类
收稿时间:9/8/2004 12:00:00 AM
修稿时间:2006-04-26

A Data Classification Method Based on Concept Similarity
PENG Jing,TANG Chang-Jie,YUAN Chang-An,LI Chuan and HU Jian-Jun.A Data Classification Method Based on Concept Similarity[J].Journal of Software,2007,18(2):311-322.
Authors:PENG Jing  TANG Chang-Jie  YUAN Chang-An  LI Chuan and HU Jian-Jun
Affiliation:1.School of Computer Science, Sichuan University, Chengdu 610065, China; 2.Science and Technology Department, Chengdu Public Security Bureau, Chengdu 610017, China
Abstract:In this paper, a method of classification is proposed based on the similar information of data properties. The new method assumes that data properties are basic vectors of m dimensions, and each of the data is viewed as a sum vector of all the property-vectors. It suggests a novel distance algorithm to get the distance of every pair of the property based on similar information of the basic property vectors. An algorithm of data classification is also presented based on correlation computing formula composed of property vectors and projections of each other. Efficiency of the new method is proved by extensive experiments.
Keywords:data mining  concept similarity  similar distance  property vector  classification
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号