基于潜在语义差异的医学网页聚类 Medical Webpage Clustering Based on Latent Semantic Difference期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于潜在语义差异的医学网页聚类

引用本文：	米晓芳,秦洋,王立宏,宋宜斌.基于潜在语义差异的医学网页聚类[J].计算机工程,2008,34(19):64-66.

作者姓名：	米晓芳秦洋王立宏宋宜斌

作者单位：	烟台大学计算机学院,烟台,264005

基金项目：	国家自然科学基金，山东省自然科学基金

摘要：	采用潜在语义索引的全局模型和局部模型表示医学网页时，模糊聚类结果的类间包含度很大。该文提出一种新的潜在语义差异模型，将医学网页中的文本抽取出来并分别采用全局模型、局部模型和差异模型进行表示，利用FCM算法进行聚类并计算类间包含度。实验发现，对给定的5类医学网页进行聚类时，采用差异模型时的类间包含度平均约为全局模型的85%、局部模型的80%。
关键词：	潜在语义索引差异模型文本挖掘 FCM聚类包含度
修稿时间：
Medical Webpage Clustering Based on Latent Semantic Difference

MI Xiao-fang,QIN Yang,WANG Li-hong,SONG Yi-bin.Medical Webpage Clustering Based on Latent Semantic Difference[J].Computer Engineering,2008,34(19):64-66.

Authors:	MI Xiao-fang QIN Yang WANG Li-hong SONG Yi-bin

Affiliation:	(College of Computer, Yantai University, Yantai 264005)

Abstract:	Fuzzy clustering, two categories of medical Web pages represented by global LSI or local LSI generate two fuzzy sets with a large inclusion degree. A new latent semantic difference model is proposed. The text in medical Webpage is extracted and represented by global LSI, local LSI and difference LSI respectively. FCM algorithm is employed to cluster the feature vectors and inclusion degree between two result fuzzy sets is calculated. Experiments on five given categories of medical Webpages show that, on the average, difference LSI reduces the inclusion degree by a factor of 85% and 80% respectively when compared with global LSI and local LSI.

Keywords:	latent semantic index difference model text mining FCM clustering inclusion degree
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《计算机工程》浏览原始摘要信息
	点击此处可从《计算机工程》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏