首页 | 本学科首页   官方微博 | 高级检索  
     

基于潜在语义差异的医学网页聚类
引用本文:米晓芳,秦洋,王立宏,宋宜斌.基于潜在语义差异的医学网页聚类[J].计算机工程,2008,34(19):64-66.
作者姓名:米晓芳  秦洋  王立宏  宋宜斌
作者单位:烟台大学计算机学院,烟台,264005
基金项目:国家自然科学基金,山东省自然科学基金
摘    要:采用潜在语义索引的全局模型和局部模型表示医学网页时,模糊聚类结果的类间包含度很大。该文提出一种新的潜在语义差异模型,将医学网页中的文本抽取出来并分别采用全局模型、局部模型和差异模型进行表示,利用FCM算法进行聚类并计算类间包含度。实验发现,对给定的5类医学网页进行聚类时,采用差异模型时的类间包含度平均约为全局模型的85%、局部模型的80%。

关 键 词:潜在语义索引  差异模型  文本挖掘  FCM聚类  包含度
修稿时间: 

Medical Webpage Clustering Based on Latent Semantic Difference
MI Xiao-fang,QIN Yang,WANG Li-hong,SONG Yi-bin.Medical Webpage Clustering Based on Latent Semantic Difference[J].Computer Engineering,2008,34(19):64-66.
Authors:MI Xiao-fang  QIN Yang  WANG Li-hong  SONG Yi-bin
Affiliation:(College of Computer, Yantai University, Yantai 264005)
Abstract:Fuzzy clustering, two categories of medical Web pages represented by global LSI or local LSI generate two fuzzy sets with a large inclusion degree. A new latent semantic difference model is proposed. The text in medical Webpage is extracted and represented by global LSI, local LSI and difference LSI respectively. FCM algorithm is employed to cluster the feature vectors and inclusion degree between two result fuzzy sets is calculated. Experiments on five given categories of medical Webpages show that, on the average, difference LSI reduces the inclusion degree by a factor of 85% and 80% respectively when compared with global LSI and local LSI.
Keywords:latent semantic index  difference model  text mining  FCM clustering  inclusion degree
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号