首页 | 本学科首页   官方微博 | 高级检索  
     

基于Ontology和EM方法的网页分类研究
引用本文:丁艳,曹倩,王超,潘金贵.基于Ontology和EM方法的网页分类研究[J].计算机科学,2003,30(11):112-115.
作者姓名:丁艳  曹倩  王超  潘金贵
作者单位:南京大学计算机软件新技术国家重点实验室,南京大学多媒体技术研究所,南京,210093
摘    要:Works on abstracting semantic information from substantive pages of Web and their usage in search engine can lead to intelligent retrieval ,or other individual services. This paper mainly focuses on some research about analysis of Web page classification infor. Ontology as a base,using TFIDF word weights and Rocchio algorithm is combined with EM to improve accuracy of classifier. It's proved that this EM procedure works well on enhancing the veracity by the usage of unlabeled pages when the samples are limited.

关 键 词:网页分类  TFIDF  EM  研究  方法

Web Page Classification Research Based on Ontology and EM
DING Yan CAO Qian WANG Chao PAN Jin-Gui.Web Page Classification Research Based on Ontology and EM[J].Computer Science,2003,30(11):112-115.
Authors:DING Yan CAO Qian WANG Chao PAN Jin-Gui
Abstract:Works on abstracting semantic information from substantive pages of Web and their usage in search engine can lead to intelligent retrieval,or other individual services. This paper mainly focuses on some research about analysis of Web page classification infor. Ontology as a base,using TFIDF word weights and Rocchio algorithm is combined with EM to improve accuracy of classifier. It's proved that this EM procedure works well on enhancing the veracity by the usage of unlabeled pages when the samples are limited.
Keywords:Ontology  VSM  Classifier  Feature vector  Document vector  
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号