首页 | 本学科首页   官方微博 | 高级检索  
     

基于聚类分析的搜索引擎自动性能评价
引用本文:吴世勇,王明文.基于聚类分析的搜索引擎自动性能评价[J].中文信息学报,2010,24(5):62-70.
作者姓名:吴世勇  王明文
作者单位:江西师范大学 计算机信息工程学院,江西 南昌 330022
基金项目:国家自然科学基金资助项目,江西省自然科学基金资助项目,江西省科技攻关项目,江西省教育厅科技课题 
摘    要:传统的搜索引擎性能评价方法需要人工标注标准答案集,需花费大量的人力物力,并且评价结果依赖于人工标注的准确性,效率较低。该文基于聚类分析的思路,提出了一种搜索引擎性能评价指标和自动进行搜索引擎性能评价的方法,此方法能自动计算信息类查询的覆盖范围,并根据其覆盖范围对检索结果进行聚类,通过类间距和类内距等指标实现检索性能的自动评价。实验结果表明,基于聚类指标的评价方法与人工标注的评价方法的评价结果是相一致的。

关 键 词:信息检索  性能评价  聚类分析  

Automatic Search Engine Performance Evaluation Based on Clustering Analysis
WU Shiyong,WANG Mingwen.Automatic Search Engine Performance Evaluation Based on Clustering Analysis[J].Journal of Chinese Information Processing,2010,24(5):62-70.
Authors:WU Shiyong  WANG Mingwen
Affiliation:School of computer Information and Engineering, Jiangxi Normal University, Nanchang, Jiangxi 330022,China
Abstract:Traditional search engine evaluation methods need manual annotation of correct answers for a set of queries, which is costly and time comsuming. In this paper, we present an automatic search engine performance evaluation method based on clustering analysis. This method includes three stepsfirst, computing the coverage score of the query for information; second, clustering the search results by the coverage score; last, evaluating the retrieval performance using intra-cluster cohesion and inter-cluster separation. Experimental results show that the automatic method gets a similar evaluation result with traditional assessor-based ones.
Key wordsinformation retrieval; performance evaluation; clustering analysis
Keywords:information retrieval  performance evaluation  clustering analysis  
本文献已被 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号