首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 156 毫秒
1.
模糊集方法在检索评价系统中的应用   总被引:1,自引:0,他引:1  
评价一个信息检索系统的标准是该系统在多大程度上满足用户的能力。文中从传统的评价信息检索系统的两大标准——计算查全率和查准率的公式出发,结合实际情况,考虑检索结果相关性是一个模糊的概念,为了对这个模糊的概念以客观的度量,应用模糊集的方法对传统的查全率和查准率的公式进行了两种修改。一种修改是在每篇文档对总的查全率和查准率的影响相同时进行的,而第二种修改则是在每一类文档对总的查全率和查准率的影响相同时进行的。这两种修改是对传统的基于二值相关性的检索评价系统的有效扩充。  相似文献   

2.
周瑛  张铃 《微机发展》2007,17(1):111-113
评价一个信息检索系统的标准是该系统在多大程度上满足用户的能力。文中从传统的评价信息检索系统的两大标准———计算查全率和查准率的公式出发,结合实际情况,考虑检索结果相关性是一个模糊的概念,为了对这个模糊的概念以客观的度量,应用模糊集的方法对传统的查全率和查准率的公式进行了两种修改。一种修改是在每篇文档对总的查全率和查准率的影响相同时进行的,而第二种修改则是在每一类文档对总的查全率和查准率的影响相同时进行的。这两种修改是对传统的基于二值相关性的检索评价系统的有效扩充。  相似文献   

3.
基于知识本体的语义信息检索框架设计   总被引:4,自引:2,他引:4  
基于关键词匹配的信息检索方式不能反映出被检关键字在现实世界中的语义,因此这种检索方式不可避免地导致查准率和查全率低的缺陷,而概念检索中的主体词典表达领域知识的能力有限。提出了一个基于知识本体的能够实现语义信息检索的多主体系统,它包括描述信息收集、存储、语义匹配和语义相关性扩展等几个主要部分,可以在很大程度上提高检索结果的查准率和查全率。  相似文献   

4.
基于语义的概念查询扩展   总被引:2,自引:1,他引:1  
针对当前信息检索系统中所存在查准率低和查全率低的情况,分析了当前检索系统中常用的方法后,提出了一种基于语义的概念查询扩展方法.该方法结合概念语义空间来实现用户检索的概念查询扩展,以达到提高查准率和查全率的目的.实验结果表明,该方法相对于传统方法可以大幅提高用户检索的查准率和查全率.  相似文献   

5.
传统信息检索模型仅考虑考虑关键词本身的匹配程度,在林业领域信息检索时得到的检索结果不全面或不准确.为了改善检索质量,提出了一种基于本体的林业领域语义查询扩展模型.该模型利用了本体的语义推理的能力和语义结构对关键词进行语义查询扩展,最终实现提高检索查全率和查准率的目的,是对传统基于关键词匹配的信息检索模型的语义补充.结果表明该模型在一定程度上改善了林业领域信息检索结果的查准率和查全率.  相似文献   

6.
白田恬  邢永康 《计算机科学》2006,33(B12):245-248
本文依次介绍了信息检索的三类数学模型——集合模型、代数模型和概率模型,并对这三类信息检索模型的检索效果进行了分析。在此基础上提出了一种实用的信息检索方法,我们称为二次检索方法。该方法基于布尔模型和向量空间模型,综合了两者的特点,从而有效地提高了信息检索的效果。文章最后通过实验,对二次检索方法、布尔模型、向量空间模型的查全率、查准率进行了比较,验证了二次检索的优点。  相似文献   

7.
随着Web的飞速发展,使得信息量激增,通过传统的信息检索技术来获取精确且对用户有价值的数据信息显得越来越困难。概念格作为形式概念分析中的核心数据结构,是进行数据分析的有力工具,将其引入信息检索系统能够提高检索的查全率和查准率,而智能Agent能给传统的信息检索系统带来智能化和个性化。本文将这两种技术相结合,提出一种基于概念格的多Agent信息检索系统模型,并给出它的框架结构和功能设计。  相似文献   

8.
为提高网络信息检索系统的查全率和查准率,引入空间向量模型设计网络信息检索系统。首先,基于网络信息检索系统结构基本框架采集和预处理网络信息文档。其次,引入空间向量模型计算文本段与查询式相似度。再次,根据相似度计算公式设置不同网络信息文档的相似度门槛值。最后,基于相似度门槛值过滤网络信息检索,将过滤后的网络信息作为检索结果显示给系统用户。通过对比实验的方式证明,新的检索系统可根据用户输入内容给出查全率和查准率较高的检索结果。  相似文献   

9.
本文提出了基于本体驱动的法律信息检索模型,以解决当前Web信息检索中存在的问题。本文运用到了数据挖掘中的关联规则,并借鉴“七步法”来构建信息检索模型,构建步骤包括文档预处理、构建领域本体、过滤、构造人机接口等。向用户提供基于法律本体的概念查询、语义扩充查询、分类浏览等检索手段。该模型能够改善用户查准率和查全率,实现对该领域资源的智能化检索。  相似文献   

10.
基于改进VSM的文本信息检索研究   总被引:1,自引:1,他引:0  
网络信息的激增和多样化给有效的信息检索带来了种种困难,目前的检索工具忽视了很多文本中所隐含的语义信息,从而导致检索时效率低下,很难满足用户的查询要求.提出了一种基于向量空间模型改进的文本信息检索方法.把本体技术引入到传统的文本信息检索系统中,利用领域本体中概念相似度计算对向量空间模型进行改进,从而实现一个高效的文本检索系统,并简述了系统的模型.实例证明,该方法可以很好地提高文本信息检索的查全率和查准率.  相似文献   

11.
传统的信息检索方法忽略了文档结构对词的重要性.在此基础上,提出了改进的向量空间检索模型,利用该模型进行相似度计算.试验表明该模型可以提高信息检索的查准率和查全率不高的缺点.  相似文献   

12.
吴代文  詹海生 《微机发展》2011,(10):121-124
通过LuceneAPI实现对PDF文档的一次全文检索,为了更精确地定位搜索关键词,设计并实现了一种新的二次索引算法,该二次索引带有关键词的页码、坐标及其上下文等信息。利用该二次索引可将检索结果定位到PDF文档的具体页,然后在页面上标示出关键字的具体位置,使对PDF文档的二次检索达到了类似GoogleBook的图书检索效果。系统测试结果说明系统具有良好检索性能,有较高的查全率和查准率,能够满足用户快速检索的需求。系统作为西安市数字方志全文检索平台投入使用已有2年,取得了较好的应用成果。  相似文献   

13.
Web预取性能指标准确率与查全率的关系   总被引:1,自引:0,他引:1       下载免费PDF全文
研究Web预取性能评价的2个重要指标(准确率与查全率)之间可能存在的关系,通过理论推导得出,两者的关系可以是相顺的,也可以是相逆的,采用真实Web服务器和代理服务器的日志进行性能实验。仿真实验结果表明,查全率依赖于准确率,即准确率的提高有利于查全率的提高。  相似文献   

14.
基于支撑向量机的自适应信息推荐算法   总被引:1,自引:0,他引:1  
提出一种新的基于支撑向量机的自适应主动推荐算法,该算法将用户模型按照层次化方式组织成领域信息和原子需求信息,考虑多用户同类信息需求,采用支撑向量机对领域信息结点中的原子需求信息进行分类协同推荐。然后再针对每一领域信息节点中的原子信息需求进行基于内容的过滤,最后将所有领域信息需求获得的推荐集按照一定的重要度等级进行推荐.本文所提算法克服了采用单一方法的弊端而使得推荐质量得到了很大的改善,基于标准测试集的测试结果表明该算法在查全率和查准率方面表现出了优越的性能,尤其适合大规模用户的自适应主动信息推荐。  相似文献   

15.
16.
Retrieving similar images from large image databases is a challenging task for today’s content-based retrieval systems. Aiming at high retrieval performance, these systems frequently capture the user’s notion of similarity through expressive image models and adaptive similarity measures. On the query side, image models can significantly differ in quality compared to those stored on the database side. Thus, similarity measures have to be robust against these individual quality changes in order to maintain high retrieval performance. In this paper, we investigate the robustness of the family of signature-based similarity measures in the context of content-based image retrieval. To this end, we introduce the generic concept of average precision stability, which measures the stability of a similarity measure with respect to changes in quality between the query and database side. In addition to the mathematical definition of average precision stability, we include a performance evaluation of the major signature-based similarity measures focusing on their stability with respect to querying image databases by examples of varying quality. Our performance evaluation on recent benchmark image databases reveals that the highest retrieval performance does not necessarily coincide with the highest stability.  相似文献   

17.
Estimating average precision when judgments are incomplete   总被引:2,自引:1,他引:1  
We consider the problem of evaluating retrieval systems with incomplete relevance judgments. Recently, Buckley and Voorhees showed that standard measures of retrieval performance are not robust to incomplete judgments, and they proposed a new measure, bpref, that is much more robust to incomplete judgments. Although bpref is highly correlated with average precision when the judgments are effectively complete, the value of bpref deviates from average precision and from its own value as the judgment set degrades, especially at very low levels of assessment. In this work, we propose three new evaluation measures induced AP, subcollection AP, and inferred AP that are equivalent to average precision when the relevance judgments are complete and that are statistical estimates of average precision when relevance judgments are a random subset of complete judgments. We consider natural scenarios which yield highly incomplete judgments such as random judgment sets or very shallow depth pools. We compare and contrast the robustness of the three measures proposed in this work with bpref for both of these scenarios. Through the use of TREC data, we demonstrate that these measures are more robust to incomplete relevance judgments than bpref, both in terms of how well the measures estimate average precision (as measured with complete relevance judgments) and how well they estimate themselves (as measured with complete relevance judgments). Finally, since inferred AP is the most accurate approximation to average precision and the most robust measure in the presence of incomplete judgments, we provide a detailed analysis of this measure, both in terms of its behavior in theory and its implementation in practice. We gratefully acknowledge the support provided by NSF grants CCF-0418390 and IIS-0534482.  相似文献   

18.
We argue that an evaluation of system behavior at the level of the music is required to usefully address the fundamental problems of music genre recognition (MGR), and indeed other tasks of music information retrieval, such as autotagging. A recent review of works in MGR since 1995 shows that most (82 %) measure the capacity of a system to recognize genre by its classification accuracy. After reviewing evaluation in MGR, we show that neither classification accuracy, nor recall and precision, nor confusion tables, necessarily reflect the capacity of a system to recognize genre in musical signals. Hence, such figures of merit cannot be used to reliably rank, promote or discount the genre recognition performance of MGR systems if genre recognition (rather than identification by irrelevant confounding factors) is the objective. This motivates the development of a richer experimental toolbox for evaluating any system designed to intelligently extract information from music signals.  相似文献   

19.
One of the challenges of modern information retrieval is to rank the most relevant documents at the top of the large system output. This calls for choosing the proper methods to evaluate the system performance. The traditional performance measures, such as precision and recall, are based on binary relevance judgment and are not appropriate for multi-grade relevance. The main objective of this paper is to propose a framework for system evaluation based on user preference of documents. It is shown that the notion of user preference is general and flexible for formally defining and interpreting multi-grade relevance. We review 12 evaluation methods and compare their similarities and differences. We find that the normalized distance performance measure is a good choice in terms of the sensitivity to document rank order and gives higher credits to systems for their ability to retrieve highly relevant documents.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号