首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 546 毫秒
1.
XML关键词搜索使用户可以在不了解数据结构的情况下从XML数据中检索信息.之前的大部分XML关键词搜索引擎都是采用一次性的方式来呈现搜索到的XML结果段,不能使用户对结果进行进一步的优化.在这些情况下,由于关键词查询存在歧义,如何时刻保证搜索引擎准确地返回所需信息就十分重要了.提出了一种新的XML关键词搜索引擎XWord,该引擎为有效用户交互提供全面支持,自动返回单元认证及拥有灵活的匹配排序语义.XWord提供灵活的输入方式,允许用户对结果段进行扩展到邻近的分段,并会给用户有效的动态查询建议.XWord还有很好的自动兼容性,可以在无需用户干涉的情况下处理任意XML数据,这一点对从大量异构XML数据中检索信息是非常重要的.最后给出大量的实验结果来展示XWord的有效性和效率.  相似文献   

2.
基于领域本体的Web信息检索实现机制研究   总被引:1,自引:1,他引:1  
传统的基于关键词的信息检索方式,往往难以用一个或几个“关键词”表达用户真正的检索要求。针对此问题,本文提出了一种基于领域本体的信息检索机制,将用户输入的关键词,用领域本体进行理解、扩充,然后基于领域本体中概念的相关度,求出扩充后每个关键词的权值,并将之用于随后的信息检索。实验证明,本方法在基本维持查准率的同时显著提高了信息检索的查全率。  相似文献   

3.
我国智慧城市安全概念的普及和建设的逐渐落地,以及大数据在智慧城市安全建设方面的深度应用,对关键词检索的处理响应速度提出了更高的要求。针对这一问题,提出了基于城市安全知识图谱的流式知识图谱多关键词并行检索算法(MKPRASKG),该算法能够根据用户输入的查询关键字,通过关联类图的构建、剪枝和融合操作实时构建基于知识图谱实体的查询子图集,再结合评分函数,以高评分的查询子图为指引,在知识图谱实例数据中进行并行搜索,最终返回Top-k查询结果。实验结果证明,该算法在实时搜索、响应时间、搜索效果以及可扩展性等方面均具有较大的优势。  相似文献   

4.
传统的云计算下的可搜索加密算法没有对查询关键词进行语义扩展,导致了用户查询意图与返回结果存在语义偏差,并且对检索结果的相关度排序不够合理,无法满足用户对智能搜索的需求。对此,提出了一种支持语义的可搜索加密方法。该方法利用本体知识库实现了用户查询的语义拓展,并通过语义相似度来控制扩展词的个数,防止因拓展词过多影响检索的精确度。同时,该方法利用文档向量、查询向量分块技术构造出对应的标记向量,以过滤无关文档,并在查询-文档的相似度得分中引入了语义相似度、关键词位置加权评分及关键词-文档相关度等影响因子,实现了检索结果的有效排序。实验结果表明,该方法在提高检索效率的基础上显著改善了检索结果的排序效果,提高了用户满意度。  相似文献   

5.
用户常常很难用具体而明确的关键词来描述自己想找的东西,而搜索出来的结果大部分与这个关键词并没有太大关系。采用“模式识别”搜索方法,可找出在内容上最接近的数据提供给用户,避免传统了“关键词检索”造成的漏检情况。  相似文献   

6.
对传统协作过滤方法在关键词推荐系统中的应用进行分析.在Apriori算法的基础上,提出一种面向主题的用户个性化搜索的关键词推荐算法.该算法基于Apriori算法,对用户的搜索历史关键词集合进行频繁集挖掘.实验证明,该算法可以根据用户输入的历史关键词推荐给用户满足其当前搜索兴趣倾向的新的关键词,使用户的查询更加精确化和个性化.  相似文献   

7.
对于加密云数据的搜索,传统的关键词模糊搜索方案虽然能搜索到相关文档,但是搜索的结果并不令人满意。在用户输入正确的情况下,无法完成近似搜索,当用户出现拼写错误时,返回的结果中包含大量无关关键词文档,严重浪费了带宽资源。针对目前在加密云数据下关键词模糊搜索的缺陷,提出了一种新型的关键词模糊搜索方案,通过对关键词计算相关度分数并对文档根据相关度分数进行排序,将top-k(即相关度最高的k个文档)个文档返回给搜索用户,减少了不必要的带宽浪费和用户寻找有效文档的时间消耗,提供了更加有效的搜索结果,并且通过引入虚假陷门集,增大了云服务器对文档关键词的分析难度,增加了系统的隐私性保护。  相似文献   

8.
随着网络的快速发展,数据库的使用大幅度增加。当前,数据信息基本上都是存储在关系数据库中的,使得我们目前的检索需要通过数据库交互进行。而基于文件的关键字检索技术根本无法适用于更广泛的检索。我们迫切的需要研究一种技术,该技术必须是用户友好的,高效而准确,使得用户可以在不了解SQL语句,不了解数据库模型结构的情况下,仍旧可以像传统基于文件的关键字检索一样仅仅输入关键字就可以检索到满意的结果。本文就是主要研究和分析数据库上的关键词检索。  相似文献   

9.
在语义搜索引擎系统中,为了使检索内容在不限制用户输入的情况下,检索结果更接近用户的需求,提出一种基于影视素材本体的查询扩展方法。对用户的检索文本中的关键词依据本体模型进行推理并按照相似度语义扩展,旨在得到更符合用户检索需求的扩展关键词集,在此基础上进行影视素材的检索,从而提高搜索引擎的召回率。  相似文献   

10.
提出一种在无标注图像库中进行的基于关键词的检索方法.该方法在用户输入关键词后,首先利用图像周围的文字信息从网页中过滤一部分与检索主题无关的图像.然后利用图像的视觉特征在之前的基础上筛选出与检索词具有高度相关性的图像.最后利用数据审计技术对筛选出的图像进行进一步精化,并利用精化后的图像对图像库进行检索.实验结果表明,借助数据审计技术,该方法可有效提高对无标注图像库进行基于关键词的检索性能.  相似文献   

11.
于静  吴国全  卢燚 《计算机应用》2010,30(6):1664-1667
现有政务信息检索系统存在两个主要问题:一是采用基于关键词匹配的检索技术忽略了对用户检索条件的语义理解,缺乏对于文档实质内涵的准确描述;二是由于对政务信息领域知识的缺乏,用户不能很好地提出符合自己检索需求的检索条件。针对这些问题,提出了基于领域本体的政务信息检索方法,即通过引入本体,在文档和检索条件间建立一种基于本体的由本体中的词汇集组成的结构化的对应关系;设计并实现了相应的概念词抽取、检索条件扩展算法以及原型系统。实验结果表明,该方法在检索的查全率和查准率方面都有很大的提升。  相似文献   

12.
冗长查询指用户提交的句子成份复杂的查询。当前的搜索引擎对于关键字的检索取得了较好的结果。但是对于冗长的查询,如果将所有词作为关键字进行检索,往往只能返回相当有限的结果。我们尝试利用关键词之间的词语关联度,发现语义蕴含,删除“信息量”小的关键词,提高检索的效果。对于实验结果,我们分别从“面向机器”和“面向用户”两个角度进行评价。在“面向机器”的评价部分,我们根据搜索引擎返回结果的标红率和结果数进行自动评价;在“面向用户”的评价部分,我们对搜索结果文档进行人工评价。实验结果表明,我们的方法能够明显提高检索结果的数量和质量。  相似文献   

13.
针对基于数据图的关系数据库关键词查询结果的排序问题, 提出了基于多因素的结果二度排序法。该方法结合结果结构权重和信息检索中常用的内容匹配, 首先采用结果路径权重衡量关键词之间的关联紧密程度对结果粗排序; 然后, 对于结构权重相等的结果, 引入信息元组中的关键词词频和包含关键词的信息量对结果细排序。实验分析表明, 该排序方法能将与查询条件高度相关的结果排在前面, 提高结果的查准率。  相似文献   

14.
结合关联规则的元搜索引擎结果聚类改进   总被引:2,自引:1,他引:1       下载免费PDF全文
将目的搜索引擎返回的结果经分词处理并提取主要关键词后,采用关联规则建立关联词矩阵,并利用FCM(Fuzzy C-Means,模糊C均值聚类)对结果进行聚类,且通过聚类有效性函数FPU,c)判断最佳聚类结果,最终按照相关度大小顺序将结果返回。通过与K-means(K均值聚类)算法的实验对比发现,以上方法能有效地保证运行效率与聚类个数的有效性,且提高了相关结果的排序位置,因此更能满足用户的需求。  相似文献   

15.
One of the useful tools offered by existing web search engines is query suggestion (QS), which assists users in formulating keyword queries by suggesting keywords that are unfamiliar to users, offering alternative queries that deviate from the original ones, and even correcting spelling errors. The design goal of QS is to enrich the web search experience of users and avoid the frustrating process of choosing controlled keywords to specify their special information needs, which releases their burden on creating web queries. Unfortunately, the algorithms or design methodologies of the QS module developed by Google, the most popular web search engine these days, is not made publicly available, which means that they cannot be duplicated by software developers to build the tool for specifically-design software systems for enterprise search, desktop search, or vertical search, to name a few. Keyword suggested by Yahoo! and Bing, another two well-known web search engines, however, are mostly popular currently-searched words, which might not meet the specific information needs of the users. These problems can be solved by WebQS, our proposed web QS approach, which provides the same mechanism offered by Google, Yahoo!, and Bing to support users in formulating keyword queries that improve the precision and recall of search results. WebQS relies on frequency of occurrence, keyword similarity measures, and modification patterns of queries in user query logs, which capture information on millions of searches conducted by millions of users, to suggest useful queries/query keywords during the user query construction process and achieve the design goal of QS. Experimental results show that WebQS performs as well as Yahoo! and Bing in terms of effectiveness and efficiency and is comparable to Google in terms of query suggestion time.  相似文献   

16.
当前基于关键字查询的大多数搜索引擎都没有提供个性化的用户服务,搜索结果主要根据关键字与文档的相似度来排序,这很难满足用户对日益膨胀的信息资源的需求。面对用户越来越难以迅速精确地检索到所需信息的现状,本文提出一种应用于LAN中的基于概念的三层搜索引擎模型:通过用户交互的方式,使得搜索具有个性化、智能化的特点。  相似文献   

17.
Keyword search is the most popular technique of searching information from XML (eXtensible markup language) document. It enables users to easily access XML data without learning the structure query language or studying the complex data schemas. Existing traditional keyword query methods are mainly based on LCA (lowest common ancestor) semantics, in which the returned results match all keywords at the granularity of elements. In many practical applications, information is often uncertain and vague. As a result, how to identify useful information from fuzzy data is becoming an important research topic. In this paper, we focus on the issue of keyword querying on fuzzy XML data at the granularity of objects. By introducing the concept of “object tree”, we propose the query semantics for keyword query at object-level. We find the minimum whole matching result object trees which contain all keywords and the partial matching result object trees which contain partial keywords, and return the root nodes of these result object trees as query results. For effectively and accurately identifying the top-K answers with the highest scores, we propose a score mechanism with the consideration of tf*idf document relevance, users’ preference and possibilities of results. We propose a stack-based algorithm named object-stack to obtain the top-K answers with the highest scores. Experimental results show that the object-stack algorithm outperforms the traditional XML keyword query algorithms significantly, and it can get high quality of query results with high search efficiency on the fuzzy XML document.  相似文献   

18.
More people than ever before have access to information with the World Wide Web; information volume and number of users both continue to expand. Traditional search methods based on keywords are not effective, resulting in large lists of documents, many of which unrelated to users’ needs. One way to improve information retrieval is to associate meaning to users’ queries by using ontologies, knowledge bases that encode a set of concepts about one domain and their relationships. Encoding a knowledge base using one single ontology is usual, but a document collection can deal with different domains, each organized into an ontology. This work presents a novel way to represent and organize knowledge, from distinct domains, using multiple ontologies that can be related. The model allows the ontologies, as well as the relationships between concepts from distinct ontologies, to be represented independently. Additionally, fuzzy set theory techniques are employed to deal with knowledge subjectivity and uncertainty. This approach to organize knowledge and an associated query expansion method are integrated into a fuzzy model for information retrieval based on multi-related ontologies. The performance of a search engine using this model is compared with another fuzzy-based approach for information retrieval, and with the Apache Lucene search engine. Experimental results show that this model improves precision and recall measures.  相似文献   

19.
Recently, social networking sites such as Facebook and Twitter are becoming increasingly popular. The high accessibility of these sites has allowed the so-called social streams being spread across the Internet more quickly and widely, as more and more of the populations are being engaged into this vortex of the social networking revolution. Information seeking never means simply typing a few keywords into a search engine in this stream world. In this study, we try to find a way to utilize these diversified social streams to assist the search process without relying solely on the inputted keywords. We propose a method to analyze and extract meaningful information in accordance with users’ current needs and interests from social streams using two developed algorithms, and go further to integrate these organized stream data which are described as associative ripples into the search system, in order to improve the relevance of the results obtained from the search engine and feedback users with a new perspective of the sought issues to guide the further information seeking process, which can benefit both search experience enrichment and search process facilitation.  相似文献   

20.
用户驱动的微博可视化搜索   总被引:1,自引:1,他引:0       下载免费PDF全文
目的 微博作为一个社交与信息分享平台,日信息量数以亿计,如何高效地搜索用户感兴趣的信息成为亟待解决的问题.提出了一个新颖的用户驱动的可视化微博信息搜索方法.方法 采用特征词及其权重来建模用户的兴趣特征,并基于此建立用户与特征词之间的相关关系.搜索微博信息时,首先定位与检索词相关的微博用户,在相关微博用户的微博中筛选与搜索相关的微博.另外,采用关注度传递算法对搜索进行扩展,将返回的特征词和微博用户进行可视化展示,并提供交互供用户查看与选定特征词或用户相关的微博.结果 实验结果表明,基于本文方法,用户可以高效地定位感兴趣的微博信息.结论 以用户作为桥梁,大大缩小了微博信息的搜索范围,同时采用关注度传递算法对搜索进行扩展,对结果进行可视化展示.实验表明本文方法能够使用户快速搜索出感兴趣的信息.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号