首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
基于Lucene的语义检索系统   总被引:5,自引:3,他引:2       下载免费PDF全文
郑廷  郑诚 《计算机工程》2008,34(16):92-94
在一种基于LUCENE的传统文本检索引擎之上,采用C/S架构模式的语义检索实验系统。用户可以根据需要,从客户端向服务器提交相应的查询信息配置,服务器根据此配置,通过本体导航与同义词查询2种查询扩展优化技术,对提交的查询关键词组进行查询、扩展等优化处理后,将经优化处理过的查询关键词组导入传统的文本检索引擎中,对文档资源进行匹配,将查询结果根据用户要求的排列,并依次返回给用户。通过用户与服务器的信息交互与对查询语句的查询扩展,该系统提高了查准率与查全率。  相似文献   

2.
随着生活节奏的加快,用户习惯将简短的查询提交给搜索引擎,并希望搜索引擎能体贴地将自己需要的结果返回在靠前的结果中。面对大量有歧义的或者意义广泛的查询,搜索引擎努力地识别用户意图,并试图用有限的结果取悦更多的用户。为了解决这个问题,搜索结果多样化技术应运而生,其任务是是对搜索结果进行重排序,在有限的搜索结果中满足尽可能多的用户意图。该文重点关注多样化算法中子话题的粒度问题。利用传统方法生成了不同粒度的子话题,并比较了使用不同粒度的子话题对搜索结果多样化算法的影响。实验结果表明,经典多样化算法使用细粒度的子话题时表现更好。  相似文献   

3.
为了提高搜索引擎查询结果的质量,越来越关注于对用户提交的网络查询意图的识别。基于查询session对用户提交的查询进行多维度特征提取,尽量能全面系统地描述查询分类特征,并使用SVM进行分类。实验结果表明通过结合查询的多个特征有助于识别查询意图,在人工标注的测试集中对查询意图分类的正确率达到80%。  相似文献   

4.
基于用户查询意图识别的Web搜索优化模型   总被引:2,自引:1,他引:1  
杨艺  周元 《计算机科学》2012,39(1):264-267
在对用户查询意图进行分析分类的基础上,提出了一种Web搜索优化模型。该模型通过识别用户查询意图来查询意图特征词和内容主题词的双重约束,再结合用户查询行为获得查询目标,既保证了用户查询意图的准确匹配,又自动过滤和屏蔽了不相关信息。与相关工作对比,其重点在于准确获取用户查询意图,提高用户满意度。实验结果表明,该模型在实现信息搜索准确性和用户对查询结果满意度方面比传统搜索方法有明显改善。  相似文献   

5.
用户兴趣和行为的多样性使得为不同用户提供更符合其查询意图的搜索结果成为一个具有挑战性的任务.Web 2.0下的社会标签是用户为他们感兴趣的网页等对象进行标注行为的结果,用户用标签来描述自己感兴趣的话题.这些标签不但代表着用户的兴趣,而且是对网页承载信息的最好揭示.提出了面向用户查询意图的标签推荐方法,旨在把能够体现用户真正查询意图的标签选择出来.标签作为对查询关键词的补充,不仅可以弥补用户短查询的缺陷,而且可以根据标签与网页上曾被标注过的标签间的关系,更准确地判断用户查询意图与网页内容之间的相关度,从而把更符合用户查询兴趣的结果排在靠前的位置上.实验结果表明,该方法比现有的其他方法更有效,这也说明社会标注对更准确地捕捉用户真实查询意图确实有重要作用.  相似文献   

6.
集成搜索引擎的文本数据库选择   总被引:8,自引:0,他引:8  
用户需要检索的信息往往分散存储在多个搜索多个搜索引擎各自的数据库里,对普通用户而言,访问多个搜索引擎并从返回的结果中分辨出确实有网页是一件费时费力的工作,集成搜索引擎则可以提供给用户一个同时记问多个搜索引擎人集成环境,集成搜索引擎能将其接收到的用户查询提交给底层的多个搜索引擎进行搜索,作为一种搜索工具,集成搜索引擎具有如WEB查询覆盖面比传统引擎更大,引警有更好的可扩展性等优点,讨论了解决集成搜索引擎的数据库选择问题的多种技术,针对用户提交的查询要求,通过数据库选择可以选定最有可能返回有用信息的底层搜索引擎。  相似文献   

7.
对查询词进行扩展是为了进一步理解用户的搜索意图,使得搜索引擎返回更加准确的信息。已有的方法主要研究如何寻找与查询词相似的词,然而相似的户的词并一定能真正反映用意图。从网络知识库中抽取查询词的待扩展词,并利用通用搜索引擎对待扩展词进行排序,这样的查询词扩展方法充分利用了网络群体智慧,使得扩展词更加贴近用户的搜索期望。通过进行实验对比发现,该方法有较好的结果。  相似文献   

8.
设计了应用于网上商店的个性化信息搜索系统,该系统根据查询条件和用户信息记录分析用户兴趣,把初步的兴趣商品结果提交给用户,并将用户对结果的使用情况回馈给数据库。系统可用于定位消费需求、挖掘用户的隐含表达;创新点是提出了用户关联行为方案,为系统设计了关联行为的模型。  相似文献   

9.
10.
基于日志挖掘的搜索引擎用户行为分析   总被引:1,自引:0,他引:1  
随着网络搜索用户的大规模增加,网络用户行为分析已成为网络信息检索系统进行架构分析、性能优化和系统维护的重要基石,是网络信息检索和知识挖掘的重要研究领域之一。为更好理解网络用户的搜索行为,该文基于7.56亿条真实网络用户行为日志,对用户行为进行分析和研究。我们主要考察了用户搜索行为中的查询长度、查询修改率、相关搜索点击率、首次/最后一次点击位置分布以及查询内点击数分布等信息。该文还基于不同类型的查询集合,考察用户在不同查询需求下的行为差异性。相关分析结果对搜索引擎算法优化和系统改进等都具有一定的参考意义。  相似文献   

11.
Today, digitally stored information isn't only ubiquitous, it's also increasing in volume at an exponential rate. And not only is the volume increasing, but so is the variety, as well as the ways of combining information from different sources to derive insights. Not surprisingly, our most pressing technological and business problem is finding what we need in this sea of information. The dominant paradigm for addressing this problem is information retrieval (Modem Information Retrieval, Ricardo Baeza-Yates and Berthier Ribeiro-Neto, ACM Press, 1999). In this paradigm, the user enters a query (typically a few words typed into a search box), and the system retrieves documents matching the query, ranking the matches based on an estimate of their relevancy to the query. If the system finds many matches, the user sees only the highest-ranked matches. The popularity of Web search systems such as Google shows that the information retrieval paradigm can be effective. An information access framework empowers users by explicitly focusing on the interaction between users and the system. The key problem for information access systems isn't guessing which matching document is most relevant, but establishing a dialogue in which users progressively communicate their information goals while the system provides immediate, incremental feedback that guides users in the pursuit of those goals  相似文献   

12.
Fuzzy User Modeling for Information Retrieval on the World Wide Web   总被引:5,自引:1,他引:4  
Information retrieval from the World Wide Web through the use of search engines is known to be unable to capture effectively the information needs of users. The approach taken in this paper is to add intelligence to information retrieval from the World Wide Web, by the modeling of users to improve the interaction between the user and information retrieval systems. In other words, to improve the performance of the user in retrieving information from the information source. To effect such an improvement, it is necessary that any retrieval system should somehow make inferences concerning the information the user might want. The system then can aid the user, for instance by giving suggestions or by adapting any query based on predictions furnished by the model. So, by a combination of user modeling and fuzzy logic a prototype system has been developed (the Fuzzy Modeling Query Assistant (FMQA)) which modifies a user's query based on a fuzzy user model. The FMQA was tested via a user study which clearly indicated that, for the limited domain chosen, the modified queries are better than those that are left unmodified. Received 10 November 1998 / Revised 14 June 2000 / Accepted in revised form 25 September 2000  相似文献   

13.
Abstract: Retrieving ad hoc data from information systems is difficult for non-expert users. Despite the efforts made in improving query tools (e.g. visual query construction, Query By Example, query templates), empirical research shows that constructing a request is still difficult (Reisner 1988). The core of the problem seems to be in the difference between the way the user perceives the application domain and the way the system requires the user to see it (Carroll & Olsen 1988).
In this paper we describe the design and implementation of an ad hoc query tool developed by RCC for a personnel information system, and how AI techniques contributed to this module. The main novel idea incorporated in the query tool is to present the user with a means to question the conceptual model of the system instead of the technical model. To support this, a blackboard architecture has been designed and implemented with knowledge sources that translate the user's questions to database queries. The query tool is used in daily practice by over 100 users.  相似文献   

14.
Private information retrieval (PIR) is normally modeled as a game between two players: a user and a database. The user wants to retrieve some item from the database without the latter learning which item is retrieved. Most current PIR protocols are ill-suited to provide PIR from a search engine or large database: (i) their computational complexity is linear in the size of the database; (ii) they assume active cooperation by the database server in the PIR protocol. If the database cannot be assumed to cooperate, a peer-to-peer (P2P) user community is a natural alternative to achieve some query anonymity: a user gets her queries submitted on her behalf by other users in the P2P community. In this way, the database still learns which item is being retrieved, but it cannot obtain the real query histories of users, which become diffused among the peer users. We name this relaxation of PIR user-private information retrieval (UPIR). A peer-to-peer UPIR system is described in this paper which relies on an underlying combinatorial structure to reduce the required key material and increase availability. Extensive simulation results are reported and a distributed key management version of the system is described.  相似文献   

15.
Information Retrieval (IR) systems assist users in finding information from the myriad of information resources available on the Web. A traditional characteristic of IR systems is that if different users submit the same query, the system would yield the same list of results, regardless of the user. Personalised Information Retrieval (PIR) systems take a step further to better satisfy the user’s specific information needs by providing search results that are not only of relevance to the query but are also of particular relevance to the user who submitted the query. PIR has thereby attracted increasing research and commercial attention as information portals aim at achieving user loyalty by improving their performance in terms of effectiveness and user satisfaction. In order to provide a personalised service, a PIR system maintains information about the users and the history of their interactions with the system. This information is then used to adapt the users’ queries or the results so that information that is more relevant to the users is retrieved and presented. This survey paper features a critical review of PIR systems, with a focus on personalised search. The survey provides an insight into the stages involved in building and evaluating PIR systems, namely: information gathering, information representation, personalisation execution, and system evaluation. Moreover, the survey provides an analysis of PIR systems with respect to the scope of personalisation addressed. The survey proposes a classification of PIR systems into three scopes: individualised systems, community-based systems, and aggregate-level systems. Based on the conducted survey, the paper concludes by highlighting challenges and future research directions in the field of PIR.  相似文献   

16.
元数据驱动的个性化查询工具设计与实现   总被引:2,自引:0,他引:2  
传统查询定制工具只关注动态组合SQL语句,并没有关注与业务相关的实体,如用户、专业等。用户无法定制个性化的查询,企业不能对数据按专业、查询对象和用户等组织多维度、多专业的数据查询。为解决上述问题,提出了一个元数据驱动的个性化查询定制框架,用元数据描述用户需求和企业环境。用户通过个性定制工具,形成用户需求的元数据描述,查询引擎通过元数据读取用户需求,然后查询专业数据库并形成个性化界面。既有通用查询的通用数据接口,又有友好、个性化的用户接口,在油田企业信息集成中得到应用,并取得良好应用效果。  相似文献   

17.
OLAP系统基于查询结构的用户浏览引导   总被引:4,自引:0,他引:4  
联机分析处理(OLAP)系统是数据仓库主要的前端支持工具,在OLAP系统中用户以浏览的方式进行数据访问。通常,OLAP系统用户一般会有相对稳定的信息需求,而OLAP系统中查询的结构一定程度上反映了用户所关心的信息内容,因此,用户执行查询的结构也会保持一定的稳定性。以查询结构为基础,对OLAP系统用户的查询行为进行了分析,提出了一种建立OLAP系统用户轮廓文件的方法,并对如何根据轮廓文件对用户的行为进行预测,并进一步对用户的浏览进行引导的方法进行了探讨。以此为基础,当OLAP系统用户进行信息浏览时,可以在OLAP系统前端,对用户可能感兴趣的地方做出一定的标识,引导用户将要进行的浏览动作,使用户能轻松的完成信息搜索的工作。  相似文献   

18.
The popularity of location-based services (LBSs) leads to severe concerns on users’ privacy. With the fast growth of Internet applications such as online social networks, more user information becomes available to the attackers, which allows them to construct new contextual information. This gives rise to new challenges for user privacy protection and often requires improvements on the existing privacy-preserving methods. In this paper, we classify contextual information related to LBS query privacy and focus on two types of contexts—user profiles and query dependency: user profiles have not been deeply studied in LBS query privacy protection, while we are the first to show the impact of query dependency on users’ query privacy. More specifically, we present a general framework to enable the attackers to compute a distribution on users with respect to issuing an observed request. The framework can model attackers with different contextual information. We take user profiles and query dependency as examples to illustrate the implementation of the framework and their impact on users’ query privacy. Our framework subsequently allows us to show the insufficiency of existing query privacy metrics, e.g., k-anonymity, and propose several new metrics. In the end, we develop new generalisation algorithms to compute regions satisfying users’ privacy requirements expressed in these metrics. By experiments, our metrics and algorithms are shown to be effective and efficient for practical usage.  相似文献   

19.
Optimal closeness query in social networks requires obtaining the social datasets from each user so that he/she finds out a shortest social distance with any target user. For example, we can make friends in terms of the most similar social relationship of family background, education level and hobbies etc. Unfortunately, social data concerning user’s attributes might reveal personal sensitive information and be exploited maliciously. Considering the above privacy-revealing issues, this paper proposes a Privacy-Preserving Optimal Closeness Query (PP-OCQ) scheme, which achieves the secure optimal closeness query in a distributed manner without revealing the users’ sensitive information. We construct an equivalent cost graph where all users’ information are encrypted by his/her public key and the data are authenticated by signature. It employs the ElGamal Cryptosystem to achieve the privacy protection in social networks, and gives an optimal closeness query protocol without leaking the users’ sensitive information on homomorphic user ciphertexts. Then it follows the routing protocol, distributed Bellman-Ford shortest-paths protocol, to query the optimal closeness through the users’ message propagation in multiple iterations. The direction of propagation is controlled by some indicators so that each user performs corresponding operations based on homomorphism property and fails to obtain other user’s information due to the masking of random numbers. Our analysis and simulations show that the proposed scheme is efficient in terms of computation cost and communication overhead.  相似文献   

20.
Malicious users can exploit the correlation among data to infer sensitive information from a series of seemingly innocuous data accesses. Thus, we develop an inference violation detection system to protect sensitive data content. Based on data dependency, database schema and semantic knowledge, we constructed a semantic inference model (SIM) that represents the possible inference channels from any attribute to the pre-assigned sensitive attributes. The SIM is then instantiated to a semantic inference graph (SIG) for query-time inference violation detection. For a single user case, when a user poses a query, the detection system will examine his/her past query log and calculate the probability of inferring sensitive information. The query request will be denied if the inference probability exceeds the prespecified threshold. For multi-user cases, the users may share their query answers to increase the inference probability. Therefore, we develop a model to evaluate collaborative inference based on the query sequences of collaborators and their task-sensitive collaboration levels. Experimental studies reveal that information authoritativeness, communication fidelity and honesty in collaboration are three key factors that affect the level of achievable collaboration. An example is given to illustrate the use of the proposed technique to prevent multiple collaborative users from deriving sensitive information via inference.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号