首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
More people than ever before have access to information with the World Wide Web; information volume and number of users both continue to expand. Traditional search methods based on keywords are not effective, resulting in large lists of documents, many of which unrelated to users’ needs. One way to improve information retrieval is to associate meaning to users’ queries by using ontologies, knowledge bases that encode a set of concepts about one domain and their relationships. Encoding a knowledge base using one single ontology is usual, but a document collection can deal with different domains, each organized into an ontology. This work presents a novel way to represent and organize knowledge, from distinct domains, using multiple ontologies that can be related. The model allows the ontologies, as well as the relationships between concepts from distinct ontologies, to be represented independently. Additionally, fuzzy set theory techniques are employed to deal with knowledge subjectivity and uncertainty. This approach to organize knowledge and an associated query expansion method are integrated into a fuzzy model for information retrieval based on multi-related ontologies. The performance of a search engine using this model is compared with another fuzzy-based approach for information retrieval, and with the Apache Lucene search engine. Experimental results show that this model improves precision and recall measures.  相似文献   

2.
Although search engines are essential tools for finding information on the World Wide Web, the effective use of search engines for information retrieval (IR) is a crucial challenge for any Internet user. Based on the user-focused approach, this study investigates individual information retrieval behaviors using information processing theory. The results show that experience with search engines significantly affects users’ attitudes toward search engines for information retrieval, the query-based service is more popular than the directory-based service, users are not completely satisfied with the precision of retrieved information and the response time of search engines, and users’ motivation is a key factor that predicts their intention to use search engines for information retrieval. Furthermore, this study proposes a conceptual model for investigating individual attitudes toward search engines for information retrieval.  相似文献   

3.
基于用户日志的查询扩展统计模型   总被引:24,自引:0,他引:24       下载免费PDF全文
崔航  文继荣  李敏强 《软件学报》2003,14(9):1593-1599
信息检索长期存在着用词歧义性问题,在Web搜索上的表现更加突出.提出了一种基于用户查询日志的查询扩展统计模型,将用户查询中使用的词或短语与文档中出现的相应词或短语以条件概率的形式连接,利用贝叶斯公式挑选出文档中与该查询关联最紧密的词加入原查询,以达到扩展优化的目的.实验结果表明,该方法更适宜改进Web上的信息检索,相对传统的查询扩展算法可以大幅度提高查询精度.  相似文献   

4.
Human-computer interaction is a decisive factor in effective content-based access to large image repositories. In current image retrieval systems the user refines his query by selecting example images from a relevance ranking. Since the top ranked images are all similar, user feedback often results in rearrangement of the presented images only.For better incorporation of user interaction in the retrieval process, we have developed the Filter Image Browsing method. It also uses feedback through image selection. However, it is based on differences between images rather than similarities. Filter Image Browsing presents overviews of relevant parts of the database to users. Through interaction users then zoom in on parts of the image collection. By repeatedly limiting the information space, the user quickly ends up with a small amount of relevant images. The method can easily be extended for the retrieval of multimedia objects.For evaluation of the Filter Image Browsing retrieval concept, a user simulation is applied to a pictorial database containing 10,000 images acquired from the World Wide Web by a search robot. The simulation incorporates uncertainty in the definition of the information need by users. Results show Filter Image Browsing outperforms plain interactive similarity ranking in required effort from the user. Also, the method produces predictable results for retrieval sessions, so that the user quickly knows if a successful session is possible at all. Furthermore, the simulations show the overview techniques are suited for applications such as hand-held devices where screen space is limited.  相似文献   

5.
A masss of heterogeneous,distributed and dynamic information on the World Wide Web(the Web) has resulted in “information overload“ .It‘s an important and urgent reserach issue to provide users with effective information retrieval service on the Web.Web search enginees attempt to solve this problem,yet their effect is far from satisfying.In this paper,a distributed and cooperative strategy for information retrieval on the Web is proposed to substitute the centralized mode adopted by the current search engines.Then a new information retrieval system model IRSM is presented.which supports the retrieval of metadata about web documents and uses Z39.50 standard protocol to unify the heterogeneous interfaces of uments and uses Z39.50 standard protocol to unify the heterogeneous interfaces of different systems.Based on that,a distributed and cooperative information refieval framework,called DCIRF,is designed to help users in fast and effective information retrieval on the Web.  相似文献   

6.
7.
Personalized Web search for improving retrieval effectiveness   总被引:11,自引:0,他引:11  
Current Web search engines are built to serve all users, independent of the special needs of any individual user. Personalization of Web search is to carry out retrieval for each user incorporating his/her interests. We propose a novel technique to learn user profiles from users' search histories. The user profiles are then used to improve retrieval effectiveness in Web search. A user profile and a general profile are learned from the user's search history and a category hierarchy, respectively. These two profiles are combined to map a user query into a set of categories which represent the user's search intention and serve as a context to disambiguate the words in the user's query. Web search is conducted based on both the user query and the set of categories. Several profile learning and category mapping algorithms and a fusion algorithm are provided and evaluated. Experimental results indicate that our technique to personalize Web search is both effective and efficient.  相似文献   

8.
You log in and find an urgent request for information about state-of-the-art tools for finding information on the World Wide Web. No problem. At the click of a mouse, you call up your favorite Web search engine and fire off the query “information retrieval WWW”. You get back lots of information in a nicely ordered list. Problem solved. It's not quite so easy-as library scientists have long known and users of the Web are discovering  相似文献   

9.
With the recent advances in the World Wide Web development, more and more users have access to web information, and more and more information providers are able to put information of various types on the web. The web has now become one of the most important Internet information systems for various professionals and users. However, owing to the huge amount of information of various types available and various users on the Internet and Web, efficient query and information retrieval as well as the management of Internet information have become a challenging and difficult task. Therefore, systematic research on the design, implementation, and management of Internet and web-based information systems has been increasingly attractive and important. The Web Information Systems Engineering (WISE) Conference Series (see http://www.i-wise.org) has emerged since 2000 as an excellent forum for researchers, professionals, and industrial practitioners to share their rapidly developing knowledge and report on new advances in web-based information systems.  相似文献   

10.
The World Wide Web is a world of great richness, but finding information on the Web is also a great challenge. Keyword-based querying has been an immediate and efficient way to specify and retrieve related information that the user inquires. However, conventional document ranking based on an automatic assessment of document relevance to the query may not be the best approach when little information is given, as in most cases. In order to clarify the ambiguity of the short queries given by users, we propose the idea of concept-based relevance feedback for Web information retrieval. The idea is to have users give two to three times more feedback in the same amount of time that would be required to give feedback for conventional feedback mechanisms. Under this design principle, we apply clustering techniques to the initial search results to provide concept-based browsing. We show the performance of various feedback interface designs and compare their pros and cons. We measure precision and relative recall to show how clustering improves performance over conventional similarity ranking and, most importantly, we show how the assistance of concept-based presentation reduces browsing labor  相似文献   

11.
基于模糊语言方法的信息检索系统的研究   总被引:4,自引:2,他引:2  
该文提出了一个基于模糊语言方法的信息检索系统模型。该系统分为查询界面子系统、数据库子系统和检索子系统三大部分。在查询界面子系统,用布尔表达式表示用户的查询请求,并对每个查询关键词赋予了两种不同语义的语言值权重,该权重表达了用户的模糊检索要求;在数据库子系统,用索引词一文档模糊矩阵表示待检索的文档,对每个索引词。根据其在文档中的出现频率大小。引入了数值权重;在检索子系统,运用模糊语言方法,对用户输入的布尔查询表达式与索引词一文档模糊矩阵进行自底向上的模糊匹配,最后返回满足用户要求的检索结果。相对于传统的基于查询关键词精确匹配的检索系统而言,该系统能较好地满足用户查询要求中的灵活性。  相似文献   

12.
The World Wide Web has evolved from a distributed hypertext system to a platform-independent graphical user interface that integrates many network services. So far, its technology has restricted it mainly to applications for information retrieval.As networks become ubiquitious and more and more users have a permanent connection, there is an increasing demand for other network services, such as real-time data feeds, group communication, and teleconferencing. So far, these services have been provided by various proprietary software systems, which were hard to set up and use, and thus not very successful.Integrating real-time group communication services into the World Wide Web is a natural way to make them more accessible and will take the Web a step further on its way to becoming the universal network application.In this paper, we describe functionalities required for these services and present an implementation based on Sun Microsystem's Java2 programming language. We focus on the high-level functionalities and abstractions, but also describe an object-oriented programming model for group communication systems.  相似文献   

13.
基于标记树对象抽取技术的Hidden Web获取研究   总被引:6,自引:0,他引:6  
目前标准的搜索引擎能够检索的仅仅是WorldWideWeb提供的小部分称为可索引的Web信息。大量的HiddenWeb信息(估计容量是可索引Web的500倍)对这些搜索引擎是不可见的。这些信息隐藏在Web页面的搜索表单后面,保存在大型的动态数据库中。该文提出了一套检索HiddenWeb信息的方法,给出了系统的框架结构,并详细讨论了实现的关键技术。系统采用新的基于标记树的对象抽取(Tag-Tree-basedObjectExtraction)方法自动地从Web页面中抽取HiddenWeb信息,然后在此基础上给出了结构化的HiddenWeb信息查询算法。文章最后对实验结果进行了讨论。  相似文献   

14.
This paper describes our research into a query-by-semantics approach to searching the World Wide Web. This research extends existing work, which had focused on a query-by-structure approach for the Web. We present a system that allows users to request documents containing not only specific content information, but also to specify that documents be of a certain type. The system captures and utilizes structure information as well as content during a distributed query of the Web. The system also allows the user the option of creating their own document types by providing the system with example documents. In addition, although the system still gives users the option of dynamically querying the web, the incorporation of a document database has improved the response time involved in the search process. Based on extensive testing and validation presented herein, it is clear that a system that incorporates structure and document semantic information into the query process can significantly improve search results over the standard keyword search.  相似文献   

15.
As more information becomes available electronically, tools for finding information of interest to users becomes increasingly important. The goal of the research described here is to build a system for generating comprehensible user profiles that accurately capture user interest with minimum user interaction. The research focuses on the importance of a suitable generalization hierarchy and representation for learning profiles which are predictively accurate and comprehensible. In our experiments we evaluated both traditional features based on weighted term vectors as well as subject features corresponding to categories which could be drawn from a thesaurus. Our experiments, conducted in the context of a content-based profiling system for on-line newspapers on the World Wide Web (the IDD News Browser), demonstrate the importance of a generalization hierarchy and the promise of combining natural language processing techniques with machine learning (ML) to address an information retrieval (IR) problem.  相似文献   

16.
Searching for relevant information on the World Wide Web is often a laborious and frustrating task for casual and experienced users. To help improve searching on the Web based on a better understanding of user characteristics, we investigate what types of knowledge are relevant for Web-based information seeking, and which knowledge structures and strategies are involved. Two experimental studies are presented, which address these questions from different angles and with different methodologies. In the first experiment, 12 established Internet experts are first interviewed about search strategies and then perform a series of realistic search tasks on the World Wide Web. From this study a model of information seeking on the World Wide Web is derived and then tested in a second study. In the second experiment two types of potentially relevant types of knowledge are compared directly. Effects of Web experience and domain-specific background knowledge are investigated with a series of search tasks in an economics-related domain (introduction of the Euro currency). We find differential and combined effects of both Web experience and domain knowledge: while successful search performance requires the combination of the two types of expertise, specific strategies directly related to Web experience or domain knowledge can be identified.  相似文献   

17.
The explosion of the World Wide Web as a global information network brings with it a number of related challenges for automation. First, nontechnical users should be able to benefit from the information available on the Web without being overwhelmed by technical detail. Second, users should be freed from mundane and repetitive browsing tasks. Third, and most critical, information from the Web should be available in the format and combination that best fit the user's cask, regardless of the pages on which the information was originally found. The article looks at these issues for Internet automation in the context of new software agent technologies that act as user surrogates for carrying out routine Web activity. Such surrogates enable automation of all interactions with HTML pages and forms-not merely the retrieval of specific URLs-and also the flexible integration of Web information into customized reports and other applications  相似文献   

18.
《Applied Soft Computing》2007,7(3):746-771
The growth and advancement in the Internet and the World Wide Web has led to an explosion in the amount of available information. This staggering amount of information has made it extremely difficult for users to locate and retrieve information that is actually relevant to their task at hand. Dealing with this problem of “information overload” will need tools to customize the information space. In this paper we present MASACAD, a multi-agent system that learns to advise students by mining the Web and discuss important problems in relationship to information customization systems and smooth the way for possible solutions. The main idea is to approach information customization using a multi-agent paradigm in combination with a number of aspects from the domains of machine learning, user modeling, and Web mining.  相似文献   

19.
王志华  金燕  李占波 《计算机工程》2011,37(11):83-85,88
基于内容的语义Web检索只考虑内容本身,没有考虑用户的不同,不能准确反映用户需求。为此,提出一个自适应语义Web检索框架,对于Web中文文档,借助HowNet知识库给出一种本体学习方法,通过提取用户客观、显式和隐式信息建立用户信息库,并设计用户初始查询本体和个性化查询本体构建算法,从而实现用户的自适应检索。实验结果表明,该方法具有较高的检索效率。  相似文献   

20.
Information sources in the World Wide Web usually offer two different schemes to their users, an Interface Schema which the user can query and a Result Schema which the user can browse. Often the Interface Schema is more restricted than the Result Schema, moreover many sources offer keyword-search interfaces only. Thus query capabilities of such sources are very small and a useful integration into a mediator-based information system using query capabilities is almost impossible. We propose the Query Tunnelling architecture for the wrapping of these restricted web sources. Wrapping of sources by Query Tunneling hides restrictive query interfaces and makes such sources fully queryable based on their result schema. The process of Query Tunneling is divided into two main steps, Query Relaxation to make a higher order query suitable to a restricted interface and Result Restriction in order to filter the results using the original query.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号