首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
随着Web技术的发展,越来越多的信息需要通过Deep Web来获取。文章对Deep Web搜索进行了全面的分析.针对传统搜索引擎的缺陷提出相应的搜索策略。最后介绍了一些有效的搜索工具。  相似文献   

2.
基于移动Agent的信息搜索系统的研究   总被引:2,自引:0,他引:2  
对传统客户/服务器模式的网络信息搜索系统的缺点进行了分析,将移动Agent技术引入信息搜索领域,阐述了Agent和移动Agent的概念,特点、开发工具,分析了其适合信息搜索的技术特点.提出了一个基于移动Agent的网络信息搜索原型系统(MAISS)来处理网络信息搜索,以实现搜索的高效率,低开销以及智能化,并时其结构及功能,实现机制和关键技术进行了深入的分析和讨论.  相似文献   

3.
从两个不同的角度对各种进化算法进行剖析,寻求合理的解释.从信息论的角度进行讨论,分析了在算法设计中如何表示信息.提取信息、传递信息,利用信息与信息融合等问题,在优化技术与信息论之间建立了联系的纽带.从搜索的角度进行讨论,分析了单点搜索与多点搜索,指出了各种进化算法均属于随机布点,企图寻找全局最优的数值方法.  相似文献   

4.
搜索引擎中的聚类浏览技术   总被引:1,自引:0,他引:1  
搜索引擎大多以文档列表的形式将搜索结果显示给用户,随着Web文档数量的剧增,使得用户查找相关信息变得越来越困难,一种解决方法是对搜索结果进行聚类提高其可浏览性。搜索引擎的聚类浏览技术能使用户在更高的主题层次上查看搜索结果,方便地找到感兴趣的信息。本文介绍了搜索引擎的聚类浏览技术对聚类算法的基本要求及其分类方法,研究分析了主要聚类算法及其改进方法的特点,讨论了对聚类质量的评价,最后指出了聚类浏览技术的发展趋势。  相似文献   

5.
基于内容的图像和视频搜索重排序技术综述   总被引:1,自引:1,他引:1       下载免费PDF全文
基于内容的图像/视频搜索重排序技术是指在基于文本的图像/视频搜索结果的基础上,利用数据的视觉信息,通过某种方法对原始搜索结果重新排序的过程,目的是提高搜索质量和提升用户搜索体验,是一种互联网多媒体图像/视频搜索新模式。对这一技术进行了综述,系统地分析了重排序技术的发展现状,详尽地探讨了各类重排序技术的特点及应用,总结了现有评价方法和数据库,指出了当前重排序技术的发展趋势。  相似文献   

6.
通过采用Jacob技术对嵌入文档搜索进行研究,本文研究了蛮力算法在搜索上的缺陷提出了改进的KMP算法迁入文档进行搜索的算法,同时为了解决模糊搜索的需求,本文引入了向量空间模型,并将其应用在嵌入文档搜索的领域。  相似文献   

7.
智能化搜索引擎原理及实现   总被引:7,自引:0,他引:7  
陈鑫  常致全 《计算机应用》2003,23(Z2):191-193
文中在分析指出传统搜索引擎存在的缺陷和不足的基础上,介绍了一种基于概念搜索的搜索引擎,并对其中的基本实现进行了说明.  相似文献   

8.
误报率是衡量静态缺陷检测工具的重要指标.在对比分析了各种误报消除技术的基础上,提出了一种前向数据流分析结合逆向约束搜索技术的误报消除方法:前向数据流分析的保守数据流解可以用于缺陷状态迭代,并得到初始的缺陷检测结果;根据缺陷发生处的数据流特征,逆向搜索可能导致缺陷发生的约束条件,该约束条件可以作为通用约束求解器的输入判断缺陷的可满足性,从而对初始的缺陷检测结果进行精化.同时,在数据流分析过程中引入符号执行技术,不仅提高了数据流分析的精度,且便于约束表示及转化,提高了约束搜索的效率.对SPECCPU2000中11个工程的对比实验表明,前向数据流分析与逆向约束搜索相结合的误报消除方法在增加了有限开销的同时有效地消除了部分误报,且与同类工具相比具有更好的可扩展性.  相似文献   

9.
在蛋白质结构预测算法中同源建模被认为是当前最成功的预测算法,文中指出了同源建模算法存在的缺陷,并且针对这一缺陷设计出改进算法。基于结构信息的目标模板比对算法,对搜索敏感度和比对准确度等方面有所提高。  相似文献   

10.
一种基于智能体的Web文档预取模式   总被引:1,自引:2,他引:1  
文章深入分析了用户对Internet资源的访问模式和web文档自身的更新模式,并提出了一个新的基于智能体的web文档预取系统结构。在这个系统结构基础上,通过用户存取日志及各种算法,发现特定用户感兴趣的主题,实现对兴趣文档的主动预取,从而提高分布式信息系统上信息的获取效率。  相似文献   

11.
WWW上的信息发现与搜索引擎技术   总被引:36,自引:1,他引:36  
随着Internet在我国逐步得到普遍应用以及WWW上中文信息量的不断增长,迫切需要研制适合我国国情的中英文Web索引和检索服务系统。WWW的信息发现和搜索引擎又称robot负责搜索物获取指定范围内的有关数据。本文对Web搜索引擎的工作原理和关键技术进行讨论和分析,并分析了我们在研制中英文Web索引和检索服务器方面所做的工作,包括系统总体结构和汉语分词技术等。  相似文献   

12.
This study investigates the cognitive strategies of 80 participants as they engaged in two researcher-defined tasks and two participant-defined information-seeking tasks using the WWW. Each researcher-defined task and participant-defined task was further divided into a directed search task and a general-purpose browsing task. On the basis of retrospective verbal protocols, log-file data and observations, 12 cognitive search strategies were identified and explained. The differences in cognitive search strategy choice between researcher-defined and participant-defined tasks and between directed search and general-purpose tasks were examined using correspondence analysis. These cognitive search strategies were compared to earlier investigations of search strategies on the WWW.

Relevance to industry

Describing information-seeking behaviours and cognitive search strategies in detail provides website developers and search engine developers with valuable insights into how users seek (and find) information of value to them. Using this information, website developers might gain some knowledge as to how to best represent the content and navigational properties of websites. Search engine developers might wish to make the search and collection strategies more transparent to users. There are also design implications for the designers of web browsers.  相似文献   


13.
随着互联网技术的飞速发展,网页数量急剧增加,搜索引擎的地位已经不可取代,成为人们使用Internet的入口。网络蜘蛛作为搜索引擎的信息来源是搜索引擎必不可少的组成部分。介绍网络蜘蛛设计中的关键技术。另外,随着用户个性化需求越来越强以及网页数量的急剧增加导致通用搜索引擎无法满足特定用户的需求,专业搜索引擎得到快速的发展。同时对于主题爬虫的研究也有很大的突破和进展。主题爬虫有别于通用爬虫,通用爬虫注重爬取的完整性,而主题爬虫强调网页与特定主题的相关性。同时对主题爬虫的研究现状进行介绍和总结。  相似文献   

14.
The information accessible through the Internet is increasing explosively as the Web is getting more and more widespread. In this situation, the Web is indispensable information resource for both of information gathering and information searching. Though traditional information retrieval techniques have been applied to information gathering and searching in the Web, they are insufficient for this new form of information source. Fortunately some Al techniques can be straightforwardly applicable to such tasks in the Web, and many researchers are trying this approach. In this paper, we attempt to describe the current state of information gathering and searching technologies in the Web, and the application of AI techniques in the fields. Then we point out limitations of these traditional and AI approaches and introduce two aapproaches: navigation planning and a Mondou search engine for overcoming them. The navigation planning system tries to collect systematic knowledge, rather than Web pages, which are only pieces of knowledge. The Mondou search engine copes with the problems of the query expansion/modification based on the techniques of text/web mining and information visualization. Seiji Yamada, Dr. Eng.: He received the B.S., M.S. and Ph.S. degrees in control engineering and artificial intelligence from Osaka University, Osaka, Japan, in 1984, 1986 and 1989, respectively. From 1989 to 1991, he served as a Research Associate in the Department of Control Engineering at Osaka University. From 1991 to 1996, he served as a Lecturer in the Institute of Scientific and Industrial Research at Osaka University. In 1996, he joined the Department of Computational Intelligence and Systems Science at Tokyo Institute of Technology, Yokohama, Japan, as an Associate Professor. His research interests include artificial intelligence, planning, machine learning for a robotics, intelligent information retrieval in the WWW, human computer interaction, He is a member of AAAI, IEEE, JSAI, RSJ and IEICE. Hiroyuki Kawano, Dr.Eng.: He is an Associate Professor at the Department of Systems Science, Graduate School of Informatics, Kyoto University, Japan. He obtained his B.Eng. and M.Eng. degrees in Applied Mathematics and Physics, and his Dr.Eng. degree in Applied Systems Science from Kyoto University. His research interests are in advanced database technologies, such as data mining, data warehousing, knowledge discovery and web search engine (Mondou). He has served on the program committees of several conferences in the areas of Data Base Systems, and technical committes of advanced information systems.  相似文献   

15.
中英文WWW搜索引擎的信息处理   总被引:20,自引:0,他引:20  
描述了WWW搜索引擎信息的相关问题,尤其对中文WWW搜索引擎信息处理的关键技术进行了讨论,并在此基础上提出了一个中英文WWW搜索引擎的实现方案,理描述了其信息处理方法。  相似文献   

16.
随着网络的快速发展,搜索引擎日益成为处理信息的主流工具。Internet是世界上资料最多、规模最大的信息资料库。在WWW上进行信息查找有三种方法,即基于超文本的信息查询、基于目录的信息查询、基于搜索引擎的信息查询,网络信息检索核心工具是搜索引擎。本文从搜索引擎概述、查询技术方法及展望三方面进行阐述。  相似文献   

17.
External information search behaviour has long been of interest to consumer researchers. Experimental and post hoc survey research methodologies have typically used a large number of variables to record search activity. However, as these are usually considered in aggregate, there is little opportunity for the researcher to overview the search style of a consumer. To date, the diagrammatic illustration of search behaviour has been limited to experimental environments in which the available information was strictly bounded, for example, within databases or when information display boards have been used. This paper, which focuses largely on inter-site world wide web (WWW) search behaviour, discusses web search paradigms and the variables used to capture WWW search. It also provides a conceptual framework for the representation of external information search behaviour in diagrammatic form. The technique offers researchers an opportunity to holistically interpret information search data and search styles. The benefits include the identification of particular search styles, more precise interpretation of web search activity numeric data and the potential application for the training of web users to improve their search effectiveness.  相似文献   

18.
Search engines are useful because they allow the user to find information of interest from the World Wide Web (WWW). However, most of the popular search engines today are textual; they do not allow the user to find images from the web. For effective retrieval, determining the semantics of the images is essential. In this paper, we describe the problems in determining the semantics of images on the WWW and the approach of AMORE, a WWW search engine that we have developed. AMORE's techniques can be extended to other media like audio and video. We explain how we assign keywords to the images based on HTML pages and the method to determine similar images based on the assigned text. We also discuss some statistics showing the effectiveness of our technique. Finally, we present the visual interface of AMORE with the help of several retrieval scenarios.  相似文献   

19.
The World Wide Web (WWW) has been recognized as the ultimate and unique source of information for information retrieval and knowledge discovery communities. Tremendous amount of knowledge are recorded using various types of media, producing enormous amount of web pages in the WWW. Retrieval of required information from the WWW is thus an arduous task. Different schemes for retrieving web pages have been used by the WWW community. One of the most widely used scheme is to traverse predefined web directories to reach a user's goal. These web directories are compiled or classified folders of web pages and are usually organized into hierarchical structures. The classification of web pages into proper directories and the organization of directory hierarchies are generally performed by human experts. In this work, we provide a corpus-based method that applies a kind of text mining techniques on a corpus of web pages to automatically create web directories and organize them into hierarchies. The method is based on the self-organizing map learning algorithm and requires no human intervention during the construction of web directories and hierarchies. The experiments show that our method can produce comprehensible and reasonable web directories and hierarchies.  相似文献   

20.
分布式WWW信息收集技术   总被引:14,自引:0,他引:14  
讨论了 WWW搜索引擎的分布式信息收集技术,提出了最佳机器人作用范围划分的概念,给出了实用的信息收集代价估算方法和实现最佳机器人作用范围划分的具体算法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号