首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 406 毫秒
1.
大型网站是网络信息的核心,其信息规模之大,更新速度之快是中小型网站不可比拟的,对大型网站网页搜索的好坏直接影响搜索引擎的整体性能.本文在分析分类网页更新策略的基础上,根据大型网站本身的特点提出了一种增量式信息更新方法.实验分析表明,这种增量式信息更新方法很大程度上提高了大型网站搜索引擎网页的更新效率.  相似文献   

2.
介绍网站与搜索引擎之间的关系,从而引入网站制作对搜索引擎的影响,深入地从网页的命名、标题、题头标签的作用等几个方面描述如何建设一个便于搜索引擎的好网站。当搜索引擎收录访问网站时,能够迅速地抓住网页的要领,完整地将网页的信息带走,让用户在搜索信息时可以得到更多与网站相关的内容。  相似文献   

3.
介绍网站与搜索引擎之间的关系,从而引入网站制作对搜索引擎的影响,深入地从网页的命名、标题、题头标签的作用等几个方面描述如何建设一个便于搜索引擎的好网站。当搜索引擎收录访问网站时,能够迅速地抓住网页的要领,完整地将网页的信息带走,让用户在搜索信息时可以得到更多与网站相关的内容。  相似文献   

4.
网页去重方法研究   总被引:2,自引:0,他引:2  
随着互联网技术的高速发展,网络中网站的数量成倍增长,这些网站提供了大量的信息,但不同的网站中存在着大量的重复信息,这些信息被搜索引擎反复的索引,因此在用户使用搜索引擎检索信息的时候就会发现有很多是来自不同网站的相同信息。采用信息抽取技术提取网页正文内容,利用加密技术对文本字符串进行转换并形成唯一的数字串,通过对数字串对比,标记出具有相同内容的网页,以此来提高搜索引擎的效率和质量。  相似文献   

5.
基于Nutch的XML网站全文搜索引擎实现   总被引:2,自引:0,他引:2       下载免费PDF全文
吴敏琦  丁岳伟 《计算机工程》2008,34(15):95-96,1
普通搜索引擎的网页抓取程序只能理解常见HTML标签,无法对XML网站的内容做有效解析。该文建立一个包含动态自定义标签的纯XML网站,提出借助XSL样式信息帮助网页抓取程序理解XML网页标签含义的方案,实现了基于Nutch的XML网站全文搜索引擎。  相似文献   

6.
网页优化的技术水平之间差距很大,而且技术更新很快。搜索引擎优化只是网站优化设计中的一部分,其核心仍然是对用户的优化。经过优化的网页在有限的屏幕空间将多媒体元素进行了有机的排列组合,将理性思维个性地表现出来,在传达信息的同时,也使人产生美感和愉悦感。  相似文献   

7.
爬虫是搜索引擎的一个重要组成部分,如何有效地保证本地镜像的新鲜度成为爬虫研究的一个热点问题。根据网页更新符合泊松过程的特点,提出一种及时同步本地数据库与远程网站的方法。通过保存有关网页更新情况的历史记录,统计出各个网页的更新频率,并以此确定爬虫对该网页的访问频率。通过实验证明基于泊松过程的爬虫调度策略的可行性。  相似文献   

8.
网页优化的技术水平之间差距很大,而且技术更新很快。搜索引擎优化只是网站优化设计中的一部分,其核心仍然是对用户的优化、经过优化的网页在有限的屏幕空间将多媒体元素进行了有机的排列组合,将理性思维个性地表现出来,在传达信息的同时,也使人产生美感和愉悦感。  相似文献   

9.
吕月娥  李信利 《福建电脑》2007,(2):99-99,122
随着web技术的发展,Web网页越来越多.目前的搜索引擎都是根据用户所给出查询词串的逻辑组合机械地找出一系列匹配网页,这就造成了垃圾信息过多.这篇论文考虑了网页信息类别、网页更新时间和用户点击数,提出了一种基于信息类别的网页过滤算法.这个算法能很好大优化查询结果,提高搜索引擎的性能.  相似文献   

10.
夏斌  徐彬 《电脑开发与应用》2007,20(5):16-17,20
针对目前搜索引擎返回候选信息过多从而使用户不能准确查找与主题有关结果的问题,提出了基于超链接信息的搜索引擎检索结果聚类方法,通过对网页的超链接锚文档和网页文档内容挖掘,最终将网页聚成不同的子类别。这种方法在依据网页内容进行聚类的同时,充分利用了Web结构和超链接信息,比传统的结构挖掘方法更能体现网站文档的内容特点,从而提高了聚类的准确性。  相似文献   

11.
如何发现主题信息源是主题Web信息整合的前提。提出了一种主题信息源发现方法,将主题信息源发现转化为网站主题分类问题,并利用站外链接发现新的信息源。从网站中提取出能反映网站主题的内容特征词和结构特征词,建立描述网站主题的改进的向量空间模型。以该模型为基础,通过类中心向量法与SVM相结合对网站主题进行分类。提出一种能尽量少爬取网页的网络搜索策略,在发现站外链接的同时爬取最能代表网站主题的页面。将该主题信息源发现方法应用于林业商务信息源,通过实验验证了该方法的有效性。  相似文献   

12.
Website browsing aid: A navigation graph-based recommendation system   总被引:1,自引:1,他引:0  
Websites nowadays are an important and popular source of publicly available information. However, due to their exploding scale and complexity, overcoming information overload to find relevant information is a major challenge. In addition to website maps and search engines, self-adaptive websites or websites with intelligent navigation aid are very useful tools in addressing this issue. In this paper, a navigation graph-based recommendation system is proposed, in which the navigation patterns of previous website visitors are utilized to provide recommendations for newcomers. The performance of the proposed recommendation algorithm is tested using the data collected from a real world website. Experimental results reveal that the proposed system can yield satisfactory recommendations, especially to the visitors in their early navigation steps.  相似文献   

13.
Decisions concerning everyday life activities such as patronizing restaurants require obtaining information about them. Some consumers go directly to content websites when they need such information; others go directly to search engines. How do search engine users differ from content website users for a given type of local information? This local information-seeking classification model posits that they differ in their prior experiences with their “go-to” websites, their perceived search skills, their habit of using search engines, their involvement with the activity for which information is sought, their tendency to conduct extensive information search, and their beliefs about their “go-to” website types. Empirical results support the model. By integrating everyday life information seeking (ELIS), technology acceptance model (TAM), and consumer behavior literatures, the model in this study fills a theoretical gap in the literature and opens new lines of inquiries for both ELIS and TAM research.  相似文献   

14.
目前站内搜索已成为Web应用领域的一个热点课题。本文在对站内搜索技术进行分析比较的基础上,根据Sphinx系统结构和运行机制的特点提出了一种可通用的基于Sphinx构建Web站内搜索引擎的方法。该方法对于利用LAMP技术构建的网站,不用修改它的原有架构而能便捷地生成一个性能优越的站内搜索引擎。  相似文献   

15.
高校网站的建设规模正在逐渐扩大,网站建设必须从“一群网站”走向“网站群”,以提升学校综合管理水平,但目前存在部门技术能力有限和信息管理与维护滞后等问题。因此,如何实现网站建设人性化、合理化、安全化等成为了高校网站建设的重要课题,如何提升高校网站管理服务水平,构建高水平、高效率的网站管理服务是高校网站工作者共同攻坚的重点内容。  相似文献   

16.
This study examined the utility of the concept of expressive aesthetics by testing websites that did or did not match this concept. A website scoring highly on this concept was created and was then compared to websites that were either non-aesthetic or corresponded to the concept of classical aesthetics. Sixty website users of a broad age range (18–60 years) were allocated to three experimental groups (expressive, classical, and non-aesthetic) and asked to complete a series of information search tasks. During the experiment, measures were taken of performance, perceived usability, perceived aesthetics, emotion, and trustworthiness. The results showed that expressive aesthetics can be considered a distinct concept. It also emerged that the website scoring high on expressive aesthetics shows a similar pattern of results to classical aesthetics. Both aesthetically appealing websites received higher ratings of perceived usability and trustworthiness than the non-aesthetic website. The effects of website aesthetics on subjective measures were not moderated by age.  相似文献   

17.
The objectives of this research were to identify design attributes to develop easy-to-use websites for older adults. Forty-one males and 58 females (age range 58–90) were asked to retrieve information on a health-related topic from the NHS Direct and Medicdirect websites, and were asked to fill in a website evaluation questionnaire. An exploratory factor analysis of data identified navigation/search usability, link usability, usefulness and colour as important dimensions of a senior-friendly website. A two-stage, three-component regression model with these dimensions as predictor variables and the satisfaction level in using a website as the dependent variable has been proposed.  相似文献   

18.
Truth Discovery with Multiple Conflicting Information Providers on the Web   总被引:1,自引:0,他引:1  
The World Wide Web has become the most important information source for most of us. Unfortunately, there is no guarantee for the correctness of information on the Web. Moreover, different websites often provide conflicting information on a subject, such as different specifications for the same product. In this paper, we propose a new problem, called Veracity, i.e., conformity to truth, which studies how to find true facts from a large amount of conflicting information on many subjects that is provided by various websites. We design a general framework for the Veracity problem and invent an algorithm, called TRUTHFlNDER, which utilizes the relationships between websites and their information, i.e., a website is trustworthy if it provides many pieces of true information, and a piece of information is likely to be true if it is provided by many trustworthy websites. An iterative method is used to infer the trustworthiness of websites and the correctness of information from each other. Our experiments show that TRUTHFlNDER successfully finds true facts among conflicting information and identifies trustworthy websites better than the popular search engines.  相似文献   

19.
Despite the rapidly increasing numbers of informational websites and learners using the World Wide Web to research topics, empirical evidence on the relationships between website design elements and website trustworthiness is scarce. This study used self-reported and behavioral screen-capture data to investigate the impact of Don Norman's (2003) emotional design levels and metacognitive awareness on website trustworthiness during an information search learning task. The results suggest that the interaction effects of website visual appeal (visceral level) and website usability (behavioral level) can override the effects of the quality or relevance of the information (reflective level) on website evaluation. In addition, in the context of limited time to find the answers, these effects on the evaluation of website trustworthiness are not moderated by users’ metacognitive awareness.  相似文献   

20.
互联网上很多资源蕴含人类群体智慧.分类网站目录人工地对网站按照主题进行组织.基于网站目录中具有主题标注的URL设计URL主题分类器,结合伪相关反馈技术以及搜索引擎查询日志,提出了自动、快速、有效的查询主题分类方法.具体地,方法为2种策略的结合.策略1通过计算搜索结果中URL的主题分布预测查询主题,策略2基于查询日志点击关系,利用具有主题标注的URL,对查询进行标注获取数据并训练统计分类器预测查询主题.实验表明,方法可获得比当前最好算法更好的准确率,更好的在线处理效率并且可基于查询日志自动获取训练数据,具有良好的可扩展性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号