首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
基于结构相关性Markov模型的Web网页预取方法   总被引:2,自引:0,他引:2  
预取技术通过在用户浏览当前网页的时间内提前取回其将来最有可能请求的网页来减小实际感知的获取网页的时间。预测的准确性和方法的可用性是预取技术需要解决的主要问题。针对目前Web网页预取的一般方法的不足之处,提出了一种基于结构相关性Markov模型的Web网页预取方法。仿真实验的结果表明,这种方法在保证一定预测准确性的同时也具有较好的可用性,能够在减小用户访问延迟、提高响应速度方面达到较为满意的效果。  相似文献   

2.
网页变化与增量搜集技术   总被引:9,自引:1,他引:8  
孟涛  王继民  闫宏飞 《软件学报》2006,17(5):1051-1067
互联网络中信息量的快速增长使得增量搜集技术成为网上信息获取的一种有效手段,它可以避免因重复搜集未曾变化的网页而带来的时间和资源上的浪费.网页变化规律的发现和利用是增量搜集技术的一个关键.它用来预测网页的下次变化时间甚至变化程度;在此基础上,增量搜集系统还需要考虑网页的变化频率、变化程度和重要性,选择一种最优的任务调度算法来决定不同网页的搜集频率和相对搜集次序.针对网页变化和增量搜集技术这一主题,对最近几年的研究成果作总结,并介绍最新的研究进展.首先论述对网页变化规律的建模、模型参数估计和估计效率等问题;然后介绍几个著名的增量搜集系统,着重分析它们的任务调度算法;最后,从理论上分析和总结增量搜集系统的最佳任务调度算法及其一个基于启发式策略的近似解,并预测其将来的研究趋势.该工作对增量搜集系统的设计和Web演化规律的研究具有参考意义.  相似文献   

3.
雷凯  王东海 《计算机工程》2008,34(13):78-80,1
针对传统的周期性集中式搜索(Crawler)的弱点和增量式Crawler的难点,提出预测更新策略,给出判别网页更新的MD5算法、URL调度算法和URL缓存算法,描述系统各个模块的分布式构架的实现,建立测试集数据对算法进行评测。该系统在北大天网搜索引擎上运行半年多,更新周期缩短了20天,变化预测命中率达到79.4%,提高了时效性、扩展性和稳定性。  相似文献   

4.
本文提出了基于未标记的中文网页的增量式Bayes自动分类算法,实验结果表明,该算法是可行的和有效的。  相似文献   

5.
网页局部更新的工程化表示方法   总被引:1,自引:0,他引:1  
李小麟 《计算机与数字工程》2009,37(10):190-191,203
为了减少网络数据流量、缩短系统响应时间,iframe、Ajax等网页局部更新技术已经在Web应用系统中得到了广泛的应用。为了规范化地表达对网页局部更新的技术要求,文章提出一种基于平面构成图和协作图的工程化表示方法,并通过实例加以说明。  相似文献   

6.
在职业院校计算机专业的《网页制作》教学过程中,教师分阶段分模块地设计侧重点不同的任务单.基础知识部分的任务设计要注重理实结合,页面布局部分的任务设计要注重提升学生技能熟练程度,精通与提高阶段的任务旨在培养学生创造力和创新力.通过任务单的实施来培养学生的自主学习能力、创新设计能力、团队协作能力.  相似文献   

7.
在职业院校计算机专业的《网页制作》教学过程中,教师分阶段分模块地设计侧重点不同的任务单。基础知识部分的任务设计要注重理实结合,页面布局部分的任务设计要注重提升学生技能熟练程度,精通与提高阶段的任务旨在培养学生创造力和创新力。通过任务单的实施来培养学生的自主学习能力、创新设计能力、团队协作能力。  相似文献   

8.
通用搜索引擎在检索过程中会出现查询结果与关键词所属领域无关的主题漂移现象.本文提出了面向特定领域的网页重排序算法-TSRR(Topic Sensitive Re-Ranking)算法,从一个新的视角对主题漂移问题加以解决. TSRR算法设计一种独立于网页排序的模型,用来表示领域,然后建立网页信息模型,在用户检索过程中结合领域向量模型和网页信息模型对网页搜索结果进行重排序.在爬取的特定领域的数据集上,以用户满意度和准确率为标准进行评估,实验结果表明,本文中提出的TSRR算法性能优异,比经典的基于Lucene的排序算法在用户满意度上平均提高17.3%,在准确率上平均提高41.9%.  相似文献   

9.
一个高效的关联规则增量式更新算法   总被引:9,自引:0,他引:9  
针对关联规则的维护问题,设计了一个高效的增量式更新算法FIUA,并将FIUA和已有的IUA算法进行了比较,并通过实验证实了FIUA算法的高效性。  相似文献   

10.
邻域多粒度粗糙集模型是粗糙集理论的重要研究分支。然而在大数据环境下,数据时刻处于动态更新之中,针对数值型信息系统对象动态变化的情形,本文提出一种邻域多粒度粗糙集模型的增量式更新算法。文中首先利用矩阵的方法表示了邻域多粒度粗糙集中邻域类与目标近似集之间的两种近似关系,分别称之为子集近似关系矩阵和交集近似关系矩阵,并通过这两种近似关系矩阵重构了邻域多粒度粗糙集模型;然后针对数值型信息系统对象增加和对象减少的情形,研究了这两种近似关系矩阵随对象变化时的增量式更新,理论分析证明了这种更新方法的高效性;最后基于近似关系矩阵的增量式更新设计出了邻域多粒度粗糙集模型的增量式更新算法。实验结果验证了所提出增量式算法的有效性和优越性。  相似文献   

11.
This paper presents a fast and new deterministic model selection methodology for incremental radial basis function neural network (RBFNN) construction in time series prediction problems. The development of such special designed methodology is motivated by the problems that arise when using a K-fold cross-validation-based model selection methodology for this paradigm: its random nature and the subjective decision for a proper value of K, resulting in large bias for low values and high variance and computational cost for high values. Taking into account these drawbacks, the proposed model selection approach is a combined algorithm that takes advantage of two balanced and representative training and validation sets for their use in RBFNN initialization, optimization and network model evaluation. This way, the model prediction accuracy is improved, getting small variance and bias, reducing the computation time spent in selecting the model and avoiding random and computationally expensive model selection methodologies based on K-fold cross-validation procedures.  相似文献   

12.
为了解决显示屏亮度不统一所引起的多媒体网页图像色彩退化的问题,设计一种基于RGB模式的图像色彩增强模型。根据RGB模式的要求,对视觉图像从照度和反射两个分量的角度进行光滑化处理,在此基础上,利用RGB格式的增强系数,建立视觉图像的色彩增强函数。通过增强多媒体网页中视觉图像的整体亮度、调整图像局部对比度,恢复图像色彩的方式,增强视觉图像的色彩。在Windows XP系统内进行图像色彩增强效果的检测,结果显示,基于RGB模式的色彩增强模型能够切实增强图像的亮度、信息熵、饱和度。说明该模型具备有效性,较传统的直方图均衡模型的图像色彩增强效果好,符合实际推广应用标准。  相似文献   

13.
To date, most of the focus regarding digital preservation has been on replicating copies of the resources to be preserved from the “living web” and placing them in an archive for controlled curation. Once inside an archive, the resources are subject to careful processes of refreshing (making additional copies to new media) and migrating (conversion to new formats and applications). For small numbers of resources of known value, this is a practical and worthwhile approach to digital preservation. However, due to the infrastructure costs (storage, networks, machines) and more importantly the human management costs, this approach is unsuitable for web scale preservation. The result is that difficult decisions need to be made as to what is saved and what is not saved. We provide an overview of our ongoing research projects that focus on using the “web infrastructure” to provide preservation capabilities for web pages and examine the overlap these approaches have with the field of information retrieval. The common characteristic of the projects is they creatively employ the web infrastructure to provide shallow but broad preservation capability for all web pages. These approaches are not intended to replace conventional archiving approaches, but rather they focus on providing at least some form of archival capability for the mass of web pages that may prove to have value in the future. We characterize the preservation approaches by the level of effort required by the web administrator: web sites are reconstructed from the caches of search engines (“lazy preservation”); lexical signatures are used to find the same or similar pages elsewhere on the web (“just-in-time preservation”); resources are pushed to other sites using NNTP newsgroups and SMTP email attachments (“shared infrastructure preservation”); and an Apache module is used to provide OAI-PMH access to MPEG-21 DIDL representations of web pages (“web server enhanced preservation”).  相似文献   

14.
Social media sites (e.g., Flickr) generate a huge amount of landmark photos with temporal information in the real-world, such as the photos describing the events happening near landmarks, and those showing different seasonal sceneries. Analyzing this temporal information of landmarks can benefit various applications, such as landmark timeline construction and tour recommendation. In this paper, we propose a novel Incremental Spatio-Temporal Theme Model (ISTTM), which can incrementally mine temporal themes that characterize the temporal information of landmarks, by differentiating them from the other three kinds of themes, i.e., general themes shared by most of all landmarks, local themes related to certain landmarks and the background theme including non-informative content. ISTTM works in an online way and is capable of selectively processing the updates of the distributions on different types of themes. Based on the proposed ISTTM, we present a framework, namely Temporal Theme Analysis for Landmarks (TTAL), which enables both periodic theme detection from discovered temporal themes and temporal theme visualization by selecting the relevant photos. We have conducted experiments on a large-scale landmark dataset from Flickr. Qualitative and quantitative evaluation results demonstrate the effectiveness of the ISTTM as well as the TTAL framework.  相似文献   

15.
Several techniques have been proposed to support user navigation of large information spaces (e.g., maps or web pages) on small-screen devices such as PDAs and Smartphones. In this paper, we present the results of an evaluation that compared three of these techniques to determine how they might affect performance and satisfaction of users. Two of the techniques are quite common on mobile devices: the first one (DoubleScrollbar) is the standard combination of two scrollbars for separate horizontal and vertical scrolling with zoom buttons to change the scale of the information space, the second one (Grab&Drag) enables users to navigate the information space by directly dragging its currently displayed portion, while zooming is handled through a slider control. The last technique (Zoom-Enhanced Navigator or ZEN) is an extension and adaptation to mobile screens of Overview&Detail approaches, which are based on displaying an overview of the information space together with a detail view of a portion of that space. In these approaches, navigation is usually supported by (i) highlighting in the overview which portion of space is displayed in the detail view, and (ii) allowing users to move the highlight within the overview area to change the corresponding portion of space in the detail area. Our experimental evaluation concerned tasks involving maps as well as web page navigation. The paper analyzes in detail the obtained results in terms of task completion times, number and duration of user interface actions, accuracy of the gained spatial knowledge, and subjective preferences.  相似文献   

16.
PI模型广泛应用于洪水预报和水资源管理等领域,为防汛、抗旱和水资源利用等重大决策提供了重要依据。API模型定线往往采用人工的方式进行绘制,在精度上和效率上比较欠缺。江苏省洪水预警预报平台在研发过程中为提高API模型预报精度,采用了混合蛙跳算法(SFLA)对API模型定线进行优化。本文主要介绍了混合蛙跳算法对API模型进行参数优化和智能定线的方法,并通过混合蛙跳算法、传统方法对71场洪水进行绘制,与传统定线方法相比,证实混合蛙跳法能有效提高API模型预报精度。  相似文献   

17.
In the era of ubiquitous computing, applications are emerging to benefit from using devices of different users and different capabilities together. This paper focuses on user-centric web browsing using multiple devices, where content of a web page is partitioned, adapted and allocated to devices in the vicinity. We contribute two novel web page partitioning algorithms. They differ from existing approaches by allowing for both, automatic and semi-automatic partitioning. On the one hand, this provides good automatic, web page independent results by utilizing sophisticated structural pre- and postprocessing of the web page. On the other hand, these results can be improved by considering additional semantic information provided through user-generated web page annotations. We further present a performance evaluation of our algorithms. Moreover, we contribute the results of a user study. These clearly show that (1) our algorithms provide good automatic results and (2) the application of user-centric, annotation-based semantic information leads to a significantly higher user satisfaction.  相似文献   

18.
英文网站的搜索引擎优化及其海外宣传策略   总被引:2,自引:0,他引:2  
文章介绍了搜索引擎基本概况及优化技术和网站海外宣传策略,包括搜索引擎定义、工作原理、网站优化的几种主要方法,海外宣传策略等等.有关搜索引擎的重要技术也在文中进行了详细介绍.  相似文献   

19.
Ranking web pages for presenting the most relevant web pages to user's queries is one of the main issues in any search engine. In this paper, two new ranking algorithms are offered, using Reinforcement Learning (RL) concepts. RL is a powerful technique of modern artificial intelligence that tunes agent's parameters, interactively. In the first step, with formulation of ranking as an RL problem, a new connectivity-based ranking algorithm, called RL_Rank, is proposed. In RL_Rank, agent is considered as a surfer who travels between web pages by clicking randomly on a link in the current page. Each web page is considered as a state and value function of state is used to determine the score of that state (page). Reward is corresponded to number of out links from the current page. Rank scores in RL_Rank are computed in a recursive way. Convergence of these scores is proved. In the next step, we introduce a new hybrid approach using combination of BM25 as a content-based algorithm and RL_Rank. Both proposed algorithms are evaluated by well known benchmark datasets and analyzed according to concerning criteria. Experimental results show using RL concepts leads significant improvements in raking algorithms.  相似文献   

20.
An incremental model is developed for the simulation of elasto-static multiphase frictional contact problems. The model employs a proposed local and nonlinear friction law, and exploits the incremental convex programming method in the framework of the finite element scheme. The proposed model accommodates the two types of inequality constraints which are representing non-interpenetration and slipping conditions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号