首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
集成Web使用挖掘和内容挖掘的用户浏览兴趣迁移挖掘算法   总被引:2,自引:0,他引:2  
提出了一种集成Web使用挖掘和内容挖掘的用户浏览兴趣迁移模式的模型和算法。介绍了Web页面及其聚类。通过替代用户事务中的页面为相应聚类的方法得到用户浏览兴趣序列。从用户浏览兴趣序列中得到用户浏览兴趣迁移模式。该模型对于网络管理者理解用户的行为特征和安排Web站点结构有较大的意义。  相似文献   

2.
提出了一个结合Web文本挖掘的分布式Web使用挖掘模型DWLMST,以及基于该模型的局部浏览兴趣迁移模式更新算法LITP和全局浏览兴趣迁移模式更新算法GITP。利用页面聚类来表示用户兴趣。通过将用户事务中的页面替代为相应的聚类号来得到用户浏览兴趣序列。从用户浏览兴趣序列中分析得到用户浏览兴趣迁移模式。算法较好地解决了Web访问信息的异地存储、实时增长等因素给模式分析过程带来的困难,同时也提高了用户浏览兴趣表示的准确性。  相似文献   

3.
用户访问兴趣路径挖掘方法   总被引:2,自引:1,他引:1       下载免费PDF全文
针对当前挖掘用户访问模式算法仅将频繁访问路径作为用户浏览兴趣路径的问题,依据使用Web日志挖掘用户兴趣页面时,通过引入页面信息量参数,综合考虑页面访问次数、浏览时间和页面信息量大小来定义用户兴趣度,提出了基于兴趣度的用户访问模式挖掘算法。实验证明该算法是有效的,在用户浏览兴趣度量方面比当前的频繁访问路径挖掘算法更准确。  相似文献   

4.
方刚 《计算机系统应用》2010,19(12):100-104
针对Web服务器日志中会话模式的页面属性为布尔量的特点,提出一种基于序列数的Web使用挖掘算法。该算法将用户会话模式转换成二进制数,然后用数字递增方式搜索候选频繁项;算法通过序列数的维来计算支持数,实现一次扫描用户会话模式,有效地提高了Web使用挖掘的效率。实验表明其效率比现有算法更快速而有效。  相似文献   

5.
基于频繁偏爱度的使用模式挖掘算法的研究   总被引:1,自引:0,他引:1  
吴进  宋顺林  王迎春 《计算机应用》2006,26(10):2425-2426
提出基于频繁偏爱度的使用模式挖掘算法,充分考虑页面停留时间对用户偏爱度的影响,利用ASP.NET和XML来收集用户使用信息,划分为用户事务,挖掘出频繁偏爱使用模式。实验证明:该算法比当前的使用模式挖掘算法计算量明显减小,精确度有所提高,真实的反映了大多数用户的偏爱访问模式。  相似文献   

6.
通过给出页面层次的概念,充分考虑用户在页面上的浏览时间以及在路径选择上表现出来的浏览偏爱,结合Web站点的结构层次特征,提出了一种改进的Web用户浏览偏爱模式挖掘算法.通过具体的事例和试验数据证明,新的模型能够更准确地寻找用户浏览偏爱模式,从而发现用户的兴趣和爱好.  相似文献   

7.
基于用户访问模式挖掘的网页实时推荐研究   总被引:1,自引:1,他引:1  
文中将数据挖掘技术用于Web日忘文件的挖掘,提出一个简单高效的关联规则和序列模式挖掘算法Predictor,谊算法具有较快的响应速度.可以满足实时页面推荐的需要,同时该算法还可以进行增量挖掘。  相似文献   

8.
本文提出了基于关联规则的挖掘最大频繁访问的新算法——s-Tree算法,并以此去分析用户的访问模式,挖掘出特定用户访问模式和浏览偏爱路径信息,进而优化站点结构,为用户提供“一对一”个性化的Web页面访问预测及内容推荐。  相似文献   

9.
Web日志挖掘预处理中的Frame页面过滤算法   总被引:12,自引:0,他引:12  
Web日志挖掘是将数据挖掘技术应用到Web服务器的日志中,发现Web用户的行为模式,在介绍了典型的数据预处理技术的基础上,指出Frame页面降低了挖掘结果的兴趣性,并提出相应的解决方法-Frame页面过滤算法消除其影响。通过实验数据对该算法进行验证,说明Frame页面过滤算法可以显著地提高Web日志挖掘结果的兴趣性。  相似文献   

10.
Web用户访问多是匿名访问,Web日志挖掘的主要目标是从Web访问记录中抽取用户行为模式,通过分析挖掘结果理解用户的行为,从而改进站点的结构.Web日志挖掘第一步是进行数据预处理.数据预处理是Web页面分析中最耗时的阶段,首先研究了数据预处理的过程,包括数据清洗、用户识别、会话识别、路径补充.提出了一种路径补充的算法,...  相似文献   

11.
随着Web服务的高速发展,其可靠性已经受到了越来越多的重视.针对当前拜占庭错误容忍算法对Web服务支持的不足,提出并设计了一种面向Web服务的拜占庭错误容忍算法.该算法与著名的CLBFT算法有显著区别.在Web服务中,为了支持组合服务必须对进行通信的Web服务双方都创建复制品,而CLBFT算法只在服务器方创建复制品.使用基于状态机的主动复制技术,在进行通信的Web服务双方都创建复制品.引入接收窗口和接收点对异步环境下的消息进行批量确认,同步各复制品收到消息的时间.用I/O自动机给出该算法的自动机模型,并通过遵循TPC-App Benchmark规范的实验平台实现了算法,验证了算法可行性.  相似文献   

12.
A Data Cube Model for Prediction-Based Web Prefetching   总被引:7,自引:0,他引:7  
Reducing the web latency is one of the primary concerns of Internet research. Web caching and web prefetching are two effective techniques to latency reduction. A primary method for intelligent prefetching is to rank potential web documents based on prediction models that are trained on the past web server and proxy server log data, and to prefetch the highly ranked objects. For this method to work well, the prediction model must be updated constantly, and different queries must be answered efficiently. In this paper we present a data-cube model to represent Web access sessions for data mining for supporting the prediction model construction. The cube model organizes session data into three dimensions. With the data cube in place, we apply efficient data mining algorithms for clustering and correlation analysis. As a result of the analysis, the web page clusters can then be used to guide the prefetching system. In this paper, we propose an integrated web-caching and web-prefetching model, where the issues of prefetching aggressiveness, replacement policy and increased network traffic are addressed together in an integrated framework. The core of our integrated solution is a prediction model based on statistical correlation between web objects. This model can be frequently updated by querying the data cube of web server logs. This integrated data cube and prediction based prefetching framework represents a first such effort in our knowledge.  相似文献   

13.
Web prefetching is a technique aimed at reducing user-perceived latencies in the World Wide Web. The spatial locality shown by user accesses makes it possible to predict future accesses from the previous ones. A prefetching engine uses these predictions to prefetch web objects before the user demands them. The existing prediction algorithms achieved an acceptable performance when they were proposed but the high increase in the number of embedded objects per page has reduced their effectiveness in the current web. In this paper, we show that most of the predictions made by the existing algorithms are not useful to reduce the user-perceived latency because these algorithms do not take into account the structure of the current web pages, i.e., an HTML object with several embedded objects. Thus, they predict the accesses to the embedded objects in an HTML after reading the HTML itself. For this reason, the prediction is not made early enough to prefetch the objects and, therefore, there is no latency reduction. In this paper we present the double dependency graph (DDG) algorithm that distinguishes between container objects (HTML) and embedded objects to create a new prediction model according to the structure of the current web. Results show that, for the same number of extra requests to the server, DDG reduces the perceived latency, on average, 40% more than the existing algorithms. Moreover, DDG distributes latency reductions more homogeneously among users.  相似文献   

14.
Web service compositions are becoming more and more complex, involving numerous interacting ad-hoc services. These services are often implemented as business processes themselves. By analysing such complex web service compositions one is able to better understand, control and eventually re-design them. Our contribution to this problem is a mining algorithm, based on a statistical technique to discover composite web service patterns from execution logs. Our approach is characterised by a “local” pattern’s discovery that covers partial results through a dynamic programming algorithm. Those locally discovered patterns are then composed iteratively until the composite Web service is discovered. The analysis of the disparities between the discovered model and the initial ad-hoc composite model (delta-analysis) enables initial design gaps to be detected and thus to re-engineer the initial Web service composition.  相似文献   

15.
Advances in the data mining technologies have enabled the intelligent Web abilities in various applications by utilizing the hidden user behavior patterns discovered from the Web logs. Intelligent methods for discovering and predicting user’s patterns is important in supporting intelligent Web applications like personalized services. Although numerous studies have been done on Web usage mining, few of them consider the temporal evolution characteristic in discovering web user’s patterns. In this paper, we propose a novel data mining algorithm named Temporal N-Gram (TN-Gram) for constructing prediction models of Web user navigation by considering the temporality property in Web usage evolution. Moreover, three kinds of new measures are proposed for evaluating the temporal evolution of navigation patterns under different time periods. Through experimental evaluation on both of real-life and simulated datasets, the proposed TN-Gram model is shown to outperform other approaches like N-gram modeling in terms of prediction precision, in particular when the web user’s navigating behavior changes significantly with temporal evolution.  相似文献   

16.
加速评估算法:一种提高Web结构挖掘质量的新方法   总被引:13,自引:1,他引:13  
利用Web结构挖掘可以找到Web上的高质量网页,它大大地提高了搜索引擎的检索精度,目前的Web结构挖掘算法是通过统计链接到每个页面的超链接的数量和源结点的质量对页面进行评估,基于统计链接数目的算法存在一个严重缺陷:页面评价两极分化,一些传统的高质量页面经常出现在Web检索结果的前面,而Web上新加入的高质量页面很难被用户找到,提出了加速评估算法以克服现有Web超链接分析中的不足,并通过搜索引擎平台对算法进行了测试和验证。  相似文献   

17.
在对目前我国煤业集团采购管理和库存控制现状弊端进行分析的基础上,提出了在我国煤业集团构建网络采购平台,通过集团公司网络采购和构建库存控制系统来实现二级单位生产物料库存控制的构想;对煤业集团基于网络采购的库存控制模式结构进行设计;提出了二级单位生产物料库存控制的核心算法并对算法原理进行设计,通过实例对库存控制算法进行验证。  相似文献   

18.

Transactions through the web are now a progressive mechanism to access an ever-increasing range of services over more and more diverse environments. The internet provides many opportunities for companies to provide personalized online services to their customers, but the quality and novelty of some web services may adversely affect the appeal and user gratification. In the future, prediction of the consumer intention needs to be a main focus for selecting the web services for an application. The aim of this study is to predict online consumer repurchase intentions; to accomplish this objective a hybrid approach is chosen with a combination of machine learning techniques and artificial bee colony (ABC) algorithm being used. The study starts with identification of consumer characteristics for repurchase intention, followed by determining the feature selection of consumer characteristics and shopping mall attributes (with >0.1 threshold value) for the prediction model using ABC. Finally, validation using k-fold cross has been employed to measure the best classification model robustness. The classification models, viz. decision trees (C5.0), AdaBoost, random forest, support vector machine and neural network, are utilized to predict consumer purchase intention. Performance evaluation of identified models on training–testing partitions (70–30%) of the data set shows that the AdaBoost method outperforms other classification models, with sensitivity and accuracy of 0.95 and 97.58%, respectively, on testing the data set. Examining the consumer repurchase intentions by considering both shopping mall and consumer characteristics makes this study unique.

  相似文献   

19.
随着Internet的飞速发展,Web应用系统在电子政务与电子商务中得到广泛应用,安全问题随之产生.入侵检测是保障Web应用系统安全的重要手段之一,利用可视化技术辅助安全专家创建轮廓有助于提高正常行为轮廓的准确程度,进而提高入侵检测性能.然而,传统基于散乱点的可视化模型对大样本数据的显示效果较差,在Internet环境中应用受限.本文针对传统模型的缺陷,提出了基于密度场的可视化模型及其相关算法,为安全专家提供更丰富的可视信息,以便安全专家能更准确地创建正常用户行为轮廓.本文还通过实验对两种可视化模型的显示效果进行了对比.  相似文献   

20.
服务质量是Web服务发现中的关键问题.该文提出一种支持QoS的自适应Web服务发现模型,在满足用户对Web服务功能需求的基础上,依据QoS等非功能性指标对Web服务进行选择和排序,并给出服务的选择排序算法,该算法以服务注册参数、用户反馈信息和实时监测数据为依据,能动态地对服务的非功能性指标进行评价,返回最合适的服务给用户.实验证明,该模型有效地提高了系统的可用性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号