首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 27 毫秒
1.
集成Web使用挖掘和内容挖掘的用户浏览兴趣迁移挖掘算法   总被引:2,自引:0,他引:2  
提出了一种集成Web使用挖掘和内容挖掘的用户浏览兴趣迁移模式的模型和算法。介绍了Web页面及其聚类。通过替代用户事务中的页面为相应聚类的方法得到用户浏览兴趣序列。从用户浏览兴趣序列中得到用户浏览兴趣迁移模式。该模型对于网络管理者理解用户的行为特征和安排Web站点结构有较大的意义。  相似文献   

2.
基于日志挖掘的搜索引擎用户行为分析   总被引:1,自引:0,他引:1  
随着网络搜索用户的大规模增加,网络用户行为分析已成为网络信息检索系统进行架构分析、性能优化和系统维护的重要基石,是网络信息检索和知识挖掘的重要研究领域之一。为更好理解网络用户的搜索行为,该文基于7.56亿条真实网络用户行为日志,对用户行为进行分析和研究。我们主要考察了用户搜索行为中的查询长度、查询修改率、相关搜索点击率、首次/最后一次点击位置分布以及查询内点击数分布等信息。该文还基于不同类型的查询集合,考察用户在不同查询需求下的行为差异性。相关分析结果对搜索引擎算法优化和系统改进等都具有一定的参考意义。  相似文献   

3.
Internet has developed in a rapid way in the recent 10 years,and the information of web site has also been increasing fast. Predicting web user’s behavior becomes a crucial issue following the purposes like increasing the user’s browsing speed efficiently, decreasing the user’s latency as well as possible and reducing the loading of web server. In this paper, we propose an efficient prediction model, two-level prediction model (TLPM), using a novel aspect of natural hierarchical property from web log data. TLPM can decrease the size of candidate set of web pages and increase the speed of predicting with adequate accuracy. The experiment results prove that TLPM can highly enhance the performance of prediction when the number of web pages is increasing.  相似文献   

4.
Correlation-Based Web Document Clustering for Adaptive Web Interface Design   总被引:2,自引:2,他引:2  
A great challenge for web site designers is how to ensure users' easy access to important web pages efficiently. In this paper we present a clustering-based approach to address this problem. Our approach to this challenge is to perform efficient and effective correlation analysis based on web logs and construct clusters of web pages to reflect the co-visit behavior of web site users. We present a novel approach for adapting previous clustering algorithms that are designed for databases in the problem domain of web page clustering, and show that our new methods can generate high-quality clusters for very large web logs when previous methods fail. Based on the high-quality clustering results, we then apply the data-mined clustering knowledge to the problem of adapting web interfaces to improve users' performance. We develop an automatic method for web interface adaptation: by introducing index pages that minimize overall user browsing costs. The index pages are aimed at providing short cuts for users to ensure that users get to their objective web pages fast, and we solve a previously open problem of how to determine an optimal number of index pages. We empirically show that our approach performs better than many of the previous algorithms based on experiments on several realistic web log files. Received 25 November 2000 / Revised 15 March 2001 / Accepted in revised form 14 May 2001  相似文献   

5.
Web service reliability is an important mission that keeps web services running normally. Within web service, the web services invoked by users not only depend on the service itself, but also on web load condition (such as latency). Due to the features of web dynamics, traditional reliability methods have become inappropriate; at the same time, the web condition parameter sparsity problem will cause inaccurate reliability prediction. To address these new challenges, in this paper, we propose a new web service reliability prediction method based on machine learning considering user, web service and web condition. First we solve the web condition parameter sparsity problem, then we use the k-means clustering method to aggregate past invocation data, incorporate user, service, and web condition parameters to build a reliability feedback matrix, at last we predict web service reliability by considering specific web condition environments. The experiment shows that our machine learning method is able to solve the data sparsity problem and improve accurate web service reliability prediction, and we discuss how data sparsity and the number of feedback clusters to affect web service reliability prediction.  相似文献   

6.
基于类Markov链的用户浏览行为预测方法   总被引:2,自引:0,他引:2       下载免费PDF全文
何丽 《计算机工程》2008,34(22):32-33
根据浏览历史对用户进行有效聚类,建立基于用户聚类的用户浏览行为预测模型是Web环境下实现个性化服务的关键。该文对系统用户进行聚类,产生相似用户群,根据每个相似用户群的浏览特征,建立基于相似用户群的类Markov链用户浏览行为预测模型,实验验证了该模型的有效性。  相似文献   

7.
The use of online questionnaires is rapidly increasing. Contrary to manifold advantages, not much is known about user behavior that can be measured outside the boundaries set by standard web technologies like HTML form elements. To show how the lack of knowledge about the user setting in web studies can be accounted for, we present a tool called UserActionTracer, with which it is possible to collect more behavior information than with any other paradata gathering tool, in order to (1) gather additional data unobtrusively from the process of answering questions and (2) to visualize individual user behavior on web pages. In an empirical study on a large web sample (N = 1046) we observed and analysed online behaviors (e.g., clicking through). We found that only 10.5% of participants showed more than five single behaviors with highly negative influence on data quality in the whole online questionnaire (out of 132 possible single behavior judgments). Furthermore, results were validated by comparison with data from online address books. With the UserActionTracer it is possible to gain further insight into the process of answering online questionnaires.  相似文献   

8.
There is a significant commercial and research interest in location-based web search engines. Given a number of search keywords and one or more locations (geographical points) that a user is interested in, a location-based web search retrieves and ranks the most textually and spatially relevant web pages. In this type of search, both the spatial and textual information should be indexed. Currently, no efficient index structure exists that can handle both the spatial and textual aspects of data simultaneously and accurately. Existing approaches either index space and text separately or use inefficient hybrid index structures with poor performance and inaccurate results. Moreover, most of these approaches cannot accurately rank web-pages based on a combination of space and text and are not easy to integrate into existing search engines. In this paper, we propose a new index structure called Spatial-Keyword Inverted File for Points to handle point-based indexing of web documents in an integrated/efficient manner. To seamlessly find and rank relevant documents, we develop a new distance measure called spatial tf-idf. We propose four variants of spatial-keyword relevance scores and two algorithms to perform top-k searches. As verified by experiments, our proposed techniques outperform existing index structures in terms of search performance and accuracy.  相似文献   

9.
With the advent of the ubiquitous era, many studies have been devoted to various situation-aware services in the semantic web environment. One of the most challenging studies involves implementing a situation-aware personalized music recommendation service which considers the user’s situation and preferences. Situation-aware music recommendation requires multidisciplinary efforts including low-level feature extraction and analysis, music mood classification and human emotion prediction. In this paper, we propose a new scheme for a situation-aware/user-adaptive music recommendation service in the semantic web environment. To do this, we first discuss utilizing knowledge for analyzing and retrieving music contents semantically, and a user adaptive music recommendation scheme based on semantic web technologies that facilitates the development of domain knowledge and a rule set. Based on this discussion, we describe our Context-based Music Recommendation (COMUS) ontology for modeling the user’s musical preferences and contexts, and supporting reasoning about the user’s desired emotions and preferences. Basically, COMUS defines an upper music ontology that captures concepts on the general properties of music such as titles, artists and genres. In addition, it provides functionality for adding domain-specific ontologies, such as music features, moods and situations, in a hierarchical manner, for extensibility. Using this context ontology, we believe that logical reasoning rules can be inferred based on high-level (implicit) knowledge such as situations from low-level (explicit) knowledge. As an innovation, our ontology can express detailed and complicated relations among music clips, moods and situations, which enables users to find appropriate music. We present some of the experiments we performed as a case-study for music recommendation.  相似文献   

10.
11.
The topic on recommendation systems for mobile users has attracted a lot of attentions in recent years. However, most of the existing recommendation techniques were developed based only on geographic features of mobile users’ trajectories. In this paper, we propose a novel approach for recommending items for mobile users based on both the geographic and semantic features of users’ trajectories. The core idea of our recommendation system is based on a novel cluster-based location prediction strategy, namely TrajUtiRec, to improve items recommendation model. Our proposed cluster-based location prediction strategy evaluates the next location of a mobile user based on the frequent behaviors of similar users in the same cluster determined by analyzing users’ common behaviors in semantic trajectories. For each location, high utility itemset mining algorithm is performed for discovering high utility itemset. Accordingly, we can recommend the high utility itemset which is related to the location the user might visit. Through a comprehensive evaluation by experiments, our proposal is shown to deliver excellent performance.  相似文献   

12.
Web日志挖掘可以通过对用户访问模式进行分析,以获取用户的访问兴趣程度。目前,大多数的web日志挖掘是基于频率的,其挖掘的信息没有太大的价值。而提出的聚类技术是基于访问时间的,使用模糊向量表示用户浏览模式,记录用户是否浏览过该页面以及停留的时间。通过不同的聚类方法对用户的访问序列进行聚类分析。将模糊粗糙[k]-均值和夹角余弦相结合,提出了一种双层聚类技术,减少了对初始聚类中心的敏感性,并且通过一系列实验,论证了该聚类方法的可行性。而且,实验通过使用Davies-Bouldin指标来验证不同聚类方法的效果并进行比较。由于数据量大时,仍然存在算法效率低的问题,因此,使用MapReduce实现双层聚类的并行化,提高了聚类的效率。  相似文献   

13.
网页信息的更新是网络一个非常重要的性质。同网络的其他应用类似,随着WWW信息内容更新的不断加快,如何有效地跟踪特定网站和页面的更新情况日渐成为人们关心的课题。论文讨论一个自适应的网页信息跟踪系统ChangeSpider,研究其体系结构、关键技术等方面的内容。实验表明ChangeSpider能够有效地跟踪网页的信息变化,及时地将变化的内容提交给用户。  相似文献   

14.
As the web grows,the massive increase in information is placing severe burdens on information retrieval and sharing.Automated search engines and directories with small editorial staff are unable to keep up with the increasing submission of web sites.To address the problem,this paper presents Infomarker-an Internet information service system based on open Directory and Zero-Keyword Inquiry,The Open DIrectory sets up a net-community in which the increasing netcitizens can each organize a small portion of the web and present it to the others.By means of Zero-Keyword Inquiry,user can get the information he is interested in with out inputting any keyword that is often required by search engines,In Infomarker,user can record the web address he likes and can put forward an information request based on his wed records.The information matching engine checks the information in the Open Directory to find what fits user‘s needs and adds it to user‘s web address records.The key to the matching process is layered keyword mapping.Informarker provides people with a whole new approach to getting information and shows a wide prospect.  相似文献   

15.
基于本体的Web使用知识发现模型及应用   总被引:3,自引:0,他引:3       下载免费PDF全文
何丽  严冬梅  韩文秀 《计算机工程》2006,32(14):169-171
本体在Web上的应用能够有效解决Web信息共享的语义问题。该文提出了基于Web本体和服务器日志文件的知识发现模型,主要讨论了用户访问行为的表示、语义用户分布的定义及发现算法。最后介绍了Web使用知识发现模型在Web个性化系统中的应用。  相似文献   

16.
基于信息寻觅智能体的网络用户浏览模式研究   总被引:3,自引:0,他引:3  
互联网的规模越来越大,其分布性、动态性、进化性使得其成为复杂系统研究的一个很好的对象。互联网上存在许多规律,利用多智能体的方法研究了网络用户的浏览行为,通过信息智能体在虚拟网络空间的浏览研究,指出网络用户的浏览行为与用户兴趣的独特分布有关,而且指出网络拓扑结构存在最佳值,可以使用户获取信息与浏览步。  相似文献   

17.
Traditional collaborative filtering (CF) based recommender systems on the basis of user similarity often suffer from low accuracy because of the difficulty in finding similar users. Incorporating trust network into CF-based recommender system is an attractive approach to resolve the neighbor selection problem. Most existing trust-based CF methods assume that underlying relationships (whether inferred or pre-existing) can be described and reasoned in a web of trust. However, in online sharing communities or e-commerce sites, a web of trust is not always available and is typically sparse. The limited and sparse web of trust strongly affects the quality of recommendation. In this paper, we propose a novel method that establishes and exploits a two-faceted web of trust on the basis of users’ personal activities and relationship networks in online sharing communities or e-commerce sites, to provide enhanced-quality recommendations. The developed web of trust consists of interest similarity graphs and directed trust graphs and mitigates the sparsity of web of trust. Moreover, the proposed method captures the temporal nature of trust and interest by dynamically updating the two-faceted web of trust. Furthermore, this method adapts to the differences in user rating scales by using a modified Resnick’s prediction formula. As enabled by the Pareto principle and graph theory, new users highly benefit from the aggregated global interest similarity (popularity) in interest similarity graph and the global trust (reputation) in the directed trust graph. The experiments on two datasets with different sparsity levels (i.e., Jester and MovieLens datasets) show that the proposed approach can significantly improve the predictive accuracy and decision-support accuracy of the trust-based CF recommender system.  相似文献   

18.
通过Web日志文件,识别用户及用户会话序列,然后提取会话序列所对应的Web页面内容,得到Web页面内容的核心概念,用核心概念描述会话主题,基于会话主题实现会话的切分。最后结合一个消费平台消费者的会话记录及Web内容,验证了该方法的准确性。  相似文献   

19.
基于联合概率矩阵分解的上下文广告推荐算法   总被引:3,自引:0,他引:3  
上下文广告与用户兴趣及网页内容相匹配,可增强用户体验并提高广告点击率.而广告收益与广告点击率直接相关,准确预测广告点击率是提高上下文广告收益的关键.目前,上下文广告推荐面临如下问题:(1) 网页数量及用户数量规模很大;(2) 历史广告点击数据十分稀疏,导致点击率预测准确率低.针对上述问题,提出一种基于联合概率矩阵分解的因子模型AdRec,它结合用户、广告和网页三者信息进行广告推荐,以解决数据稀疏时点击率预测准确率低的问题.算法复杂度随着观测数据数量的增加呈线性增长,因此可应用于大规模数据.  相似文献   

20.
There are many parameters that may affect the navigation behaviour of web users. Prediction of the potential next page that may be visited by the web user is important, since this information can be used for prefetching or personalization of the page for that user. One of the successful methods for the determination of the next web page is to construct behaviour models of the users by clustering. The success of clustering is highly correlated with the similarity measure that is used for calculating the similarity among navigation sequences. This work proposes a new approach for determining the next web page by extending the standard clustering with the content-based semantic similarity method. Semantics of web-pages are represented as sets of concepts, and thus, user session are modelled as sequence of sets. As a result, session similarity is defined as an alignment of two sequences of sets. The success of the proposed method has been shown through applying it on real life web log data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号