首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Users of a Web site usually perform their interest-oriented actions by clicking or visiting Web pages, which are traced in access log files. Clustering Web user access patterns may capture common user interests to a Web site, and in turn, build user profiles for advanced Web applications, such as Web caching and prefetching. The conventional Web usage mining techniques for clustering Web user sessions can discover usage patterns directly, but cannot identify the latent factors or hidden relationships among users?? navigational behaviour. In this paper, we propose an approach based on a vector space model, called Random Indexing, to discover such intrinsic characteristics of Web users?? activities. The underlying factors are then utilised for clustering individual user navigational patterns and creating common user profiles. The clustering results will be used to predict and prefetch Web requests for grouped users. We demonstrate the usability and superiority of the proposed Web user clustering approach through experiments on a real Web log file. The clustering and prefetching tasks are evaluated by comparison with previous studies demonstrating better clustering performance and higher prefetching accuracy.  相似文献   

2.
《Information Systems》2006,31(4-5):247-265
As more information becomes available on the Web, there has been a crescent interest in effective personalization techniques. Personal agents providing assistance based on the content of Web documents and the user interests emerged as a viable alternative to this problem. Provided that these agents rely on having knowledge about users contained into user profiles, i.e., models of user preferences and interests gathered by observation of user behavior, the capacity of acquiring and modeling user interest categories has become a critical component in personal agent design. User profiles have to summarize categories corresponding to diverse user information interests at different levels of abstraction in order to allow agents to decide on the relevance of new pieces of information. In accomplishing this goal, document clustering offers the advantage that an a priori knowledge of categories is not needed, therefore the categorization is completely unsupervised. In this paper we present a document clustering algorithm, named WebDCC (Web Document Conceptual Clustering), that carries out incremental, unsupervised concept learning over Web documents in order to acquire user profiles. Unlike most user profiling approaches, this algorithm offers comprehensible clustering solutions that can be easily interpreted and explored by both users and other agents. By extracting semantics from Web pages, this algorithm also produces intermediate results that can be finally integrated in a machine-understandable format such as an ontology. Empirical results of using this algorithm in the context of an intelligent Web search agent proved it can reach high levels of accuracy in suggesting Web pages.  相似文献   

3.
In order to be capable of exploiting context for pro-active information recommendation, agents need to extract and understand user activities based on their knowledge of the user interests. In this paper, we propose a novel approach for context-aware recommendation in browsing assistants based on the integration of user profiles, navigational patterns and contextual elements. In this approach, user profiles built using an unsupervised Web page clustering algorithm are used to characterize user ongoing activities and behavior patterns. Experimental evidence show that using longer-term interests to explain active browsing goals user assistance is effectively enhanced.
Analía AmandiEmail:
  相似文献   

4.
杨顺  赵佳程 《测控技术》2018,37(11):68-71
随着WiFi技术的发展,当前国内健身房管理方式逐渐从传统射频卡识别转化成WiFi室内定位识别,改进后有效地节约了健身成本。目前WiFi室内定位技术如三边定位、贝叶斯定位算法等已均被提出,但仍存在定位误差大、计算复杂度高等弊端。因此,提出基于滤波的三边定位及模式聚类两种定位算法。基于滤波的三边定位通过增加卡尔曼滤波,降低噪声数据的影响,从而减小定位误差。模式聚类定位转变常用的贝叶斯概率聚类方式,采用健身者运动模式进行个性化聚类,有效地提高定位识别率和准确率。实验结果表明,相比基线方法,基于滤波的三边定位和模式聚类定位算法应用于健身场所更加精确化、人性化。  相似文献   

5.
集成Web使用挖掘和内容挖掘的用户浏览兴趣迁移挖掘算法   总被引:2,自引:0,他引:2  
提出了一种集成Web使用挖掘和内容挖掘的用户浏览兴趣迁移模式的模型和算法。介绍了Web页面及其聚类。通过替代用户事务中的页面为相应聚类的方法得到用户浏览兴趣序列。从用户浏览兴趣序列中得到用户浏览兴趣迁移模式。该模型对于网络管理者理解用户的行为特征和安排Web站点结构有较大的意义。  相似文献   

6.
Understanding user contexts and group structures plays a central role in pervasive computing. These contexts and community structures are complex to mine from data collected in the wild due to the unprecedented growth of data, noise, uncertainties and complexities. Typical existing approaches would first extract the latent patterns to explain human dynamics or behaviors and then use them as a way to consistently formulate numerical representations for community detection, often via a clustering method. While being able to capture high-order and complex representations, these two steps are performed separately. More importantly, they face a fundamental difficulty in determining the correct number of latent patterns and communities. This paper presents an approach that seamlessly addresses these challenges to simultaneously discover latent patterns and communities in a unified Bayesian nonparametric framework. Our Simultaneous Extraction of Context and Community (SECC) model roots in the nested Dirichlet process theory which allows a nested structure to be built to summarize data at multiple levels. We demonstrate our framework on five datasets where the advantages of the proposed approach are validated.  相似文献   

7.
Social media and mobile devices have revolutionized the way people communicate and share information in various contexts, such as in cities. In today’s “smart” cities, massive amounts of multiple forms of geolocated content is generated daily in social media, out of which knowledge for social interactions and urban dynamics can be derived. This work addresses the problem of detecting urban social activity patterns and interactions, by modeling cities into “dynamic areas”, i.e., coherent geographic areas shaped through social activities. Social media users provide the information on such social activities and interactions in cases when they are on the move around the city neighborhoods. The proposed approach models city places as feature vectors which represent users visiting patterns (social activity), the time of observed visits (temporal activity), and the context of functionality of visited places category. To uncover the dynamics of city areas, a clustering approach is proposed which considers the derived feature vectors to group people’s activities with respect to location, time, and context. The proposed methodology has been implemented on the DynamiCITY platform which demonstrates neighborhood analytics via a Web interface that allows end-users to explore neighborhoods dynamics and gain insights for city cross-neighborhood patterns and inter-relationships.  相似文献   

8.
Mobile context modeling is a process of recognizing and reasoning about contexts and situations in a mobile environment, which is critical for the success of context-aware mobile services. While there are prior works on mobile context modeling, the use of unsupervised learning techniques for mobile context modeling is still under-explored. Indeed, unsupervised techniques have the ability to learn personalized contexts, which are difficult to be predefined. To that end, in this paper, we propose an unsupervised approach to modeling personalized contexts of mobile users. Along this line, we first segment the raw context data sequences of mobile users into context sessions where a context session contains a group of adjacent context records which are mutually similar and usually reflect the similar contexts. Then, we exploit two methods for mining personalized contexts from context sessions. The first method is to cluster context sessions and then to extract the frequent contextual feature-value pairs from context session clusters as contexts. The second method leverages topic models to learn personalized contexts in the form of probabilistic distributions of raw context data from the context sessions. Finally, experimental results on real-world data show that the proposed approach is efficient and effective for mining personalized contexts of mobile users.  相似文献   

9.
In this study, we experiment with several multiobjective evolutionary algorithms to determine a suitable approach for clustering Web user sessions, which consist of sequences of Web pages visited by the users. Our experimental results show that the multiobjective evolutionary algorithm-based approaches are successful for sequence clustering. We look at a commonly used cluster validity index to verify our findings. The results for this index indicate that the clustering solutions are of high quality. As a case study, the obtained clusters are then used in a Web recommender system for representing usage patterns. As a result of the experiments, we see that these approaches can successfully be applied for generating clustering solutions that lead to a high recommendation accuracy in the recommender model we used in this paper.  相似文献   

10.
在语义Web服务发现中,服务本身及用户所处的语境是不可忽视的因素.针对现有服务发现方法的不足,给出一种基于语境和动作推理的语义Web服务发现方法.该方法通过建立基于动作的语境模型来刻画静态和动态的语境信息,利用动态描述逻辑中的动作推理实现语境推理,并在此基础上实现语境敏感的语义Web服务发现算法.案例研究及相关工作对比表明,与现有方法相比,文中方法在语境刻画和推理能力方面均有较优表现.同时实验结果也表明,在增加合理逻辑推理的时空开销的前提下,文中方法能为用户提供更符合需求的服务发现结果.  相似文献   

11.
基于k-means聚类的无导词义消歧   总被引:5,自引:3,他引:5  
无导词义消歧避免了人工词义标注的巨大工作量,可以适应大规模的多义词消歧工作,具有广阔的应用前景。这篇文章提出了一种无导词义消歧的方法,该方法采用二阶context 构造上下文向量,使用k-means算法进行聚类,最后通过计算相似度来进行词义的排歧. 实验是在抽取术语的基础上进行的,在多个汉语高频多义词的两组测试中取得了平均准确率82167 %和80187 %的较好的效果。  相似文献   

12.
Clustering has always been an exploratory but critical step in the knowledge discovery process. Often unsupervised, the clustering task received a huge interest when reinforced by different kinds of inputs provided by the user. This paper presents an approach giving the possibility to incorporate business knowledge in order to guide the clustering algorithm. A formalization of the fact that an intuitive a priori prioritization of the variables might exist, is presented in this paper and applied in a direct marketing context using recent data. By providing the analyst with a new approach offering different clustering perspectives, this paper proposes a straightforward way to apply constrained clustering with soft attribute-level constraints based on feature order preferences.  相似文献   

13.
This paper proposes a dead-reckoning (DR)/WiFi fingerprinting/magnetic matching (MM) integration structure that uses off-the-shelf sensors in consumer portable devices and existing WiFi infrastructures. One key improvement of this structure over previous DR/WiFi/MM fusion structures is the introduction of a three-level quality-control (QC) mechanism based on the interaction between different techniques. On QC Level #1, several criteria are applied to filter out blunders or unreliable measurements in each separate technology. Then, on Level #2, a threshold-based approach is used to set the weight of WiFi results automatically through the investigation of the EKF innovation sequence. Finally, on Level #3, DR/WiFi results are utilized to limit the MM search space and in turn reduce both mismatch rate and computational load. The proposed structure reduced the root mean square (RMS) of position errors in the range of 13.3 to 55.2% in walking experiments with two smartphones, under four motion conditions, and in two indoor environments. Furthermore, the proposed structure reduced the rate of mismatches (i.e., matching to an incorrect point that is geographically located over 15 m away from the true position) rate by over 75.0% when compared with previous DR/WiFi/MM integration structures.  相似文献   

14.
Clustering requires the user to define a distance metric, select a clustering algorithm, and set the hyperparameters of that algorithm. Getting these right, so that a clustering is obtained that meets the users subjective criteria, can be difficult and tedious. Semi-supervised clustering methods make this easier by letting the user provide must-link or cannot-link constraints. These are then used to automatically tune the similarity measure and/or the optimization criterion. In this paper, we investigate a complementary way of using the constraints: they are used to select an unsupervised clustering method and tune its hyperparameters. It turns out that this very simple approach outperforms all existing semi-supervised methods. This implies that choosing the right algorithm and hyperparameter values is more important than modifying an individual algorithm to take constraints into account. In addition, the proposed approach allows for active constraint selection in a more effective manner than other methods.  相似文献   

15.
目前的Web服务发现方法,由于没有充分利用用户情境信息,导致在服务发现时间和结果准确率方面存在不足。首先对包括当前用户在内的情境相似的用户进行了聚类,缩小了服务发现的范围;然后,在此基础上利用当前用户偏好信息及情境相似的历史用户感知到的候选服务的QoS数据,给出了一种基于历史用户QoS感知的Web服务发现方法,包括候选服务的QoS数据获取和综合权重计算;最后,结合实验并与其他Web服务发现方法进行比较,证明了该方法在服务发现结果的准确率和时间效率方面均有了一定的提升。  相似文献   

16.
A recommender system is used in various fields to recommend items of interest to the users. Most recommender approaches focus only on the users and items to make the recommendations. However, in many applications, it is also important to incorporate contextual information into the recommendation process. Although the use of contextual information has received great focus in recent years, there is a lack of automatic methods to obtain such information for context-aware recommender systems. Some works address this problem by proposing supervised methods, which require greater human effort and whose results are not so satisfactory. In this scenario, we propose an unsupervised method to extract contextual information from web page content. Our method builds topic hierarchies from page textual content considering, besides the traditional bag-of-words, valuable information of texts as named entities and domain terms (privileged information). The topics extracted from the hierarchies are used as contextual information in context-aware recommender systems. We conducted experiments by using two data sets and two baselines: the first baseline is a recommendation system that does not use contextual information and the second baseline is a method proposed in literature to extract contextual information. The results are, in general, very good and present significant gains. In conclusion, our method has advantages and innovations:(i) it is unsupervised; (ii) it considers the context of the item (Web page), instead of the context of the user as in most of the few existing methods, which is an innovation; (iii) it uses privileged information in addition to the existing technical information from pages; and (iv) it presented good and promising empirical results. This work represents an advance in the state-of-the-art in context extraction, which means an important contribution to context-aware recommender systems, a kind of specialized and intelligent system.  相似文献   

17.
Continuously identifying a user’s location context provides new opportunities to understand daily life and human behavior. Indoor location systems have been mainly based on WiFi infrastructures which consume a great deal of energy mostly due to keeping the user’s WiFi device connected to the infrastructure and network communication, limiting the overall time when a user can be tracked. Particularly such tracking systems on battery-limited mobile devices must be energy-efficient to limit the impact on the experience of using a phone. Recently, there have been a lot of studies of energy-efficient positioning systems, but these have focused on outdoor positioning technologies. In this paper, we propose a novel indoor tracking framework that intelligently determines the location sampling rate and the frequency of network communication, to optimize the accuracy of the location data while being energy-efficient at the same time. This framework leverages an accelerometer, widely available on everyday smartphones, to reduce the duty cycle and the network communication frequency when a tracked user is moving slowly or not at all. Our framework can work for 14 h without charging, supporting applications that require this location information without affecting user experience.  相似文献   

18.
基于MDL聚类的无导词义消歧   总被引:2,自引:0,他引:2  
无导词义消歧避免了人工词义标注的巨大工作量,可以适应大规模的多义词消歧工作,具有广阔的应用前景.提出了一种无导词义消歧的方法,该方法以hownet词库为词典,采用二阶上下文构造上下文向量,使用MDL算法进行聚类,最后通过计算相似度来进行词义的排歧.实验是在抽取术语的基础上进行的,在8个汉语高频多义词的测试中取得了平均准确率81.12%的较好的效果.  相似文献   

19.
The degree of personalization that a Web site offers in presenting its services to users is an important attribute contributing to the site's popularity. Web server access logs contain substantial data about user access patterns. One way to solve this problem is to group users on the basis of their Web interests and then organize the site's structure according to the needs of different groups. Two main difficulties inhibit this approach: the essentially infinite diversity of user interests and the change in these interests with time. We have developed a clustering algorithm that groups users according to their Web access patterns. The algorithm is based on the ART1 version of adaptive resonance theory. In our ART1-based algorithm, a prototype vector represents each user cluster by generalizing the URLs most frequently accessed by all cluster members. We have compared our algorithm's performance with the traditional k-means clustering algorithm. Results showed that the ART1-based technique performed better in terms of intracluster distances. We also applied the technique in a prefetching scheme that predicts future user requests.  相似文献   

20.
基于Web日志的用户访问模式挖掘   总被引:1,自引:0,他引:1  
Web日志挖掘是数据挖掘技术在Web日志数据存储中的应用。论文介绍了Web日志挖掘,在分析发现用户访问模式方法——类Apriori算法的基础上,给出一种基于粗糙集的用户访问模式聚类方法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号