首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.

Web page recommendations have attracted increasing attention in recent years. Web page recommendation has different characteristics compared to the classical recommenders. For example, the recommender cannot simply use the user-item utility prediction method as e-commerce recommendation, which would face the repeated item cold-start problem. Recent researches generally classify the web page articles before recommending. But classification often requires manual labors, and the size of each category may be too large. Some studies propose to utilize clustering method to preprocess the web page corpus and achieve good results. But there are many differences between different clustering methods. For instance, some clustering methods are of high time complexity; in addition, some clustering methods rely on initial parameters by iterative computing whose results probably aren’t stable. In order to solve the above issues, we propose a web page recommendation based on twofold clustering by considering both effectiveness and efficiency, and take the popularity and freshness factors into account. In our proposed clustering, we combined the strong points of density-based clustering and the k-means clustering. The core idea is that we used the density-based clustering in sample data to get the number of clusters and the initial center of each cluster. The experimental results show that our method performs better diversity and accuracy compared to the state-of-the-art approaches.

  相似文献   

2.
Evolutionary algorithms (EAs) are often well-suited for optimization problems involving several, often conflicting objectives. Since 1985, various evolutionary approaches to multiobjective optimization have been developed that are capable of searching for multiple solutions concurrently in a single run. However, the few comparative studies of different methods presented up to now remain mostly qualitative and are often restricted to a few approaches. In this paper, four multiobjective EAs are compared quantitatively where an extended 0/1 knapsack problem is taken as a basis. Furthermore, we introduce a new evolutionary approach to multicriteria optimization, the strength Pareto EA (SPEA), that combines several features of previous multiobjective EAs in a unique manner. It is characterized by (a) storing nondominated solutions externally in a second, continuously updated population, (b) evaluating an individual's fitness dependent on the number of external nondominated points that dominate it, (c) preserving population diversity using the Pareto dominance relationship, and (d) incorporating a clustering procedure in order to reduce the nondominated set without destroying its characteristics. The proof-of-principle results obtained on two artificial problems as well as a larger problem, the synthesis of a digital hardware-software multiprocessor system, suggest that SPEA can be very effective in sampling from along the entire Pareto-optimal front and distributing the generated solutions over the tradeoff surface. Moreover, SPEA clearly outperforms the other four multiobjective EAs on the 0/1 knapsack problem  相似文献   

3.
4.
Discovery and Evaluation of Aggregate Usage Profiles for Web Personalization   总被引:21,自引:1,他引:20  
Web usage mining, possibly used in conjunction with standard approaches to personalization such as collaborative filtering, can help address some of the shortcomings of these techniques, including reliance on subjective user ratings, lack of scalability, and poor performance in the face of high-dimensional and sparse data. However, the discovery of patterns from usage data by itself is not sufficient for performing the personalization tasks. The critical step is the effective derivation of good quality and useful (i.e., actionable) aggregate usage profiles from these patterns. In this paper we present and experimentally evaluate two techniques, based on clustering of user transactions and clustering of pageviews, in order to discover overlapping aggregate profiles that can be effectively used by recommender systems for real-time Web personalization. We evaluate these techniques both in terms of the quality of the individual profiles generated, as well as in the context of providing recommendations as an integrated part of a personalization engine. In particular, our results indicate that using the generated aggregate profiles, we can achieve effective personalization at early stages of users' visits to a site, based only on anonymous clickstream data and without the benefit of explicit input by these users or deeper knowledge about them.  相似文献   

5.
Users of a Web site usually perform their interest-oriented actions by clicking or visiting Web pages, which are traced in access log files. Clustering Web user access patterns may capture common user interests to a Web site, and in turn, build user profiles for advanced Web applications, such as Web caching and prefetching. The conventional Web usage mining techniques for clustering Web user sessions can discover usage patterns directly, but cannot identify the latent factors or hidden relationships among users?? navigational behaviour. In this paper, we propose an approach based on a vector space model, called Random Indexing, to discover such intrinsic characteristics of Web users?? activities. The underlying factors are then utilised for clustering individual user navigational patterns and creating common user profiles. The clustering results will be used to predict and prefetch Web requests for grouped users. We demonstrate the usability and superiority of the proposed Web user clustering approach through experiments on a real Web log file. The clustering and prefetching tasks are evaluated by comparison with previous studies demonstrating better clustering performance and higher prefetching accuracy.  相似文献   

6.
Web caching has been proposed as an effective solution to the problems of network traffic and congestion, Web objects access and Web load balancing. This paper presents a model for optimizing Web cache content by applying either a genetic algorithm or an evolutionary programming scheme for Web cache content replacement. Three policies are proposed for each of the genetic algorithm and the evolutionary programming techniques, in relation to objects staleness factors and retrieval rates. A simulation model is developed and long term trace-driven simulation is used to experiment on the proposed techniques. The results indicate that all evolutionary techniques are beneficial to the cache replacement, compared to the conventional replacement applied in most Web cache server. Under an appropriate objective function the genetic algorithm has been proven to be the best of all approaches with respect to cache hit and byte hit ratios.  相似文献   

7.
In this paper, we propose a parallel multiobjective evolutionary algorithm called Parallel Criterion-based Partitioning MOEA (PCPMOEA), with an application to the Multiobjective Knapsack Problem (MOKP). The suggested search strategy is based on a periodic partitioning of potentially efficient solutions, which are distributed to multiple multiobjective evolutionary algorithms (MOEAs). Each MOEA is dedicated to a sole objective, in which it combines both criterion-based and dominance-based approaches. The suggested algorithm addresses two main sub-objectives: minimizing the distance between the current non-dominated solutions and the ideal point, and ensuring the spread of the potentially efficient solutions. Experimental results are included, where we assess the performance of the suggested algorithm against the above mentioned sub-objectives, compared with state-of-the-art results using well-known multi-objective metaheuristics.  相似文献   

8.
9.
唐哲  丁二玉  骆斌  陈世福 《计算机科学》2005,32(12):193-196
推荐系统(Recommender System)被电子商务站点用来向顾客提供信息以帮助顾客选择产品,其基本思想是以统计结果或者顾客以前的行为记录为依据,推测顾客未来可能的行为并给出相应的推荐。本文对基于传统技术和Web mining技术的推荐系统进行了简要综述,同时描述了基于Web mining技术的推荐系统的工作流程,重点分析了应用于推荐系统的各种具体Web mining技术及其算法比较。  相似文献   

10.
The segmentation task in the feature space of an image can be formulated as an optimization problem. Recent researches have demonstrated that the clustering techniques, using only one objective may not obtain suitable solution because the single objective function just can provide satisfactory result to one kind of corresponding data set. In this letter, a novel multiobjective clustering approach, named a quantum-inspired multiobjective evolutionary clustering algorithm (QMEC), is proposed to deal with the problem of image segmentation, where two objectives are simultaneously optimized. Based on the concepts and principles of quantum computing, the multi-state quantum bits are used to represent individuals and quantum rotation gate strategy is used to update the probabilistic individuals. The proposed algorithm can take advantage of the multiobjective optimization mechanism and the superposition of quantum states, and therefore it has a good population diversity and search capabilities. Due to a set of nondominated solutions in multiobjective clustering problems, a simple heuristic method is adopted to select a preferred solution from the final Pareto front and the results show that a good image segmentation result is selected. Experiments on one simulated synthetic aperture radar (SAR) image and two real SAR images have shown the superiority of the QMEC over three other known algorithms.  相似文献   

11.
Deep Web数据源聚类与分类   总被引:1,自引:0,他引:1  
随着Internet信息的迅速增长,许多Web信息已经被各种各样的可搜索在线数据库所深化,并被隐藏在Web查询接口下面.传统的搜索引擎由于技术原因不能索引这些信息--Deep Web信息.本文分析了Deep Web查询接口的各种类型,研究了基于查询接口特征的数据源聚类方法和基于聚类结果的数据源分类方法,讨论了从基于规则与线性文档分类器中抽取查询探测集的规则抽取算法和Web文档数据库分类的查询探测算法.  相似文献   

12.
The supply trajectory of electric power for submerged arc magnesia furnace determines the yields and grade of magnesia grain during the manufacture process. As the two production targets (i.e., the yields and the grade of magnesia grain) are conflicting and the process is subject to changing conditions, the supply of electric power needs to be dynamically optimized to track the moving Pareto optimal set with time. A hybrid evolutionary multiobjective optimization strategy is proposed to address the dynamic multiobjective optimization problem. The hybrid strategy is based on two techniques. The first one uses case-based reasoning to immediately generate good solutions to adjust the power supply once the environment changes, and then apply a multiobjective evolutionary algorithm to accurately solve the problem. The second one is to learn the case solutions to guide and promote the search of the evolutionary algorithm, and the best solutions found by the evolutionary algorithm can be used to update the case library to improve the accuracy of case-based reasoning in the following process. Due to the effectiveness of mutual promotion, the hybrid strategy can continuously adapt and search in dynamic environments. Two prominent multiobjective evolutionary algorithms are integrated into the hybrid strategy to solve the dynamic multiobjective power supply optimization problem. The results from a series of experiments show that the proposed hybrid algorithms perform better than their component multiobjective evolutionary algorithms for the tested problems.  相似文献   

13.
陈娟  王贤  黄青松 《现代计算机》2006,(9):19-21,62
近几年,网络被在线数据库迅速地深化.在深网中,大量的资料提供了丰富的数据模式,这些模式详细说明了它们的目标领域和查询性能,因此对大规模数据的整合是当前面临的挑战.在数据挖掘中,聚类分析是一个重要方法.本文论述通过查询接口采用凝聚层次聚类方法聚类结构化的Web资源,并采用先聚类后分类的方法稍加改进.实验显示对于聚类Web查询模式,凝聚的层次聚类能正确地组织资料.  相似文献   

14.
Recommender systems combine ideas from information retrieval, user modelling, and artificial intelligence to focus on the provision of more intelligent and proactive information services. As such, recommender systems play an important role when it comes to assisting the user during both routine and specialised information retrieval tasks. Like any good assistant it is important that users can trust in the ability of a recommender system to respond with timely and relevant suggestions. In this paper, we will look at a collaborative recommendation system operating in the domain of Web search. We will show how explicit models of trust can help to inform more reliable recommendations that translate into more relevant search results. Moreover, we demonstrate how the availability of this trust-model facilitates important interface enhancements that provide a means to declare the provenance of result recommendations in a way that will allow searchers to evaluate their likely relevance based on the reputation and trustworthiness of the recommendation partners behind these suggestions.  相似文献   

15.
Time-Aware Web Users' Clustering   总被引:1,自引:0,他引:1  
Web users' clustering is a crucial task for mining information related to users' needs and preferences. Up to now, popular clustering approaches build clusters based on usage patterns derived from users' page preferences. This paper emphasizes the need to discover similarities in users' accessing behavior with respect to the time locality of their navigational acts. In this context, we present two time-aware clustering approaches for tuning and binding the page and time visiting criteria. The two tracks of the proposed algorithms define clusters with users that show similar visiting behavior at the same time period, by varying the priority given to page or time visiting. The proposed algorithms are evaluated using both synthetic and real data sets and the experimentation has shown that the new clustering schemes result in enriched clusters compared to those created by the conventional non-time-aware user clustering approaches. These clusters contain users exhibiting similar access behavior in terms not only of their page preferences but also of their access time.  相似文献   

16.
会话识别是用户访问行为分析的基础和关键工作,其质量对于识别和发现用户的信息需求具有决定性的影响。目前常用的是基于时间阈值的切分方法,但是该方法存在的主要问题是针对不同用户时间阈值难以准确地确定。提出了一种新的基于聚类技术的会话识别优化方法,首先建立了基于聚类的会话识别优化模型,然后采用改进的K-means算法进行会话识别。实验结果表明该方法与传统方法相比具有较好的效果。  相似文献   

17.
The field of computational biology encloses a wide range of optimization problems that show non‐deterministic polynomial‐time hard complexities. Nowadays, phylogeneticians are dealing with a growing amount of biological data that must be analyzed to explain the origins of modern species. Evolutionary relationships among organisms are often described by means of tree‐shaped structures known as phylogenetic trees. When inferring phylogenies, two main challenges must be addressed. First, the inference of reliable evolutionary trees on data sets where different optimality principles support conflicting evolutionary hypotheses. Second, the processing of enormous tree searches spaces where traditional sequential strategies cannot be applied. In this sense, phylogenetic inference can benefit from the combination of high performance computing and evolutionary computation to carry out the reconstruction of complex evolutionary histories in reduced execution times. In this paper, we introduce multiobjective phylogenetics, a hybrid OpenMP/MPI approach to parallelize a well‐known multiobjective metaheuristic, the fast non‐dominated sorting genetic algorithm (NSGA‐II). This algorithm has been designed to conduct phylogenetic analyses on multi‐core clusters in accordance with two principles: maximum parsimony and maximum likelihood. The main goal is to combine the benefits of shared‐memory and distributed‐memory programming paradigms to efficiently infer a set of high‐quality Pareto solutions. Experiments on six real nucleotide data sets and comparisons with other hybrid parallel approaches show that multiobjective phylogenetics is able to achieve significant performance in terms of parallel, multiobjective, and biological results. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

18.
Efficient phrase-based document indexing for Web document clustering   总被引:4,自引:0,他引:4  
Document clustering techniques mostly rely on single term analysis of the document data set, such as the vector space model. To achieve more accurate document clustering, more informative features including phrases and their weights are particularly important in such scenarios. Document clustering is particularly useful in many applications such as automatic categorization of documents, grouping search engine results, building a taxonomy of documents, and others. This article presents two key parts of successful document clustering. The first part is a novel phrase-based document index model, the document index graph, which allows for incremental construction of a phrase-based index of the document set with an emphasis on efficiency, rather than relying on single-term indexes only. It provides efficient phrase matching that is used to judge the similarity between documents. The model is flexible in that it could revert to a compact representation of the vector space model if we choose not to index phrases. The second part is an incremental document clustering algorithm based on maximizing the tightness of clusters by carefully watching the pair-wise document similarity distribution inside clusters. The combination of these two components creates an underlying model for robust and accurate document similarity calculation that leads to much improved results in Web document clustering over traditional methods.  相似文献   

19.
SW-Store: a vertically partitioned DBMS for Semantic Web data management   总被引:3,自引:0,他引:3  
Efficient management of RDF data is an important prerequisite for realizing the Semantic Web vision. Performance and scalability issues are becoming increasingly pressing as Semantic Web technology is applied to real-world applications. In this paper, we examine the reasons why current data management solutions for RDF data scale poorly, and explore the fundamental scalability limitations of these approaches. We review the state of the art for improving performance of RDF databases and consider a recent suggestion, “property tables”. We then discuss practically and empirically why this solution has undesirable features. As an improvement, we propose an alternative solution: vertically partitioning the RDF data. We compare the performance of vertical partitioning with prior art on queries generated by a Web-based RDF browser over a large-scale (more than 50 million triples) catalog of library data. Our results show that a vertically partitioned schema achieves similar performance to the property table technique while being much simpler to design. Further, if a column-oriented DBMS (a database architected specially for the vertically partitioned case) is used instead of a row-oriented DBMS, another order of magnitude performance improvement is observed, with query times dropping from minutes to several seconds. Encouraged by these results, we describe the architecture of SW-Store, a new DBMS we are actively building that implements these techniques to achieve high performance RDF data management.  相似文献   

20.
Multiobjective evolutionary algorithms for electric power dispatch problem   总被引:6,自引:0,他引:6  
The potential and effectiveness of the newly developed Pareto-based multiobjective evolutionary algorithms (MOEA) for solving a real-world power system multiobjective nonlinear optimization problem are comprehensively discussed and evaluated in this paper. Specifically, nondominated sorting genetic algorithm, niched Pareto genetic algorithm, and strength Pareto evolutionary algorithm (SPEA) have been developed and successfully applied to an environmental/economic electric power dispatch problem. A new procedure for quality measure is proposed in this paper in order to evaluate different techniques. A feasibility check procedure has been developed and superimposed on MOEA to restrict the search to the feasible region of the problem space. A hierarchical clustering algorithm is also imposed to provide the power system operator with a representative and manageable Pareto-optimal set. Moreover, an approach based on fuzzy set theory is developed to extract one of the Pareto-optimal solutions as the best compromise one. These multiobjective evolutionary algorithms have been individually examined and applied to the standard IEEE 30-bus six-generator test system. Several optimization runs have been carried out on different cases of problem complexity. The results of MOEA have been compared to those reported in the literature. The results confirm the potential and effectiveness of MOEA compared to the traditional multiobjective optimization techniques. In addition, the results demonstrate the superiority of the SPEA as a promising multiobjective evolutionary algorithm to solve different power system multiobjective optimization problems.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号