首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
基于模糊簇的个性化推荐方法   总被引:3,自引:0,他引:3  
提出了一种运用模糊聚类方法将项目属性特征的相似性与协同过滤推荐算法相融合的推荐方法,此方法将用户对单个项目的偏好转化为对相似群组的偏好,目的是构造密集的用户-模糊簇的偏好信息,同时利用项目之间在相似群组的相似性来初步预测用户对未评价项目的评分,在此基础之上再完成基于用户的协同过滤推荐算法。实验结果表明,该方法确实可提高协同过滤推荐算法的推荐精度。  相似文献   

2.
随着个性化推荐技术的发展,推荐系统面临着越来越多的挑战。传统的推荐算法通常存在数据稀疏性和推荐精度低等问题。针对以上问题,提出了一种融合时间隐语义填充和子群划分的推荐算法[K]-TLFM(Time Based Latent Factor Model Integrated with [k]-means)。该算法利用融合时间因素的隐语义模型对原始用户物品评分矩阵缺失项进行填充,避免了用全局平均值或者用户/物品平均值补全矩阵带来的误差,有效缓解了数据稀疏性问题,同时融合时间因素有效地刻画了用户偏好随时间的变化;完成评分矩阵缺失项填充后,基于二分[k]-means聚类算法将偏好、兴趣特征相似的对象划分到同一个子群中,在目标用户所属的子群中基于选定的协同过滤算法为用户产生推荐列表,提高了推荐效率和准确性。在MovieLens和Netflix数据集上对该算法的推荐性能进行了对比实验,结果表明该算法具有更高的推荐精度。  相似文献   

3.
协同过滤推荐算法使用评分数据作为学习的数据源,针对协同过滤推荐算法中存在的评分数据稀疏以及算法的可拓展性问题,提出了一种基于聚类和用户偏好的协同过滤推荐算法。为了挖掘用户的偏好,该算法引入了用户对项目类型的平均评分到评分矩阵中,并加入了基于用户自身属性的相似度;同时,为了降低数据稀疏性,该算法使用Weighted Slope One算法填充评分数据中的未评分项,并通过融入密度和距离优化初始聚类中心的K-means算法聚类填充后的评分数据中的用户,缩小了相似用户的搜索空间;最后在聚类后的数据集中使用传统的协同过滤推荐算法生成目标用户的推荐结果。通过使用MovieLens100K数据集实验证明,提出的算法对推荐效果有所改善。  相似文献   

4.
Collaborative recommendation (CR) is a popular method of filtering items that may interest social users by referring to the opinions of friends and acquaintances in the network and computer applications. However, CR involves a cold-start problem, in which a newly established recommender system usually exhibits low recommending accuracy because of insufficient data, such as lack of ratings from users. In this study, we rigorously identify active users in social networks, who are likely to share and accept a recommendation in each data cluster to enhance the performance of the recommendation system and solve the cold-start problem. This novel modified CR method called div-clustering is presented to cluster Web entities in which the properties are specified formally in a recommendation framework, with the reusability of the user modeling component considered. We improve the traditional k-means clustering algorithm by applying supplementary works such as compensating for nominal values supported by the knowledge base, as well as computing and updating the k value. We use the data from two different cases to test for accuracy and demonstrate high quality in div-clustering against a baseline CR algorithm. The experimental results of both offline and online evaluations, which also consider in detail the volunteer profiles, indicate that the CR system with div-clustering obtains more accurate results than does the baseline system.  相似文献   

5.
In this paper, we present a fast global k-means clustering algorithm by making use of the cluster membership and geometrical information of a data point. This algorithm is referred to as MFGKM. The algorithm uses a set of inequalities developed in this paper to determine a starting point for the jth cluster center of global k-means clustering. Adopting multiple cluster center selection (MCS) for MFGKM, we also develop another clustering algorithm called MFGKM+MCS. MCS determines more than one starting point for each step of cluster split; while the available fast and modified global k-means clustering algorithms select one starting point for each cluster split. Our proposed method MFGKM can obtain the least distortion; while MFGKM+MCS may give the least computing time. Compared to the modified global k-means clustering algorithm, our method MFGKM can reduce the computing time and number of distance calculations by a factor of 3.78-5.55 and 21.13-31.41, respectively, with the average distortion reduction of 5,487 for the Statlog data set. Compared to the fast global k-means clustering algorithm, our method MFGKM+MCS can reduce the computing time by a factor of 5.78-8.70 with the average reduction of distortion of 30,564 using the same data set. The performances of our proposed methods are more remarkable when a data set with higher dimension is divided into more clusters.  相似文献   

6.
何明  孙望  肖润  刘伟世 《计算机科学》2017,44(Z11):391-396
协同过滤推荐算法可以根据已知用户的偏好预测其可能感兴趣的项目,是现今最为成功、应用最广泛的推荐技术。然而,传统的协同过滤推荐算法受限于数据稀疏性问题,推荐结果较差。目前的协同过滤推荐算法大多只针对用户-项目评分矩阵进行数据分析,忽视了项目属性特征及用户对项目属性特征的偏好。针对上述问题,提出了一种融合聚类和用户兴趣偏好的协同过滤推荐算法。首先根据用户评分矩阵与项目类型信息,构建用户针对项目类型的用户兴趣偏好矩阵;然后利用K-Means算法对项目集进行聚类,并基于用户兴趣偏好矩阵查找待估值项所对应的近邻用户;在此基础上,通过结合项目相似度的加权Slope One算法在每一个项目类簇中对稀疏矩阵进行填充,以缓解数据稀疏性问题;进而基于用户兴趣偏好矩阵对用户进行聚类;最后,面向填充后的评分矩阵,在每一个用户类簇中使用基于用户的协同过滤算法对项目评分进行预测。实验结果表明,所提算法能够有效缓解原始评分矩阵的稀疏性问题,提升算法的推荐质量。  相似文献   

7.
We present the global k-means algorithm which is an incremental approach to clustering that dynamically adds one cluster center at a time through a deterministic global search procedure consisting of N (with N being the size of the data set) executions of the k-means algorithm from suitable initial positions. We also propose modifications of the method to reduce the computational load without significantly affecting solution quality. The proposed clustering methods are tested on well-known data sets and they compare favorably to the k-means algorithm with random restarts.  相似文献   

8.
The k-means algorithm and its variations are known to be fast clustering algorithms. However, they are sensitive to the choice of starting points and are inefficient for solving clustering problems in large datasets. Recently, incremental approaches have been developed to resolve difficulties with the choice of starting points. The global k-means and the modified global k-means algorithms are based on such an approach. They iteratively add one cluster center at a time. Numerical experiments show that these algorithms considerably improve the k-means algorithm. However, they require storing the whole affinity matrix or computing this matrix at each iteration. This makes both algorithms time consuming and memory demanding for clustering even moderately large datasets. In this paper, a new version of the modified global k-means algorithm is proposed. We introduce an auxiliary cluster function to generate a set of starting points lying in different parts of the dataset. We exploit information gathered in previous iterations of the incremental algorithm to eliminate the need of computing or storing the whole affinity matrix and thereby to reduce computational effort and memory usage. Results of numerical experiments on six standard datasets demonstrate that the new algorithm is more efficient than the global and the modified global k-means algorithms.  相似文献   

9.
Clustering is one of the widely used knowledge discovery techniques to reveal structures in a dataset that can be extremely useful to the analyst. In iterative clustering algorithms the procedure adopted for choosing initial cluster centers is extremely important as it has a direct impact on the formation of final clusters. Since clusters are separated groups in a feature space, it is desirable to select initial centers which are well separated. In this paper, we have proposed an algorithm to compute initial cluster centers for k-means algorithm. The algorithm is applied to several different datasets in different dimension for illustrative purposes. It is observed that the newly proposed algorithm has good performance to obtain the initial cluster centers for the k-means algorithm.  相似文献   

10.
针对传统协同过滤推荐算法没有充分考虑用户属性及项目类别划分等因素对相似度计算产生的影响,存在数据稀疏性,从而导致推荐准确度不高的问题.提出一种基于用户属性聚类与项目划分的协同过滤推荐算法,算法对推荐准确度有重要影响的相似度计算进行了充分考虑.先对用户采用聚类算法以用户身份属性聚类,进而再对项目进行类别划分,在相似度计算中增加类别相似度,考虑共同评分用户数通过加权系数进行综合相似度计算,最后结合平均相似度,采用阈值法综合得出最近邻.实验结果表明,所提算法能够有效提高推荐精度,为用户提供更准确的推荐项目.  相似文献   

11.
准确而积极地向用户提供他们可能感兴趣的信息或服务是推荐系统的主要任务。协同过滤是采用得最广泛的推荐算法之一,而数据稀疏的问题往往严重影响推荐质量。为了解决这个问题,提出了基于二分图划分联合聚类的协同过滤推荐算法。首先将用户与项目构建成二分图进行联合聚类,从而映射到低维潜在特征空间;其次根据聚类结果改进2种相似性计算策略:簇偏好相似性和评分相似性,并将二者相结合。基于结合的相似性,分别采用基于用户和项目的方法来获得对未知目标评分的预测。最后,将这些预测结果进行融合。实验结果表明,所提算法比最新的联合聚类协同过滤推荐算法具有更好的性能。  相似文献   

12.
针对传统的协同过滤推荐算法存在评分数据稀疏和推荐准确率偏低的问题,提出了一种优化聚类的协同过滤推荐算法。根据用户的评分差异对原始评分矩阵进行预处理,再将得到的用户项目评分矩阵以及项目类型矩阵构造用户类别偏好矩阵,更好反映用户的兴趣偏好,缓解数据的稀疏性。在该矩阵上利用花朵授粉优化的模糊聚类算法对用户聚类,增强用户的聚类效果,并将项目偏好信息的相似度与项目评分矩阵的相似度进行加权求和,得到多个最近邻居。融合时间因素对目标用户进行项目评分预测,改善用户兴趣变化对推荐效果的影响。通过在MovieLens 100k数据集上实验结果表明,提出的算法缓解了数据的稀疏性问题,提高了推荐的准确性。  相似文献   

13.
In this paper, we present a modified filtering algorithm (MFA) by making use of center variations to speed up clustering process. Our method first divides clusters into static and active groups. We use the information of cluster displacements to reject unlikely cluster centers for all nodes in the kd-tree. We reduce the computational complexity of filtering algorithm (FA) through finding candidates for each node mainly from the set of active cluster centers. Two conditions for determining the set of candidate cluster centers for each node from active clusters are developed. Our approach is different from the major available algorithm, which passes no information from one stage of iteration to the next. Theoretical analysis shows that our method can reduce the computational complexity, in terms of the number of distance calculations, of FA at each stage of iteration by a factor of FC/AC, where FC and AC are the numbers of total clusters and active clusters, respectively. Compared with the FA, our algorithm can effectively reduce the computing time and number of distance calculations. It is noted that our proposed algorithm can generate the same clusters as that produced by hard k-means clustering. The superiority of our method is more remarkable when a larger data set with higher dimension is used.  相似文献   

14.
随着移动互联网规模的不断扩大,传统推荐系统因较少考虑多种情境因素和用户置信度对用户偏好预测的综合影响,造成了推荐算法预测结果的偏差。针对此问题,将情境信息引入个性化推荐的过程中,提出一种基于情境相似度和二次聚类的协同过滤算法。该算法首先根据用户情境的相似度对用户进行初始聚类,再基于评分矩阵计算用户评分置信度,将用户分为核心用户和非核心用户;然后根据核心用户评分对初始聚类的簇心进行调整,并对簇中非核心用户进行重聚类,形成新的聚簇;最终根据情境相似度对用户偏好进行预测。该算法可以在一定程度上降低评分矩阵中的噪点对聚类结果的影响,提高了推荐结果的准确性。基于实际数据集的仿真实验表明,该算法与传统协同过滤算法相比能够有效提高用户偏好预测的准确性,增加协同过滤推荐算法的精确度。  相似文献   

15.
协同过滤技术是目前电子商务推荐系统中最为主要的技术之一,但随着系统规模的日益扩大,它面临着算法可扩展性和数据稀疏性两大挑战。针对上述问题,本文提出了一种基于聚类和协同过滤的组合推荐算法。首先利用聚类对项目进行分类,在用户感兴趣的类里进行推荐计算,有效地解决了算法的可扩展性问题;接着在每一类中使用基于项目的协同过滤对未评价的项目进行预测,把较好的预测值填充到原用户-项集合中,有效地缓解了数据稀疏性问题;最后根据协同过滤推荐在相似项目的范围内计算邻居用户,给出最终的预测评分并产生推荐。实验结果表明,本算法有效地解决了上述两个问题,提高了推荐系统的推荐质量。  相似文献   

16.
Intrusion detection is a necessary step to identify unusual access or attacks to secure internal networks. In general, intrusion detection can be approached by machine learning techniques. In literature, advanced techniques by hybrid learning or ensemble methods have been considered, and related work has shown that they are superior to the models using single machine learning techniques. This paper proposes a hybrid learning model based on the triangle area based nearest neighbors (TANN) in order to detect attacks more effectively. In TANN, the k-means clustering is firstly used to obtain cluster centers corresponding to the attack classes, respectively. Then, the triangle area by two cluster centers with one data from the given dataset is calculated and formed a new feature signature of the data. Finally, the k-NN classifier is used to classify similar attacks based on the new feature represented by triangle areas. By using KDD-Cup ’99 as the simulation dataset, the experimental results show that TANN can effectively detect intrusion attacks and provide higher accuracy and detection rates, and the lower false alarm rate than three baseline models based on support vector machines, k-NN, and the hybrid centroid-based classification model by combining k-means and k-NN.  相似文献   

17.
DIVCLUS-T is a divisive hierarchical clustering algorithm based on a monothetic bipartitional approach allowing the dendrogram of the hierarchy to be read as a decision tree. It is designed for either numerical or categorical data. Like the Ward agglomerative hierarchical clustering algorithm and the k-means partitioning algorithm, it is based on the minimization of the inertia criterion. However, unlike Ward and k-means, it provides a simple and natural interpretation of the clusters. The price paid by construction in terms of inertia by DIVCLUS-T for this additional interpretation is studied by applying the three algorithms on six databases from the UCI Machine Learning repository.  相似文献   

18.
传统基于项目的协同过滤算法在计算项目相似度时仅依靠评分数据,未考虑项目的自身特征。社会化标注的出现使得标签能在一定程度上反映项目特征,但标签具有语义模糊的特点,因此直接将标签纳入协同过滤算法存在一定问题。为解决上述问题,提出一种改进的基于项目的协同过滤推荐算法。该算法对标签进行聚类并生成主题标签簇,根据项目标注情况计算项目与主题间的相关度并生成项目-主题相关度矩阵,同时将其与项目-评分矩阵相结合来计算项目间的相似度,采用协同过滤完成对目标项目的评分预测,以实现个性化推荐。在Movielens数据集上的实验结果表明,该算法能够解决标签的语义模糊问题并提升推荐质量。  相似文献   

19.
针对推荐系统领域中应用最广泛的协同过滤推荐算法仍伴随着数据稀疏性、冷启动和扩展性问题,基于用户冷启动和扩展性问题,提出了基于改进聚类的PCEDS(pearson correlation coefficient and euclidean distance similarity)协同过滤推荐算法。首先针对用户属性特征,采用优化的K-means聚类算法对其聚类,然后结合基于信任度的用户属性特征相似度模型和用户偏好相似度模型,形成一种新颖的PCEDS相似度模型,对聚类结果建立预测模型。实验结果表明:提出的PCEDS算法比传统的协同过滤推荐算法在均方根误差(RMSE)上降低5%左右,并且推荐准确率(precision)和召回率(recall)均有明显提高,缓解了冷启动问题,同时聚类技术可以节省系统内存计算空间,从而提高了推荐效率。  相似文献   

20.
丁永刚  李石君  余伟  王俊 《计算机科学》2017,44(10):182-186
传统的协同过滤推荐算法普遍存在数据稀疏问题,且仅利用单一综合评分来计算用户相似度,无法找到在多个指标上偏好相似的用户,因而影响推荐的准确度。多指标评分推荐算法力图寻找在多个指标上偏好相似的用户,但是其评价成本高,导致数据稀疏性问题更加严重。为了找到与目标用户在多个指标上偏好相似的用户,提出基于码本聚类的思想来获取用户在各指标上的评分风格信息,然后基于评分风格信息将用户和项目在各指标上进行双向聚类,最后利用因子分解机模型(Factorization Machines,FMs)基于同一簇内的用户、项目、多指标评分信息、评分风格信息进行推荐。实验结果表明,与传统的协同过滤算法和其他多指标推荐方法相比,基于多指标评分信息的因子分解机推荐算法能够在一定程度上缓解数据稀疏问题,提高推荐的准确度。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号