首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
本文把一个求解高维空间数据聚类问题转换为一个超图分割寻优问题,提出一种基于超图模式的高维空间数据聚类方法.该方法不需要减少高维空间数据项的维数,直接用超图模式描述原始数据之间的关系,并能通过选择适当的支持度阈值,有效去除噪声点,保证数据聚类的质量.  相似文献   

2.
从技术的角度看,洗钱侦测问题实际上是一个数据分析问题。本文首先给出了一个可疑交易判定模型,并提出了一个基于超图模型的高维聚类算法,运用该算法从案例库中形成可疑交易模式,最后给出了可疑交易的判定方法。该基于超图的高维聚类算法具有以下特点:1)能处理大数据集;2)能适应高维数据;3)聚类结果是可理解、可解释和可用的。  相似文献   

3.
非负张量链分解作为一种重要的张量分解模型,可保留数据内部结构信息,广泛应用于高维数据的特征提取和表示。从流形学习角度出发,高维数据信息通常潜在于低维空间的非线性流形结构中,然而现有图学习理论只能建模对象间的成对关系,很难准确刻画具有复杂流形结构的高维数据的相似关系。引入超图学习,提出一种超图正则化非负张量链(HGNTT)分解方法,在高维数据中提取低维表示的同时通过构建超图描述样本数据间的高阶关系,从而保留非线性流形结构,同时采用乘法更新方法对HGNTT模型进行优化求解并证明其收敛性。在ORL和Faces95这两个公开数据集上的聚类实验结果表明,相比于NMF、GNMF等方法 ,HGNTT方法的聚类准确率和归一化互信息分别提升了1.2%~7.6%和0.2%~3.0%,验证了HGNTT方法的有效性。  相似文献   

4.
聚类可应用于现代生活的诸多方面,现代生活中的数据对象往往是高维、稀疏的。对于此类高维数据,传统聚类算法不能有效地处理。提出一种基于属性相似性的改进的超图聚类算法,在原有超图聚类算法的基础上,根据超边距离阈值形成超图模型并采用超图分割法对数据对象进行聚类,采用簇内奇异特征值进行评估聚类质量。  相似文献   

5.
基于k均值分区的流数据高效密度聚类算法   总被引:2,自引:0,他引:2  
数据流聚类是数据流挖掘研究的一个重要内容,已有的数据流聚类算法大多采用k中心点(均值)方法对数据进行聚类,不能对数据分布不规则以及高维空间数据流进行有效聚类.论文提出一种基于k均值分区的流数据密度聚类算法,先对数据流进行分区做k均值聚类生成中间聚类结果(均值参考点集),随后对这些均值参考点进行密度聚类,理论分析和实验结果表明算法可以有效解决数据分布不规则以及高维空间数据流聚类问题,算法是有效可行的.  相似文献   

6.
项响琴  汪彩梅 《微机发展》2010,(1):124-127,131
离群数据挖掘是数据挖掘领域的一个研究分支,而聚类算法分析则是进行离群数据挖掘的重要研究方法之一。文中首先分析研究离群数据挖掘方法,对多个离群数据挖掘算法进行分析比较,讨论各自的优点和不足,同时针对高维空间数据的特点,分析挖掘高维空间数据中的离群点方法。其次对聚类分析算法进行讨论,分析一种基于网格和基于密度的聚类方法——聚类高维空间算法(CLIQUE算法),运用它可以更好地挖掘高维空间中的离群数据。提出了CLIQUE算法的有待改进的思想,为以后的研究指明方向。  相似文献   

7.
基于聚类高维空间算法的离群数据挖掘技术研究   总被引:3,自引:1,他引:2  
离群数据挖掘是数据挖掘领域的一个研究分支,而聚类算法分析则是进行离群数据挖掘的重要研究方法之一。文中首先分析研究离群数据挖掘方法,对多个离群数据挖掘算法进行分析比较,讨论各自的优点和不足,同时针对高维空间数据的特点,分析挖掘高维空间数据中的离群点方法。其次对聚类分析算法进行讨论,分析一种基于网格和基于密度的聚类方法——聚类高维空间算法(CLIQUE算法),运用它可以更好地挖掘高维空间中的离群数据。提出了CLIQUE算法的有待改进的思想,为以后的研究指明方向。  相似文献   

8.
一种适用于高维数据流的子空间聚类方法   总被引:2,自引:0,他引:2  
颜晓龙  沈鸿 《计算机应用》2007,27(7):1680-1684
受频繁模式挖掘中FP树算法的启发,结合静态高维数据聚类中CLIQUE算法所体现的思想,设计一种树形数据结构DenseGrid树(简称DG树),以记录用于聚类的数据流摘要信息,并通过搜索树中路径从高维数据流中发现存在聚类的低维子空间,从而将高维空间聚类问题转化成构造DG树并利用这种树形数据结构搜索高密网格单元的过程。实验表明,这种聚类方法具有良好的聚类效果和伸缩性。  相似文献   

9.
k-LDCHD--高维空间k邻域局部密度聚类算法   总被引:7,自引:0,他引:7  
聚类是数据挖掘领域的一项重要课题,高维空间聚类以数据分布稀疏、噪声数据多、“差距趋零现象”而成为难点.在分析现有聚类算法不足的基础上,引入k邻域点集、k邻域半径等概念,提出一种高维空间单参数k邻域局部密度聚类算法k—PCLDHD;为了提高算法的效率,进一步定义了参考距离等概念,并采用“双参考数据点”对数据集中的数据对象进行预处理,以减少扫描数据集的开销。提出k—PCLDHD的优化算法k—LDCHD.理论分析和实验结果表明,算法可以有效解决高维空间聚类问题,算法是有效可行的.  相似文献   

10.
针对传统谱聚类算法仅考虑数据点对点间的相互关系而未考虑数据间可能隐藏的复杂的相关性的问题,提出一种基于超图和自表征的谱聚类方法。首先,建立数据的超图,得到超图的拉普拉斯矩阵表示;然后,利用L2,1-范数对样本进行行稀疏自表征,同时融入超图来描述数据间多层次的相互关系;最后,利用生成的自表征系数进行谱聚类。利用基于超图的样本自表征技术考虑了样本之间复杂的相关性。通过在Hopkins155等数据集上的实验表明,在聚类错误率评判标准下,算法优于现有基于普通图的谱聚类算法SSC、SRC等。  相似文献   

11.
基于属性分布相似度的超图高维聚类算法研究   总被引:4,自引:0,他引:4  
在许多聚类应用中,数据对象是具有高维、稀疏、二元的特征。传统聚类算法无法有效地处理此类数据。该文提出一种基于超图模型的高维聚类算法,通过定义对象属性分布特征向量和对象间属性分布相似度,建立超图模型,并应用超图分割法进行聚类。聚类结果通过簇内奇异特征值进行评价。实验结果和算法分析表明,该算法可以有效地进行聚类知识挖掘。  相似文献   

12.
In the last few years, hypergraph-based methods have gained considerable attention in the resolution of real-world clustering problems, since such a mode of representation can handle higher-order relationships between elements compared to the standard graph theory. The most popular and promising approach to hypergraph clustering arises from concepts in spectral hypergraph theory [53], and clustering is configured as a hypergraph cut problem where an appropriate objective function has to be optimized. The spectral relaxation of this optimization problem allows to get a clustering that is close to the optimum, but this approach generally suffers from its high computational demands, especially in real-world problems where the size of the data involved in their resolution becomes too large. A natural way to overcome this limitation is to operate a reduction of the hypergraph, where spectral clustering should be applied over a hypergraph of smaller size. In this paper, we introduce two novel hypergraph reduction algorithms that are able to maintain the hypergraph structure as accurate as possible. These algorithms allowed us to design a new approach devoted to hypergraph clustering, based on the multilevel paradigm that operates in three steps: (i) hypergraph reduction; (ii) initial spectral clustering of the reduced hypergraph and (iii) clustering refinement. The accuracy of our hypergraph clustering framework has been demonstrated by extensive experiments with comparison to other hypergraph clustering algorithms, and have been successfully applied to image segmentation, for which an appropriate hypergraph-based model have been designed. The low running times displayed by our algorithm also demonstrates that the latter, unlike the standard spectral clustering approach, can handle datasets of considerable size.  相似文献   

13.
针对非负张量分解应用于图像聚类时忽略了高维数据内部几何结构的问题,在经典的张量非负Tucker分解的基础上,添加超图正则项以尽可能多地保留原始数据的内在几何结构信息,提出一种基于超图正则化非负Tucker分解模型HGNTD。通过构造超图刻画数据内部样本间的高阶关系,提高几何结构描述的准确性,针对超图正则化非负张量分解模型,基于交替非负最小二乘法,设计快速有效的超图正则化非负Tucker分解算法求解所给模型,证明算法在非负的条件下是收敛的,最终将算法应用于图像聚类。在Yale和COIL两个常用公开数据集上的实验结果表明,相对于k-means、非负矩阵分解、图正则化非负矩阵分解、非负Tucker分解和图正则化非负Tucker分解等算法,超图正则化非负Tucker分解算法聚类准确度提升了8.6%~11.4%,归一化互信息提升了2.0%~7.5%,具有更好的聚类效果。  相似文献   

14.
Clustering ensemble integrates multiple base clustering results to obtain a consensus result and thus improves the stability and robustness of the single clustering method. Since it is natural to use a hypergraph to represent the multiple base clustering results, where instances are represented by nodes and base clusters are represented by hyperedges, some hypergraph based clustering ensemble methods are proposed. Conventional hypergraph based methods obtain the final consensus result by partitioning a pre-defined static hypergraph. However, since base clusters may be imperfect due to the unreliability of base clustering methods, the pre-defined hypergraph constructed from the base clusters is also unreliable. Therefore, directly obtaining the final clustering result by partitioning the unreliable hypergraph is inappropriate. To tackle this problem, in this paper, we propose a clustering ensemble method via structured hypergraph learning, i.e., instead of being constructed directly, the hypergraph is dynamically learned from base results, which will be more reliable. Moreover, when dynamically learning the hypergraph, we enforce it to have a clear clustering structure, which will be more appropriate for clustering tasks, and thus we do not need to perform any uncertain postprocessing, such as hypergraph partitioning. Extensive experiments show that, our method not only performs better than the conventional hypergraph based ensemble methods, but also outperforms the state-of-the-art clustering ensemble methods.  相似文献   

15.
Hypergraph Models and Algorithms for Data-Pattern-Based Clustering   总被引:2,自引:0,他引:2  
In traditional approaches for clustering market basket type data, relations among transactions are modeled according to the items occurring in these transactions. However, an individual item might induce different relations in different contexts. Since such contexts might be captured by interesting patterns in the overall data, we represent each transaction as a set of patterns through modifying the conventional pattern semantics. By clustering the patterns in the dataset, we infer a clustering of the transactions represented this way. For this, we propose a novel hypergraph model to represent the relations among the patterns. Instead of a local measure that depends only on common items among patterns, we propose a global measure that is based on the cooccurences of these patterns in the overall data. The success of existing hypergraph partitioning based algorithms in other domains depends on sparsity of the hypergraph and explicit objective metrics. For this, we propose a two-phase clustering approach for the above hypergraph, which is expected to be dense. In the first phase, the vertices of the hypergraph are merged in a multilevel algorithm to obtain large number of high quality clusters. Here, we propose new quality metrics for merging decisions in hypergraph clustering specifically for this domain. In order to enable the use of existing metrics in the second phase, we introduce a vertex-to-cluster affinity concept to devise a method for constructing a sparse hypergraph based on the obtained clustering. The experiments we have performed show the effectiveness of the proposed framework.  相似文献   

16.
林国平  李绍滋 《软件学报》2009,20(Z1):330-335
考虑到实验数据的大规模性及不完备性等特点,根据集对分析理论,提出一种新超图模型不完备文本系统的聚类算法,即在超图边的权重中引入了集对的同异反联系度和集对的相似联系度并建立了超图模型,最后应用超图分隔法进行聚类.该算法克服了传统聚类算法的缺陷,更有效地降低了文本空间的维数,提高了不完备文本信息系统聚类的精度和速度.最后的实例说明了该算法的可行性和有效性.  相似文献   

17.
18.
对超图划分问题运用元胞自动机理论进行分析建模,提出一种元胞自动机模型以及基于该模型的赋权超图划分优化算法。在该模型中,元胞对应于赋权超图中的结点,邻接元胞对应于邻接超边所包含的结点,元胞的状态对应于所在的划分子集。引入二维辅助数组存储每条超边在划分子集中的结点个数,给出快速的元胞收益值和划分割切值的计算方法,从而避免遍历超边中的结点。实验结果表明,与赋权图划分算法和迁移方法相比,该算法可以取得更优的划分,且时间复杂度和空间复杂度较低。  相似文献   

19.
In spatial networks, clustering adjacent data to disk pages is highly likely to reduce the number of disk page accesses made by the aggregate network operations during query processing. For this purpose, different techniques based on the clustering graph model are proposed in the literature. In this work, we show that the state-of-the-art clustering graph model is not able to correctly capture the disk access costs of aggregate network operations. Moreover, we propose a novel clustering hypergraph model that correctly captures the disk access costs of these operations. The proposed model aims to minimize the total number of disk page accesses in aggregate network operations. Based on this model, we further propose two adaptive recursive bipartitioning schemes to reduce the number of allocated disk pages while trying to minimize the number of disk page accesses. We evaluate our clustering hypergraph model and recursive bipartitioning schemes on a wide range of road network datasets. The results of the conducted experiments show that the proposed model is quite effective in reducing the number of disk accesses incurred by the network operations.  相似文献   

20.
杨伟英  王英  吴越 《计算机应用研究》2021,38(5):1508-1513,1519
如何采用超边建模网络数据中的多元关联关系,实现潜在超边链接关系的预测具有重要的现实意义。现有方法主要集中于研究具有成对关系的网络数据,然而,直接将现有的链接预测方法用于超图网络中的超边链接预测具有一定的局限性。因此,提出基于异质变分超图自动编码器的超边链接预测模型(heterogeneous variational hypergraph autoencoder,HVGAE)。首先,利用超图卷积实现变分超图自动编码器,将超图网络数据转换成一种低维空间表示;其次,加入节点近邻度函数,最大程度地保留其结构信息,从而构建异质超图网络超边链接预测模型。针对三种不同类型的超图网络进行实验,结果表明相比其他的基准方法,HVGAE模型获得了较好的预测结果,说明其能够较好地解决超图网络中的超边链接预测问题。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号