首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
Spectral clustering based on matrix perturbation theory   总被引:5,自引:1,他引:5  
This paper exposes some intrinsic characteristics of the spectral clustering method by using the tools from the matrix perturbation theory. We construct a weight ma- trix of a graph and study its eigenvalues and eigenvectors. It shows that the num- ber of clusters is equal to the number of eigenvalues that are larger than 1, and the number of points in each of the clusters can be approximated by the associated eigenvalue. It also shows that the eigenvector of the weight matrix can be used directly to perform clustering; that is, the directional angle between the two-row vectors of the matrix derived from the eigenvectors is a suitable distance measure for clustering. As a result, an unsupervised spectral clustering algorithm based on weight matrix (USCAWM) is developed. The experimental results on a number of artificial and real-world data sets show the correctness of the theoretical analysis.  相似文献   

2.
This paper proposes a sampling based hierarchical approach for solving the computational demands of the spectral clustering methods when applied to the problem of image segmentation. The authors first define the distance between a pixel and a cluster, and then derive a new theorem to estimate the number of samples needed for clustering. Finally, by introducing a scale parameter into the simi- larity function, a novel spectral clustering based image segmentation method has been developed. An important characteristic of the approach is that in the course of image segmentation one needs not only to tune the scale parameter to merge the small size clusters or split the large size clusters but also take samples from the data set at the different scales. The multiscale and stochastic nature makes it feasible to apply the method to very large grouping problem. In addition, it also makes the segmentation compute in time that is linear in the size of the image. The experimental results on various synthetic and real world images show the effective- ness of the approach.  相似文献   

3.
Clustering analysis is an unsupervised method to find out hidden structures in datasets.Most partitional clustering algorithms are sensitive to the selection of initial exemplars,the outliers and noise.In this paper,a novel technique called data competition algorithm is proposed to solve the problems.First the concept of aggregation field model is defined to describe the partitional clustering problem.Next,the exemplars are identified according to the data competition.Then,the members will be assigned to the suitable clusters.Data competition algorithm is able to avoid poor solutions caused by unlucky initializations,outliers and noise,and can be used to detect the coexpression gene,cluster the image,diagnose the disease,distinguish the variety,etc.The provided experimental results validate the feasibility and effectiveness of the proposed schemes and show that the proposed approach of data competition algorithm is simple,stable and efficient.The experimental results also show that the proposed approach of data competition clustering outperforms three of the most well known clustering algorithms K-means clustering,affinity propagation clustering,hierarchical clustering.  相似文献   

4.
In this paper, we present a perturbation analysis for the matrices in the multiway normalized cut spectral clustering method based on the matrix perturbation theory. The analytical results show that the eigenvalues and the eigenspaces of the normalized Laplacian matrices are continuous. Therefore, clustering algorithms can be designed according to the special properties of the normalized Laplacian matrices in the ideal case and the method can be extended to the general case based on the continuity of the eigenvalues and the eigenspaces of the normalized Laplacian matrices. The numerical results are consistent with the theoretical results.  相似文献   

5.
The probabilistic real-time automaton (PRTA) is a representation of dynamic processes arising in the sciences and industry. Currently, the induction of automata is divided into two steps: the creation of the prefix tree acceptor (PTA) and the merge procedure based on clustering of the states. These two steps can be very time intensive when a PRTA is to be induced for massive or even unbounded datasets. The latter one can be efficiently processed, as there exist scalable online clustering algorithms. However, the creation of the PTA still can be very time consuming. To overcome this problem, we propose a genuine online PRTA induction approach that incorporates new instances by first collapsing them and then using a maximum frequent pattern based clustering. The approach is tested against a predefined synthetic automaton and real world datasets, for which the approach is scalable and stable. Moreover, we present a broad evaluation on a real world disease group dataset that shows the applicability of such a model to the analysis of medical processes.  相似文献   

6.
A class of hybrid niching evolutionary algorithms (HNE) using clustering crowding and parallel local searching is proposed. By analyzing topology of fitness landscape and extending the space for searching similar individual, HNE determines the locality of search space more accurately, and decreases the replacement errors of crowding and suppressing genetic drift of the population. The integration of deterministic and probabilistic crowding increases the capacity of both parallel local hill-climbing and maintaining multiple subpopulations. Parallel local search based on simplex method over disjoint subpopulations greatly speeds up the convergence of the population towards various optima simultaneously. Real coded representation and Gaussian mutation improve the precision of the solutions founded. The experimental results optimizing various multimodal functions show that, the performances of HNE such as the number of effective peaks generated and maintained, average peak ratio, global optimum ratio and CPU time consumed are uniformly superior to those of genetic algorithms using the sharing and deterministic crowding method.  相似文献   

7.
A novel technique is proposed for the incremental construction of sparse radial basis function (RBF) networks. The correlation between an RBF regressor and the training data is used as the criterion to position and shape the RBF node, and it is shown that this is equivalent to incrementally minimise the modelling mean square error. A guided random search optimisation method, called the repeated weighted boosting search, is adopted to append RBF nodes one by one in an incremental regression modelling procedure. The experimental results obtained using the proposed method demonstrate that it provides a viable alternative to the existing state-of-the-art modelling techniques for constructing parsimonious RBF models that generalise well.  相似文献   

8.
Support vector clustering (SVC) is an important boundary-based clustering algorithm in multiple applications for its capability of handling arbitrary cluster shapes.However,SVC’s popularity is degraded by its highly intensive time complexity and poor label performance.To overcome such problems,we present a novel efficient and robust convex decomposition based cluster labeling (CDCL) method based on the topological property of dataset.The CDCL decomposes the implicit cluster into convex hulls and each one is comprised by a subset of support vectors (SVs).According to a robust algorithm applied in the nearest neighboring convex hulls,the adjacency matrix of convex hulls is built up for finding the connected components;and the remaining data points would be assigned the label of the nearest convex hull appropriately.The approach’s validation is guaranteed by geometric proofs.Time complexity analysis and comparative experiments suggest that CDCL improves both the efficiency and clustering quality significantly.  相似文献   

9.
We study the problem of searching for a mobile intruder in a polygonal region P by two guards. The objective is to decide whether there should exist a search schedule for the two guards to detect the intruder, no matter how fast the intruder moves, and if so, generate a search schedule. During the search, the two guards are required to walk on the boundary of P continuously and be mutually visible all the time. We present a characterization of the class of polygons searchable by two guards in terms of non-redundant components, and thus solve a long-standing open problem in computational geometry. Also, we give an optimal O(n) time algorithm to determine the two-guard searchability in a polygon, and an O(nlog n m) time algorithm to generate a search schedule, if it exists, where n is the number of vertices of P and m (≤n2) is the number of search instructions reported.  相似文献   

10.
Resolution and parameters estimations for multiple maneuvering targets in the same range cell is addressed in this work. The low-resolution radar cannot distinguish multiple targets in both distance and angle, but the detection of Doppler frequency variation of the multiple maneuvering targets can be used to resolve this problem. At present, most of researches on detection of Doppler frequency variation are carried out with time-frequency analysis methods, such as Fractional Fourier transformation (FRFT), Adaptive Chirplet transformation (ACT), and Wigner-Ville distribution (WVD) and so on, which need satisfy enough time duration and sampling theorem. This paper proposes a new method of resolution and parameters estimation for multiple maneuvering targets based on Compressive Sensing (CS) and clustering technique, which samples at low rate and short time duration without sacrificing estimation performance. Simulation results validate the effectiveness of the proposed algorithm, and also show that the performance of the proposed method is superior to that of FRFT in the condition of multiple targets.  相似文献   

11.
This paper describes a novel feature selection algorithm for unsupervised clustering, that combines the clustering ensembles method and the population based incremental learning algorithm. The main idea of the proposed unsupervised feature selection algorithm is to search for a subset of all features such that the clustering algorithm trained on this feature subset can achieve the most similar clustering solution to the one obtained by an ensemble learning algorithm. In particular, a clustering solution is firstly achieved by a clustering ensembles method, then the population based incremental learning algorithm is adopted to find the feature subset that best fits the obtained clustering solution. One advantage of the proposed unsupervised feature selection algorithm is that it is dimensionality-unbiased. In addition, the proposed unsupervised feature selection algorithm leverages the consensus across multiple clustering solutions. Experimental results on several real data sets demonstrate that the proposed unsupervised feature selection algorithm is often able to obtain a better feature subset when compared with other existing unsupervised feature selection algorithms.  相似文献   

12.
聚类分析是模式识别中的一个重要问题,是非监督学习的重要方法。K -means 算法是其中最经典的聚类算法之一。但是这种方法面对大规模数据的时候工作量非常巨大,并且保证不了聚类结果的最优性。提出了一种基于量子进化算法的改进的 K -means 聚类算法。该方法结合了两个方法的优点,用量子进化算法进行优化,并且改进了量子进化算法中的交叉算子和更新算子,提高了基于量子进化算法的 K -means 算法局部搜索能力。实验结果表明,改进算法取得了较好的效果。  相似文献   

13.
Supervised clustering is a new research area that aims to improve unsupervised clustering algorithms exploiting supervised information. Today, there are several clustering algorithms, but the effective supervised cluster adjustment method which is able to adjust the resulting clusters, regardless of applied clustering algorithm has not been presented yet. In this paper, we propose a new supervised cluster adjustment method which can be applied to any clustering algorithm. Since the adjustment method is based on finding the nearest neighbors, a novel exact nearest neighbor search algorithm is also introduced which is significantly faster than the classic one. Several datasets and clustering evaluation metrics are employed to examine the effectiveness of the proposed cluster adjustment method and the proposed fast exact nearest neighbor algorithm comprehensively. The experimental results show that the proposed algorithms are significantly effective in improving clusters and accelerating nearest neighbor searches.  相似文献   

14.
针对基于无监督特征提取的目标检测方法效率不高的问题,提出一种在无标记数据集中准确检测前景目标的方法.其基本出发点是:正确的特征聚类结果可以指导目标特征提取,同时准确提取的目标特征可以提高特征聚类的精度.该方法首先对无标记样本图像进行局部特征提取,然后根据最小化特征距离进行无监督特征聚类.将同一个聚类内的图像两两匹配,将特征匹配的重现程度作为特征权重,最后根据更新后的特征权重指导下一次迭代的特征聚类.多次迭代后同时得到聚类结果和前景目标.实验结果表明,该方法有效地提高Caltech-256数据集和Google车辆图像的检测精度.此外,针对目前绝大部分无监督目标检测方法不具备增量学习能力这一缺点,提出了增量学习方法实现,实验结果表明,增量学习方法有效地提高了计算速度.  相似文献   

15.
Wu  Yue  Wang  Can  Zhang  Yue-qing  Bu  Jia-jun 《浙江大学学报:C卷英文版》2019,20(4):538-553

Feature selection has attracted a great deal of interest over the past decades. By selecting meaningful feature subsets, the performance of learning algorithms can be effectively improved. Because label information is expensive to obtain, unsupervised feature selection methods are more widely used than the supervised ones. The key to unsupervised feature selection is to find features that effectively reflect the underlying data distribution. However, due to the inevitable redundancies and noise in a dataset, the intrinsic data distribution is not best revealed when using all features. To address this issue, we propose a novel unsupervised feature selection algorithm via joint local learning and group sparse regression (JLLGSR). JLLGSR incorporates local learning based clustering with group sparsity regularized regression in a single formulation, and seeks features that respect both the manifold structure and group sparse structure in the data space. An iterative optimization method is developed in which the weights finally converge on the important features and the selected features are able to improve the clustering results. Experiments on multiple real-world datasets (images, voices, and web pages) demonstrate the effectiveness of JLLGSR.

  相似文献   

16.
针对无监督分类问题,提出一种多尺度并行免疫克隆优化聚类算法.算法中,进化在多个子群之间并行进行,不同子群的抗体根据子群适应度采用不同变异尺度.进化初期,利用大尺度变异子群实现全局最优解空间的快速定位,同时变异尺度随着适应值的提升逐渐降低;进化后期,利用小尺度变异子群完成局部解空间的精确搜索.将新算法与其他聚类算法进行比较,所得结果表明新算法具有较好的聚类性能和鲁棒性.  相似文献   

17.
基于无监督学习的问答模式抽取技术   总被引:4,自引:0,他引:4  
本文提出了一种基于无监督学习算法的问答模式抽取技术从互联网上抽取应用于汉语问答系统的答案模式。该算法可以避免有监督学习算法的不足,它无需用户提供<提问,答案>对作为训练集,只需用户提供每种提问类型两个或以上的提问实例,算法即可通过Web检索、主题划分、模式提取、垂直聚类和水平聚类等步骤完成该类型提问的答案模式的学习。实验结果表明,论文提出的无监督问答模式学习方法是有效的,基于模式匹配的答案抽取技术能够较大幅度地提高汉语问答系统的性能。  相似文献   

18.
为了解决用户查询经常存在表意模糊或歧义性等问题,明确用户的查询意图,该文提出了一种无指导的子主题挖掘方法。该方法首先在检索结果文档集中利用ATF × PDF模型挖掘候选主题词;其次,为保证子主题的多样性,该文基于HowNet语义相似度方法对候选主题词进行了层次聚类分析,进而得到潜在主题;最后,利用LCS算法生成多样性子主题。实验结果显示,系统平均D#-nDCG@10达到0.573,结果说明该方法在明确查询主题表意方面取得了较好效果。  相似文献   

19.
In this paper, we propose a general learning framework based on local and global regularization. In the local regularization part, our algorithm constructs a regularized classifier for each data point using its neighborhood, while the global regularization part adopts a Laplacian regularizer to smooth the data labels predicted by those local classifiers. We show that such a learning framework can easily be incorporated into either unsupervised learning, semi-supervised learning, and supervised learning paradigm. Moreover, many existing learning algorithms can be derived from our framework. Finally we present some experimental results to show the effectiveness of our method.  相似文献   

20.
用无监督模糊聚类方法进行视频内容的分层表示   总被引:3,自引:0,他引:3  
为了在视频数据库中提供有效的视频检索和浏览功能,必须用简明的方式表示视频的内容。由于视频数据具有层次性结构,在镜头边界检测后,可以利用聚类方法按不同的相似性尺度选取代表帧和代表镜头,对视频内容进行抽象概括的表示。文中提出了一种基于无监督模糊聚类对视频内容进行分层表示的算法,它用无监督聚类方法选取镜头的代表帧,并用模糊聚类算法对代表帧进行层次化聚类以选取代表镜头和代表场景。实验结果表明这种方法可以较好地概括视频的内容,方便用户检索和浏览。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号