首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 437 毫秒
1.
Video annotation is an important issue in video content management systems. Rapid growth of the digital video data has created a need for efficient and reasonable mechanisms that can ease the annotation process. In this paper, we propose a novel hierarchical clustering based system for video annotation. The proposed system generates a top-down hierarchy of the video streams using hierarchical k-means clustering. A tree-based structure is produced by dividing the video recursively into sub-groups, each of which consists of similar content. Based on the visual features, each node of the tree is partitioned into its children using k-means clustering. Each sub-group is then represented by its key frame, which is selected as the closest frame to the centroids of the corresponding cluster, and then can be displayed at the higher level of the hierarchy. The experiments show that very good hierarchical view of the video sequences can be created for annotation in terms of efficiency.  相似文献   

2.
The k-means algorithm and its variations are known to be fast clustering algorithms. However, they are sensitive to the choice of starting points and are inefficient for solving clustering problems in large datasets. Recently, incremental approaches have been developed to resolve difficulties with the choice of starting points. The global k-means and the modified global k-means algorithms are based on such an approach. They iteratively add one cluster center at a time. Numerical experiments show that these algorithms considerably improve the k-means algorithm. However, they require storing the whole affinity matrix or computing this matrix at each iteration. This makes both algorithms time consuming and memory demanding for clustering even moderately large datasets. In this paper, a new version of the modified global k-means algorithm is proposed. We introduce an auxiliary cluster function to generate a set of starting points lying in different parts of the dataset. We exploit information gathered in previous iterations of the incremental algorithm to eliminate the need of computing or storing the whole affinity matrix and thereby to reduce computational effort and memory usage. Results of numerical experiments on six standard datasets demonstrate that the new algorithm is more efficient than the global and the modified global k-means algorithms.  相似文献   

3.
The k nearest neighbor (k-NN) classifier has been a widely used nonparametric technique in Pattern Recognition, because of its simplicity and good performance. In order to decide the class of a new prototype, the k-NN classifier performs an exhaustive comparison between the prototype to classify and the prototypes in the training set T. However, when T is large, the exhaustive comparison is expensive. For this reason, many fast k-NN classifiers have been developed, some of them are based on a tree structure, which is created during a preprocessing phase using the prototypes in T. Then, in a search phase, the tree is traversed to find the nearest neighbor. The speed up is obtained, while the exploration of some parts of the tree is avoided using pruning rules which are usually based on the triangle inequality. However, in soft sciences as Medicine, Geology, Sociology, etc., the prototypes are usually described by numerical and categorical attributes (mixed data), and sometimes the comparison function for computing the similarity between prototypes does not satisfy metric properties. Therefore, in this work an approximate fast k most similar neighbor classifier, for mixed data and similarity functions that do not satisfy metric properties, based on a tree structure (Tree k-MSN) is proposed. Some experiments with synthetic and real data are presented.  相似文献   

4.
This paper presents a new and simple segmentation method based on the K-means clustering procedure and a two-step process. The first step relies on an original de-texturing procedure which aims at converting the input natural textured color image into a color image, without texture, that will be easier to segment. Once, this de-textured (color) image is estimated, a final segmentation is achieved by a spatially-constrained K-means segmentation. These spatial constraints help the iterative K-means labeling process to succeed in finding an accurate segmentation by taking into account the inherent spatial relationships and the presence of pre-estimated homogeneous textural regions in the input image. This procedure has been successfully applied on the Berkeley image database. The experiments reported in this paper demonstrate that the proposed method is efficient (in terms of visual evaluation and quantitative performance measures) and performs competitively compared to the best existing state-of-the-art segmentation methods recently proposed in the literature.  相似文献   

5.
Intrusion detection is a necessary step to identify unusual access or attacks to secure internal networks. In general, intrusion detection can be approached by machine learning techniques. In literature, advanced techniques by hybrid learning or ensemble methods have been considered, and related work has shown that they are superior to the models using single machine learning techniques. This paper proposes a hybrid learning model based on the triangle area based nearest neighbors (TANN) in order to detect attacks more effectively. In TANN, the k-means clustering is firstly used to obtain cluster centers corresponding to the attack classes, respectively. Then, the triangle area by two cluster centers with one data from the given dataset is calculated and formed a new feature signature of the data. Finally, the k-NN classifier is used to classify similar attacks based on the new feature represented by triangle areas. By using KDD-Cup ’99 as the simulation dataset, the experimental results show that TANN can effectively detect intrusion attacks and provide higher accuracy and detection rates, and the lower false alarm rate than three baseline models based on support vector machines, k-NN, and the hybrid centroid-based classification model by combining k-means and k-NN.  相似文献   

6.
A k-means clustering algorithm for designing binary tree classifiers is introduced for the classification of cervical cells. At each nonterminal node of the designed binary tree classifier, two sets of effective feature are selected: one is based on the Bhattacharyya distance, a measure of separability between two classes; the other is based on the merits of classification accuracy. The classification result has shown the effectiveness of the features and the binary tree classifier used.  相似文献   

7.
Cluster ensembles in collaborative filtering recommendation   总被引:1,自引:0,他引:1  
Recommender systems, which recommend items of information that are likely to be of interest to the users, and filter out less favored data items, have been developed. Collaborative filtering is a widely used recommendation technique. It is based on the assumption that people who share the same preferences on some items tend to share the same preferences on other items. Clustering techniques are commonly used for collaborative filtering recommendation. While cluster ensembles have been shown to outperform many single clustering techniques in the literature, the performance of cluster ensembles for recommendation has not been fully examined. Thus, the aim of this paper is to assess the applicability of cluster ensembles to collaborative filtering recommendation. In particular, two well-known clustering techniques (self-organizing maps (SOM) and k-means), and three ensemble methods (the cluster-based similarity partitioning algorithm (CSPA), hypergraph partitioning algorithm (HGPA), and majority voting) are used. The experimental results based on the Movielens dataset show that cluster ensembles can provide better recommendation performance than single clustering techniques in terms of recommendation accuracy and precision. In addition, there are no statistically significant differences between either the three SOM ensembles or the three k-means ensembles. Either the SOM or k-means ensembles could be considered in the future as the baseline collaborative filtering technique.  相似文献   

8.
Recently there has been a considerable interest in dynamic textures due to the explosive growth of multimedia databases. In addition, dynamic texture appears in a wide range of videos, which makes it very important in applications concerning to model physical phenomena. Thus, dynamic textures have emerged as a new field of investigation that extends the static or spatial textures to the spatio-temporal domain. In this paper, we propose a novel approach for dynamic texture segmentation based on automata theory and k-means algorithm. In this approach, a feature vector is extracted for each pixel by applying deterministic partially self-avoiding walks on three orthogonal planes of the video. Then, these feature vectors are clustered by the well-known k-means algorithm. Although the k-means algorithm has shown interesting results, it only ensures its convergence to a local minimum, which affects the final result of segmentation. In order to overcome this drawback, we compare six methods of initialization of the k-means. The experimental results have demonstrated the effectiveness of our proposed approach compared to the state-of-the-art segmentation methods.  相似文献   

9.
In recent years, there have been numerous attempts to extend the k-means clustering protocol for single database to a distributed multiple database setting and meanwhile keep privacy of each data site. Current solutions for (whether two or more) multiparty k-means clustering, built on one or more secure two-party computation algorithms, are not equally contributory, in other words, each party does not equally contribute to k-means clustering. This may lead a perfidious attack where a party who learns the outcome prior to other parties tells a lie of the outcome to other parties. In this paper, we present an equally contributory multiparty k-means clustering protocol for vertically partitioned data, in which each party equally contributes to k-means clustering. Our protocol is built on ElGamal's encryption scheme, Jakobsson and Juels's plaintext equivalence test protocol, and mix networks, and protects privacy in terms that each iteration of k-means clustering can be performed without revealing the intermediate values.  相似文献   

10.
Color quantization is an important operation with many applications in graphics and image processing. Most quantization methods are essentially based on data clustering algorithms. However, despite its popularity as a general purpose clustering algorithm, k-means has not received much respect in the color quantization literature because of its high computational requirements and sensitivity to initialization. In this paper, we investigate the performance of k-means as a color quantizer. We implement fast and exact variants of k-means with several initialization schemes and then compare the resulting quantizers to some of the most popular quantizers in the literature. Experiments on a diverse set of images demonstrate that an efficient implementation of k-means with an appropriate initialization strategy can in fact serve as a very effective color quantizer.  相似文献   

11.
Rough k-means clustering algorithm and its extensions are introduced and successfully applied to real-life data where clusters do not necessarily have crisp boundaries. Experiments with the rough k-means clustering algorithm have shown that it provides a reasonable set of lower and upper bounds for a given dataset. However, the same weight was used for all the data objects in a lower or upper approximate set when computing the new centre for each cluster while the different impacts of the objects in a same approximation were ignored. An improved rough k-means clustering based on weighted distance measure with Gaussian function is proposed in this paper. The validity of this algorithm is demonstrated by simulation and experimental analysis.  相似文献   

12.
We present the global k-means algorithm which is an incremental approach to clustering that dynamically adds one cluster center at a time through a deterministic global search procedure consisting of N (with N being the size of the data set) executions of the k-means algorithm from suitable initial positions. We also propose modifications of the method to reduce the computational load without significantly affecting solution quality. The proposed clustering methods are tested on well-known data sets and they compare favorably to the k-means algorithm with random restarts.  相似文献   

13.
In this paper, we present a fast global k-means clustering algorithm by making use of the cluster membership and geometrical information of a data point. This algorithm is referred to as MFGKM. The algorithm uses a set of inequalities developed in this paper to determine a starting point for the jth cluster center of global k-means clustering. Adopting multiple cluster center selection (MCS) for MFGKM, we also develop another clustering algorithm called MFGKM+MCS. MCS determines more than one starting point for each step of cluster split; while the available fast and modified global k-means clustering algorithms select one starting point for each cluster split. Our proposed method MFGKM can obtain the least distortion; while MFGKM+MCS may give the least computing time. Compared to the modified global k-means clustering algorithm, our method MFGKM can reduce the computing time and number of distance calculations by a factor of 3.78-5.55 and 21.13-31.41, respectively, with the average distortion reduction of 5,487 for the Statlog data set. Compared to the fast global k-means clustering algorithm, our method MFGKM+MCS can reduce the computing time by a factor of 5.78-8.70 with the average reduction of distortion of 30,564 using the same data set. The performances of our proposed methods are more remarkable when a data set with higher dimension is divided into more clusters.  相似文献   

14.
DIVCLUS-T is a divisive hierarchical clustering algorithm based on a monothetic bipartitional approach allowing the dendrogram of the hierarchy to be read as a decision tree. It is designed for either numerical or categorical data. Like the Ward agglomerative hierarchical clustering algorithm and the k-means partitioning algorithm, it is based on the minimization of the inertia criterion. However, unlike Ward and k-means, it provides a simple and natural interpretation of the clusters. The price paid by construction in terms of inertia by DIVCLUS-T for this additional interpretation is studied by applying the three algorithms on six databases from the UCI Machine Learning repository.  相似文献   

15.
This paper proposes a novel scheme for 3D model compression based on mesh segmentation using multiple principal plane analysis. This algorithm first performs a mesh segmentation scheme, based on fusion of the well-known k-means clustering and the proposed principal plane analysis to separate the input 3D mesh into a set of disjointed polygonal regions. The boundary indexing scheme for the whole object is created by assembling local regions. Finally, the current work proposes a triangle traversal scheme to encode the connectivity and geometry information simultaneously for every patch under the guidance of the boundary indexing scheme. Simulation results demonstrate that the proposed algorithm obtains good performance in terms of compression rate and reconstruction quality.  相似文献   

16.
Image segmentation denotes a process of partitioning an image into distinct regions. A large variety of different segmentation approaches for images have been developed. Among them, the clustering methods have been extensively investigated and used. In this paper, a clustering based approach using a hierarchical evolutionary algorithm (HEA) is proposed for medical image segmentation. The HEA can be viewed as a variant of conventional genetic algorithms. By means of a hierarchical structure in the chromosome, the proposed approach can automatically classify the image into appropriate classes and avoid the difficulty of searching for the proper number of classes. The experimental results indicate that the proposed approach can produce more continuous and smoother segmentation results in comparison with four existing methods, competitive Hopfield neural networks (CHNN), dynamic thresholding, k-means, and fuzzy c-means methods.  相似文献   

17.
In practical cluster analysis tasks, an efficient clustering algorithm should be less sensitive to parameter configurations and tolerate the existence of outliers. Based on the neural gas (NG) network framework, we propose an efficient prototype-based clustering (PBC) algorithm called enhanced neural gas (ENG) network. Several problems associated with the traditional PBC algorithms and original NG algorithm such as sensitivity to initialization, sensitivity to input sequence ordering and the adverse influence from outliers can be effectively tackled in our new scheme. In addition, our new algorithm can establish the topology relationships among the prototypes and all topology-wise badly located prototypes can be relocated to represent more meaningful regions. Experimental results1on synthetic and UCI datasets show that our algorithm possesses superior performance in comparison to several PBC algorithms and their improved variants, such as hard c-means, fuzzy c-means, NG, fuzzy possibilistic c-means, credibilistic fuzzy c-means, hard/fuzzy robust clustering and alternative hard/fuzzy c-means, in static data clustering tasks with a fixed number of prototypes.  相似文献   

18.
Clustering is one of the widely used knowledge discovery techniques to reveal structures in a dataset that can be extremely useful to the analyst. In iterative clustering algorithms the procedure adopted for choosing initial cluster centers is extremely important as it has a direct impact on the formation of final clusters. Since clusters are separated groups in a feature space, it is desirable to select initial centers which are well separated. In this paper, we have proposed an algorithm to compute initial cluster centers for k-means algorithm. The algorithm is applied to several different datasets in different dimension for illustrative purposes. It is observed that the newly proposed algorithm has good performance to obtain the initial cluster centers for the k-means algorithm.  相似文献   

19.
By using a kernel function, data that are not easily separable in the original space can be clustered into homogeneous groups in the implicitly transformed high-dimensional feature space. Kernel k-means algorithms have recently been shown to perform better than conventional k-means algorithms in unsupervised classification. However, few reports have examined the benefits of using a kernel function and the relative merits of the various kernel clustering algorithms with regard to the data distribution. In this study, we reformulated four representative clustering algorithms based on a kernel function and evaluated their performances for various data sets. The results indicate that each kernel clustering algorithm gives markedly better performance than its conventional counterpart for almost all data sets. Of the kernel clustering algorithms studied in the present work, the kernel average linkage algorithm gives the most accurate clustering results.  相似文献   

20.
Kernel-based clustering is one of the most popular methods for partitioning nonlinearly separable datasets. However, exhaustive search for the global optimum is NP-hard. Iterative procedure such as k-means can be used to seek one of the local minima. Unfortunately, it is easily trapped into degenerate local minima when the prototypes of clusters are ill-initialized. In this paper, we restate the optimization problem of kernel-based clustering in an online learning framework, whereby a conscience mechanism is easily integrated to tackle the ill-initialization problem and faster convergence rate is achieved. Thus, we propose a novel approach termed conscience online learning (COLL). For each randomly taken data point, our method selects the winning prototype based on the conscience mechanism to bias the ill-initialized prototype to avoid degenerate local minima and efficiently updates the winner by the online learning rule. Therefore, it can more efficiently obtain smaller distortion error than k-means with the same initialization. The rationale of the proposed COLL method is experimentally analyzed. Then, we apply the COLL method to the applications of digit clustering and video clustering. The experimental results demonstrate the significant improvement over existing kernel-based clustering methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号