共查询到20条相似文献,搜索用时 0 毫秒
1.
Spectral clustering: A semi-supervised approach 总被引:2,自引:0,他引:2
Recently, graph-based spectral clustering algorithms have been developing rapidly, which are proposed as discrete combinatorial optimization problems and approximately solved by relaxing them into tractable eigenvalue decomposition problems. In this paper, we first review the current existing spectral clustering algorithms in a unified-framework way and give a straightforward explanation about spectral clustering. We also present a novel model for generalizing the unsupervised spectral clustering to semi-supervised spectral clustering. Under this model, prior information given by some instance-level constraints can be generalized to space-level constraints. We find that (undirected) graph built on the enlarged prior information is more meaningful, hence the boundaries of the clusters are more correct. Experimental results based on toy data, real-world data and image segmentation demonstrate the advantages of the proposed model. 相似文献
2.
This paper proposes an algorithm that solves the shape recovery problem from N arbitrary images. By introducing a polygonal carving technique, the proposed algorithm can reconstruct the image-consistent polygonal shape that is patched by input images. This algorithm eliminates the invalid vertices and polygons from the initial polygonal grid space according to the color variance that represents their image consistency. The carved shape is refined by moving the outlier vertices on the boundary of each image. The final reconstructed shape faithfully accounts for the input images, and its textured appearance reflects the similar color property of the target object. 相似文献
3.
Semi-supervised learning (SSL) involves the training of a decision rule from both labeled and unlabeled data. In this paper, we propose a novel SSL algorithm based on the multiple clusters per class assumption. The proposed algorithm consists of two stages. In the first stage, we aim to capture the local cluster structure of the training data by using the k-nearest-neighbor (kNN) algorithm to split the data into a number of disjoint subsets. In the second stage, a maximal margin classifier based on the second order cone programming (SOCP) is introduced to learn an inductive decision function from the obtained subsets globally. For linear classification problems, once the kNN algorithm has been performed, the proposed algorithm trains a classifier using only the first and second order moments of the subsets without considering individual data points. Since the number of subsets is usually much smaller than the number of training points, the proposed algorithm is efficient for handling big data sets with a large amount of unlabeled data. Despite its simplicity, the classification performance of the proposed algorithm is guaranteed by the maximal margin classifier. We demonstrate the efficiency and effectiveness of the proposed algorithm on both synthetic and real-world data sets. 相似文献
4.
Exploiting constraint inconsistence for dimension selection in subspace clustering: A semi-supervised approach 总被引:1,自引:0,他引:1
Xianchao ZhangAuthor Vitae Yang Qiu Author VitaeYao Wu Author Vitae 《Neurocomputing》2011,74(17):3598-3608
Selecting correct dimensions is very important to subspace clustering and is a challenging issue. This paper studies semi-supervised approach to the problem. In this setting, limited domain knowledge in the form of space level pair-wise constraints, i.e., must-links and cannot-links, are available. We propose a semi-supervised subspace clustering (S3C) algorithm that exploits constraint inconsistence for dimension selection. Our algorithm firstly correlates globally inconsistent constraints to dimensions in which they are consistent, then unites constraints with common correlating dimensions, and finally forms the subspaces according to the constraint unions. Experimental results show that S3C is superior to the typical unsupervised subspace clustering algorithm FINDIT, and the other constraint based semi-supervised subspace clustering algorithm SC-MINER. 相似文献
5.
Modelling of a complex carving surface is the most important process for digitization of art carving such as Chinese classical furniture carving, and it is difficult to be fulfilled. However, a complex 2D curve flower pattern can be easily acquired or drawn by handcraft or a drawing software. This paper presents a quick integrative 3D modeling method of complex carving surface based on a 2D curve flower pattern. The proposed method uses a scanning analysis algorithm, a normal distribution function and a distance function to model and create carving tracks. In this paper, the delamination, combination and interpolation of modelling process are described as well. The provided research method will make the modelling of complex carving surface more intelligent, agile, and will meet the requirement of integrative 3D modelling of digital art carving. Experimental results show that this method is of quick modelling and multi-model effective characteristics with realizable interactive designing and excellent practicability. 相似文献
6.
As we all know, a well-designed graph tends to result in good performance for graph-based semi-supervised learning. Although most graph-based semi-supervised dimensionality reduction approaches perform very well on clean data sets, they usually cannot construct a faithful graph which plays an important role in getting a good performance, when performing on the high dimensional, sparse or noisy data. So this will generally lead to a dramatic performance degradation. To deal with these issues, this paper proposes a feasible strategy called relative semi-supervised dimensionality reduction (RSSDR) by utilizing the perceptual relativity to semi-supervised dimensionality reduction. In RSSDR, firstly, relative transformation will be performed over the training samples to build the relative space. It should be indicated that relative transformation improves the distinguishing ability among data points and diminishes the impact of noise on semi-supervised dimensionality reduction. Secondly, the edge weights of neighborhood graph will be determined through minimizing the local reconstruction error in the relative space such that it can preserve the global geometric structure as well as the local one of the data. Extensive experiments on face, UCI, gene expression, artificial and noisy data sets have been provided to validate the feasibility and effectiveness of the proposed algorithm with the promising results both in classification accuracy and robustness. 相似文献
7.
Deformable surface 3D tracking is a severely under-constrained problem and great efforts have been made to solve it. A recent state-of-the-art approach solves this problem by formulating it as a second order cone programming (SOCP) problem. However, one drawback of this approach is that it is time-consuming. In this paper, we propose an effective method for 3D deformable surface tracking. First, we formulate the deformable surface tracking problem as a linear programming (LP) problem. Then, we solve the LP problem with an algorithm which converges superlinearly rather than bisection algorithm whose convergence speed is linear. Our experimental studies on synthetic and real data have demonstrated the proposed method can not only reliably recover 3D structures of surfaces but also run faster than the state-of-the-art method. 相似文献
8.
Semi-supervised Gaussian mixture model (SGMM) has been successfully applied to a wide range of engineering and scientific fields, including text classification, image retrieval, and biometric identification. Recently, many studies have shown that naturally occurring data may reside on or near manifold structures in ambient space. In this paper, we study the use of SGMM for data sets containing multiple separated or intersecting manifold structures. We propose a new multi-manifold regularized, semi-supervised Gaussian mixture model (M2SGMM) for classifying multiple manifolds. Specifically, we model the data manifold using a similarity graph with local and geometrical consistency properties. The geometrical similarity is measured by a novel application of local tangent space. We regularize the model parameters of the SGMM by incorporating the enhanced Laplacian of the graph. Experiments demonstrate the effectiveness of the proposed approach. 相似文献
9.
A global optimization method for semi-supervised clustering 总被引:1,自引:0,他引:1
Yu Xia 《Data mining and knowledge discovery》2009,18(2):214-256
In this paper, we adapt Tuy’s concave cutting plane method to the semi-supervised clustering. We also give properties of local
optimal solutions of the semi-supervised clustering. Numerical examples show that this method can give a better solution than
other semi-supervised clustering algorithms do. 相似文献
10.
11.
Literature on supervised Machine-Learning (ML) approaches for classifying text-based safety reports for the construction sector has been growing. Recent studies have emphasized the need to build ML approaches that balance high classification accuracy and performance on management criteria, such as resource intensiveness. However, despite being highly accurate, the extensively focused, supervised ML approaches may not perform well on management criteria as many factors contribute to their resource intensiveness. Alternatively, the potential for semi-supervised ML approaches to achieve balanced performance has rarely been explored in the construction safety literature. The current study contributes to the scarce knowledge on semi-supervised ML approaches by demonstrating the applicability of a state-of-the-art semi-supervised learning approach, i.e., Yet, Another Keyword Extractor (YAKE) integrated with Guided Latent Dirichlet Allocation (GLDA) for construction safety report classification. Construction-safety-specific knowledge is extracted as keywords through YAKE, relying on accessible literature with minimal manual intervention. Keywords from YAKE are then seeded in the GLDA model for the automatic classification of safety reports without requiring a large quantity of prelabeled datasets. The YAKE-GLDA classification performance (F1 score of 0.66) is superior to existing unsupervised methods for the benchmark data containing injury narratives from Occupational Health and Safety Administration (OSHA). The YAKE-GLDA approach is also applied to near-miss safety reports from a construction site. The study demonstrates a high degree of generality of the YAKE-GLDA approach through a moderately high F1 score of 0.86 for a few categories in the near-miss data. The current research demonstrates that, unlike the existing supervised approaches, the semi-supervised YAKE-GLDA approach can achieve a novel possibility of consistently achieving reasonably good classification performance across various construction-specific safety datasets yet being resource-efficient. Results from an objective comparative and sensitivity analysis contribute to much-required knowledge-contesting insights into the functioning and applicability of the YAKE-GLDA. The results from the current study will help construction organizations implement and optimize an efficient ML-based knowledge-mining strategy for domains beyond safety and across sites where the availability of a pre-labeled dataset is a significant limitation. 相似文献
12.
13.
This paper proposes a fast and stable image-based modeling method which generates 3D models with high-quality face textures in a semi-automatic way. The modeler guides untrained users to quickly obtain 3D model data via several steps of simple user interface operations using predefined 3D primitives. The proposed method contains an iterative non-linear error minimization technique in the model estimation step with an error function based on finite line segments instead of infinite lines. The error corresponds to the difference between the observed structure and the predicted structure from current model parameters. Experimental results on real images validate the robustness and the accuracy of the algorithm. 相似文献
14.
Ruichu Cai Author Vitae Zhenjie Zhang Author Vitae Author Vitae 《Pattern recognition》2011,44(4):811-820
Feature selection is an important preprocessing step for building efficient, generalizable and interpretable classifiers on high dimensional data sets. Given the assumption on the sufficient labelled samples, the Markov Blanket provides a complete and sound solution to the selection of optimal features, by exploring the conditional independence relationships among the features. In real-world applications, unfortunately, it is usually easy to get unlabelled samples, but expensive to obtain the corresponding accurate labels on the samples. This leads to the potential waste of valuable classification information buried in unlabelled samples.In this paper, we propose a new BAyesian Semi-SUpervised Method, or BASSUM in short, to exploit the values of unlabelled samples on classification feature selection problem. Generally speaking, the inclusion of unlabelled samples helps the feature selection algorithm on (1) pinpointing more specific conditional independence tests involving fewer variable features and (2) improving the robustness of individual conditional independence tests with additional statistical information. Our experimental results show that BASSUM enhances the efficiency of traditional feature selection methods and overcomes the difficulties on redundant features in existing semi-supervised solutions. 相似文献
15.
Exploring the relationships of humans is an important study in the mobile communication network. But the relationship prediction accuracy is not good enough when the number of known relationship labels (e.g., “friend” and “colleague”) is small, especially when the number of different relation classes are imbalanced in the mobile communication network. To deal with issues, we present a semi-supervised social relationships inferred model. This model can infer the relationships based on a large amount of unlabeled data or a small amount of labeled data. The model is a co-training style semi-supervised model which is combined with the support vector machine and naive Bayes. The final relationship labels are decided by the two classifiers. The proposed model is evaluated by a real mobile communication network dataset and the experiment results show that the model is effective in relationship mining, especially when the relationship network is in a stable state. 相似文献
16.
The spatially asymptotic theory is a useful approach to the neutron transport model for nuclear reactor physics applications. For steady-state problems the transport equation is taken in an infinite medium and it is treated by the Fourier transform. A formal solution is thus obtained for any assumption on the order of anisotropy, leading to the BN formulation. In the case of isotropic emissions the Green function of the problem can be given an explicit expression by the inverse Fourier transformation, leading to the solution that can also be obtained by Case method. 相似文献
17.
Traditional data-based soft sensors are constructed with equal numbers of input and output data samples, meanwhile, these collected process data are assumed to be clean enough and no outliers are mixed. However, such assumptions are too strict in practice. On one hand, those easily collected input variables are sometimes corrupted with outliers. On the other hand, output variables, which also called quality variables, are usually difficult to obtain. These two problems make traditional soft sensors cumbersome. To deal with both issues, in this paper, the Student's t distributions are used during mixture probabilistic principal component regression modeling to tolerate outliers with regulated heavy tails. Furthermore, a semi-supervised mechanism is incorporated into traditional probabilistic regression so as to deal with the unbalanced modeling issue. For simulation, two case studies are provided to demonstrate robustness and reliability of the new method. 相似文献
18.
For the management of digital document collections, automatic database analysis still has difficulties to deal with semantic queries and abstract concepts that users are looking for. Whenever interactive learning strategies may improve the results of the search, system performances still depend on the representation of the document collection. We introduce in this paper a weakly supervised optimization of a feature vector set. According to an incomplete set of partial labels, the method improves the representation of the collection, even if the size, the number, and the structure of the concepts are unknown. Experiments have been carried out on synthetic and real data in order to validate our approach. 相似文献
19.
由于锥形束体积重建算法具有较高的获取投影数据的速度、较大的 X射线利用率及能保持重建物体的空间和密度各向同性等方面的优点 ,因而引起了人们的广泛关注 .针对锥顶轨迹为单圆的锥形束体积重建问题 ,提出了一种基于平面检测器的 T- FDK算法 (简称 FT- FDK算法 ) .该算法首先将锥形束投影数据重排为倾斜平行投影数据 ,然后再经过加权滤波和反投影重建来得到待测物体的三维结构 .实验结果表明 ,该算法不仅与传统的 FDK算法有相同的计算复杂度 ,且重建图象的质量有了明显的提高 ,因而该算法在医学成像和无损探伤等领域具有重要的实用价值 . 相似文献
20.
J. R. J. Lee M. L. Smith L. N. Smith P. S. Midha 《Machine Vision and Applications》2005,16(5):282-288
Angularity is a critically important property in terms of the performance of natural particulate materials. It is also one of the most difficult to measure objectively using traditional methods. Here we present an innovative and efficient approach to the determination of particle angularity using image analysis. The direct use of three-dimensional data offers a more robust solution than the two-dimensional methods proposed previously. The algorithm is based on the application of mathematical morphological techniques to range imagery, and effectively simulates the natural wear processes by which rock particles become rounded. The analysis of simulated volume loss is used to provide a valuable measure of angularity that is geometrically commensurate with the traditional definitions. Experimental data obtained using real particle samples are presented and results correlated with existing methods in order to demonstrate the validity of the new approach. The implementation of technologies such as these has the potential to offer significant process optimisation and environmental benefits to the producers of aggregates and their composites. The technique is theoretically extendable to the quantification of surface texture. 相似文献