首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 140 毫秒
1.
王靖 《软件学报》2011,22(7):1571-1579
近几年来,流形学习在模式识别、机器学习和数据挖掘等许多领域都受到了广泛的关注.但是,通常的流形学习方法对离群点缺乏鲁棒性.对此,提出了一种基于重构权的流形离群点检测方法.该方法在每个样本点构造局部"强"邻域,再利用局部重构权来计算每个样本点的可靠值,最后利用可靠值检测出离群点.该算法具有计算快、参数少、参数敏感性小等优点.基于此离群点检测方法,提出了鲁棒的Isomap算法.实验结果表明,该方法能够有效检测离群点,从而提高流形学习方法对离群点的鲁棒性.  相似文献   

2.
几种流形学习算法的比较研究   总被引:1,自引:0,他引:1  
如何发现高维数据空间流形中有意义的低维嵌入信息是流形学习的主要目的。目前,大部分流形学习算法都是用于非线性维数约简或是数据可视化的,如等距映射(Isomap),局部线性嵌入算法(LLE),拉普拉斯特征映射算(laplacian Eigenmap)等等,文章对这三种流形学习算法进行实验分析与比较,目的在于了解这几种流形学习算法的特点,以便更好地进行数据的降维与分析。  相似文献   

3.
近年来出现的一系列进行维数约简的非线性方法——流形学习中等距映射(Isomap)是其中的代表,该算法高效、简单,但计算复杂度较高。基于标志点(Landmark Points)的L-Isomap减少了计算复杂度,但对于标志点的选取,大都采用随机的方法,致使该算法不稳定。考虑到样本点和近邻点相对位置,将对嵌入流形影响较大的样本点赋予较高的权重。然后根据权重大小选择标志点,同时考虑标志点之间的相对位置,使得选出的标志点不会出现过度集中的现象,近似直线分布的概率也大大降低,从而保证了算法的稳定性。实验结果表明,该算法在标志点数量较少的情况下,比L-Isomap稳定,且对缺失数据的不完整流形,也能获取和Isomap相差不大的结果。  相似文献   

4.
高维数据流形的低维嵌入及嵌入维数研究   总被引:29,自引:0,他引:29  
发现高维数据空间流形中有意义的低维嵌入是一个经典难题.Isomap是提出的一种有效的基于流形理论的非线性降维方法,它不仅能够揭示高维观察数据的内在结构,还能够发现潜在的低维参教空间.Isomap的理论基础是假设在高维数据空间和低维参数空间存在等距映射,但并没有进行证明.首先给出了高维数据的连续流形和低维参数空间之间的等距映射存在性证明,然后区分了嵌入空间维数、高维数据空间的固有维数和流形维数,并证明存在环状流形高维数据空间的参数空间维数小于嵌入空间维数.最后提出一种环状流形的发现算法,判断高维数据空间是否存在环状流形,进而估计其固有维教及潜在空间维数.在多姿态三维对象的实验中证明了算法的有效性,并得到正确的低维参数空间.  相似文献   

5.
来自化工生产过程的数据大多具有非线性和高维性,对数据进行特征提取是软测量建模过程的必要环节。流形学习作为一种非线性维数约简方法,可以获得高维数据在低维空间的嵌入。针对流形学习中的等距映射法(Isomap)鲁棒性差、拓扑稳定性不好等缺点,通过常数偏移的方法构造核矩阵,形成核等距映射法(KIsomap),提高了Isomap算法的鲁棒性和拓扑稳定性。运用一种将K近邻与ε-半径法相结合的方法构造邻域图,基于核等距映射法(KIsomap)对数据进行特征提取,并建立高斯过程回归软测量模型,提高了模型的泛化能力与学习效率。将该方法应用于某双酚A装置的软测量建模,仿真结果表明相比于其他特征提取的软测量建模方法,该方法具有更高的估计精度和学习效率。  相似文献   

6.
在面向分类的高光谱遥感数据降维过程中,考虑到高光谱遥感数据内在的非线性结构和传统流形学习非监督的特点,提出一种新的监督等距映射方法(S-Isomap)。方法基于类间距离大于类内距离的思想,首先利用KMEANS算法对原始数据进行聚类得到样本的初始类别标签,采用新距离搜寻数据点的K近邻,进而实施等距映射降维。实验证明了该方法优于传统Isomap。  相似文献   

7.
在面向分类的高光谱遥感数据降维过程中,考虑到高光谱遥感数据内在的非线性结构和传统流形学习非监督的特点,提出一种新的监督等距映射方法(S-Isomap).方法基于类间距离大于类内距离的思想,首先利用KMEANS算法对原始数据进行聚类得到样本的初始类别标签,采用新距离搜寻数据点的K近邻,进而实施等距映射降维.实验证明了该方法优于传统Isomap.  相似文献   

8.
流形学习概述   总被引:37,自引:2,他引:37  
流形学习是一种新的非监督学习方法,近年来引起越来越多机器学习和认知科学工作者的重视.为了加深对流形学习的认识和理解,该文由流形学习的拓扑学概念入手,追溯它的发展过程.在明确流形学习的不同表示方法后,针对几种主要的流形算法,分析它们各自的优势和不足,然后分别引用Isomap和LLE的应用示例.结果表明,流形学习较之于传统的线性降维方法,能够有效地发现非线性高维数据的本质维数,利于进行维数约简和数据分析.最后对流形学习未来的研究方向做出展望,以期进一步拓展流形学习的应用领域.  相似文献   

9.
目前大多数流形学习算法无法获取高维输入空间到低维嵌入空间的映射,无法处理新增数据,因此无增量学习能力。而已有的增量流形学习算法大多是通过扩展某一特定的流形学习算法使其具备增量学习能力,不具有通用性。针对这一问题,提出了一种通用的增量流形学习(GIML)算法。该方法充分考虑流形的局部平滑性这一本质特征,利用局部主成分分析法来提取数据集的局部平滑结构,并寻找包含新增样本点的局部平滑结构到对应训练数据的低维嵌入坐标的最佳变换。最后GIML算法利用该变换计算新增样本点的低维嵌入坐标。在人工数据集和实际图像数据集上进行了系统而广泛的比较实验,实验结果表明GIML算法是一种高效通用的增量流形学习方法,且相比当前主要的增量算法,能更精确地获取增量数据的低维嵌入坐标。  相似文献   

10.
流形学习算法中的参数选择问题研究   总被引:1,自引:0,他引:1  
流形学习(Manifold Learning)算法是近年来发展起来的非线性降维机器学习算法.等度规特征映射Isomap(Isometric feature mapping)和局部线性嵌入LLE(Locally Linear Embedding)是两种典型的流形学习算法.通过实验比较和分析两种算法中邻接参数K和采样点数N的选取对降维结果以及执行时间的影响,实验结果表明Isomap对邻接参数K和采样点数N具有较高的容忍度,而LLE算法在计算速度上优势明显.  相似文献   

11.
In the past few years, some nonlinear dimensionality reduction (NLDR) or nonlinear manifold learning methods have aroused a great deal of interest in the machine learning community. These methods are promising in that they can automatically discover the low-dimensional nonlinear manifold in a high-dimensional data space and then embed the data points into a low-dimensional embedding space, using tractable linear algebraic techniques that are easy to implement and are not prone to local minima. Despite their appealing properties, these NLDR methods are not robust against outliers in the data, yet so far very little has been done to address the robustness problem. In this paper, we address this problem in the context of an NLDR method called locally linear embedding (LLE). Based on robust estimation techniques, we propose an approach to make LLE more robust. We refer to this approach as robust locally linear embedding (RLLE). We also present several specific methods for realizing this general RLLE approach. Experimental results on both synthetic and real-world data show that RLLE is very robust against outliers.  相似文献   

12.
Many manifold learning procedures try to embed a given feature data into a flat space of low dimensionality while preserving as much as possible the metric in the natural feature space. The embedding process usually relies on distances between neighboring features, mainly since distances between features that are far apart from each other often provide an unreliable estimation of the true distance on the feature manifold due to its non-convexity. Distortions resulting from using long geodesics indiscriminately lead to a known limitation of the Isomap algorithm when used to map non-convex manifolds. Presented is a framework for nonlinear dimensionality reduction that uses both local and global distances in order to learn the intrinsic geometry of flat manifolds with boundaries. The resulting algorithm filters out potentially problematic distances between distant feature points based on the properties of the geodesics connecting those points and their relative distance to the boundary of the feature manifold, thus avoiding an inherent limitation of the Isomap algorithm. Since the proposed algorithm matches non-local structures, it is robust to strong noise. We show experimental results demonstrating the advantages of the proposed approach over conventional dimensionality reduction techniques, both global and local in nature.  相似文献   

13.
Existing approaches to recover structure of 3D deformable objects and camera motion parameters from an uncalibrated images assume the object’s shape could be modelled well by a linear subspace. These methods have been proven effective and well suited when the deformations are relatively small, but fail to reconstruct the objects with relatively large deformations. This paper describes a novel approach for 3D non-rigid shape reconstruction, based on manifold decision forest technique. The use of this technique can be justified by noting that a specific type of shape variations might be governed by only a small number of parameters, and therefore can be well represented in a low-dimensional manifold. The key contributions of this work are the use of random decision forests for the shape manifold learning and robust metric for calculation of the re-projection error. The learned manifold defines constraints imposed on the reconstructed shapes. Due to a nonlinear structure of the learned manifold, this approach is more suitable to deal with large and complex object deformations when compared to the linear constraints. The robust metric is applied to reduce the effect of measurement outliers on the quality of the reconstruction. In many practical applications outliers cannot be completely removed and therefore the use of robust techniques is of particular practical interest. The proposed method is validated on 2D points sequences projected from the 3D motion capture data for ground truth comparison and also on real 2D video sequences. Experiments show that the newly proposed method provides better performance compared to previously proposed ones, including the robustness with respect to measurement noise, missing measurements and outliers present in the data.  相似文献   

14.
Recently, the Isomap algorithm has been proposed for learning a parameterized manifold from a set of unorganized samples from the manifold. It is based on extending the classical multidimensional scaling method for dimension reduction, replacing pairwise Euclidean distances by the geodesic distances on the manifold. A continuous version of Isomap called continuum Isomap is proposed. Manifold learning in the continuous framework is then reduced to an eigenvalue problem of an integral operator. It is shown that the continuum Isomap can perfectly recover the underlying parameterization if the mapping associated with the parameterized manifold is an isometry and its domain is convex. The continuum Isomap also provides a natural way to compute low-dimensional embeddings for out-of-sample data points. Some error bounds are given for the case when the isometry condition is violated. Several illustrative numerical examples are also provided.  相似文献   

15.
针对环状流形数据的非线性降维   总被引:1,自引:0,他引:1  
孟德宇  古楠楠  徐宗本  梁怡 《软件学报》2008,19(11):2908-2920
近年来出现了多种新型的非线性降维方法,且在一些应用中体现出良好的效果.然而,当面对球体、柱体等环状流形产生的非线性流形数据时,这些方法往往会失效.针对这一问题,提出了针对环状流形数据的环结构检测算法与非线性降维方法.理论上,基于目前极受关注的Isomap降维方法的运行原理,给出了一个判断环状流形的充要条件;算法上利用所得的判断定理,制订了基于数据的环状流形检测算法:最后基于所找到的环结构,利用极坐标展开的思想设计了针对环状流形数据的非线性降维策略.针对一系列典型环状流形数据集的仿真实验结果表明,与其他流形学习降维方法相比,该方法对环状流形数据进行降维具有显著优势.  相似文献   

16.
入侵检测是计算机安全研究方面的热点领域,在入侵检测数据可视化和分类方面面临的问题是其高维特性。流形学习算法Isomap是有效的非线性降维工具。但是Isomap算法在实际应用中存在不能保证构造连通的邻接图和没有利用样本已知类别标记的缺点,针对上述缺陷提出了健壮的有监督S-kv-Isomap算法。该算法利用类别标记来指导降维,并且利用k-variable算法构造联通的邻接图。实验选用KDDCUP1999数据集,对四类入侵数据即Dos、R2L、Probe、U2R进行了可视化和分类研究。可视化中比较了S-kv-Isomap算法与kv-Isomap算法,前者具有更好的可视化效果。在分类研究中比较了S-kv-Isomap、kv-Isomap、SVM和k-NN算法,实验结果表明,S-kv-Isomap方法在入侵检测中不仅保持较高的入侵检测率,而且误警率很低。  相似文献   

17.
Isomap is one of widely used low-dimensional embedding methods, where geodesic distances on a weighted graph are incorporated with the classical scaling (metric multidimensional scaling). In this paper we pay our attention to two critical issues that were not considered in Isomap, such as: (1) generalization property (projection property); (2) topological stability. Then we present a robust kernel Isomap method, armed with such two properties. We present a method which relates the Isomap to Mercer kernel machines, so that the generalization property naturally emerges, through kernel principal component analysis. For topological stability, we investigate the network flow in a graph, providing a method for eliminating critical outliers. The useful behavior of the robust kernel Isomap is confirmed through numerical experiments with several data sets.  相似文献   

18.
This paper proposes a new approach to analyze high-dimensional data set using low-dimensional manifold. This manifold-based approach provides a unified formulation for both learning from and synthesis back to the input space. The manifold learning method desires to solve two problems in many existing algorithms. The first problem is the local manifold distortion caused by the cost averaging of the global cost optimization during the manifold learning. The second problem results from the unit variance constraint generally used in those spectral embedding methods where global metric information is lost. For the out-of-sample data points, the proposed approach gives simple solutions to transverse between the input space and the feature space. In addition, this method can be used to estimate the underlying dimension and is robust to the number of neighbors. Experiments on both low-dimensional data and real image data are performed to illustrate the theory.  相似文献   

19.
Manifold learning is a well-known dimensionality reduction scheme which can detect intrinsic low-dimensional structures in non-linear high-dimensional data. It has been recently widely employed in data analysis, pattern recognition, and machine learning applications. Isomap is one of the most promising manifold learning algorithms, which extends metric multi-dimensional scaling by using approximate geodesic distance. However, when Isomap is conducted on real-world applications, it may have some difficulties in dealing with noisy data. Although many applications represent a special sample by multiple feature vectors in different spaces, Isomap employs samples in unique observation space. In this paper, two extended versions of Isomap to multiple feature spaces problem, namely fusion of dissimilarities and fusion of geodesic distances, are presented. We have employed the advantages of several spaces and depicted the Euclidean distance on learned manifold that is more compatible to the semantic distance. To show the effectiveness and validity of the proposed method, some experiments have been carried out on the application of shape analysis on MPEG7 CE Part B and Fish data sets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号