首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
高维数据流形的低维嵌入及嵌入维数研究   总被引:29,自引:0,他引:29  
发现高维数据空间流形中有意义的低维嵌入是一个经典难题.Isomap是提出的一种有效的基于流形理论的非线性降维方法,它不仅能够揭示高维观察数据的内在结构,还能够发现潜在的低维参教空间.Isomap的理论基础是假设在高维数据空间和低维参数空间存在等距映射,但并没有进行证明.首先给出了高维数据的连续流形和低维参数空间之间的等距映射存在性证明,然后区分了嵌入空间维数、高维数据空间的固有维数和流形维数,并证明存在环状流形高维数据空间的参数空间维数小于嵌入空间维数.最后提出一种环状流形的发现算法,判断高维数据空间是否存在环状流形,进而估计其固有维教及潜在空间维数.在多姿态三维对象的实验中证明了算法的有效性,并得到正确的低维参数空间.  相似文献   

2.
针对等距离映射(Isomap)算法在处理扰动图像时拓扑结构不稳定的缺点,提出了一种改进算法。改进算法将图像欧氏距离(IMED)嵌入到等距离映射算法之中。首先引入坐标度量系数计算图像的坐标度量矩阵,通过线性变换将原始图像从欧氏距离(ED)空间转换到图像欧氏距离空间;然后计算变换空间中样本的欧氏距离矩阵,并在此基础上构建样本邻域图,得到近似测地距离矩阵;最后采用多维标度(MDS)分析算法构造样本的低维表示。对ORL和Yale人脸数据库降维并结合最近邻分类器进行实验,基于改进算法的识别率平均分别提高了5.57%和3.95%,表明与原算法相比,改进算法在人脸识别中对图像扰动具有较好的鲁棒性。  相似文献   

3.
Manifold learning is a well-known dimensionality reduction scheme which can detect intrinsic low-dimensional structures in non-linear high-dimensional data. It has been recently widely employed in data analysis, pattern recognition, and machine learning applications. Isomap is one of the most promising manifold learning algorithms, which extends metric multi-dimensional scaling by using approximate geodesic distance. However, when Isomap is conducted on real-world applications, it may have some difficulties in dealing with noisy data. Although many applications represent a special sample by multiple feature vectors in different spaces, Isomap employs samples in unique observation space. In this paper, two extended versions of Isomap to multiple feature spaces problem, namely fusion of dissimilarities and fusion of geodesic distances, are presented. We have employed the advantages of several spaces and depicted the Euclidean distance on learned manifold that is more compatible to the semantic distance. To show the effectiveness and validity of the proposed method, some experiments have been carried out on the application of shape analysis on MPEG7 CE Part B and Fish data sets.  相似文献   

4.
5.
Riemannian manifold learning   总被引:1,自引:0,他引:1  
Recently, manifold learning has been widely exploited in pattern recognition, data analysis, and machine learning. This paper presents a novel framework, called Riemannian manifold learning (RML), based on the assumption that the input high-dimensional data lie on an intrinsically low-dimensional Riemannian manifold. The main idea is to formulate the dimensionality reduction problem as a classical problem in Riemannian geometry, i.e., how to construct coordinate charts for a given Riemannian manifold? We implement the Riemannian normal coordinate chart, which has been the most widely used in Riemannian geometry, for a set of unorganized data points. First, two input parameters (the neighborhood size k and the intrinsic dimension d) are estimated based on an efficient simplicial reconstruction of the underlying manifold. Then, the normal coordinates are computed to map the input high-dimensional data into a low-dimensional space. Experiments on synthetic data as well as real world images demonstrate that our algorithm can learn intrinsic geometric structures of the data, preserve radial geodesic distances, and yield regular embeddings.  相似文献   

6.
Recently, the Isomap algorithm has been proposed for learning a parameterized manifold from a set of unorganized samples from the manifold. It is based on extending the classical multidimensional scaling method for dimension reduction, replacing pairwise Euclidean distances by the geodesic distances on the manifold. A continuous version of Isomap called continuum Isomap is proposed. Manifold learning in the continuous framework is then reduced to an eigenvalue problem of an integral operator. It is shown that the continuum Isomap can perfectly recover the underlying parameterization if the mapping associated with the parameterized manifold is an isometry and its domain is convex. The continuum Isomap also provides a natural way to compute low-dimensional embeddings for out-of-sample data points. Some error bounds are given for the case when the isometry condition is violated. Several illustrative numerical examples are also provided.  相似文献   

7.
As a large-scale database of hundreds of thousands of face images collected from the Internet and digital cameras becomes available, how to utilize it to train a well-performed face detector is a quite challenging problem. In this paper, we propose a method to resample a representative training set from a collected large-scale database to train a robust human face detector. First, in a high-dimensional space, we estimate geodesic distances between pairs of face samples/examples inside the collected face set by isometric feature mapping (Isomap) and then subsample the face set. After that, we embed the face set to a low-dimensional manifold space and obtain the low-dimensional embedding. Subsequently, in the embedding, we interweave the face set based on the weights computed by locally linear embedding (LLE). Furthermore, we resample nonfaces by Isomap and LLE likewise. Using the resulting face and nonface samples, we train an AdaBoost-based face detector and run it on a large database to collect false alarms. We then use the false detections to train a one-class support vector machine (SVM). Combining the AdaBoost and one-class SVM-based face detector, we obtain a stronger detector. The experimental results on the MIT + CMU frontal face test set demonstrated that the proposed method significantly outperforms the other state-of-the-art methods.  相似文献   

8.
黎曼流形上的保局投影在图像集匹配中的应用   总被引:1,自引:1,他引:0       下载免费PDF全文
目的提出了黎曼流形上局部结构特征保持的图像集匹配方法。方法该方法使用协方差矩阵建模图像集合,利用对称正定的非奇异协方差矩阵构成黎曼流形上的子空间,将图像集的匹配转化为流形上的点的匹配问题。通过基于协方差矩阵度量学习的核函数将黎曼流形上的协方差矩阵映射到欧几里德空间。不同于其他方法黎曼流形上的鉴别分析方法,考虑到样本分布的局部几何结构,引入了黎曼流形上局部保持的图像集鉴别分析方法,保持样本分布的局部邻域结构的同时提升样本的可分性。结果在基于图像集合的对象识别任务上测试了本文算法,在ETH80和YouTube Celebrities数据库分别进行了对象识别和人脸识别实验,分别达到91.5%和65.31%的识别率。结论实验结果表明,该方法取得了优于其他图像集匹配算法的效果。  相似文献   

9.
局部保持的流形学习算法对比研究   总被引:1,自引:1,他引:0  
局部保持的流形学习通过从局部到整体的思想保持观测空间和内在嵌入空间的局部几何共性,发现嵌入在高维欧氏空间中的内在低维流形。分析了局部保持的流形学习算法的基本实现框架,详细比较了一些局部保持的流形学习算法的特点,提出了几个有益的研究主题。  相似文献   

10.
传统的Isomap算法仅侧重于当前数据的分析,不能提供由高维空间到低维空间的快速直接映射,因此无法用于特征提取和高维数据检索.针对这一问题,文中提出一种基于Isornap的快速数据检索算法.该算法能够快速得到新样本的低维嵌入坐标,并基于此坐标检索与输入样本最相似的参考样本.在典型测试集上的实验结果表明,该算法在实现新样本到低维流形快速映射的同时,能较好保留样本的近邻关系.  相似文献   

11.
几种流形学习算法的比较研究   总被引:1,自引:0,他引:1  
如何发现高维数据空间流形中有意义的低维嵌入信息是流形学习的主要目的。目前,大部分流形学习算法都是用于非线性维数约简或是数据可视化的,如等距映射(Isomap),局部线性嵌入算法(LLE),拉普拉斯特征映射算(laplacian Eigenmap)等等,文章对这三种流形学习算法进行实验分析与比较,目的在于了解这几种流形学习算法的特点,以便更好地进行数据的降维与分析。  相似文献   

12.
To effectively handle speech data lying on a nonlinear manifold embedded in a high-dimensional acoustic space, in this paper, an adaptive supervised manifold learning algorithm based on locally linear embedding (LLE) for nonlinear dimensionality reduction is proposed to extract the low-dimensional embedded data representations for phoneme recognition. The proposed method aims to make the interclass dissimilarity maximized, while the intraclass dissimilarity minimized in order to promote the discriminating power and generalization ability of the low-dimensional embedded data representations. The performance of the proposed method is compared with five well-known dimensionality reduction methods, i.e., principal component analysis, linear discriminant analysis, isometric mapping (Isomap), LLE as well as the original supervised LLE. Experimental results on three benchmarking speech databases, i.e., the Deterding database, the DARPA TIMIT database, and the ISOLET E-set database, demonstrate that the proposed method obtains promising performance on the phoneme recognition task, outperforming the other used methods.  相似文献   

13.
To improve effectively the performance on spoken emotion recognition, it is needed to perform nonlinear dimensionality reduction for speech data lying on a nonlinear manifold embedded in a high-dimensional acoustic space. In this paper, a new supervised manifold learning algorithm for nonlinear dimensionality reduction, called modified supervised locally linear embedding algorithm (MSLLE) is proposed for spoken emotion recognition. MSLLE aims at enlarging the interclass distance while shrinking the intraclass distance in an effort to promote the discriminating power and generalization ability of low-dimensional embedded data representations. To compare the performance of MSLLE, not only three unsupervised dimensionality reduction methods, i.e., principal component analysis (PCA), locally linear embedding (LLE) and isometric mapping (Isomap), but also five supervised dimensionality reduction methods, i.e., linear discriminant analysis (LDA), supervised locally linear embedding (SLLE), local Fisher discriminant analysis (LFDA), neighborhood component analysis (NCA) and maximally collapsing metric learning (MCML), are used to perform dimensionality reduction on spoken emotion recognition tasks. Experimental results on two emotional speech databases, i.e. the spontaneous Chinese database and the acted Berlin database, confirm the validity and promising performance of the proposed method.  相似文献   

14.
A novel binning and learning framework is presented for analyzing and applying large data sets that have no explicit knowledge of distribution parameterizations, and can only be assumed generated by the underlying probability density functions (PDFs) lying on a nonparametric statistical manifold. For models’ discretization, the uniform sampling-based data space partition is used to bin flat-distributed data sets, while the quantile-based binning is adopted for complex distributed data sets to reduce the number of under-smoothed bins in histograms on average. The compactified histogram embedding is designed so that the Fisher–Riemannian structured multinomial manifold is compatible to the intrinsic geometry of nonparametric statistical manifold, providing a computationally efficient model space for information distance calculation between binned distributions. In particular, without considering histogramming in optimal bin number, we utilize multiple random partitions on data space to embed the associated data sets onto a product multinomial manifold to integrate the complementary bin information with an information metric designed by factor geodesic distances, further alleviating the effect of over-smoothing problem. Using the equipped metric on the embedded submanifold, we improve classical manifold learning and dimension estimation algorithms in metric-adaptive versions to facilitate lower-dimensional Euclidean embedding. The effectiveness of our method is verified by visualization of data sets drawn from known manifolds, visualization and recognition on a subset of ALOI object database, and Gabor feature-based face recognition on the FERET database.  相似文献   

15.
This paper proposes a 1D representation of isometric feature mapping (Isomap) based united video coding algorithms. First, 1D Isomap representations that maintain distances are generated which can achieve a very high compression ratio. Next, embedding and reconstruction algorithms for the 1D Isomap representation are presented that can transform samples from a high-dimensional space to a low-dimensional space and vice versa. Then, dictionary learning algorithms for training samples are proposed to compress the input samples. Finally, a unified coding framework for diverse videos based on a 1D Isomap representation is built. The proposed methods make full use of correlations between internal and external videos, which are not considered by classical methods. Simulation experiments have shown that the proposed methods can obtain higher peak signal-to-noise ratios than standard highly efficient video coding for similar bit per pixel levels in the low bit rate situation.  相似文献   

16.
We consider the problems of clustering, classification, and visualization of high-dimensional data when no straightforward Euclidean representation exists. In this paper, we propose using the properties of information geometry and statistical manifolds in order to define similarities between data sets using the Fisher information distance. We will show that this metric can be approximated using entirely nonparametric methods, as the parameterization and geometry of the manifold is generally unknown. Furthermore, by using multidimensional scaling methods, we are able to reconstruct the statistical manifold in a low-dimensional Euclidean space; enabling effective learning on the data. As a whole, we refer to our framework as Fisher information nonparametric embedding (FINE) and illustrate its uses on practical problems, including a biomedical application and document classification.  相似文献   

17.
针对传统多模态配准方法忽视图像的结构信息和像素间的空间关系,并假定灰度全局一致的前提。本文提出了一种在黎曼流形上的多模态医学图像配准算法。首先采用线性动态模型捕捉图像的高维空间的非线性结构和局部信息,然后通过参数化动态模型构造出一种李群群元,形成黎曼流形,继而将流形嵌入到高维的再生核希尔伯特空间,再在核空间上学习出相似性测度。仿真和临床数据实验结果表明本文算法在刚体配准和仿射配准精度上均优于传统互信息方法和基于邻域的相似性测度学习方法。  相似文献   

18.
Image and video classification tasks often suffer from the problem of high-dimensional feature space. How to discover the meaningful, low-dimensional representations of such high-order, high-dimensional observations remains a fundamental challenge. In this paper, we present a unified framework for tensor based dimensionality reduction including a new tensor distance (TD) metric and a novel multilinear globality preserving embedding (MGPE) strategy. Different with the traditional Euclidean distance, which is constrained by orthogonality assumption, TD measures the distance between data points by considering the relationships among different coordinates of high-order data. To preserve the natural tensor structure in low-dimensional space, MGPE directly works on the high-order form of input data and employs an iterative strategy to learn the transformation matrices. To provide faithful global representation for datasets, MGPE intends to preserve the distances between all pairs of data points. According to the proposed TD metric and MGPE strategy, we further derive two algorithms dubbed tensor distance based multilinear multidimensional scaling (TD-MMDS) and tensor distance based multilinear isometric embedding (TD-MIE). TD-MMDS finds the transformation matrices by keeping the TDs between all pairs of input data in the embedded space, while TD-MIE intends to preserve all pairwise distances calculated according to TDs along shortest paths in the neighborhood graph. By integrating tensor distance into tensor based embedding, TD-MMDS and TD-MIE perform tensor based dimensionality reduction through the whole learning procedure and achieve obvious performance improvement on various standard datasets.  相似文献   

19.
This paper proposes a new approach to analyze high-dimensional data set using low-dimensional manifold. This manifold-based approach provides a unified formulation for both learning from and synthesis back to the input space. The manifold learning method desires to solve two problems in many existing algorithms. The first problem is the local manifold distortion caused by the cost averaging of the global cost optimization during the manifold learning. The second problem results from the unit variance constraint generally used in those spectral embedding methods where global metric information is lost. For the out-of-sample data points, the proposed approach gives simple solutions to transverse between the input space and the feature space. In addition, this method can be used to estimate the underlying dimension and is robust to the number of neighbors. Experiments on both low-dimensional data and real image data are performed to illustrate the theory.  相似文献   

20.
High-dimensional data is involved in many fields of information processing. However, sometimes, the intrinsic structures of these data can be described by a few degrees of freedom. To discover these degrees of freedom or the low-dimensional nonlinear manifold underlying a high-dimensional space, many manifold learning algorithms have been proposed. Here we describe a novel algorithm, locally linear inlaying (LLI), which combines simple geometric intuitions and rigorously established optimality to compute the global embedding of a nonlinear manifold. Using a divide-and-conquer strategy, LLI gains some advantages in itself. First, its time complexity is linear in the number of data points, and hence LLI can be implemented efficiently. Second, LLI overcomes problems caused by the nonuniform sample distribution. Third, unlike existing algorithms such as isometric feature mapping (Isomap), local tangent space alignment (LTSA), and locally linear coordination (LLC), LLI is robust to noise. In addition, to evaluate the embedding results quantitatively, two criteria based on information theory and Kolmogorov complexity theory, respectively, are proposed. Furthermore, we demonstrated the efficiency and effectiveness of our proposal by synthetic and real-world data sets.   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号