首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
流形学习中非线性维数约简方法概述   总被引:4,自引:1,他引:3  
较为详细地回顾了流形学习中非线性维数约简方法,分析了它们各自的优势和不足.与传统的线性维数约简方法相比较,可以发现非线性高维数据的本质维数,有利于进行维数约简和数据分析.最后展望了流形学习中非线性维数方法的未来研究方向,期望进一步拓展流形学习的应用领域.  相似文献   

2.
采用流形学习及维数约简方法可以有效保护敏感数据。针对交通事故黑点的敏感数据挖掘中隐私保护问题,提出了综合应用等距变换和微分流形两种算法来提高原始数据保密程度的方法,采用基于旋转的等距变换扰乱数据,用Laplacian Eigenmap对高维数据进行非线性降维,在保留数据内在几何结构的同时,进一步扰乱数据。该方法有效地应用于交通事故黑点数据隐私保护中,同时降低了原始数据的维数,便于后续的数据挖掘与分析。  相似文献   

3.
In this paper we introduce a novel supervised manifold learning technique called Supervised Laplacian Eigenmaps (S-LE), which makes use of class label information to guide the procedure of non-linear dimensionality reduction by adopting the large margin concept. The graph Laplacian is split into two components: within-class graph and between-class graph to better characterize the discriminant property of the data. Our approach has two important characteristics: (i) it adaptively estimates the local neighborhood surrounding each sample based on data density and similarity and (ii) the objective function simultaneously maximizes the local margin between heterogeneous samples and pushes the homogeneous samples closer to each other.Our approach has been tested on several challenging face databases and it has been conveniently compared with other linear and non-linear techniques, demonstrating its superiority. Although we have concentrated in this paper on the face recognition problem, the proposed approach could also be applied to other category of objects characterized by large variations in their appearance (such as hand or body pose, for instance).  相似文献   

4.
Recent years have witnessed great success of manifold learning methods in understanding the structure of multidimensional patterns. However, most of these methods operate in a batch mode and cannot be effectively applied when data are collected sequentially. In this paper, we propose a general incremental learning framework, capable of dealing with one or more new samples each time, for the so-called spectral embedding methods. In the proposed framework, the incremental dimensionality reduction problem reduces to an incremental eigen-problem of matrices. Furthermore, we present, using this framework as a tool, an incremental version of Hessian eigenmaps, the IHLLE method. Finally, we show several experimental results on both synthetic and real world datasets, demonstrating the efficiency and accuracy of the proposed algorithm.  相似文献   

5.
A well-designed graph plays a fundamental role in graph-based semi-supervised learning; however, the topological structure of a constructed neighborhood is unstable in most current approaches, since they are very sensitive to the high dimensional, sparse and noisy data. This generally leads to dramatic performance degradation. To deal with this issue, we developed a relative manifold based semisupervised dimensionality reduction (RMSSDR) approach by utilizing the relative manifold to construct a better neighborhood graph with fewer short-circuit edges. Based on the relative cognitive law and manifold distance, a relative transformation is used to construct the relative space and the relative manifold. A relative transformation can improve the ability to distinguish between data points and reduce the impact of noise such that it may be more intuitive, and the relative manifold can more truly reflect the manifold structure since data sets commonly exist in a nonlinear structure. Specifically, RMSSDR makes full use of pairwise constraints that can define the edge weights of the neighborhood graph by minimizing the local reconstruction error and can preserve the global and local geometric structures of the data set. The experimental results on face data sets demonstrate that RMSSDR is better than the current state of the art comparing methods in both performance of classification and robustness.  相似文献   

6.
Figure 8 of this article shows YaleB and CMU PIE with incorrect legend titles:YaleB(Tr=1900,Te=514,NOC=100)should be YaleB(Tr=1900,Te=514,d=100)(Fig.8(a));TIE(Tr=1200,Te=2880,d=100)should be PIE(Tr=1200,Te=2880,d=100)(Fig.8(b)).In Fig.9,the legend keys and the legend texts are mismatched.The correct figure is ilustrated as follows.  相似文献   

7.
现有的主要非线性维数约减算法,如SIE和Isomap等,其邻域参数的设定是全局性的。仿真表明,对于局域流形结构差异较大的数据集,全局一致的邻域参数可能无法获得合理的嵌入结果。为此给出基于局域主方向重构的适应性邻域选择算法。算法首先为每个参考点选择一个邻域集,使各邻域集近似处于局域主线性子空间,并计算各邻域集的基向量集;再由基向量集对各邻域点的线性拟合误差判定该邻域点与主线性子空间的偏离程度,删除偏离较大的点。仿真表明,基于局域主方向重构的适应性邻域选择可有效处理局域流形结构差异较大的数据集;且相对于已有的适应性邻域选择算法,可以更好屏蔽靠近参考点的孤立噪声点及较大的空间曲率导致的虚假连通性。  相似文献   

8.
子空间分割方法一直是一种重要的机器学习方法,这些方法在人脸识别和基因表达数据识别等研究中有较好的聚类准确率。然而,这些方法在对高维小样本数据进行聚类时难以取得理想的结果。为了解决这些问题,借鉴流形降维中的局部保持投影法和最小二乘回归子空间分割法,提出流形降维最小二乘回归子空间分割法。该方法通过局部保持投影进行降维,再利用最小二乘回归子空间分割方法实现聚类。在6个生物基因表达数据集和2个图像数据集上的实验表明了该方法的有效性。  相似文献   

9.
增量与演化流形学习综述   总被引:1,自引:0,他引:1  
流形学习的目标是发现观测数据嵌入在高维数据空间中的低维光滑流形.近年来,在线或增量地发现内在低维流形结构成为流形学习的研究热点.从增量学习和演化学习2个方面入手,对该领域已有研究进展进行综述.增量流形学习较之传统的批量流形学习方法具有动态增量的能力,而演化流形学习能够在线地发现海量动态数据的内在规律,有利于进行维数约简和数据分析.文中对主要的增量与演化流形学习算法的基本原理、特点进行了阐述,分析了各自的优点与不足,指出了该领域的开放问题,并对进一步的研究方向进行了展望.  相似文献   

10.
11.
12.
The curse of dimensionality has prompted intensive research in effective methods of mapping high dimensional data. Dimensionality reduction and subspace learning have been studied extensively and widely applied to feature extraction and pattern representation in image and vision applications. Although PCA has long been regarded as a simple, efficient linear subspace technique, many nonlinear methods such as kernel PCA, local linear embedding, and self-organizing networks have been proposed recently for dealing with increasingly complex nonlinear data. The intensive research in nonlinear methods often creates an impression that they are highly superior and preferred, though often limited experiments were given and the results not tested on significance. In this paper, we systematically investigate and compare the capabilities of various linear and nonlinear subspace methods for face representation and recognition. The performances of these methods are analyzed and discussed along with statistical significance tests on obtained results. The experiments on a range of data sets show that nonlinear methods do not always outperform linear ones, especially on data sets containing noise and outliers or having discontinuous or multiple submanifolds. Certain nonlinear methods with certain classifiers do yield better performances consistently than others. However, the differences among them are small and in most cases are not significant. A measure is used to quantify the nonlinearity of a data set in a subspace. It explains that good performances are achievable in reduced dimensions of low degree of nonlinearity.  相似文献   

13.
The human heart is a complex system that reveals many clues about its condition in its electrocardiogram (ECG) signal, and ECG supervising is the most important and efficient way of preventing heart attacks. ECG analysis and recognition are both important and tempting topics in modern medical research. The purpose of this paper is to develop an algorithm which investigates kernel method, locally linear embedding (LLE), principal component analysis (PCA), and support vector machine(SVM) algorithms for dimensionality reduction, features extraction, and classification for recognizing and classifying the given ECG signals. In order to do so, a nonlinear dimensionality reduction kernel method based LLE is proposed to reduce the high dimensions of the variational ECG signals, and the principal characteristics of the signals are extracted from the original database by means of the PCA, each signal representing a single and complete heart beat. SVM method is applied to classify the ECG data into several categories of heart diseases. Experimental results obtained demonstrated that the performance of the proposed method was similar and sometimes better when compared to other ECG recognition techniques, thus indicating a viable and accurate technique.  相似文献   

14.
Real-world applications of multivariate data analysis often stumble upon the barrier of interpretability. Simple data analysis methods are usually easy to interpret, but they risk providing poor data models. More involved methods may instead yield faithful data models, but limited interpretability. This is the case of linear and nonlinear methods for multivariate data visualization through dimensionality reduction. Even though the latter have provided some of the most exciting visualization developments, their practicality is hindered by the difficulty of explaining them in an intuitive manner. The interpretability, and therefore the practical applicability, of data visualization through nonlinear dimensionality reduction (NLDR) methods would improve if, first, we could accurately calculate the distortion introduced by these methods in the visual representation and, second, if we could faithfully reintroduce this distortion into such representation. In this paper, we describe a technique for the reintroduction of the distortion into the visualization space of NLDR models. It is based on the concept of density-equalizing maps, or cartograms, recently developed for the representation of geographic information. We illustrate it using Generative Topographic Mapping (GTM), a nonlinear manifold learning method that can provide both multivariate data visualization and a measure of the local distortion that the model generates. Although illustrated here with GTM, it could easily be extended to other NLDR visualization methods, provided a local distortion measure could be calculated. It could also serve as a guiding tool for interactive data visualization.  相似文献   

15.
针对线性降维技术应用于具有非线性结构的数据时无法得到令人满意的结果的问题,提出一种新的着重于保持高维空间局部最近邻信息的非线性随机降维算法(NNSE)。该算法首先在高维空间中通过计算样本点之间的欧氏距离找出每个样本点的最近邻点,接着在低维空间中产生一个随机的初始分布;然后通过将低维空间中的样本点不断向其最近邻点的平均位置移动,直到产生稳定的低维嵌入结果。与一种先进的非线性随机降维算法——t分布随机邻域嵌入(t-SNE)相比,NNSE算法得到的低维结果在可视化方面与t-SNE算法相差不大,但通过比较两者的量化指标可以发现,NNSE算法在保持最近邻信息方面上明显优于t-SNE算法。  相似文献   

16.
A new quality assessment criterion for evaluating the performance of the nonlinear dimensionality reduction (NLDR) methods is proposed in this paper. Differing from the current quality assessment criteria focusing on the local-neighborhood-preserving performance of the NLDR methods, the proposed criterion capitalizes on a new aspect, the global-structure-holding performance, of the NLDR methods. By taking both properties into consideration, the intrinsic capability of the NLDR methods can be more faithfully reflected, and hence more rational measurement for the proper selection of NLDR methods in real-life applications can be offered. The theoretical argument is supported by experiment results implemented on a series of benchmark data sets.  相似文献   

17.
《Pattern recognition》2014,47(2):758-768
Sentiment analysis, which detects the subjectivity or polarity of documents, is one of the fundamental tasks in text data analytics. Recently, the number of documents available online and offline is increasing dramatically, and preprocessed text data have more features. This development makes analysis more complex to be analyzed effectively. This paper proposes a novel semi-supervised Laplacian eigenmap (SS-LE). The SS-LE removes redundant features effectively by decreasing detection errors of sentiments. Moreover, it enables visualization of documents in perceptible low dimensional embedded space to provide a useful tool for text analytics. The proposed method is evaluated using multi-domain review data set in sentiment visualization and classification by comparing other dimensionality reduction methods. SS-LE provides a better similarity measure in the visualization result by separating positive and negative documents properly. Sentiment classification models trained over reduced data by SS-LE show higher accuracy. Overall, experimental results suggest that SS-LE has the potential to be used to visualize documents for the ease of analysis and to train a predictive model in sentiment analysis. SS-LE can also be applied to any other partially annotated text data sets.  相似文献   

18.
涂腾涛  顾嗣扬 《计算机应用》2008,28(8):2030-2032
提出了一种有监督的非线性核子空间人脸识别新方法。在核邻域保持投影方法的局部邻域构建过程中引入监督机制,更好地利用了人脸训练样本的类别信息,提高人脸识别的效率;在获取最佳重建权矩阵的过程中引入一个正则项约束 ,降低了其对噪声的敏感性。实验阶段,采用了AT&T和Yale人脸库和最近邻分类器测试该方法。结果表明,这种方法是有效的,且较无监督的KNPP方法及传统的经典人脸识别法具有更好的识别率和鲁棒性。  相似文献   

19.
When performing visualization and classification, people often confront the problem of dimensionality reduction. Isomap is one of the most promising nonlinear dimensionality reduction techniques. However, when Isomap is applied to real-world data, it shows some limitations, such as being sensitive to noise. In this paper, an improved version of Isomap, namely S-Isomap, is proposed. S-Isomap utilizes class information to guide the procedure of nonlinear dimensionality reduction. Such a kind of procedure is called supervised nonlinear dimensionality reduction. In S-Isomap, the neighborhood graph of the input data is constructed according to a certain kind of dissimilarity between data points, which is specially designed to integrate the class information. The dissimilarity has several good properties which help to discover the true neighborhood of the data and, thus, makes S-Isomap a robust technique for both visualization and classification, especially for real-world problems. In the visualization experiments, S-Isomap is compared with Isomap, LLE, and WeightedIso. The results show that S-Isomap performs the best. In the classification experiments, S-Isomap is used as a preprocess of classification and compared with Isomap, WeightedIso, as well as some other well-established classification methods, including the K-nearest neighbor classifier, BP neural network, J4.8 decision tree, and SVM. The results reveal that S-Isomap excels compared to Isomap and WeightedIso in classification, and it is highly competitive with those well-known classification methods.  相似文献   

20.
A distance-preserving method is presented to map high-dimensional data sequentially to low-dimensional space. It preserves exact distances of each data point to its nearest neighbor and to some other near neighbors. Intrinsic dimensionality of data is estimated by examining the preservation of interpoint distances. The method has no user-selectable parameter. It can successfully project data when the data points are spread among multiple clusters. Results of experiments show its usefulness in projecting high-dimensional data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号