首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 515 毫秒
1.
基于放大因子和延伸方向研究流形学习算法   总被引:16,自引:0,他引:16  
何力  张军平  周志华 《计算机学报》2005,28(12):2000-2009
流形学习是一种新的非监督学习方法,可以有效地发现高维非线性数据集的内在维数和进行维数约简,近年来越来越受到机器学习和认知科学领域研究者的重视.虽然目前已经出现了很多有效的流形学习算法,如等度规映射(ISOMAP)、局部线性嵌套(Locally Linear Embedding,LLE)等,然而,对观测空间的高维数据与降维后的低维数据之间的定量关系,尚难以直观地进行分析.这一方面不利于对数据内在规律的深入探察,一方面也不利于对不同流形学习算法的降维效果进行直观比较.文中提出了一种方法,可以从放大因子和延伸方向这两个方面显示出观测空间的高维数据与降维后的低维数据之间的联系;比较了两种著名的流形学习算法(ISOMAP和LLE)的性能,得出了一些有意义的结论;提出了相应的算法从而实现了以上理论.对几组数据的实验表明了研究的有效性和意义.  相似文献   

2.
正交化近邻关系保持的降维及分类算法   总被引:1,自引:0,他引:1       下载免费PDF全文
针对近邻关系保持嵌入(NPE)算法易于受到降低后的维数影响,而且性能依赖于正确的维数估计的问题,提出了一种正交化的近邻关系保持的嵌入降维方法——ONPE。ONPE方法是使用数据点间的近邻关系来构造邻接图,假设每个数据点都能由其近邻点的线性组合表示,则可以通过提取数据点的局部几何信息,并在降维中保持提取的局部几何信息,迭代地计算正交基来得到数据的低维嵌入坐标。同时,在ONPE算法的基础上,利用局部几何信息,提出了一种在低维空间中使用标签传递(LNP)的分类算法——ONPC。其是假设高维空间中的局部近邻关系在降维后的空间中依然得到保持,并且数据点的类别可由近邻点的类别得到。在人工数据和人脸数据上的实验表明,该算法在减少维数依赖的同时,能有效提高NPE算法的分类性能。  相似文献   

3.
改进的局部线性嵌入算法及其应用   总被引:1,自引:0,他引:1       下载免费PDF全文
局部线性嵌入算法(LLE)中常用欧氏距离来度量样本间相似度,而对于具有低维流形结构的高维数据,欧氏距离不能衡量流形上两点间相对位置关系。提出基于Geodesic Rank-order距离的局部线性嵌入算法(简称GRDLLE)。应用最短路径算法(Dijkstra算法)找到最短路径长度来近似计算任意两个样本间的测地线距离,计算Rank-order距离用于LLE算法的相似性度量。将GRDLLE算法、其他改进LLE的流形学习算法及2DPCA算法在ORL与Yale数据集上进行对比实验,对数据用GRDLLE算法进行降维后人脸识别率有所提高,结果表明GRDLLE算法具有很好的降维效果。  相似文献   

4.
等距映射算法(ISOMAP)是一种典型的非线性流形降维算法,该算法可在尽量保持高维数据测地距离与低维数据空间距离对等关系的基础上实现降维.但ISOMAP容易受噪声的影响,导致数据降维后不能保持高维拓扑结构.针对这一问题,提出了一种基于最优密度方向的等距映射(ODD–ISOMAP)算法.该算法通过筛选数据的自然邻居确定每个数据沿流形方向的最优密度方向,之后基于与各近邻数据组成的向量相对最优密度方向投影的角度、方向和长度合理缩放局部邻域距离,引导数据沿流形方向计算测地距离,从而降低算法对噪声的敏感度.为验证算法有效性,选取了2类人工合成数据和5类实测数据作为测试数据集,分别使用ISOMAP,LLE,HLLE,LTSA,LEIGS,PCA和ODD–ISOMAP算法对数据集降维,并对降维数据进行K-mediods聚类分析.通过比对聚类正确率以及不同幅度噪声对此正确率的影响程度评价各算法降维效果优劣.结果表明,ODD–ISOMAP算法较其他6种常见算法降维效果提升显著,且对噪声干扰有更强的抵抗能力.  相似文献   

5.
流形学习已成为机器学习和数据挖掘领域的研究热点。比如,算法LLE(Locally Linear Embedding)作为一种非线性降维算法有很好的泛化性能,被广泛地应用于图像分类和目标识别,但其仅仅假设了数据集处于单流形的情况。MM-LLE(Multiple Manifold Locally Linear Embedding)学习算法作为一种考虑多流形情况的改进算法,依然存在几点不足之处。因此,提出改进的MM-LLE算法,通过任意两类间的局部低维流形组合并构建分类器来提高分类精度;同时改进原算法计算最佳维度的方法。通过与算法ISOMAP、LLE以及MM-LLE比较分类精度,实验结果验证了改进算法的有效性。  相似文献   

6.
局部线性嵌入算法(LLE)是流形学习中非线性数据降维的重要方法之一。考虑数据点分布大多呈现不均匀性,LLE对近邻点的选取方式将会导致大量的信息丢失。根据其不足,提出一种基于数据点松紧度的局部线性嵌入改进算法——tLLE算法,针对数据点分布不均匀的数据集,tLLE算法能有效地进行维数约简,且具有比LLE更好的降维效果。在人造数据和现实数据上的嵌入以及分类识别结果表明了tLLE算法的有效性。  相似文献   

7.
提出一种稀疏局部Fisher判别分析(Sparsity Local Fisher Discriminant Analysis,SLFDA)。该算法在局部Fisher判别分析降维的基础上,通过平衡参数引入稀疏保持投影,在投影降维过程中保持了数据的全局几何结构和局部近邻信息。在UCI数据集和YaleB人脸数据集上的实验表明,该算法融合局部Fisher判别分析和稀疏保持投影的优点;与现有的半监督局部Fisher判别分析降维算法相比,该算法提高了基于最短欧氏距离的分类算法的精度。  相似文献   

8.
流形学习方法中的LLE算法可以将高维数据在保持局部邻域结构的条件下降维到低维流形子空间中.并得到与原样本集具有相似局部结构的嵌入向量集合。LLE算法在数据降维处理过程中没有考虑样本的分类信息。针对这些问题进行研究,提出改进的有监督的局部线性嵌人算法(MSLLE),并利用MatLab对该改进算法的实现效果同LLE进行实验演示比较。通过实验演示表明,MSLLE算法较LLE算法可以有利于保持数据点本身内部结构。  相似文献   

9.
局部线性嵌入算法(LLE)因其较低的计算复杂度和高效性适用于很多降维问题,新的自适应局部线性嵌入(ALLE)算法对数据进行非线性降维,提取高维数据的本质特征,并保持了数据的全局几何结构特征,对比实验结果表明了该算法对于非理想数据的降维结果均优于LLE算法。  相似文献   

10.
LLE算法是一种有效的非线性数据降维方法,但是直接LLE降维后的数据比原始数据小的很多,不能保持原始数据的整体特征,本文在LLE的基础上提出了一种能够保持原始流行整体结构的LLE点算法,实验表明本算法在降维的同时还保持了原始流行的大小。是对LLE点的有效改进。  相似文献   

11.
We propose a novel supervised dimensionality reduction method named local tangent space discriminant analysis (TSD) which is capable of utilizing the geometrical information from tangent spaces. The proposed method aims to seek an embedding space where the local manifold structure of the data belonging to the same class is preserved as much as possible, and the marginal data points with different class labels are better separated. Moreover, TSD has an analytic form of the solution and can be naturally extended to non-linear dimensionality reduction through the kernel trick. Experimental results on multiple real-world data sets demonstrate the effectiveness of the proposed method.  相似文献   

12.
The problem of learning from both labeled and unlabeled data is considered. In this paper, we present a novel semisupervised multimodal dimensionality reduction (SSMDR) algorithm for feature reduction and extraction. SSMDR can preserve the local and multimodal structures of labeled and unlabeled samples. As a result, data pairs in the close vicinity of the original space are projected in the nearby of the embedding space. Due to overfitting, supervised dimensionality reduction methods tend to perform inefficiently when only few labeled samples are available. In such cases, unlabeled samples play a significant role in boosting the learning performance. The proposed discriminant technique has an analytical form of the embedding transformations that can be effectively obtained by applying the eigen decomposition, or finding two close optimal sets of transforming basis vectors. By employing the standard kernel trick, SSMDR can be extended to the nonlinear dimensionality reduction scenarios. We verify the feasibility and effectiveness of SSMDR through conducting extensive simulations including data visualization and classification on the synthetic and real‐world datasets. Our obtained results reveal that SSMDR offers significant advantages over some widely used techniques. Compared with other methods, the proposed SSMDR exhibits superior performance on multimodal cases.  相似文献   

13.
半监督降维(Semi\|Supervised Dimensionality Reduction,SSDR)框架下,基于成对约束提出一种半监督降维算法SCSSDR。利用成对样本进行构图,在保持局部结构的同时顾及数据的全局结构。通过最优化目标函数,使得同类样本更加紧凑\,异类样本更加离散。采用UCI数据集对算法进行定量分析,发现该方法优于PCA及传统流形学习算法,进一步的UCI数据集和高光谱数据集分类实验表明:该方法适合于进行分类目的特征提取。  相似文献   

14.
In this paper, a novel unsupervised dimensionality reduction algorithm, unsupervised Globality-Locality Preserving Projections in Transfer Learning (UGLPTL) is proposed, based on the conventional Globality-Locality Preserving dimensionality reduction algorithm (GLPP) that does not work well in real-world Transfer Learning (TL) applications. In TL applications, one application (source domain) contains sufficient labeled data, but the related application contains only unlabeled data (target domain). Compared to the existing TL methods, our proposed method incorporates all the objectives, such as minimizing the marginal and conditional distributions between both the domains, maximizing the variance of the target domain, and performing Geometrical Diffusion on Manifolds, all of which are essential for transfer learning applications. UGLPTL seeks a projection vector that projects the source and the target domains data into a common subspace where both the labeled source data and the unlabeled target data can be utilized to perform dimensionality reduction. Comprehensive experiments have verified that the proposed method outperforms many state-of-the-art non-transfer learning and transfer learning methods on two popular real-world cross-domain visual transfer learning data sets. Our proposed UGLPTL approach achieved 82.18% and 87.14% mean accuracies over all the tasks of PIE Face and Office-Caltech data sets, respectively.  相似文献   

15.
The problem of dimensionality reduction is to map data from high dimensional spaces to low dimensional spaces. In the process of dimensionality reduction, the data structure, which is helpful to discover the latent semantics and simultaneously respect the intrinsic geometric structure, should be preserved. In this paper, to discover a low-dimensional embedding space with the nature of structure preservation and basis compactness, we propose a novel dimensionality reduction algorithm, called Structure Preserving Non-negative Matrix Factorization (SPNMF). In SPNMF, three kinds of constraints, namely local affinity, distant repulsion, and embedding basis redundancy elimination, are incorporated into the NMF framework. SPNMF is formulated as an optimization problem and solved by an effective iterative multiplicative update algorithm. The convergence of the proposed update solutions is proved. Extensive experiments on both synthetic data and six real world data sets demonstrate the encouraging performance of the proposed algorithm in comparison to the state-of-the-art algorithms, especially some related works based on NMF. Moreover, the convergence of the proposed updating rules is experimentally validated.  相似文献   

16.
Similarity search usually encounters a serious problem in the high-dimensional space, known as the “curse of dimensionality.” In order to speed up the retrieval efficiency, most previous approaches reduce the dimensionality of the entire data set to a fixed lower value before building indexes (referred to as global dimensionality reduction (GDR)). More recent works focus on locally reducing the dimensionality of data to different values (called the local dimensionality reduction (LDR)). In addition, random projection is proposed as an approximate dimensionality reduction (ADR) technique to answer the approximate similarity search instead of the exact one. However, so far little work has formally evaluated the effectiveness and efficiency of GDR, LDR, and ADR for the range query. Motivated by this, in this paper, we propose general cost models for evaluating the query performance over the reduced data sets by GDR, LDR, and ADR, in light of which we introduce a novel (A)LDR method, Partitioning based on RANdomized Search (PRANS). It can achieve high retrieval efficiency with the guarantee of optimality given by the formal models. Finally, a {rm B}^{+}-tree index is constructed over the reduced partitions for fast similarity search. Extensive experiments validate the correctness of our cost models on both real and synthetic data sets and demonstrate the efficiency and effectiveness of the proposed PRANS method.  相似文献   

17.
马宗杰  刘华文 《计算机应用》2014,34(7):2058-2060
针对多标签数据的标签相关性和高维问题,提出一种基于奇异值分解-偏最小二乘回归的多标签分类算法,该算法可以对多标签数据进行维数约简和回归分析。首先,将类别标签集合作为整体处理,对标签相关性进行考察; 其次,利用奇异值分解(SVD)技术得到样本和标签空间的得分向量,实施降维; 最后,在偏最小二乘回归(PLSR)的基础上构建多标签分类模型。实验结果表明,在四种维数较高的真实数据集上,该算法可以获得有效的分类结果。  相似文献   

18.
主成分分析(Principle component analysis,PCA)是一种被广泛应用的降维方法.然而经典PCA的构造基于L2-模导致了其对离群点和噪声点敏感,同时经典PCA也不具备稀疏性的特点.针对此问题,本文提出基于Lp-模的稀疏主成分分析降维方法(LpSPCA).LpSPCA通过极大化带有稀疏正则项的Lp-模样本方差,使得其在降维的同时保证了稀疏性和鲁棒性.LpSPCA可用简单的迭代算法求解,并且当p≥1时该算法的收敛性可在理论上保证.此外通过选择不同的p值,LpSPCA可应用于更广泛的数据类型.人工数据及人脸数据上的实验结果表明,本文所提出的LpSPCA不仅具有较好的降维效果,并且具有较强的抗噪能力.  相似文献   

19.
In this paper, we propose a source localization algorithm based on a sparse Fast Fourier Transform (FFT)-based feature extraction method and spatial sparsity. We represent the sound source positions as a sparse vector by discretely segmenting the space with a circular grid. The location vector is related to microphone measurements through a linear equation, which can be estimated at each microphone. For this linear dimensionality reduction, we have utilized a Compressive Sensing (CS) and two-level FFT-based feature extraction method which combines two sets of audio signal features and covers both short-time and long-time properties of the signal. The proposed feature extraction method leads to a sparse representation of audio signals. As a result, a significant reduction in the dimensionality of the signals is achieved. In comparison to the state-of-the-art methods, the proposed method improves the accuracy while the complexity is reduced in some cases.  相似文献   

20.
郭喻栋  郭志刚  陈刚  魏晗 《计算机应用》2017,37(9):2665-2670
针对基于k近邻的协同过滤推荐算法中存在的评分特征数据维度过高、k近邻查找速度慢,以及评分冷启动等问题,提出基于数据降维与精确欧氏局部敏感哈希(E2LSH)的k近邻协同过滤推荐算法。首先,融合评分数据、用户属性数据以及项目类别数据,将融合后的数据作为输入对堆叠降噪自编码(SDA)神经网络进行训练,取神经网络编码部分最后一个隐层的值作为输入数据的特征编码,完成非线性降维。然后,利用精确欧氏局部敏感哈希算法对降维后的数据建立索引,通过检索得到目标用户或目标项目的相似近邻。最后,计算目标与近邻之间的相似度,利用相似度对近邻的评分记录加权得到目标用户对目标项目的预测评分。在标准数据集上的实验结果表明,在冷启动场景下,均方根误差比基于局部敏感哈希的推荐算法(LSH-ICF)平均降低了约7.2%,平均运行时间和LSH-ICF相当。表明该方法在保证推荐效率的前提下,缓解了评分冷启动问题。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号