首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 109 毫秒
1.
半监督正则化学习   总被引:1,自引:1,他引:0  
研究半监督线性维数约减算法.与传统监督维数约减算法不同的是,半监督算法使用辅助信息和大量的无标号样本来达到更好的推广性能.在丰监督框架下,本文的目标是学习一个光滑、有判别力的子空间.明确地说,使用cannot-link成对约束来最大化不同类样本之间的距离,使用must-link成对约束来最小化相同类样本之间的距离;同时使用无标号样本的几何结构和投影向量的特征结构作为正则化项来引导维数约减过程.并且,所提出算法能容易处理样本外问题.实验结果验证了新算法的有效性.  相似文献   

2.
半监督聚类就是利用样本的监督信息来帮助提升无监督学习的性能。在半监督聚类中,成对约束(must-link约束和cannot-link约束)作为样本的先验知识被广泛地使用。凝聚层次聚类(AHC)也叫合成聚类,是层次聚类法的一种。提出了一种基于成对约束的半监督凝聚层次聚类算法(PS-AHC),该算法利用成对约束来改变聚类簇之间的距离,使聚类簇之间的距离更真实。在UCI数据集上的实验表明,PS-AHC能有效地提高聚类的准确率,是一种有前景的半监督聚类算法。  相似文献   

3.
半监督维数约简是指借助于辅助信息与大量无标记样本信息从高维数据空间找到一个最优低维判别空间,便于后续的分类或聚类操作,它被看作是理解基因序列、文本与人脸图像等高维数据的有效方法。提出一个基于成对约束的半监督维数约简一般框架(SSPC)。该方法首先通过使用成对约束和无标号样本的内在几何结构学习一个判别邻接矩阵;其次,新方法应用学到的投影将原来高维空间中的数据映射到低维空间中,以至于聚类内的样本之间距离变得更加紧凑,而不同聚类间的样本之间距离变得尽可能得远。所提出的算法不仅能找到一个最佳的线性判别子空间,还可以揭示流形数据的非线性结构。在一些真实数据集上的实验结果表明,新方法的性能优于当前主流基于成对约束的维数约简算法的性能。  相似文献   

4.
考虑到已有的半监督维数约减方法在利用边信息时将所有边信息等同,不能充分挖掘边所含信息,提出加权成对约束半监督局部维数约减算法(WSLDR).通过构建近邻图对边信息进行扩充,使边信息数量有所增加.另外,根据边所含信息量的不同构建边的权系数矩阵.将边信息融入近邻图对其进行修正,对修正后的近邻图和加权的成对约束寻找最优投影.算法不仅保持了数据的内在局部几何结构,而且使得类内数据分布更加紧密,类间数据分布更加分散.在UCI数据集上的实验结果验证了该算法的有效性.  相似文献   

5.
潘银松  王攀峰  黄鸿  刘艳 《计算机科学》2013,40(Z11):333-336,373
局部保持投影算法为非监督维数约简算法,没有有效利用样本数据的类别信息,不能有效提取鉴别特征。针对此问题,提出一种半监督局部保持投影(SSLPP)算法。该算法以少量有标记数据和无标记数据作为训练样本集构造出本征图Gi,并有区别地对待标记样本与无标记样本,增大同类样本点之间的权重,更有利于鉴别特征提取。在AVIRIS KSC和Botswana高光谱遥感影像数据集上的实验结果表明,SSLPP算法能够较为有效地发现高维空间中数据的内蕴结构,使得总体分类精度得到较为明显的改善。  相似文献   

6.
王磊 《计算机科学》2009,36(10):234-236
提出两种基于约束投影的支持向量机选择性集成算法。首先利用随机选取的must-link和cannot-link成对约束集确定投影矩阵,将原始训练样本投影到不同的低维空间训练一组基分类器;然后,分别采用遗传优化和最小化偏离度误差两种选择性集成技术对基分类器进行组合。基于UCI数据的实验表明,提出的两种集成算法均能有效提高支持向量机的泛化性能,显著优于Bagging,Boosting,特征Bagging及LoBag等集成算法。  相似文献   

7.
半监督典型相关分析算法   总被引:13,自引:2,他引:11  
彭岩  张道强 《软件学报》2008,19(11):2822-2832
在典型相关分析算法(canonical correlation analysis,简称CCA)的基础上,通过引入以成对约束形式给出的监督信息,提出了一种半监督的典型相关分析算法(Semi-CCA).在此算法中,除了考虑大量的无标号样本以外,还考虑成对约束信息,即已知两样本属于同一类(正约束)或不属于同一类(负约束),同时验证了两者的相对重要性.在人工数据集、多特征手写体数据集和人脸数据集(Yale和AR)上的实验结果表明,Semi-CCA能够有效地利用少量的监督信息采提高分类性能.  相似文献   

8.
随着大数据时代的到来,复杂网络的社区发现已成为一个重要研究方向。层次聚类算法作为社区发现的经典算法受到了广泛应用,然而该算法具有较高的时间复杂度和较低的运行效率。为提高社区发现算法的运行效率,提出了一种基于节点相似度的半监督社区发现新算法--SSGN算法。充分利用先验知识must-link、cannot-link约束集合,将先验信息通过衍生规则进行扩展,并对扩展的信息通过基于距离度量的方式加以验证。采用人工网络和真实网络进行验证,UCI 数据集和大型真实数据集上的实验结果表明, 基于节点相似度的半监督社区发现算法较其他半监督聚类算法更准确,也更高效。  相似文献   

9.
稀疏保持投影算法是一种无监督的全局线性降维方法,无法应对训练样本不足及类内样本间差异过大的情况。针对该问题,提出一种结合成对约束机制的近邻稀疏保留投影算法。利用近邻样本求取稀疏系数以保留局部结构信息,引入成对约束监督的思想,利用样本类别指导稀疏重构过程,最后定义能最大限度保留稀疏系数中蕴含的类别信息的低维子空间。将该算法用于人脸识别,实验结果证明了算法在识别率以及运行时间上的有效性和可行性。  相似文献   

10.
线性判别分析(LDA)是最经典的子空间学习和有监督判别特征提取方法之一.受到流形学习的启发,近年来众多基于LDA的改进方法被提出.尽管出发点不同,但这些算法本质上都是基于欧氏距离来度量样本的空间散布度.欧氏距离的非线性特性带来了如下两个问题:1)算法对噪声和异常样本点敏感;2)算法对流形或者是多模态数据集中局部散布度较大的样本点过度强调,导致特征提取过程中数据的本质结构特征被破坏.为了解决这些问题,提出一种新的基于非参数判别分析(NDA)的维数约减方法,称作动态加权非参数判别分析(DWNDA).DWNDA采用动态加权距离来计算类间散布度和类内散布度,不仅能够保留多模态数据集的本质结构特征,还能有效地利用边界样本点对之间的判别信息.因此,DWNDA在噪声实验中展现出对噪声和异常样本的强鲁棒性.此外,在人脸和手写体数据库上进行实验,DWNDA方法均取得了优异的实验结果.  相似文献   

11.
Principal component analysis (PCA) is one of the most widely used unsupervised dimensionality reduction methods in pattern recognition. It preserves the global covariance structure of data when labels of data are not available. However, in many practical applications, besides the large amount of unlabeled data, it is also possible to obtain partial supervision such as a few labeled data and pairwise constraints, which contain much more valuable information for discrimination than unlabeled data. Unfortunately, PCA cannot utilize that useful discriminant information effectively. On the other hand, traditional supervised dimensionality reduction methods such as linear discriminant analysis perform on only labeled data. When labeled data are insufficient, their performances will deteriorate. In this paper, we propose a novel discriminant PCA (DPCA) model to boost the discriminant power of PCA when both unlabeled and labeled data as well as pairwise constraints are available. The derived DPCA algorithm is efficient and has a closed form solution. Experimental results on several UCI and face data sets show that DPCA is superior to several established dimensionality reduction methods.  相似文献   

12.
Dimensionality reduction plays an important role in many machine learning tasks. This paper studies semi-supervised dimensionality reduction using pairwise constraints. In this setting, domain knowledge is given in the form of pairwise constraint, which specifies whether a pair of instances belongs to the same class (must-link constraint) or different classes (cannot-link constraint). In this paper, a novel semi-supervised dimensionality reduction method called LGS3DR is proposed, which can integrate both local and global topological structures of the data as well as pairwise constraints. The LGS3DR method is effective and has a closed form solution. Experiments on data visualization and face recognition show that LGS3DR is superior to many existing dimensionality reduction methods.  相似文献   

13.
Graph structure is crucial to graph based dimensionality reduction. A mixture graph based semi-supervised dimensionality reduction (MGSSDR) method with pairwise constraints is proposed. MGSSDR first constructs multiple diverse graphs on different random subspaces of dataset, then it combines these graphs into a mixture graph and does dimensionality reduction on this mixture graph. MGSSDR can preserve the pairwise constraints and local structure of samples in the reduced subspace. Meanwhile, it is robust to noise and neighborhood size. Experimental results on facial images feature extraction demonstrate its effectiveness.  相似文献   

14.
针对网络流量特征选择过程中监督信息缺乏的问题,提出一种基于成对约束扩展的半监督网络流量特征选择算法。该算法同时考虑少量成对约束和大量无标记样本,利用样本集合间的相关性和自相关性,扩展成对约束集到无标记样本上,产生更多可靠性强的成对约束,以揭示样本空间分布信息。最后,利用扩展的成对约束集进行特征选择。实验证明:与未进行成对约束扩展的算法相比,该算法在少量初始成对约束的情况下能获得更好的分类性能。  相似文献   

15.
基于稀疏表示的半监督降维方法   总被引:1,自引:1,他引:0       下载免费PDF全文
提出一种基于稀疏表示的半监督降维方法(SpSSDR)。不同于其他基于图的半监督降维方法分步构图,SpSSDR通过稀疏重构系数来同时定义图上边连接性及边权重,再结合边约束信息进行降维。在高维人脸数据上的实验表明,SpSSDR不仅对噪声鲁棒,对边信息的利用也更有效。  相似文献   

16.
Semi-supervised dimensionality reduction has attracted an increasing amount of attention in this big-data era. Many algorithms have been developed with a small number of pairwise constraints to achieve performances comparable to those of fully supervised methods. However, one challenging problem with semi-supervised approaches is the appropriate choice of the constraint set, including the cardinality and the composition of the constraint set which, to a large extent, affects the performance of the resulting algorithm. In this work, we address the problem by incorporating ensemble subspaces and active learning into dimensionality reduction and propose a new global and local scatter based semi-supervised dimensionality reduction method with active constraints selection. Unlike traditional methods that select the supervised information in one subspace, we pick up pairwise constraints in ensemble subspaces, where a novel active learning algorithm is designed with both exploration and filtering to generate informative pairwise constraints. The automatic constraint selection approach proposed in this paper can be generalized to be used with all constraint-based semi-supervised learning algorithms. Comparative experiments are conducted on four face database and the results validate the effectiveness of the proposed method.  相似文献   

17.
基于成对约束的判别型半监督聚类分析   总被引:10,自引:1,他引:9  
尹学松  胡恩良  陈松灿 《软件学报》2008,19(11):2791-2802
现有一些典型的半监督聚类方法一方面难以有效地解决成对约束的违反问题,另一方面未能同时处理高维数据.通过提出一种基于成对约束的判别型半监督聚类分析方法来同时解决上述问题.该方法有效地利用了监督信息集成数据降维和聚类,即在投影空间中使用基于成对约束的K均值算法对数据聚类,再利用聚类结果选择投影空间.同时,该算法降低了基于约束的半监督聚类算法的计算复杂度,并解决了聚类过程中成对约束的违反问题.在一组真实数据集上的实验结果表明,与现有相关半监督聚类算法相比,新方法不仅能够处理高维数据,还有效地提高了聚类性能.  相似文献   

18.
Curse of dimensionality is a bothering problem in high dimensional data analysis. To enhance the performances of classification or clustering on these data, their dimensionalities should be reduced beforehand. Locality Preserving Projections (LPP) is a widely used linear dimensionality reduction method. It seeks a subspace in which the neighborhood graph structure of samples is preserved. However, like most dimensionality reduction methods based on graph embedding, LPP is sensitive to noise and outliers, and its effectiveness depends on choosing suitable parameters for constructing the neighborhood graph. Unfortunately, it is difficult to choose effective parameters for LPP. To address these problems, we propose an Enhanced LPP (ELPP) using a similarity metric based on robust path and a Semi-supervised ELPP (SELPP) with pairwise constraints. In comparison with original LPP, our methods are not only robust to noise and outliers, but also less sensitive to parameters selection. Besides, SELPP makes use of pairwise constraints more efficiently than other comparing methods. Experimental results on real world face databases confirm their effectiveness.  相似文献   

19.
The problem of learning from both labeled and unlabeled data is considered. In this paper, we present a novel semisupervised multimodal dimensionality reduction (SSMDR) algorithm for feature reduction and extraction. SSMDR can preserve the local and multimodal structures of labeled and unlabeled samples. As a result, data pairs in the close vicinity of the original space are projected in the nearby of the embedding space. Due to overfitting, supervised dimensionality reduction methods tend to perform inefficiently when only few labeled samples are available. In such cases, unlabeled samples play a significant role in boosting the learning performance. The proposed discriminant technique has an analytical form of the embedding transformations that can be effectively obtained by applying the eigen decomposition, or finding two close optimal sets of transforming basis vectors. By employing the standard kernel trick, SSMDR can be extended to the nonlinear dimensionality reduction scenarios. We verify the feasibility and effectiveness of SSMDR through conducting extensive simulations including data visualization and classification on the synthetic and real‐world datasets. Our obtained results reveal that SSMDR offers significant advantages over some widely used techniques. Compared with other methods, the proposed SSMDR exhibits superior performance on multimodal cases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号