首页 | 本学科首页   官方微博 | 高级检索  
 共查询到18条相似文献,搜索用时 140 毫秒
多核局部领域适应学习   总被引:1,自引:0,他引:1  
陶剑文  王士同 《软件学报》2012,23(9):2297-2310
领域适应(或跨领域)学习旨在利用源领域(或辅助领域)中带标签样本来学习一种鲁棒的目标分类器,其关键问题在于如何最大化地减小领域间的分布差异.为了有效解决领域间特征分布的变化问题,提出一种三段式多核局部领域适应学习(multiple kernel local leaning-based domain adaptation,简称MKLDA)方法:1)基于最大均值差(maximum mean discrepancy,简称MMD)度量准则和结构风险最小化模型,同时,学习一个再生多核Hilbert空间和一个初始的支持向量机(support vector machine,简称SVM),对目标领域数据进行初始划分;2)在习得的多核Hilbert空间,对目标领域数据的类别信息进行局部重构学习;3)最后,利用学习获得的类别信息,在目标领域训练学习一个鲁棒的目标分类器.实验结果显示,所提方法具有优化或可比较的领域适应学习性能.  相似文献   

基于增强稀疏性特征选择的网络图像标注   总被引:1,自引:0,他引:1  
史彩娟  阮秋琦 《软件学报》2015,26(7):1800-1811
面对网络图像的爆炸性增长,网络图像标注成为近年来一个热点研究内容,稀疏特征选择在提升网络图像标注效率和性能方面发挥着重要的作用.提出了一种增强稀疏性特征选择算法,即,基于l2,1/2矩阵范数和共享子空间的半监督稀疏特征选择算法(semi-supervised sparse feature selection based on l2,1/2-matix norm with shared subspace learning,简称SFSLS)进行网络图像标注.在SFSLS算法中,应用l2,1/2矩阵范数来选取最稀疏和最具判别性的特征,通过共享子空间学习,考虑不同特征之间的关联信息.另外,基于图拉普拉斯的半监督学习,使SFSLS算法同时利用了有标签数据和无标签数据.设计了一种有效的迭代算法来最优化目标函数.SFSLS算法与其他稀疏特征选择算法在两个大规模网络图像数据库上进行了比较,结果表明,SFSLS算法更适合于大规模网络图像的标注.  相似文献   

胡庆辉  丁立新  何进荣 《软件学报》2013,24(11):2522-2534
在机器学习领域,核方法是解决非线性模式识别问题的一种有效手段.目前,用多核学习方法代替传统的单核学习已经成为一个新的研究热点,它在处理异构、不规则和分布不平坦的样本数据情况下,表现出了更好的灵活性、可解释性以及更优异的泛化性能.结合有监督学习中的多核学习方法,提出了基于Lp范数约束的多核半监督支持向量机(semi-supervised support vector machine,简称S3VM)的优化模型.该模型的待优化参数包括高维空间的决策函数fm和核组合权系数θm.同时,该模型继承了单核半监督支持向量机的非凸非平滑特性.采用双层优化过程来优化这两组参数,并采用改进的拟牛顿法和基于成对标签交换的局部搜索算法分别解决模型关于fm的非平滑及非凸问题,以得到模型近似最优解.在多核框架中同时加入基本核和流形核,以充分利用数据的几何性质.实验结果验证了算法的有效性及较好的泛化性能.  相似文献   

针对现有的基于图的半监督学习(graph-based semi-supervised learning,简称GSSL)方法存在模型参数敏感和数据空间判别信息不充分等问题,受最近特征空间嵌入和数据稀疏表示思想的启发,提出一种稀疏近似最近特征空间嵌入标签传播算法SANFSP(sparse approximated nearest feature space embedding label propagation).SANFSP首先利用特征空间嵌入投影点来稀疏表示原始数据;然后,度量原始数据和稀疏近似最近特征空间嵌入投影间的相似性;进而提出稀疏近似最近特征空间嵌入正则化项;最后,基于传统GSSL 方法的标签传播算法,实现数据标签的平滑传播.同时,还将SANFSP 算法简单拓展到out-of-sample 学习.SANFSP 算法在人造和实际数据集(如人脸识别、可视物件识别以及手写数字分类等)上取得了有效的实验结果.  相似文献   

改进的加权稀疏表示人脸识别算法   总被引:1,自引:0,他引:1  
针对传统的加权稀疏表示分类方法在获取训练样本权重以及求解l1范数最小化问题中计算效率低的问题,提出了一种加权稀疏表示和对偶增广拉格朗日乘子法(DALM)相结合的人脸识别算法WSRC_DALM算法.该算法主要采用高斯核函数计算每个训练样本与测试样本之间的相关性,即获得训练样本相对于测试样本的权重;接着利用DALM算法求解l1范数最小化模型,实现测试样本的精准重构和分类,最后在ORL和FEI人脸数据集上进行算法验证.在ORL数据集中,WSRC_DALM算法的识别率高达99%,相比经典的SRC和WSRC算法,识别率分别提高了7%和4.8%,同时计算效率比WSRC算法提高了约20倍;在FEI数据集中,多姿态变化下的人脸识别率接近于92%.实验结果表明,WSRC_DALM算法在识别准确度和计算效率上具有明显的优势,并且对较大类内变化具有较好的鲁棒性.  相似文献   

目的 特征降维是机器学习领域的热点研究问题。现有的低秩稀疏保持投影方法忽略了原始数据空间和降维后的低维空间之间的信息损失,且现有的方法不能有效处理少量有标签数据和大量无标签数据的情况,针对这两个问题,提出基于低秩稀疏图嵌入的半监督特征选择方法(LRSE)。方法 LRSE方法包含两步:第1步是充分利用有标签数据和无标签数据分别学习其低秩稀疏表示,第2步是在目标函数中同时考虑数据降维前后的信息差异和降维过程中的结构信息保持,其中通过最小化信息损失函数使数据中有用的信息尽可能地保留下来,将包含数据全局结构和内部几何结构的低秩稀疏图嵌入在低维空间中使得原始数据空间中的结构信息保留下来,从而能选择出更有判别性的特征。结果 将本文方法在6个公共数据集上进行测试,对降维后的数据采用KNN分类验证本文方法的分类准确率,并与其他现有的降维算法进行实验对比,本文方法分类准确率均有所提高,在其中的5个数据集上本文方法都有最高的分类准确率,其分类准确率分别在Wine数据集上比次高算法鲁棒非监督特征选择算法(RUFS)高11.19%,在Breast数据集上比次高算法RUFS高0.57%,在Orlraws10P数据集上比次高算法多聚类特征选择算法(MCFS)高1%,在Coil20数据集上比次高算法MCFS高1.07%,在数据集Orl64上比次高算法MCFS高2.5%。结论 本文提出的基于低秩稀疏图嵌入的半监督特征选择算法使得降维后的数据能最大限度地保留原始数据包含的信息,且能有效处理少量有标签样本和大量无标签样本的情况。实验结果表明,本文方法比现有算法的分类效果更好,此外,由于本文方法基于所有的特征都在线性流形上的假设,所以本文方法只适用于线性流形上的数据。  相似文献   

核分布一致局部领域适应学习   总被引:3,自引:3,他引:0  
陶剑文  王士同 《自动化学报》2013,39(8):1295-1309
针对领域适应学习(Domain adaptation learning, DAL)问题,提出一种核分布一致局部领域适应学习机(Kernel distribution consistency based local domain adaptation classifier, KDC-LDAC),在某个通用再生核Hilbert空间(Universally reproduced kernel Hilbert space, URKHS),基于结构风险最小化模型, KDC-LDAC首先学习一个核分布一致正则化支持向量机(Support vector machine, SVM),对目标数据进行初始划分; 然后,基于核局部学习思想,对目标数据类别信息进行局部回归重构; 最后,利用学习获得的类别信息,在目标领域训练学习一个适于目标判别的分类器.人 造和实际数据集实验结果显示,所提方法具有优化或可比较的领域适应学习性能.  相似文献   

针对多核子空间谱聚类算法没有考虑噪声和关系图结构的问题,提出了一种新的联合低秩稀疏的多核子空间聚类算法(JLSMKC)。首先,通过联合低秩与稀疏表示进行子空间学习,使关系图具有低秩和稀疏结构属性;其次,建立鲁棒的多核低秩稀疏约束模型,用于减少噪声对关系图的影响和处理数据的非线性结构;最后,通过多核方法充分利用共识核矩阵来增强关系图质量。7个数据集上的实验结果表明,所提算法JLSMKC在聚类精度(ACC)、标准互信息(NMI)和纯度(Purity)上优于5种流行的多核聚类算法,同时减少了聚类时间,提高了关系图块对角质量。该算法在聚类性能上有较大优势。  相似文献   

多源适应学习是一种旨在提升目标学习性能的有效机器学习方法。针对多标签视觉分类问题,基于现有的研究进展,研究提出一种新颖的联合特征选择和共享特征子空间学习的多源适应多标签分类框架,在现有的图Laplacian正则化半监督学习范式中充分考虑目标视觉特征的优化处理,多标签相关信息在共享特征子空间的嵌入,以及多个相关领域的判别信息桥接利用等多个方面,并将其融为一个统一的学习模型,理论证明了其局部最优解只需通过求解一个广义特征分解问题便可分别获得,并给出了算法实现及其收敛性定理。在两个实际的多标签视觉数据分类上分别进行深入实验分析,证实了所提框架的鲁棒有效性和优于现有相关方法的分类性能。  相似文献   

针对当前稀疏数据推荐准确率低的问题,提出一种基于多核学习卷积神经网络的稀疏数据推荐算法.将项目的辅助信息送入卷积神经网络学习特征,将向量在可再生核希尔伯特空间组合,利用多核学习技术增强卷积神经网络的特征学习能力;基于学习的卷积特征集初始化非负矩阵模型,通过非负矩阵模型实现对缺失评分的预测.实验结果表明,该算法有效提高了稀疏数据集的推荐性能,验证了多核学习卷积神经网络的有效性.  相似文献   

Domain adaptation learning(DAL) methods have shown promising results by utilizing labeled samples from the source(or auxiliary) domain(s) to learn a robust classifier for the target domain which has a few or even no labeled samples.However,there exist several key issues which need to be addressed in the state-of-theart DAL methods such as sufficient and effective distribution discrepancy metric learning,effective kernel space learning,and multiple source domains transfer learning,etc.Aiming at the mentioned-above issues,in this paper,we propose a unified kernel learning framework for domain adaptation learning and its effective extension based on multiple kernel learning(MKL) schema,regularized by the proposed new minimum distribution distance metric criterion which minimizes both the distribution mean discrepancy and the distribution scatter discrepancy between source and target domains,into which many existing kernel methods(like support vector machine(SVM),v-SVM,and least-square SVM) can be readily incorporated.Our framework,referred to as kernel learning for domain adaptation learning(KLDAL),simultaneously learns an optimal kernel space and a robust classifier by minimizing both the structural risk functional and the distribution discrepancy between different domains.Moreover,we extend the framework KLDAL to multiple kernel learning framework referred to as MKLDAL.Under the KLDAL or MKLDAL framework,we also propose three effective formulations called KLDAL-SVM or MKLDAL-SVM with respect to SVM and its variant μ-KLDALSVM or μ-MKLDALSVM with respect to v-SVM,and KLDAL-LSSVM or MKLDAL-LSSVM with respect to the least-square SVM,respectively.Comprehensive experiments on real-world data sets verify the outperformed or comparable effectiveness of the proposed frameworks.  相似文献   

Domain adaptation learning (DAL) is a novel and effective technique to address pattern classification problems where the prior information for training is unavailable or insufficient. Its effectiveness depends on the discrepancy between the two distributions that respectively generate the training data for the source domain and the testing data for the target domain. However, DAL may not work so well when only the distribution mean discrepancy between source and target domains is considered and minimized. In this paper, we first construct a generalized projected maximum distribution discrepancy (GPMDD) metric for DAL on reproducing kernel Hilbert space (RKHS) based domain distributions by simultaneously considering both the projected maximum distribution mean and the projected maximum distribution scatter discrepancy between the source and the target domain. In the sequel, based on both the structure risk and the GPMDD minimization principle, we propose a novel domain adaptation kernelized support vector machine (DAKSVM) with respect to the classical SVM, and its two extensions called LS-DAKSVM and μ-DAKSVM with respect to the least-square SVM and the v-SVM, respectively. Moreover, our theoretical analysis justified that the proposed GPMDD metric could effectively measure the consistency not only between the RKHS embedding domain distributions but also between the scatter information of source and target domains. Hence, the proposed methods are distinctive in that the more consistency between the scatter information of source and target domains can be achieved by tuning the kernel bandwidth, the better the convergence of GPMDD metric minimization is and thus improving the scalability and generalization capability of the proposed methods for DAL. Experimental results on artificial and real-world problems indicate that the performance of the proposed methods is superior to or at least comparable with existing benchmarking methods.  相似文献   

陈思宝  赵令  罗斌 《自动化学报》2014,40(10):2295-2305
为了利用核技巧提高分类性能, 在局部保持的稀疏表示 字典学习的基础上, 提出了两种核化的稀疏表示字典学习方法. 首先, 原始训练数据被投影到高维核空间, 进行基于局部保持的核稀疏表示字典学习; 其次, 在稀疏系数上强加核局部保持约束, 进行基于核局部保持的核稀疏表示字典学习. 实验结果表明, 该方法的分类识别结果优于其他方法.  相似文献   

In many machine learning algorithms, a major assumption is that the training and the test samples are in the same feature space and have the same distribution. However, for many real applications this assumption does not hold. In this paper, we survey the problem where the training samples and the test samples are from different distributions. This problem can be referred as domain adaptation. The training samples, always with labels, are obtained from what is called source domains, while the test samples, which usually have no labels or only a few labels, are obtained from what is called target domains. The source domains and the target domains are different but related to some extent; the learners can learn some information from the source domains for the learning of the target domains. We focus on the multi-source domain adaptation problem where there is more than one source domain available together with only one target domain. A key issue is how to select good sources and samples for the adaptation. In this survey, we review some theoretical results and well developed algorithms for the multi-source domain adaptation problem. We also discuss some open problems which can be explored in future work.  相似文献   

汪云云  孙顾威  赵国祥  薛晖 《软件学报》2022,33(4):1170-1182
无监督域适应(unsupervised domain adaptation,UDA)旨在利用带大量标注数据的源域帮助无任何标注信息的目标域学习.在UDA中,通常假设源域和目标域间的数据分布不同,但共享相同的类标签空间.但在真实开放学习场景中,域间的标签空间很可能存在差异.在极端情形下,域间的类别不存在交集,即目标域中类...  相似文献   

Predicting the response variables of the target dataset is one of the main problems in machine learning. Predictive models are desired to perform satisfactorily in a broad range of target domains. However, that may not be plausible if there is a mismatch between the source and target domain distributions. The goal of domain adaptation algorithms is to solve this issue and deploy a model across different target domains. We propose a method based on kernel distribution embedding and Hilbert-Schmidt independence criterion (HSIC) to address this problem. The proposed method embeds both source and target data into a new feature space with two properties: 1) the distributions of the source and the target datasets are as close as possible in the new feature space, and 2) the important structural information of the data is preserved. The embedded data can be in lower dimensional space while preserving the aforementioned properties and therefore the method can be considered as a dimensionality reduction method as well. Our proposed method has a closed-form solution and the experimental results show that it works well in practice.  相似文献   

领域适应核支持向量机   总被引:6,自引:4,他引:2  
领域适应学习是一种新颖的解决先验信息缺少的模式分类问题的有效方法, 最大化地缩小领域间样本分布差是领域适应学习成功的关键因素之一,而仅考虑领域间分布均值差最小化, 使得在具体领域适应学习问题上存在一定的局限性.对此,在某个再生核Hilbert空间, 在充分考虑领域间分布的均值差和散度差最小化的基础上,基于结构风险最小化模型, 提出一种领域适应核支持向量学习机(Kernel support vector machine for domain adaptation, DAKSVM)及其最小平方范式,人造和实际数据集实验结果显示,所提方法具有优化或可比较的模式分类性能.  相似文献   

一种基于局部加权均值的领域适应学习框架   总被引:2,自引:0,他引:2  
皋军  黄丽莉  孙长银 《自动化学报》2013,39(7):1037-1052
最大均值差异(Maximum mean discrepancy, MMD)作为一种能有效度量源域和目标域分布差异的标准已被成功运用.然而, MMD作为一种全局度量方法一定程度上反映的是区域之间全局分布和全局结构上的差异.为此, 本文通过引入局部加权均值的方法和理论到MMD中, 提出一种具有局部保持能力的投影最大局部加权均值差异(Projected maximum local weighted mean discrepancy, PMLWD)度量,%从而一定程度上使得PMLWD更能有效度量源域和目标域中局部分块之间的分布和结构上的差异,结合传统的学习理论提出基于局部加权均值的领域适应学习框架(Local weighted mean based domain adaptation learning framework, LDAF), 在LDAF框架下, 衍生出两种领域适应学习方法: LDAF_MLC和 LDAF_SVM.最后,通过测试人工数据集、高维文本数据集和人脸数据集来表明LDAF比其他领域适应学习方法更具优势.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号