首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 968 毫秒
1.
Cross-domain learning methods have shown promising results by leveraging labeled patterns from the auxiliary domain to learn a robust classifier for the target domain which has only a limited number of labeled samples. To cope with the considerable change between feature distributions of different domains, we propose a new cross-domain kernel learning framework into which many existing kernel methods can be readily incorporated. Our framework, referred to as Domain Transfer Multiple Kernel Learning (DTMKL), simultaneously learns a kernel function and a robust classifier by minimizing both the structural risk functional and the distribution mismatch between the labeled and unlabeled samples from the auxiliary and target domains. Under the DTMKL framework, we also propose two novel methods by using SVM and prelearned classifiers, respectively. Comprehensive experiments on three domain adaptation data sets (i.e., TRECVID, 20 Newsgroups, and email spam data sets) demonstrate that DTMKL-based methods outperform existing cross-domain learning and multiple kernel learning methods.  相似文献   

2.
Domain adaptation learning(DAL) methods have shown promising results by utilizing labeled samples from the source(or auxiliary) domain(s) to learn a robust classifier for the target domain which has a few or even no labeled samples.However,there exist several key issues which need to be addressed in the state-of-theart DAL methods such as sufficient and effective distribution discrepancy metric learning,effective kernel space learning,and multiple source domains transfer learning,etc.Aiming at the mentioned-above issues,in this paper,we propose a unified kernel learning framework for domain adaptation learning and its effective extension based on multiple kernel learning(MKL) schema,regularized by the proposed new minimum distribution distance metric criterion which minimizes both the distribution mean discrepancy and the distribution scatter discrepancy between source and target domains,into which many existing kernel methods(like support vector machine(SVM),v-SVM,and least-square SVM) can be readily incorporated.Our framework,referred to as kernel learning for domain adaptation learning(KLDAL),simultaneously learns an optimal kernel space and a robust classifier by minimizing both the structural risk functional and the distribution discrepancy between different domains.Moreover,we extend the framework KLDAL to multiple kernel learning framework referred to as MKLDAL.Under the KLDAL or MKLDAL framework,we also propose three effective formulations called KLDAL-SVM or MKLDAL-SVM with respect to SVM and its variant μ-KLDALSVM or μ-MKLDALSVM with respect to v-SVM,and KLDAL-LSSVM or MKLDAL-LSSVM with respect to the least-square SVM,respectively.Comprehensive experiments on real-world data sets verify the outperformed or comparable effectiveness of the proposed frameworks.  相似文献   

3.
多核局部领域适应学习   总被引:1,自引:0,他引:1  
陶剑文  王士同 《软件学报》2012,23(9):2297-2310
领域适应(或跨领域)学习旨在利用源领域(或辅助领域)中带标签样本来学习一种鲁棒的目标分类器,其关键问题在于如何最大化地减小领域间的分布差异.为了有效解决领域间特征分布的变化问题,提出一种三段式多核局部领域适应学习(multiple kernel local leaning-based domain adaptation,简称MKLDA)方法:1)基于最大均值差(maximum mean discrepancy,简称MMD)度量准则和结构风险最小化模型,同时,学习一个再生多核Hilbert空间和一个初始的支持向量机(support vector machine,简称SVM),对目标领域数据进行初始划分;2)在习得的多核Hilbert空间,对目标领域数据的类别信息进行局部重构学习;3)最后,利用学习获得的类别信息,在目标领域训练学习一个鲁棒的目标分类器.实验结果显示,所提方法具有优化或可比较的领域适应学习性能.  相似文献   

4.
现有的跨领域情感分类方法大多只利用了单个源领域到目标域的迁移特征,没有充分考虑目标域实例与不同源域之间的联系。针对此问题,本文提出一种无监督的多源跨领域情感分类模型。首先利用单个源域到目标域的迁移特征训练基分类器,并对不同的基分类器加权;然后将不同基分类器对目标域实例预测的集成一致性作为目标函数,优化该目标函数,得到不同基分类器的权重;最后利用加权后的基分类器得到目标域的情感分类结果。该模型在Amazon数据集上进行了多源域情感迁移实验,取得了较好的实验结果,相对其他基线模型,在4组实验中平均提升了0.75%。  相似文献   

5.
Domain adaptation learning (DAL) is a novel and effective technique to address pattern classification problems where the prior information for training is unavailable or insufficient. Its effectiveness depends on the discrepancy between the two distributions that respectively generate the training data for the source domain and the testing data for the target domain. However, DAL may not work so well when only the distribution mean discrepancy between source and target domains is considered and minimized. In this paper, we first construct a generalized projected maximum distribution discrepancy (GPMDD) metric for DAL on reproducing kernel Hilbert space (RKHS) based domain distributions by simultaneously considering both the projected maximum distribution mean and the projected maximum distribution scatter discrepancy between the source and the target domain. In the sequel, based on both the structure risk and the GPMDD minimization principle, we propose a novel domain adaptation kernelized support vector machine (DAKSVM) with respect to the classical SVM, and its two extensions called LS-DAKSVM and μ-DAKSVM with respect to the least-square SVM and the v-SVM, respectively. Moreover, our theoretical analysis justified that the proposed GPMDD metric could effectively measure the consistency not only between the RKHS embedding domain distributions but also between the scatter information of source and target domains. Hence, the proposed methods are distinctive in that the more consistency between the scatter information of source and target domains can be achieved by tuning the kernel bandwidth, the better the convergence of GPMDD metric minimization is and thus improving the scalability and generalization capability of the proposed methods for DAL. Experimental results on artificial and real-world problems indicate that the performance of the proposed methods is superior to or at least comparable with existing benchmarking methods.  相似文献   

6.
Most current unsupervised domain networks try to alleviate domain shifts by only considering the difference between source domain and target domain caused by the classifier, without considering task-specific decision boundaries between categories. In addition, these networks aim to completely align data distributions, which is difficult because each domain has its characteristics. In light of these issues, we develop a Gaussian-guided adversarial adaptation transfer network (GAATN) for bearing fault diagnosis. Specifically, GAATN introduces a Gaussian-guided distribution alignment strategy to make the data distribution of two domains close to the Gaussian distribution to reduce data distribution discrepancies. Furthermore, GAATN adopts a novel adversarial training mechanism for domain adaptation, which designs two task-specific classifiers to identify target data to consider the relationship between target data and category boundaries. Massive experimental results prove that the superiority and robustness of the proposed method outperform existing popular methods.  相似文献   

7.
王帆  韩忠义  尹义龙 《软件学报》2022,33(4):1183-1199
无监督域自适应是解决训练集(源域)和测试集(目标域)分布不一致的有效途径之一.现有的无监督域自适应的理论和方法在相对封闭、静态的环境下取得了一定成功,但面向开放动态任务环境时,在隐私保护、数据孤岛等限制条件下,源域数据往往不可直接获取,现有无监督域自适应方法的鲁棒性将面临严峻的挑战.鉴于此,研究了一个更具挑战性却又未被...  相似文献   

8.
汪云云  孙顾威  赵国祥  薛晖 《软件学报》2022,33(4):1170-1182
无监督域适应(unsupervised domain adaptation,UDA)旨在利用带大量标注数据的源域帮助无任何标注信息的目标域学习.在UDA中,通常假设源域和目标域间的数据分布不同,但共享相同的类标签空间.但在真实开放学习场景中,域间的标签空间很可能存在差异.在极端情形下,域间的类别不存在交集,即目标域中类...  相似文献   

9.
Domain adaptation for object detection has been extensively studied in recent years. Most existing approaches focus on single-source unsupervised domain adaptive object detection. However, a more practical scenario is that the labeled source data is collected from multiple domains with different feature distributions. The conventional approaches do not work very well since multiple domain gaps exist. We propose a Multi-source domain Knowledge Transfer (MKT) method to handle this situation. First, the low-level features from multiple domains are aligned by learning a shallow feature extraction network. Then, the high-level features from each pair of source and target domains are aligned by the followed multi-branch network. After that, we perform two parts of information fusion: (1) We train a detection network shared by all branches based on the transferability of each source sample feature. The transferability of a source sample feature means the indistinguishable degree to the target domain sample features. (2) For using our model, the target sample features output by the multi-branch network are fused based on the average transferability of each domain. Moreover, we leverage both image-level and instance-level attention to promote positive cross-domain transfer and suppress negative transfer. Our main contributions are the two-stage feature alignments and information fusion. Extensive experimental results on various transfer scenarios show that our method achieves the state-of-the-art performance.  相似文献   

10.
在无监督领域自适应中分类器对目标域的样本进行类别预测时容易产生混淆预测,虽然已有研究提出了相关算法提取到样本的类间相关性,降低了分类器在目标域上的类混淆预测。但该方法仍然未能解决源域和目标域因共享特征稀疏导致的迁移学习能力不足的问题,针对这个问题,通过使用生成对抗网络对源域进行了风格迁移,扩展源域各类样本的特征空间可供目标域匹配的共享特征,解决因共享特征稀疏导致分类器正迁移力不足的问题,从而进一步减少分类器在目标域上产生的类混淆预测。当分类器利用扩充后的共享特征对目标域样本预测分类概率时,基于不确定性权重机制,加重预测概率权重使其能在几个预测概率峰值上以更高的概率值突出,准确地量化类混淆,最小化跨域的类混淆预测,抑制跨域的负迁移。在UDA场景下,对标准的数据集ImageCLEF-DA和Office-31的三个子数据集分别进行了领域自适应实验,相较于RADA算法平均识别精度分别提升了1.3个百分点和1.7个百分点。  相似文献   

11.
在构建基于极限学习机的无监督自适应分类器时, 隐含层的参数通常都是随机选取的, 而随机选取的参数不具备领域适应能力. 为了增强跨领域极限学习机的知识迁移能力,提出一种新的基于极限学习机的无监督领域适应分类器学习方法, 该方法主要利用自编码极限学习机对源域和目标域数据进行重构学习, 从而可以获得具有领域不变特性的隐含层参数. 进一步, 结合联合概率分布匹配和流形正则的思想, 对输出层权重进行自适应调整. 所提出算法能对极限学习机的两层参数均赋予领域适应能力,在字符数据集和对象识别数据集上的实验结果表明其具有较高的跨领域分类精度.  相似文献   

12.
In many applications, a face recognition model learned on a source domain but applied to a novel target domain degenerates even significantly due to the mismatch between the two domains. Aiming at learning a better face recognition model for the target domain, this paper proposes a simple but effective domain adaptation approach that transfers the supervision knowledge from a labeled source domain to the unlabeled target domain. Our basic idea is to convert the source domain images to target domain (termed as targetize the source domain hereinafter), and at the same time keep its supervision information. For this purpose, each source domain image is simply represented as a linear combination of sparse target domain neighbors in the image space, with the combination coefficients however learnt in a common subspace. The principle behind this strategy is that, the common knowledge is only favorable for accurate cross-domain reconstruction, but for the classification in the target domain, the specific knowledge of the target domain is also essential and thus should be mostly preserved (through targetization in the image space in this work). To discover the common knowledge, specifically, a common subspace is learnt, in which the structures of both domains are preserved and meanwhile the disparity of source and target domains is reduced. The proposed method is extensively evaluated under three face recognition scenarios, i.e., domain adaptation across view angle, domain adaptation across ethnicity and domain adaptation across imaging condition. The experimental results illustrate the superiority of our method over those competitive ones.  相似文献   

13.
李庆勇  何军    张春晓 《智能系统学报》2021,16(6):999-1006
采用对抗训练的方式成为域适应算法的主流,通过域分类器将源域和目标域的特征分布对齐,减小不同域之间的特征分布差异。但是,现有的域适应方法仅将不同域数据之间的距离缩小,而没有考虑目标域数据分布与决策边界之间的关系,这会降低目标域内不同类别的特征的域内可区分性。针对现有方法的缺点,提出一种基于分类差异与信息熵对抗的无监督域适应算法(adversarial training on classification discrepancy and information entropy for unsupervised domain adaptation, ACDIE)。该算法利用两个分类器之间的不一致性对齐域间差异,同时利用最小化信息熵的方式降低不确定性,使目标域特征远离决策边界,提高了不同类别的可区分性。在数字标识数据集和Office-31数据集上的实验结果表明,ACDIE算法可以学习到更优的特征表示,域适应分类准确率有明显提高。  相似文献   

14.
Unsupervised Domain Adaptation (UDA) aims to use the source domain with large amounts of labeled data to help the learning of the target domain without any label information. In UDA, the source and target domains are usually assumed to have different data distributions but share the same class label space. Nevertheless, in real-world open learning scenarios, label spaces are highly likely to be different across domains. In extreme cases, the domains share no common classes, i.e., all classes in the target domain are new classes. In such a case, direct transferring the class-discriminative knowledge from the source domain may impair the performance in the target domain and lead to negative transfer. For this reason, this paper proposes unsupervised new-set domain adaptation with self-supervised knowledge (SUNDA) to transfer the sample contrastive knowledge from the source domain, and use self-supervised knowledge from the target domain to guide the knowledge transfer. Specifically, the initial features of the source and target domains are learned by self-supervised learning, and some network parameters are frozen to preserve target domain information. Sample contrastive knowledge from the source domain is then transferred to the target domain to assist the learning of class-discriminative features in the target domain. Moreover, graph-based self-supervised classification loss is adopted to handle the problem of target domain classification with no inter-domain common classes. SUNDA is evaluated on tasks of cross-domain transfer for handwritten digits without any common class and cross-race transfer for face data without any common class. The experiments show that SUNDA outperforms UDA, unsupervised clustering, and new class discovery methods in learning performance.  相似文献   

15.
Predicting the response variables of the target dataset is one of the main problems in machine learning. Predictive models are desired to perform satisfactorily in a broad range of target domains. However, that may not be plausible if there is a mismatch between the source and target domain distributions. The goal of domain adaptation algorithms is to solve this issue and deploy a model across different target domains. We propose a method based on kernel distribution embedding and Hilbert-Schmidt independence criterion (HSIC) to address this problem. The proposed method embeds both source and target data into a new feature space with two properties: 1) the distributions of the source and the target datasets are as close as possible in the new feature space, and 2) the important structural information of the data is preserved. The embedded data can be in lower dimensional space while preserving the aforementioned properties and therefore the method can be considered as a dimensionality reduction method as well. Our proposed method has a closed-form solution and the experimental results show that it works well in practice.  相似文献   

16.
17.
田青  孙灿宇  储奕 《软件学报》2024,35(4):1703-1716
作为机器学习的一个新兴领域,多源部分域适应(MSPDA)问题由于其源域自身的复杂性、领域之间的差异性以及目标域自身的无监督性,给相关研究带来了挑战,以致目前鲜有相关工作被提出.在该场景下,多个源域中的无关类样本在域适应过程中会造成较大的累积误差和负迁移.此外,现有多源域适应方法大多未考虑不同源域对目标域任务的贡献度不同.因此,提出基于自适应权重的多源部分域适应方法(AW-MSPDA).首先,构建了多样性特征提取器以有效利用源域的先验知识;同时,设计了多层次分布对齐策略从不同层面消除了分布差异,促进了正迁移;此外,为量化不同源域贡献度以及过滤源域无关类样本,利用相似性度量以及伪标签加权方式构建自适应权重;最后,通过大量实验验证了所提出AW-MSPDA算法的泛化性以及优越性.  相似文献   

18.
深度决策树迁移学习Boosting方法(DTrBoost)可以有效地实现单源域有监督情况下向一个目标域迁移学习,但无法实现多个源域情况下的无监督迁移场景。针对这一问题,提出了多源域分布下优化权重的无监督迁移学习Boosting方法,主要思想是根据不同源域与目标域分布情况计算出对应的KL值,通过比较选择合适数量的不同源域样本训练分类器并对目标域样本打上伪标签。最后,依照各个不同源域的KL距离分配不同的学习权重,将带标签的各个源域样本与带伪标签的目标域进行集成训练得到最终结果。对比实验表明,提出的算法实现了更好的分类精度并对不同的数据集实现了自适应效果,分类错误率平均下降2.4%,在效果最好的marketing数据集上下降6%以上。  相似文献   

19.
当训练集数据和测试集数据来自不同的载体源时,即在载体源失配的条件下,通常会使一个表现优异的隐写分析器检测准确率下降。在实际应用中,隐写分析人员往往需要处理从互联网上采集的图像。然而,与训练集数据相比,这些可疑图像很可能具有完全不同的捕获和处理历史,导致隐写分析模型可能出现不同程度的检测性能下降,这也是隐写分析工具在现实应用中很难成功部署的原因。为了提高基于深度学习的隐写分析方法的实际应用价值,对测试样本信息加以利用,使用领域自适应方法来解决载体源失配问题,将训练集数据作为源领域,将测试集数据作为目标领域,通过最小化源领域与目标领域之间的特征分布差异来提高隐写分析器在目标领域的检测性能,提出了一种对抗子领域自适应网络(ASAN,adversarial subdomain adaptation network)。一方面从生成特征的角度出发,要求隐写分析模型生成的源领域特征和目标领域特征尽可能相似,使判别器分辨不出特征来自哪一个领域;另一方面从减小域间特征分布差异的角度出发,采用子领域自适应方法来减少相关子领域分布的非期望变化,有效地扩大了载体与载密样本之间的距离,有利于分类精度的提高。通过...  相似文献   

20.
A theory of learning from different domains   总被引:1,自引:0,他引:1  
Discriminative learning methods for classification perform well when training and test data are drawn from the same distribution. Often, however, we have plentiful labeled training data from a source domain but wish to learn a classifier which performs well on a target domain with a different distribution and little or no labeled training data. In this work we investigate two questions. First, under what conditions can a classifier trained from source data be expected to perform well on target data? Second, given a small amount of labeled target data, how should we combine it during training with the large amount of labeled source data to achieve the lowest target error at test time? We address the first question by bounding a classifier’s target error in terms of its source error and the divergence between the two domains. We give a classifier-induced divergence measure that can be estimated from finite, unlabeled samples from the domains. Under the assumption that there exists some hypothesis that performs well in both domains, we show that this quantity together with the empirical source error characterize the target error of a source-trained classifier. We answer the second question by bounding the target error of a model which minimizes a convex combination of the empirical source and target errors. Previous theoretical work has considered minimizing just the source error, just the target error, or weighting instances from the two domains equally. We show how to choose the optimal combination of source and target error as a function of the divergence, the sample sizes of both domains, and the complexity of the hypothesis class. The resulting bound generalizes the previously studied cases and is always at least as tight as a bound which considers minimizing only the target error or an equal weighting of source and target errors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号