首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
最大均值差异仅用于反映样本空间总体的分布信息和全局结构信息,忽略了单个样本对全局度量贡献的差异性。为此,提出一种最大分布加权均值差异(MDWMD)度量方法,采用白化余弦相似性度量为源域和目标域的所有样本设计相应的分布权重,使得每个样本的分布差异信息在全局度量中均得以体现。进一步,在MDWMD基础上,结合联合分布调整思想,提出一种领域适应学习算法:基于最大分布加权均值嵌入的联合分布调整,同时对源域和目标域中的数据进行边缘概率分布调整和条件分布调整。实验结果表明,与现有典型的迁移学习和无迁移学习算法相比,所提算法在不同类型跨领域图片数据集上的分类精度较高。  相似文献   

2.
针对领域自适应问题中源域和目标域的联合分布差异最小化问题,提出两阶段领域自适应学习方法.在第一阶段考虑样本标签和数据结构的判别信息,通过学习一个共享投影变换,使投影后的共享空间中边缘分布的差异最小.第二阶段利用源域标记数据和目标域非标记数据学习一个带结构风险的自适应分类器,不仅能最小化源域和目标域条件分布差异,还能进一步保持源域和目标域边缘分布的流形一致性.在3个基准数据集上的实验表明,文中方法在平均分类准确率和Kappa系数两项评价指标上均表现较优.  相似文献   

3.
In this work, we study the problem of cross-domain video concept detection, where the distributions of the source and target domains are different. Active learning can be used to iteratively refine a source domain classifier by querying labels for a few samples in the target domain, which could reduce the labeling effort. However, traditional active learning method which often uses a discriminative query strategy that queries the most ambiguous samples to the source domain classifier for labeling would fail, when the distribution difference between two domains is too large. In this paper, we tackle this problem by proposing a joint active learning approach which combines a novel generative query strategy and the existing discriminative one. The approach adaptively fits the distribution difference and shows higher robustness than the ones using single strategy. Experimental results on two synthetic datasets and the TRECVID video concept detection task highlight the effectiveness of our joint active learning approach.  相似文献   

4.
目的 目前深度神经网络已成功应用于众多机器学习任务,并展现出惊人的性能提升效果。然而传统的深度网络和机器学习算法都假定训练数据和测试数据服从的是同一分布,而这种假设在实际应用中往往是不成立的。如果训练数据和测试数据的分布差异很大,那么由传统机器学习算法训练出来的分类器的性能将会大大降低。为了解决此类问题,提出了一种基于多层校正的无监督领域自适应方法。方法 首先利用多层校正来调整现有的深度网络,利用加法叠加来完美对齐源域和目标域的数据表示;然后采用多层权值最大均值差异来适应目标域,增加网络的表示能力;最后提取学习获得的域不变特征来进行分类,得到目标图像的识别效果。结果 本文算法在Office-31图像数据集等4个数字数据集上分别进行了测试实验,以对比不同算法在图像识别和分类方面的性能差异,并进行准确度测量。测试结果显示,与同领域算法相比,本文算法在准确率上至少提高了5%,在应对照明变化、复杂背景和图像质量不佳等干扰情况时,亦能获得较好的分类效果,体现出更强的鲁棒性。结论 在领域自适应相关数据集上的实验结果表明,本文方法具备一定的泛化能力,可以实现较高的分类性能,并且优于其他现有的无监督领域自适应方法。  相似文献   

5.
Automatic annotation of images is one of the fundamental problems in computer vision applications. With the increasing amount of freely available images, it is quite possible that the training data used to learn a classifier has different distribution from the data which is used for testing. This results in degradation of the classifier performance and highlights the problem known as domain adaptation. Framework for domain adaptation typically requires a classification model which can utilize several classifiers by combining their results to get the desired accuracy. This work proposes depth-based and iterative depth-based fusion methods which are basically rank-based fusion methods and utilize rank of the predicted labels from different classifiers. Two frameworks are also proposed for domain adaptation. The first framework uses traditional machine learning algorithms, while the other works with metric learning as well as transfer learning algorithm. Motivated from ImageCLEF’s 2014 domain adaptation task, these frameworks with the proposed fusion methods are validated and verified by conducting experiments on the images from five domains having varied distributions. Bing, Caltech, ImageNet, and PASCAL are used as source domains and the target domain is SUN. Twelve object categories are chosen from these domains. The experimental results show the performance improvement not only over the baseline system, but also over the winner of the ImageCLEF’s 2014 domain adaptation challenge.  相似文献   

6.
在域间分布适配的过程中,容易丢失一些重要的域自身信息,在源域上难以训练获得一个有效的分类器,影响其在目标域上的泛化与标注性能.基于此种情况,文中提出联合类间及域间分布适配的迁移学习方法.通过学习一个公共投影矩阵,分别将源域与目标域映射到一个公共子空间上.采用最大均值差异方法分别度量类间及域间分布距离.在目标函数的优化过程中,不但显式地使域间分布差异变小,而且增大不同类别间的差异性,提高源域与目标域之间知识迁移的性能.在迁移学习数据集上的实验表明文中方法的有效性.  相似文献   

7.
针对在单一匹配边缘概率分布以缩减源域和目标域的差异性时存在的泛化能力差的问题,提出联合边缘概率分布和条件概率分布减小域间差异性的基于特征和实例的迁移学习算法.通过核主成分分析在子空间中寻找样本新的特征表示,在该子空间中利用最小化最大均值差异,联合匹配边缘概率分布和条件概率分布以减小源域和目标域间的差异性.同时利用L2,1范数约束选择源域中相关实例进行训练,进一步提高迁移学习获得的模型泛化性能.在字符集和对象识别数据集上的实验表明文中算法的有效性.  相似文献   

8.
李志恒 《计算机应用研究》2021,38(2):591-594,599
针对机器学习中训练样本和测试样本概率分布不一致的问题,提出了一种基于dropout正则化的半监督域自适应方法来实现将神经网络的特征表示从标签丰富的源域转移到无标签的目标域。此方法从半监督学习的角度出发,在源域数据中添加少量带标签的目标域数据,使得神经网络在学习到源域数据特征分布的同时也能学习到目标域数据的特征分布。由于有了先验知识的指导,即使没有丰富的标签信息,神经网络依然可以很好地拟合目标域数据。实验结果表明,此算法在几种典型的数字数据集SVHN、MNIST和USPS的域自适应任务上的性能优于现有的其他算法,并且在涵盖广泛自然类别的真实数据集CIFAR-10和STL-10的域自适应任务上有较好的鲁棒性。  相似文献   

9.
目的 现有的图像识别方法应用于从同一分布中提取的训练数据和测试数据时具有良好性能,但这些方法在实际场景中并不适用,从而导致识别精度降低。使用领域自适应方法是解决此类问题的有效途径,领域自适应方法旨在解决来自两个领域相关但分布不同的数据问题。方法 通过对数据分布的分析,提出一种基于注意力迁移的联合平衡自适应方法,将源域有标签数据中提取的图像特征迁移至无标签的目标域。首先,使用注意力迁移机制将有标签源域数据的空间类别信息迁移至无标签的目标域。通过定义卷积神经网络的注意力,使用关注信息来提高图像识别精度。其次,基于目标数据集引入网络参数的先验分布,并且赋予网络自动调整每个领域对齐层特征对齐的能力。最后,通过跨域偏差来描述特定领域的特征对齐层的输入分布,定量地表示每层学习到的领域适应性程度。结果 该方法在数据集Office-31上平均识别准确率为77.6%,在数据集Office-Caltech上平均识别准确率为90.7%,不仅大幅领先于传统手工特征方法,而且取得了与目前最优的方法相当的识别性能。结论 注意力迁移的联合平衡领域自适应方法不仅可以获得较高的识别精度,而且能够自动学习领域间特征的对齐程度,同时也验证了进行域间特征迁移可以提高网络优化效果这一结论。  相似文献   

10.
智能裁剪任务一直受到缺乏训练数据的困扰,目前还局限于公开数据集中.因为实际应用场景与训练场景之间存在域迁移,文中提出基于序列对抗域适应的智能裁剪算法.首先,通过实验证实裁剪数据集GAICD和CPC之间存在域迁移问题.然后,构造由美学评分模块和对抗域适应模块组成的算法.美学评分模块用于预测图像的美学评分,并辅助提取面向裁剪任务的不变特征.对抗域适应模块实现基于对抗的域适应学习.不同裁剪数据集之间的域迁移实验及室内/室外场景之间的域迁移实验均验证文中算法的有效性.  相似文献   

11.
跨项目缺陷预测(cross-project defect prediction, CPDP)已经成为软件工程和数据挖掘领域的一个重要研究方向, 利用其他数据丰富项目的缺陷代码来建立预测模型, 解决了模型构建过程中的数据不足问题. 然而源项目和目标项目的代码文件之间存在的分布差异, 导致跨项目预测效果不佳. 大多数研究采用域适应方法来解决这一问题, 但是现有的方法一方面只考虑了条件分布或边缘分布对缺陷预测的影响, 忽视了其动态性; 另一方面没有选择合适的伪标签. 基于上述两个方面, 本文提出了一种基于动态分布对齐和伪标签学习的跨项目缺陷预测方法(DPLD). 具体来说, 我们通过对抗域适应方法分别在域对齐和类别对齐模块中减小项目间的边缘分布差异和条件分布差异, 并借助动态分布因子动态、定量地描述了两种分布的相对重要性. 此外, 本文也提出了一种伪标签学习方法, 通过数据间的几何相似性来增强伪标签作为真实标签的准确性. 本文在PROMISE数据集上进行了实验, F-measure和AUC的值分别提升了22.98%、15.21%, 表明了本文方法在减小项目间分布差异、提升跨项目缺陷预测性能上的有效性.  相似文献   

12.
Despite the recent success in data-driven machinery fault diagnosis, cross-domain diagnostic tasks still remain challenging where the supervised training data and unsupervised testing data are collected under different operating conditions. In order to address the domain shift problem, minimizing the marginal domain distribution discrepancy is considered in most of the existing studies. While improvements have been achieved, the class-level alignments between domains are generally neglected, resulting in deteriorations in testing performance. This paper proposes an adversarial multi-classifier optimization method for cross-domain fault diagnosis based on deep learning. Through adversarial training, the overfitting phenomena of different classifiers are exploited to achieve class-level domain adaptation effects, facilitating extraction of domain-invariant features and development of cross-domain classifiers. Experiments on three rotating machinery datasets are carried out for validations, and the results suggest the proposed method is promising for cross-domain fault diagnostic tasks.  相似文献   

13.
In many machine learning algorithms, a major assumption is that the training and the test samples are in the same feature space and have the same distribution. However, for many real applications this assumption does not hold. In this paper, we survey the problem where the training samples and the test samples are from different distributions. This problem can be referred as domain adaptation. The training samples, always with labels, are obtained from what is called source domains, while the test samples, which usually have no labels or only a few labels, are obtained from what is called target domains. The source domains and the target domains are different but related to some extent; the learners can learn some information from the source domains for the learning of the target domains. We focus on the multi-source domain adaptation problem where there is more than one source domain available together with only one target domain. A key issue is how to select good sources and samples for the adaptation. In this survey, we review some theoretical results and well developed algorithms for the multi-source domain adaptation problem. We also discuss some open problems which can be explored in future work.  相似文献   

14.
迁移学习是将源域的知识迁移解决目标域问题的方法,能有效解决数据分布不一致的问题.针对多源域迁移时传统方法缺乏对多源域的可迁移性的合理分析和迁移效果的有效处理问题,提出一种基于流形结构的多源自适应迁移学习的方法,旨在提高单源域迁移效果的同时实现多源域的有效迁移.首先,对多源域进行可迁移性分析,选择可迁移的源域;然后,适配边缘分布和条件分布并引入均衡因子得到均衡分布适配,同时利用流形正则化约束数据结构,使单源域的信息使用最大化;最后,通过加权因子对不同源域分类器进行自适应加权,充分利用多源域的信息求解目标域问题.将该算法应用于滚磨光整加工中滚抛磨块的优选,通过建立滚抛磨块的相似度匹配方法,构建基于流形结构的多源自适应迁移学习的滚抛磨块优选模型.大量对比实验表明所提出方法表现更佳,准确率最高至73.44%,可以为滚磨光整中滚抛磨块的选择提供更有效的决策指导.  相似文献   

15.
Transfer learning is a widely investigated learning paradigm that is initially proposed to reuse informative knowledge from related domains, as supervised information in the target domain is scarce while it is sufficiently available in the multiple source domains. One of the challenging issues in transfer learning is how to handle the distribution differences between the source domains and the target domain. Most studies in the research field implicitly assume that data distributions from the source domains and the target domain are similar in a well-designed feature space. However, it is often the case that label assignments for data in the source domains and the target domain are significantly different. Therefore, in reality even if the distribution difference between a source domain and a target domain is reduced, the knowledge from multiple source domains is not well transferred to the target domain unless the label information is carefully considered. In addition, noisy data often emerge in real world applications. Therefore, considering how to handle noisy data in the transfer learning setting is a challenging problem, as noisy data inevitably cause a side effect during the knowledge transfer. Due to the above reasons, in this paper, we are motivated to propose a robust framework against noise in the transfer learning setting. We also explicitly consider the difference in data distributions and label assignments among multiple source domains and the target domain. Experimental results on one synthetic data set, three UCI data sets and one real world text data set in different noise levels demonstrate the effectiveness of our method.  相似文献   

16.
In this paper, we study the problem of domain adaptation, which is a crucial ingredient in transfer learning with two domains, that is, the source domain with labeled data and the target domain with none or few labels. Domain adaptation aims to extract knowledge from the source domain to improve the performance of the learning task in the target domain. A popular approach to handle this problem is via adversarial training, which is explained by the $\mathcal H \Delta \mathcal H$-distance theory. However, traditional adversarial network architectures just align the marginal feature distribution in the feature space. The alignment of class condition distribution is not guaranteed. Therefore, we proposed a novel method based on pseudo labels and the cluster assumption to avoid the incorrect class alignment in the feature space. The experiments demonstrate that our framework improves the accuracy on typical transfer learning tasks.  相似文献   

17.
为克服不同图像域之间的特征“差异”,跨越分布“鸿沟”,提出了一种基于正则化迁移稀疏概念编码的跨域图像分类方法。将图像域间的分布差异性和标签相关性信息融入稀疏编码模型中,以学习跨域图像的鲁棒性稀疏表示,从高维的图像特征空间中挖掘图像低维流形结构,形成基向量集,构造跨域图像的迁移稀疏概念编码。该方法挖掘不同图像域之间的共同特征表达,实现了图像标签的跨域迁移。通过在多个图像数据库中的比较实验表明,该方法获得更为鲁棒的图像特征表达,其分类性能显著优于其他相关比较方法。  相似文献   

18.

现实世界中训练数据和测试数据往往存在分布差异,导致基于独立同分布假设的模型丧失鲁棒性. 无监督域自适应是一种重要解决方法,极具应用价值. 鉴于此,国内外研究者进行大量理论基础和方法技术的研究,促进了很多应用领域的发展,包括自动驾驶、智慧医疗等. 但是,目前主流的方法仍存在一些问题:源域和目标域的概率分布距离是否能真正代表它们之间的差异,以及如何更准确地度量2个分布之间的差异,仍然是一个值得探讨的问题. 同时,如何更有效地利用伪标签,也是一个值得继续探索的问题. 提出了反向伪标签最优化传输(backward pseudo-label and optimal transport,BPLOT),不仅利用瓦瑟斯坦距离和格罗莫夫-瓦瑟斯坦距离,从最优化特征-拓扑传输的角度更准确地计算了2个分布之间的差异;而且提出了反向验证伪标签的模块来更有效地利用伪标签,在训练过程中验证伪标签的质量. 将所提出的方法在多个无监督域自适应的数据集上进行了实验验证. 实验结果表明,BPLOT模型的效果超过了所有对比的基准方法.

  相似文献   

19.
徐春荞  张冰冰  李培华 《计算机应用研究》2021,38(10):3040-3043,3048
域对抗学习是一种主流的域适应方法,它通过分类器和域判别器来学习具有可区分性的域不变特征;然而,现有的域对抗方法大多利用一阶特征来学习域不变特征,忽略了具有更强表达能力的二阶特征.提出了一种条件对抗域适应网络,通过联合建模图像的二阶表征以及特征和分类器预测之间的互协方差以便更有效地学习具有区分性的域不变特征;此外,引入了熵条件来平衡分类器预测的不确定性,以保证特征的可迁移性.提出的方法在两个常用的域适应数据库Office-31和ImageCLEF-DA上进行了验证,实验结果表明该方法优于同类方法并获得了领先的性能.  相似文献   

20.
Unsupervised domain adaptation (UDA) has achieved great success in handling cross-domain machine learning applications.It typically benefits the model training of unlabeled target domain by leveraging knowledge from labeled source domain.For this purpose,the minimization of the marginal distribution divergence and conditional distribution divergence between the source and the target domain is widely adopted in existing work.Nevertheless,for the sake of privacy preservation,the source domain is usually not provided with training data but trained predictor (e.g.,classifier).This incurs the above studies infeasible because the marginal and conditional distributions of the source domain are incalculable.To this end,this article proposes a source-free UDA which jointly models domain adaptation and sample transport learning,namely Sample Transport Domain Adaptation (STDA).Specifically,STDA constructs the pseudo source domain according to the aggregated decision boundaries of multiple source classifiers made on the target domain.Then,it refines the pseudo source domain by augmenting it through transporting those target samples with high confidence,and consequently generates labels for the target domain.We train the STDA model by performing domain adaptation with sample transport between the above steps in alternating manner,and eventually achieve knowledge adaptation to the target domain and attain confident labels for it.Finally,evaluation results have validated effectiveness and superiority of the proposed method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号