首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 312 毫秒
1.
Many applications are facing the problem of learning from multiple information sources, where sources may be labeled or unlabeled, and information from multiple information sources may be beneficial but cannot be integrated into a single information source for learning. In this paper, we propose an ensemble learning method for different labeled and unlabeled sources. We first present two label propagation methods to infer the labels of training objects from unlabeled sources by making a full use of class label information from labeled sources and internal structure information from unlabeled sources, which are processes referred to as global consensus and local consensus, respectively. We then predict the labels of testing objects using the ensemble learning model of multiple information sources. Experimental results show that our method outperforms two baseline methods. Meanwhile, our method is more scalable for large information sources and is more robust for labeled sources with noisy data.  相似文献   

2.
Boosting for transfer learning from multiple data sources   总被引:2,自引:0,他引:2  
Transfer learning aims at adapting a classifier trained on one domain with adequate labeled samples to a new domain where samples are from a different distribution and have no class labels. In this paper, we explore the transfer learning problems with multiple data sources and present a novel boosting algorithm, SharedBoost. This novel algorithm is capable of applying for very high dimensional data such as in text mining where the feature dimension is beyond several ten thousands. The experimental results illustrate that the SharedBoost algorithm significantly outperforms the traditional methods which transfer knowledge with supervised learning techniques. Besides, SharedBoost also provides much better classification accuracy and more stable performance than some other typical transfer learning methods such as the structural correspondence learning (SCL) and the structural learning in the multiple sources transfer learning problems.  相似文献   

3.
In many supervised learning problems, determining the true labels of training instances is expensive, laborious, and even practically impossible. As an alternative approach, it is much easier to collect multiple subjective (possibly noisy) labels from human labelers, especially with the crowdsourcing services such as Amazon’s Mechanical Turk. The collected labels are then aggregated to estimate the true labels. In order to reduce the negative effects of novices, spammers, and malicious labelers, it necessitates taking into account the accuracies of the labelers. However, in the absence of true labels, we miss the main source of information to estimate the labeler accuracies. This paper demonstrates that the agreements or disagreements among labeler opinions are useful sources of information and facilitate the accuracy estimation problem. We represent this estimation problem as an optimization problem which its goal is to minimize the differences between the analytical probabilities of disagreements based on estimated accuracies and the probabilities of disagreements according to the provided labels. We present an efficient semi-exhaustive search method to solve this optimization problem. Our experiments on the simulated data and three real datasets show that the proposed method is a promising idea in this emerging new area. The source code of the proposed method is available for downloading at http://ceit.aut.ac.ir/~amirkhani.  相似文献   

4.
In the supervised classification framework, human supervision is required for labeling a set of learning data which are then used for building the classifier. However, in many applications, human supervision is either imprecise, difficult or expensive. In this paper, the problem of learning a supervised multi-class classifier from data with uncertain labels is considered and a model-based classification method is proposed to solve it. The idea of the proposed method is to confront an unsupervised modeling of the data with the supervised information carried by the labels of the learning data in order to detect inconsistencies. The method is able afterward to build a robust classifier taking into account the detected inconsistencies into the labels. Experiments on artificial and real data are provided to highlight the main features of the proposed method as well as an application to object recognition under weak supervision.  相似文献   

5.
为了在图像底层特征与高层语义之间建立关系,提高图像自动标注的精确度,结合基于图学习的方法和基于分类的标注算法,提出了基于连续预测的半监督学习图像语义标注的方法,并对该方法的复杂度进行分析。该方法利用标签数据提供的信息和标签事例与无标签事例之间的关系,根据邻接点(事例)属于同一个类的事实,构建K邻近图。用一个基于图的分类器,通过核函数有效地计算邻接信息。在建立图的基础上,把经过划分后的样本节点集通过基于连续预测的多标签半监督学习方法进行标签传递。实验表明,提出的算法在图像标注中的标注词的平均查准率、平均查全率方面有显著的提高。  相似文献   

6.
作为监督学习的一种变体,多示例学习(MIL)试图从包中的示例中学习分类器。在多示例学习中,标签与包相关联,而不是与单个示例相关联。包的标签是已知的,示例的标签是未知的。MIL可以解决标记模糊问题,但要解决带有弱标签的问题并不容易。对于弱标签问题,包和示例的标签都是未知的,但它们是潜在的变量。现在有多个标签和示例,可以通过对不同标签进行加权来近似估计包和示例的标签。提出了一种新的基于迁移学习的多示例学习框架来解决弱标签的问题。首先构造了一个基于多示例方法的迁移学习模型,该模型可以将知识从源任务迁移到目标任务中,从而将弱标签问题转换为多示例学习问题。在此基础上,提出了一种求解多示例迁移学习模型的迭代框架。实验结果表明,该方法优于现有多示例学习方法。  相似文献   

7.
Min-Ling  Zhi-Jian 《Neurocomputing》2009,72(16-18):3951
In multi-instance multi-label learning (MIML), each example is not only represented by multiple instances but also associated with multiple class labels. Several learning frameworks, such as the traditional supervised learning, can be regarded as degenerated versions of MIML. Therefore, an intuitive way to solve MIML problem is to identify its equivalence in its degenerated versions. However, this identification process would make useful information encoded in training examples get lost and thus impair the learning algorithm's performance. In this paper, RBF neural networks are adapted to learn from MIML examples. Connections between instances and labels are directly exploited in the process of first layer clustering and second layer optimization. The proposed method demonstrates superior performance on two real-world MIML tasks.  相似文献   

8.
Supervised neural-network learning algorithms have proven very successful at solving a variety of learning problems. However, they suffer from a common problem of requiring explicit output labels. This requirement makes such algorithms implausible as biological models. In this paper, it is shown that pattern classification can be achieved, in a multilayered feedforward neural network, without requiring explicit output labels, by a process of supervised self-coding. The class projection is achieved by optimizing appropriate within-class uniformity, and between-class discernability criteria. The mapping function and the class labels are developed together, iteratively using the derived self-coding backpropagation algorithm. The ability of the self-coding network to generalize on unseen data is also experimentally evaluated on real data sets, and compares favorably with the traditional labeled supervision with neural networks. However, interesting features emerge out of the proposed self-coding supervision, which are absent in conventional approaches. The further implications of supervised self-coding with neural networks are also discussed.  相似文献   

9.
In the era of Big Data, a practical yet challenging task is to make learning techniques more universally applicable in dealing with the complex learning problem, such as multi-source multi-label learning. While some of the early work have developed many effective solutions for multi-label classification and multi-source fusion separately, in this paper we learn the two problems together, and propose a novel method for the joint learning of multiple class labels and data sources, in which an optimization framework is constructed to formulate the learning problem, and the result of multi-label classification is induced by the weighted combination of the decisions from multiple sources. The proposed method is responsive in exploiting the label correlations and fusing multi-source data, especially in the fusion of long-tail data. Experiments on various multi-source multi-label data sets reveal the advantages of the proposed method.  相似文献   

10.
In this paper, we propose a general learning framework based on local and global regularization. In the local regularization part, our algorithm constructs a regularized classifier for each data point using its neighborhood, while the global regularization part adopts a Laplacian regularizer to smooth the data labels predicted by those local classifiers. We show that such a learning framework can easily be incorporated into either unsupervised learning, semi-supervised learning, and supervised learning paradigm. Moreover, many existing learning algorithms can be derived from our framework. Finally we present some experimental results to show the effectiveness of our method.  相似文献   

11.
一种改进的针对合著关系网络的链接预测方法   总被引:1,自引:1,他引:0  
主要针对那些实体类标号属性未知的社会网络进行链接预测.由于实体的类标号属性与具体的社会网络有关,因此具体解决对作者之间合著关系网络图的链接预测问题.首先,给出了合著关系图的结构表示,然后把一个作者是否是多产的定义为合著关系图中作者实体的类标号属性.另外,还提出了一种改进的利用有指导学习进行链接预测的方法.在改进的链接预测方法中为每对作者新引入了一个特征属性--是否至少有一个是多产的.当所要预测的合著关系图中作者实体的类标号属性不完全已知时,用改进后的ICCLP算法对合著关系进行预测,以提高链接预测的性能.改进后的ICCLP算法中采用上面提到的改进后的链接预测方法.  相似文献   

12.
本文提出一种用神经网络技术学习模糊分类规则的算法-有导师共振竞争学习算法(SRCL)。SRCL方法有机地把无导师ART学习方法和有导师竞争学习方法结合起来,可有效地学习模糊分类规则。警戒线参数是自适应变化的,从而可自动地确定连结权向量的个数。言语中给出一个数字例了,并对实验结果进行了分析。  相似文献   

13.
图像级标签的弱监督图像语义分割方法是目前比较热门的研究方向,类激活图生成方式是最为常用的解决该类问题的主要工作方法。由于类激活图的稀疏性,导致判别区域的准确性降低。针对上述问题,提出了一种改进的Transformer网络弱监督图像学习方法。首先,引入空间注意力交换层来扩大类激活图的覆盖范围;其次,进一步设计了一个注意力自适应模块,来指导模型增强弱区域的类响应;特别地,在类生成过程中,构建了一个自适应跨域来提高模型分类性能。该方法在Pascal VOC 2012 验证集和测试集上分别达到了73.5%和73.0%。实验结果表明,细化Transformer网络学习方法有助于提高弱监督图像的语义分割性能。  相似文献   

14.
Learning-based hashing methods are becoming the mainstream for large scale visual search. They consist of two main components: hash codes learning for training data and hash functions learning for encoding new data points. The performance of a content-based image retrieval system crucially depends on the feature representation, and currently Convolutional Neural Networks (CNNs) has been proved effective for extracting high-level visual features for large scale image retrieval. In this paper, we propose a Multiple Hierarchical Deep Hashing (MHDH) approach for large scale image retrieval. Moreover, MHDH seeks to integrate multiple hierarchical non-linear transformations with hidden neural network layer for hashing code generation. The learned binary codes represent potential concepts that connect to class labels. In addition, extensive experiments on two popular datasets demonstrate the superiority of our MHDH over both supervised and unsupervised hashing methods.  相似文献   

15.
Crowdsourcing services have been proven efficient in collecting large amount of labeled data for supervised learning tasks. However, the low cost of crowd workers leads to unreliable labels, a new problem for learning a reliable classifier. Various methods have been proposed to infer the ground truth or learn from crowd data directly though, there is no guarantee that these methods work well for highly biased or noisy crowd labels. Motivated by this limitation of crowd data, in this paper, we propose a novel framewor for improving the performance of crowdsourcing learning tasks by some additional expert labels, that is, we treat each labeler as a personal classifier and combine all labelers’ opinions from a model combination perspective, and summarize the evidence from crowds and experts naturally via a Bayesian classifier in the intermediate feature space formed by personal classifiers. We also introduce active learning to our framework and propose an uncertainty sampling algorithm for actively obtaining expert labels. Experiments show that our method can significantly improve the learning quality as compared with those methods solely using crowd labels.  相似文献   

16.
A new approach for classification of circular knitted fabric defect is proposed which is based on accepting uncertainty in labels of the learning data. In the basic classification methodologies it is assumed that correct labels are assigned to samples and these approaches concentrate on the strength of categorization. However, there are some classification problems in which a considerable amount of uncertainty exists in the labels of samples. The core of innovation in this research has been usage of the uncertain information of labeling and their combination with the Dempster–Shafer theory of evidence. The experimental results show the robustness of the proposed method in comparison with usual classification techniques of supervised learning where the certain labels are assigned to training data.  相似文献   

17.
This paper addresses the supervised learning in which the class memberships of training data are subject to ambiguity. This problem is tackled in the ensemble learning and the Dempster-Shafer theory of evidence frameworks. The initial labels of the training data are ignored and by utilizing the main classes’ prototypes, each training pattern is reassigned to one class or a subset of the main classes based on the level of ambiguity concerning its class label. Multilayer perceptron neural network is employed to learn the characteristics of the data with new labels and for a given test pattern its outputs are considered as basic belief assignment. Experiments with artificial and real data demonstrate that taking into account the ambiguity in labels of the learning data can provide better classification results than single and ensemble classifiers that solve the classification problem using data with initial imperfect labels.  相似文献   

18.
多标签学习是一种非常重要的机器学习范式.传统的多标签学习方法是在监督或半监督的情况下设计的.通常情况下,它们需要对所有或部分数据进行准确的属于多个类别的标注.在许多实际应用中,拥有大量标注的标签信息往往难以获取,限制了多标签学习的推广和应用.与之相比,标签相关性作为一种常见的弱监督信息,它对标注信息的要求较低.如何利用标签相关性进行多标签学习,是一个重要但未研究的问题.提出了一种利用标签相关性作为先验的弱监督多标签学习方法(WSMLLC).该模型利用标签相关性对样本相似性进行了重述,能够有效地获取标签指示矩阵;同时,利用先验信息对数据的投影矩阵进行约束,并引入回归项对指示矩阵进行修正.与现有方法相比,WSMLLC模型的突出优势在于:仅提供标签相关性先验,就可以实现多标签样本的标签指派任务.在多个公开数据集上进行实验验证,实验结果表明:在标签矩阵完全缺失的情况下,WSMLLC与当前先进的多标签学习方法相比具有明显优势.  相似文献   

19.
李绍园  韦梦龙  黄圣君 《软件学报》2022,33(4):1274-1286
传统监督学习需要训练样本的真实标记信息,而在很多情况下,真实标记并不容易收集.与之对比,众包学习从多个可能犯错的非专家收集标注,通过某种融合方式估计样本的真实标记.注意到现有深度众包学习工作对标注者相关性建模不足,而非深度众包学习方面的工作表明,标注者相关性建模利用有助于改善学习效果.提出一种深度生成式众包学习方法,以...  相似文献   

20.
A machine learning framework which uses unlabeled data from a related task domain in supervised classification tasks is described. The unlabeled data come from related domains, which share the same class labels or generative distribution as the labeled data. Patterns in the unlabeled data are learned via a neural network and transferred to the target domain from where the labeled data are generated, so as to improve the performance of the supervised learning task. We call this approach self-taught transfer learning from unlabeled data. We introduce a general-purpose feature learning algorithm producing features that retain information from the unlabeled data. Information preservation assures that the features obtained will be useful for improving the classification performance of the supervised tasks.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号