共查询到17条相似文献,搜索用时 31 毫秒
1.
基于流形学习的多示例回归算法 总被引:2,自引:0,他引:2
多示例学习是一种新型机器学习框架,以往的研究主要集中在多示例分类上,最近多示例回归受到了国际机器学习界的关注.流形学习旨在获得非线性分布数据的内在结构,可以用于非线性降维.文中基于流形学习技术,提出了用于解决多示例同归问题的Mani MIL算法.该算法首先对训练包中的示例降维,利用降维结果出现坍缩的特性对多示例包进行预测.实验表明,Mani MIL算法比现有的多示例算法例如Citation-kNN等有更好的性能. 相似文献
2.
在多示例学习(Multi-instance learning,MIL)中,核心示例对于包类别的预测具有重要作用。若两个示例周围分布不同数量的同类示例,则这两个示例的代表程度不同。为了从包中选出最具有代表性的示例组成核心示例集,提高分类精度,本文提出多示例学习的示例层次覆盖算法(Multi-instance learning with instance_level covering algorithm,MILICA)。该算法首先利用最大Hausdorff距离和覆盖算法构建初始核心示例集,然后通过覆盖算法和反验证获得最终的核心示例集和各覆盖包含的示例数,最后使用相似函数将包转为单示例。在两类数据集和多类图像数据集上的实验证明,MILICA算法具有较好的分类性能。 相似文献
4.
为了有效地解决多示例图像分类问题,基于稀疏表示提出了一种新的多示例图像分类方法.该方法将图像看作多示例包,图像中的区域作为包中示例,利用示例嵌入策略计算包特征;然后将待分类图像包特征表示为训练图像包特征集上的稀疏线性组合,利用Z1优化方法求得稀疏解;最后根据稀疏系数提出一个为待分类图像预测标记的方法.在Corel数据集上的实验结果表明,与其他方法相比,所提方法具有更高的分类精度. 相似文献
5.
提出一种基于多示例学习的图像表示方法,将图像作为多示例包,用高斯滤波器将图像滤波并取样为由颜色区域构成的矩阵,使用单颜色及相邻区域(single blob with neighbors)的包生成方法.根据用户选择的实例图像生成正包和负包,使用MIL-SVDD_I和MIL-SVDD_B算法进行实验.实验表明该图像表示方法是可行的. 相似文献
6.
多示例学习中,包空间特征描述包容易忽略包中的局部信息,示例空间特征描述包容易忽略包的整体结构信息.针对上述问题,提出融合包空间特征和示例空间特征的多示例学习方法.首先建立图模型表达包中示例之间的关系,将图模型转化为关联矩阵以构建包空间特征;其次筛选出正包中与正包的类别的相关性比较强的示例和负包中与正包的类别的相关性比较弱的示例,将示例特征分别作为正包和负包的示例空间特征;最后用Gaussian RBF核将包空间和示例空间特征映射到相同的特征空间,采用基于权重的特征融合方法进行特征融合.在多示例的基准数据集、公开的图像数据集和文本数据集上进行实验的结果表明,该方法提高了分类效果. 相似文献
7.
多示例多标签学习是一种新型的机器学习框架。在多示例多标签学习中,样本以包的形式存在,一个包由多个示例组成,并被标记多个标签。以往的多示例多标签学习研究中,通常认为包中的示例是独立同分布的,但这个假设在实际应用中是很难保证的。为了利用包中示例的相关性特征,提出了一种基于示例非独立同分布的多示例多标签分类算法。该算法首先通过建立相关性矩阵表示出包内示例的相关关系,每个多示例包由一个相关性矩阵表示;然后建立基于不同尺度的相关性矩阵的核函数;最后考虑到不同标签的预测对应不同的核函数,引入多核学习构造并训练针对不同标签预测的多核SVM分类器。图像和文本数据集上的实验结果表明,该算法大大提高了多标签分类的准确性。 相似文献
8.
作为监督学习的一种变体,多示例学习(MIL)试图从包中的示例中学习分类器。在多示例学习中,标签与包相关联,而不是与单个示例相关联。包的标签是已知的,示例的标签是未知的。MIL可以解决标记模糊问题,但要解决带有弱标签的问题并不容易。对于弱标签问题,包和示例的标签都是未知的,但它们是潜在的变量。现在有多个标签和示例,可以通过对不同标签进行加权来近似估计包和示例的标签。提出了一种新的基于迁移学习的多示例学习框架来解决弱标签的问题。首先构造了一个基于多示例方法的迁移学习模型,该模型可以将知识从源任务迁移到目标任务中,从而将弱标签问题转换为多示例学习问题。在此基础上,提出了一种求解多示例迁移学习模型的迭代框架。实验结果表明,该方法优于现有多示例学习方法。 相似文献
9.
多示例神经网络是一类用于求解多示例学习问题的神经网络,但由于其中有不可微函数,使用反向传播训练方法时需要采用近似方法,因此多示例神经网络的预测准确性不高。〖BP)〗为了提高预测准确性,构造了一类优化多示例神经网络参数的改进遗传算法, 借助基于反向传播训练的局部搜索算子、排挤操作和适应性操作概率计算方式来提高收敛速度和防止早熟收敛。通过公认的数据集上实验结果的分析和对比,证实了这个改进的遗传算法能够明显地提高多示例神经网络的预测准确性,同时还具有比其他算法更快的收敛速度。 相似文献
10.
在多示例学习中引入利用未标记示例的机制,能降低训练的成本并提高学习器的泛化能力。当前半监督多示例学习算法大部分是基于对包中的每一个示例进行标记,把多示例学习转化为一个单示例半监督学习问题。考虑到包的类标记由包中示例及包的结构决定,提出一种直接在包层次上进行半监督学习的多示例学习算法。通过定义多示例核,利用所有包(有标记和未标记)计算包层次的图拉普拉斯矩阵,作为优化目标中的光滑性惩罚项。在多示例核所张成的RKHS空间中寻找最优解被归结为确定一个经过未标记数据修改的多示例核函数,它能直接用在经典的核学习方法上。在实验数据集上对算法进行了测试,并和已有的算法进行了比较。实验结果表明,基于半监督多示例核的算法能够使用更少量的训练数据而达到与监督学习算法同样的精度,在有标记数据集相同的情况下利用未标记数据能有效地提高学习器的泛化能力。 相似文献
11.
In the setting of multi-instance learning, each object is represented by a bag composed of multiple instances instead of by a single instance in a traditional learning setting. Previous works in this
area only concern multi-instance prediction problems where each bag is associated with a binary (classification) or real-valued (regression) label. However, unsupervised multi-instance learning where bags are without labels has not been studied. In this paper, the problem of unsupervised multi-instance
learning is addressed where a multi-instance clustering algorithm named Bamic is proposed. Briefly, by regarding bags as atomic data items and using some form of distance metric to measure distances
between bags, Bamic adapts the popular k
-Medoids algorithm to partition the unlabeled training bags into k disjoint groups of bags. Furthermore, based on the clustering results, a novel multi-instance prediction algorithm named Bartmip is developed. Firstly, each bag is re-represented by a k-dimensional feature vector, where the value of the i-th feature is set to be the distance between the bag and the medoid of the i-th group. After that, bags are transformed into feature vectors so that common supervised learners are used to learn from
the transformed feature vectors each associated with the original bag’s label. Extensive experiments show that Bamic could effectively discover the underlying structure of the data set and Bartmip works quite well on various kinds of multi-instance prediction problems. 相似文献
12.
自动图像标注技术研究进展 总被引:1,自引:0,他引:1
近年来,自动图像标注(Automatic Image Annotation,AIA)技术已经成为图像语义理解研究领域的热点。其基本思想是利用已标注图像集或其他可获得的信息自动学习语义概念空间与视觉特征空间的潜在关联或者映射关系,来预测未知图像的标注。随着机器学习理论的不断发展,包括相关模型、分类器模型等不同的学习模型已经被广泛地应用于自动图像标注研究领域。现有的自动图像标注算法可以大致分为基于分类的标注算法、基于概率关联模型的标注算法以及基于图学习的标注算法等三大类。首先根据自动图像标注算法的特征提取及表示机制不同,将现有算法划分为基于全局特征和基于区域划分的自动图像标注方法。其次,在基于区域划分的自动图像标注算法中,按照学习算法的不同,将其划分为基于分类的标注方法、基于概率关联模型的标注方法以及基于图学习的标注方法,并分别介绍各类别中具有代表性的标注算法及其优缺点。然后给出了自动图像标注最新的研究进展,最后探讨自动图像标注的进一步研究方向。 相似文献
13.
基于半监督多示例学习的对象图像检索 总被引:2,自引:0,他引:2
针对基于对象的图像检索问题,提出一种新的半监督多示例学习(MIL)算法.该算法将图像当作包,分割区域的视觉特征当作包中的示例,按\"点密度\"最大原则,提取\"视觉语义\"构造投影空间;然后利用定义的非线性函数将包映射成投影空间中的一个点,以获得图像的\"投影特征\",并采用粗糙集(RS)方法对其进行属性约简;最后利用直推式支持向量机(TSVM)进行半监督的学习,得到分类器.实验结果表明,该方法有效且性能优于其他方法. 相似文献
14.
A two-class classification problem is considered where the objects to be classified are bags of instances in d-space. The classification rule is defined in terms of an open d-ball. A bag is labeled positive if it meets the ball and labeled negative otherwise. Determining the center and radius of the ball is modeled as a SVM-like margin optimization problem. Necessary optimality conditions are derived leading to a polynomial algorithm in fixed dimension. A VNS type heuristic is developed and experimentally tested. The methodology is extended to classification by several balls and to more than two classes. 相似文献
15.
Solving multi-instance problems with classifier ensemble based on constructive clustering 总被引:3,自引:0,他引:3
In multi-instance learning, the training set is composed of labeled bags each consists of many unlabeled instances, that is, an object is represented by a set of feature vectors instead of only
one feature vector. Most current multi-instance learning algorithms work through adapting single-instance learning algorithms
to the multi-instance representation, while this paper proposes a new solution which goes at an opposite way, that is, adapting
the multi-instance representation to single-instance learning algorithms. In detail, the instances of all the bags are collected
together and clustered into d groups first. Each bag is then re-represented by d binary features, where the value of the ith feature is set to one if the concerned bag has instances falling into the ith group and zero otherwise. Thus, each bag is represented by one feature vector so that single-instance classifiers can be
used to distinguish different classes of bags. Through repeating the above process with different values of d, many classifiers can be generated and then they can be combined into an ensemble for prediction. Experiments show that the
proposed method works well on standard as well as generalized multi-instance problems.
Zhi-Hua Zhou is currently Professor in the Department of Computer Science & Technology and head of the LAMDA group at Nanjing University.
His main research interests include machine learning, data mining, information retrieval, and pattern recognition. He is associate
editor of Knowledge and Information Systems and on the editorial boards of Artificial Intelligence in Medicine, International Journal of Data Warehousing and Mining, Journal of Computer Science & Technology, and Journal of Software. He has also been involved in various conferences.
Min-Ling Zhang received his B.Sc. and M.Sc. degrees in computer science from Nanjing University, China, in 2001 and 2004, respectively.
Currently he is a Ph.D. candidate in the Department of Computer Science & Technology at Nanjing University and a member of
the LAMDA group. His main research interests include machine learning and data mining, especially in multi-instance learning
and multi-label learning. 相似文献
16.
Songhe FengAuthor Vitae Hong BaoAuthor Vitae Congyan LangAuthor Vitae 《Neurocomputing》2011,74(17):3619-3627
Tag ranking has emerged as an important research topic recently due to its potential application on web image search. Existing tag relevance ranking approaches mainly rank the tags according to their relevance levels with respect to a given image. Nonetheless, such algorithms heavily rely on the large-scale image dataset and the proper similarity measurement to retrieve semantic relevant images with multi-labels. In contrast to the existing tag relevance ranking algorithms, in this paper, we propose a novel tag saliency ranking scheme, which aims to automatically rank the tags associated with a given image according to their saliency to the image content. To this end, this paper presents an integrated framework for tag saliency ranking, which combines both visual attention model and multi-instance learning to investigate the saliency ranking order information of tags with respect to the given image. Specifically, tags annotated on the image-level are propagated to the region-level via an efficient multi-instance learning algorithm firstly; then, visual attention model is employed to measure the importance of regions in the given image. Finally, tags are ranked according to the saliency values of the corresponding regions. Experiments conducted on the COREL and MSRC image datasets demonstrate the effectiveness and efficiency of the proposed framework. 相似文献
17.
In multi-instance multi-label learning (MIML), each example is not only represented by multiple instances but also associated with multiple class labels. Several learning frameworks, such as the traditional supervised learning, can be regarded as degenerated versions of MIML. Therefore, an intuitive way to solve MIML problem is to identify its equivalence in its degenerated versions. However, this identification process would make useful information encoded in training examples get lost and thus impair the learning algorithm's performance. In this paper, RBF neural networks are adapted to learn from MIML examples. Connections between instances and labels are directly exploited in the process of first layer clustering and second layer optimization. The proposed method demonstrates superior performance on two real-world MIML tasks. 相似文献