首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到16条相似文献,搜索用时 0 毫秒
1.
Random set framework for multiple instance learning   总被引:1,自引:0,他引:1  
Multiple instance learning (MIL) is a technique used for learning a target concept in the presence of noise or in a condition of uncertainty. While standard learning techniques present the learner with individual samples, MIL alternatively presents the learner with sets of samples. Although sets are the primary elements used for analysis in MIL, research in this area has focused on using standard analysis techniques. In the following, a random set framework for multiple instance learning (RSF-MIL) is proposed that can directly perform analysis on sets. The proposed method uses random sets and fuzzy measures to model the MIL problem, thus providing a more natural mathematical framework, a more general MIL solution, and a more versatile learning tool. Comparative experimental results using RSF-MIL are presented for benchmark data sets. RSF-MIL is further compared to the state-of-the-art in landmine detection using ground penetrating radar data.  相似文献   

2.
In machine learning the so-called curse of dimensionality, pertinent to many classification algorithms, denotes the drastic increase in computational complexity and classification error with data having a great number of dimensions. In this context, feature selection techniques try to reduce dimensionality finding a new more compact representation of instances selecting the most informative features and removing redundant, irrelevant, and/or noisy features. In this paper, we propose a filter-based feature selection method for working in the multiple-instance learning scenario called ReliefF-MI; it is based on the principles of the well-known ReliefF algorithm. Different extensions are designed and implemented and their performance checked in multiple instance learning. ReliefF-MI is applied as a pre-processing step that is completely independent from the multi-instance classifier learning process and therefore is more efficient and generic than wrapper approaches proposed in this area. Experimental results on five benchmark real-world data sets and 17 classification algorithms confirm the utility and efficiency of this method, both statistically and from the point of view of execution time.  相似文献   

3.
G3P-MI: A genetic programming algorithm for multiple instance learning   总被引:1,自引:0,他引:1  
This paper introduces a new Grammar-Guided Genetic Programming algorithm for resolving multi-instance learning problems. This algorithm, called G3P-MI, is evaluated and compared to other multi-instance classification techniques in different application domains. Computational experiments show that the G3P-MI often obtains consistently better results than other algorithms in terms of accuracy, sensitivity and specificity. Moreover, it makes the knowledge discovery process clearer and more comprehensible, by expressing information in the form of IF-THEN rules. Our results confirm that evolutionary algorithms are very appropriate for dealing with multi-instance learning problems.  相似文献   

4.
针对壁画图像具有较大类内差异的特点,提出一种分组策略,将样本空间划分为不同的子空间,每一个子空间中的所有训练样本训练分类器模型,测试阶段,根据测试样本落到的子空间来选择不同的分类模型对测试样本进行分类。在各个子空间训练分类器时,为了克服壁画图像较强背景噪音的影响,我们将每一幅壁画图像样本看作多个实例的组成,采用多实例学习的方式来训练分类器。训练过程中,我们引入隐变量用于标识每一个实例,隐变量的存在使得分类器的优化问题不是一个凸问题,因此我们无法用梯度下降法去直接求解,本文中我们采用迭代的方式训练Latent SVM作为每一个子空间的分类器。实验证明了本文的分类模型能够较大程度的解决壁画图像的类内差异以及背景噪音对分类结果造成的影响。  相似文献   

5.
Min-Ling  Zhi-Jian 《Neurocomputing》2009,72(16-18):3951
In multi-instance multi-label learning (MIML), each example is not only represented by multiple instances but also associated with multiple class labels. Several learning frameworks, such as the traditional supervised learning, can be regarded as degenerated versions of MIML. Therefore, an intuitive way to solve MIML problem is to identify its equivalence in its degenerated versions. However, this identification process would make useful information encoded in training examples get lost and thus impair the learning algorithm's performance. In this paper, RBF neural networks are adapted to learn from MIML examples. Connections between instances and labels are directly exploited in the process of first layer clustering and second layer optimization. The proposed method demonstrates superior performance on two real-world MIML tasks.  相似文献   

6.
The paper presents a supervised discriminative dictionary learning algorithm specially designed for classifying HEp-2 cell patterns. The proposed algorithm is an extension of the popular K-SVD algorithm: at the training phase, it takes into account the discriminative power of the dictionary atoms and reduces their intra-class reconstruction error during each update. Meanwhile, their inter-class reconstruction effect is also considered. Compared to the existing extension of K-SVD, the proposed algorithm is more robust to parameters and has better discriminative power for classifying HEp-2 cell patterns. Quantitative evaluation shows that the proposed algorithm outperforms general object classification algorithms significantly on standard HEp-2 cell patterns classifying benchmark1 and also achieves competitive performance on standard natural image classification benchmark.  相似文献   

7.
刘博  景丽萍  于剑 《软件学报》2017,28(8):2113-2125
随着视频采集和网络传输技术的快速发展,以及个人移动终端设备的广泛使用,大量图像数据以集合形式存在.由于集合内在结构的复杂性,使得图像集分类的一个关键问题是如何度量集合间距离.为了解决这一问题,本文提出了一种基于双稀疏正则的图像集距离学习框架(DSRID).在该框架中,两集合间距离被建模成其对应的内部典型子结构间的距离,从而保证了度量的鲁棒性和判别性.根据不同的集合表示方法,本文给出了其在传统的欧式空间,以及两个常见的流形空间,即对称正定矩阵流形(symmetric positive definite matrices manifold,SPD manifold)和格林斯曼流形(Grassmann manifold)上的实现.在一系列的基于集合的人脸识别、动作识别和物体分类任务中验证了该框架的有效性.  相似文献   

8.
In this paper, the multiple kernel learning (MKL) is formulated as a supervised classification problem. We dealt with binary classification data and hence the data modelling problem involves the computation of two decision boundaries of which one related with that of kernel learning and the other with that of input data. In our approach, they are found with the aid of a single cost function by constructing a global reproducing kernel Hilbert space (RKHS) as the direct sum of the RKHSs corresponding to the decision boundaries of kernel learning and input data and searching that function from the global RKHS, which can be represented as the direct sum of the decision boundaries under consideration. In our experimental analysis, the proposed model had shown superior performance in comparison with that of existing two stage function approximation formulation of MKL, where the decision functions of kernel learning and input data are found separately using two different cost functions. This is due to the fact that single stage representation helps the knowledge transfer between the computation procedures for finding the decision boundaries of kernel learning and input data, which inturn boosts the generalisation capacity of the model.  相似文献   

9.
We introduce a coefficient update procedure into existing batch and online dictionary learning algorithms. We first propose an algorithm which is a coefficient updated version of the Method of Optimal Directions (MOD) dictionary learning algorithm (DLA). The MOD algorithm with coefficient updates presents a computationally expensive dictionary learning iteration with high convergence rate. Secondly, we present a periodically coefficient updated version of the online Recursive Least Squares (RLS)-DLA, where the data is used sequentially to gradually improve the learned dictionary. The developed algorithm provides a periodical update improvement over the RLS-DLA, and we call it as the Periodically Updated RLS Estimate (PURE) algorithm for dictionary learning. The performance of the proposed DLAs in synthetic dictionary learning and image denoising settings demonstrates that the coefficient update procedure improves the dictionary learning ability.  相似文献   

10.
The annoyance of spam emails increasingly plagues both individuals and organizations. In response, most of prior research investigates spam filtering as a classical text categorization task, in which training examples must include both spam (positive examples) and legitimate (negative examples) emails. However, in many spam filtering scenarios, obtaining legitimate emails for training purpose can be more difficult than collecting spam and unclassified emails. Hence, it is more appropriate to construct a classification model for spam filtering that uses positive training examples (i.e., spam) and unlabeled instances only and does not require legitimate emails as negative training examples. Several single-class learning techniques, such as PNB and PEBL, have been proposed in the literature. However, they incur inherent limitations with regard to spam filtering. In this study, we propose and develop an ensemble approach, referred to as E2, to address these limitations. Specifically, we follow the two-stage framework of PEBL but extend each stage with an ensemble strategy. The empirical evaluation results from two spam filtering corpora suggest that our proposed E2 technique generally outperforms benchmark techniques (i.e., PNB and PEBL) and exhibits more stable performance than its counterparts.  相似文献   

11.
Multi-label learning originated from the investigation of text categorization problem, where each document may belong to several predefined topics simultaneously. In multi-label learning, the training set is composed of instances each associated with a set of labels, and the task is to predict the label sets of unseen instances through analyzing training instances with known label sets. In this paper, a multi-label lazy learning approach named ML-KNN is presented, which is derived from the traditional K-nearest neighbor (KNN) algorithm. In detail, for each unseen instance, its K nearest neighbors in the training set are firstly identified. After that, based on statistical information gained from the label sets of these neighboring instances, i.e. the number of neighboring instances belonging to each possible class, maximum a posteriori (MAP) principle is utilized to determine the label set for the unseen instance. Experiments on three different real-world multi-label learning problems, i.e. Yeast gene functional analysis, natural scene classification and automatic web page categorization, show that ML-KNN achieves superior performance to some well-established multi-label learning algorithms.  相似文献   

12.
Hierarchical Dirichlet process (HDP) is an unsupervised method which has been widely used for topic extraction and document clustering problems. One advantage of HDP is that it has an inherent mechanism to determine the total number of clusters/topics. However, HDP has three weaknesses: (1) there is no mechanism to use known labels or incorporate expert knowledge into the learning procedure, thus precluding users from directing the learning and making the final results incomprehensible; (2) it cannot detect the categories expected by applications without expert guidance; (3) it does not automatically adjust the model parameters and structure in a changing environment. To address these weaknesses, this paper proposes an incremental learning method, with partial supervision for HDP, which enables the topic model (initially guided by partial knowledge) to incrementally adapt to the latest available information. An important contribution of this work is the application of granular computing to HDP for partial-supervision and incremental learning which results in a more controllable and interpretable model structure. These enhancements provide a more flexible approach with expert guidance for the model learning and hence results in better prediction accuracy and interpretability.  相似文献   

13.
In this paper, we propose a novel self-organizing framework to construct multiple, low-dimensional eigenspaces from a set of training images. Grouping of images is systematically and robustly performed via eigenspace-growing in terms of low-dimensional eigenspaces. To further increase the robustness, the eigenspace-growing is initiated independently with many small groups of images—seeds. All these grown eigenspaces are treated as hypotheses that are subject to a selection procedure eigenspace-selection, based on the MDL principle, which selects the final resulting set of eigenspaces as an efficient representation of the training set, taking into account the number of images encompassed by the eigenspaces, the dimensions of the eigenspaces, and their corresponding residual errors. We have tested the proposed method on a number of standard image sets, and the significance of the approach with respect to the recognition rate has been demonstrated.  相似文献   

14.
文本分类是自然语言处理中一项基本且重要的任务.基于深度学习的文本分类方法大多只针对单一的模型结构进行深入研究,这种单一的结构缺乏同时捕获并利用全局语义特征与局部语义特征的能力,且网络的加深会损失更多的语义信息.对此,提出了一种融合多神经网络的文本分类模型FMNN(A Text Classification Model ...  相似文献   

15.
16.
This paper proposes an improved variational model, multiple piecewise constant with geodesic active contour (MPC-GAC) model, which generalizes the region-based active contour model by Chan and Vese, 2001 [11] and merges the edge-based active contour by Caselles et al., 1997 [7] to inherit the advantages of region-based and edge-based image segmentation models. We show that the new MPC-GAC energy functional can be iteratively minimized by graph cut algorithms with high computational efficiency compared with the level set framework. This iterative algorithm alternates between the piecewise constant functional learning and the foreground and background updating so that the energy value gradually decreases to the minimum of the energy functional. The k-means method is used to compute the piecewise constant values of the foreground and background of image. We use a graph cut method to detect and update the foreground and background. Numerical experiments show that the proposed interactive segmentation method based on the MPC-GAC model by graph cut optimization can effectively segment images with inhomogeneous objects and background.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号