首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 765 毫秒
1.
Approaches to distance metric learning (DML) for Mahalanobis distance metric involve estimating a parametric matrix that is associated with a linear transformation. For complex pattern analysis tasks, it is necessary to consider the approaches to DML that involve estimating a parametric matrix that is associated with a nonlinear transformation. One such approach involves performing the DML of Mahalanobis distance in the feature space of a Mercer kernel. In this approach, the problem of estimation of a parametric matrix of Mahalanobis distance is formulated as a problem of learning an optimal kernel gram matrix from the kernel gram matrix of a base kernel by minimizing the logdet divergence between the kernel gram matrices. We propose to use the optimal kernel gram matrices learnt from the kernel gram matrix of the base kernels in pattern analysis tasks such as clustering, multi-class pattern classification and nonlinear principal component analysis. We consider the commonly used kernels such as linear kernel, polynomial kernel, radial basis function kernel and exponential kernel as well as hyper-ellipsoidal kernels as the base kernels for optimal kernel learning. We study the performance of the DML-based class-specific kernels for multi-class pattern classification using support vector machines. Results of our experimental studies on benchmark datasets demonstrate the effectiveness of the DML-based kernels for different pattern analysis tasks.  相似文献   

2.
In this paper, the multiple kernel learning (MKL) is formulated as a supervised classification problem. We dealt with binary classification data and hence the data modelling problem involves the computation of two decision boundaries of which one related with that of kernel learning and the other with that of input data. In our approach, they are found with the aid of a single cost function by constructing a global reproducing kernel Hilbert space (RKHS) as the direct sum of the RKHSs corresponding to the decision boundaries of kernel learning and input data and searching that function from the global RKHS, which can be represented as the direct sum of the decision boundaries under consideration. In our experimental analysis, the proposed model had shown superior performance in comparison with that of existing two stage function approximation formulation of MKL, where the decision functions of kernel learning and input data are found separately using two different cost functions. This is due to the fact that single stage representation helps the knowledge transfer between the computation procedures for finding the decision boundaries of kernel learning and input data, which inturn boosts the generalisation capacity of the model.  相似文献   

3.
沈健  蒋芸  张亚男  胡学伟 《计算机科学》2016,43(12):139-145
多核学习方法是机器学习领域中的一个新的热点。核方法通过将数据映射到高维空间来增加线性分类器的计算能力,是目前解决非线性模式分析与分类问题的一种有效途径。但是在一些复杂的情况下,单个核函数构成的核学习方法并不能完全满足如数据异构或者不规则、样本规模大、样本分布不平坦等实际应用中的需求问题,因此将多个核函数进行组合以期获得更好的结果,是一种必然的发展趋势。因此提出一种基于样本加权的多尺度核支持向量机方法,通过不同尺度核函数对样本的拟合能力进行加权,从而得到基于样本加权的多尺度核支持向量机决策函数。通过在多个数据集上的实验分析可以得出所提方法对于各个数据集都获得了很高的分类准确率。  相似文献   

4.
This paper addresses the problem of optimal feature extraction from a wavelet representation. Our work aims at building features by selecting wavelet coefficients resulting from signal or image decomposition on an adapted wavelet basis. For this purpose, we jointly learn in a kernelized large-margin context the wavelet shape as well as the appropriate scale and translation of the wavelets, hence the name “wavelet kernel learning”. This problem is posed as a multiple kernel learning problem, where the number of kernels can be very large. For solving such a problem, we introduce a novel multiple kernel learning algorithm based on active constraints methods. We furthermore propose some variants of this algorithm that can produce approximate solutions more efficiently. Empirical analysis show that our active constraint MKL algorithm achieves state-of-the art efficiency. When used for wavelet kernel learning, our experimental results show that the approaches we propose are competitive with respect to the state-of-the-art on brain–computer interface and Brodatz texture datasets.  相似文献   

5.
核学习机研究   总被引:2,自引:2,他引:2  
该文概述了近年来机器学习研究领域的一个热点问题———核学习机。首先分析了核方法的主要思想,然后着重介绍了几种新近发展的核学习机,包括支持向量机、核的Fisher判别分析等有监督学习算法及核的主分量分析等无监督学习算法,最后讨论了其应用及前景展望。  相似文献   

6.
王铁建  吴飞  荆晓远 《计算机科学》2017,44(12):131-134, 168
提出一种多核字典学习方法,用以对软件模块是否存在缺陷进行预测。用于软件缺陷预测的历史数据具有结构复杂、类不平衡的特点,用多个核函数构成的合成核将这些数据映射到一个高维特征空间,通过对多核字典基的选择,得到一个类别平衡的多核字典,用以对新的软件模块进行分类和预测,并判定其中是否存在缺陷。在NASA MDP数据集上的实验表明,与其他软件缺陷预测方法相比,多核字典学习方法能够针对软件缺陷历史数据结构复杂、类不平衡的特点,较好地解决软件缺陷预测问题。  相似文献   

7.
局部切空间对齐算法的核主成分分析解释   总被引:1,自引:0,他引:1       下载免费PDF全文
基于核方法的降维技术和流形学习是两类有效而广泛应用的非线性降维技术,它们有着各自不同的出发点和理论基础,在以往的研究中很少有研究关注两者的联系。LTSA算法利用数据的局部结构构造一种特殊的核矩阵,然后利用该核矩阵进行核主成分分析。本文针对局部切空间对齐这种流形学习算法,重点研究了LTSA算法与核PCA的内在联系。研究表明,LTSA在本质上是一种基于核方法的主成分分析技术。  相似文献   

8.
Constrained clustering methods (that usually use must-link and/or cannot-link constraints) have been received much attention in the last decade. Recently, kernel adaptation or kernel learning has been considered as a powerful approach for constrained clustering. However, these methods usually either allow only special forms of kernels or learn non-parametric kernel matrices and scale very poorly. Therefore, they either learn a metric that has low flexibility or are applicable only on small data sets due to their high computational complexity. In this paper, we propose a more efficient non-linear metric learning method that learns a low-rank kernel matrix from must-link and cannot-link constraints and the topological structure of data. We formulate the proposed method as a trace ratio optimization problem and learn appropriate distance metrics through finding optimal low-rank kernel matrices. We solve the proposed optimization problem much more efficiently than SDP solvers. Additionally, we show that the spectral clustering methods can be considered as a special form of low-rank kernel learning methods. Extensive experiments have demonstrated the superiority of the proposed method compared to recently introduced kernel learning methods.  相似文献   

9.
Kernel machines such as Support Vector Machines (SVM) have exhibited successful performance in pattern classification problems mainly due to their exploitation of potentially nonlinear affinity structures of data through the kernel functions. Hence, selecting an appropriate kernel function, equivalently learning the kernel parameters accurately, has a crucial impact on the classification performance of the kernel machines. In this paper we consider the problem of learning a kernel matrix in a binary classification setup, where the hypothesis kernel family is represented as a convex hull of fixed basis kernels. While many existing approaches involve computationally intensive quadratic or semi-definite optimization, we propose novel kernel learning algorithms based on large margin estimation of Parzen window classifiers. The optimization is cast as instances of linear programming. This significantly reduces the complexity of the kernel learning compared to existing methods, while our large margin based formulation provides tight upper bounds on the generalization error. We empirically demonstrate that the new kernel learning methods maintain or improve the accuracy of the existing classification algorithms while significantly reducing the learning time on many real datasets in both supervised and semi-supervised settings.  相似文献   

10.
Kernel learning is widely used in many areas, and many methods are developed. As a famous kernel learning method, kernel principal component analysis (KPCA) endures two problems in the practical applications. One is that all training samples need to be stored for the computing the kernel matrix during kernel learning. Second is that the kernel and its parameter have the heavy influence on the performance of kernel learning. In order to solve the above problem, we present a novel kernel learning namely sparse data-dependent kernel principal component analysis through reducing the training samples with sparse learning-based least squares support vector machine and adaptive self-optimizing kernel structure according to the input training samples. Experimental results on UCI datasets, ORL and YALE face databases, and Wisconsin Breast Cancer database show that it is feasible to improve KPCA on saving consuming space and optimizing kernel structure.  相似文献   

11.
The advantage of a kernel method often depends critically on a proper choice of the kernel function. A promising approach is to learn the kernel from data automatically. In this paper, we propose a novel method for learning the kernel matrix based on maximizing a class separability criterion that is similar to those used by linear discriminant analysis (LDA) and kernel Fisher discriminant (KFD). It is interesting to note that optimizing this criterion function does not require inverting the possibly singular within-class scatter matrix which is a computational problem encountered by many LDA and KFD methods. We have conducted experiments on both synthetic data and real-world data from UCI and FERET, showing that our method consistently outperforms some previous kernel learning methods.  相似文献   

12.
Asymptotic properties of the Fisher kernel   总被引:1,自引:0,他引:1  
This letter analyzes the Fisher kernel from a statistical point of view. The Fisher kernel is a particularly interesting method for constructing a model of the posterior probability that makes intelligent use of unlabeled data (i.e., of the underlying data density). It is important to analyze and ultimately understand the statistical properties of the Fisher kernel. To this end, we first establish sufficient conditions that the constructed posterior model is realizable (i.e., it contains the true distribution). Realizability immediately leads to consistency results. Subsequently, we focus on an asymptotic analysis of the generalization error, which elucidates the learning curves of the Fisher kernel and how unlabeled data contribute to learning. We also point out that the squared or log loss is theoretically preferable-because both yield consistent estimators-to other losses such as the exponential loss, when a linear classifier is used together with the Fisher kernel. Therefore, this letter underlines that the Fisher kernel should be viewed not as a heuristics but as a powerful statistical tool with well-controlled statistical properties.  相似文献   

13.
We develop a general theoretical framework for statistical logical learning with kernels based on dynamic propositionalization, where structure learning corresponds to inferring a suitable kernel on logical objects, and parameter learning corresponds to function learning in the resulting reproducing kernel Hilbert space. In particular, we study the case where structure learning is performed by a simple FOIL-like algorithm, and propose alternative scoring functions for guiding the search process. We present an empirical evaluation on several data sets in the single-task as well as in the multi-task setting.  相似文献   

14.
Software defect prediction aims to predict the defect proneness of new software modules with the historical defect data so as to improve the quality of a software system. Software historical defect data has a complicated structure and a marked characteristic of class-imbalance; how to fully analyze and utilize the existing historical defect data and build more precise and effective classifiers has attracted considerable researchers’ interest from both academia and industry. Multiple kernel learning and ensemble learning are effective techniques in the field of machine learning. Multiple kernel learning can map the historical defect data to a higher-dimensional feature space and make them express better, and ensemble learning can use a series of weak classifiers to reduce the bias generated by the majority class and obtain better predictive performance. In this paper, we propose to use the multiple kernel learning to predict software defect. By using the characteristics of the metrics mined from the open source software, we get a multiple kernel classifier through ensemble learning method, which has the advantages of both multiple kernel learning and ensemble learning. We thus propose a multiple kernel ensemble learning (MKEL) approach for software defect classification and prediction. Considering the cost of risk in software defect prediction, we design a new sample weight vector updating strategy to reduce the cost of risk caused by misclassifying defective modules as non-defective ones. We employ the widely used NASA MDP datasets as test data to evaluate the performance of all compared methods; experimental results show that MKEL outperforms several representative state-of-the-art defect prediction methods.  相似文献   

15.
Kernel-based methods have been widely investigated in the soft-computing community. However, they focus mainly on numeric data. In this paper, we propose a novel method for kernel learning on categorical data, and show how the method can be used to derive effective classifiers for linear classification. Based on kernel density estimation for categorical attributes, three popular classification methods, i.e., Naive Bayes, nearest neighbor and prototype-based classification, are effectively extended to classify categorical data. We also propose two data-driven approaches to the bandwidth selection problem, with one aimed at minimizing the mean squared error of the kernel estimate and the other endeavored to attribute weights optimization. Theoretical analysis indicates that, as in the numeric case, kernel learning of categorical attributes is capable to make the classes to be more separable, resulting in outstanding performances of the new classifiers on various real-world data sets.  相似文献   

16.
Small sample size and high computational complexity are two major problems encountered when traditional kernel discriminant analysis methods are applied to high-dimensional pattern classification tasks such as face recognition. In this paper, we introduce a new kernel discriminant learning method, which is able to effectively address the two problems by using regularization and subspace decomposition techniques. Experiments performed on real face databases indicate that the proposed method outperforms, in terms of classification accuracy, existing kernel methods, such as kernel principal component analysis and kernel linear discriminant analysis, at a significantly reduced computational cost.  相似文献   

17.
Kernel methods and deep learning are two of the most currently remarkable machine learning techniques that have achieved great success in many applications. Kernel methods are powerful tools to capture nonlinear patterns behind data. They implicitly learn high (even infinite) dimensional nonlinear features in the reproducing kernel Hilbert space (RKHS) while making the computation tractable by leveraging the kernel trick. It is commonly agreed that the success of kernel methods is very much dependent on the choice of kernel. Multiple kernel learning (MKL) is one possible scheme that performs kernel combination and selection for a variety of learning tasks, such as classification, clustering, and dimensionality reduction. Deep learning models project input data through several layers of nonlinearity and learn different levels of abstraction. The composition of multiple layers of nonlinear functions can approximate a rich set of naturally occurring input-output dependencies. To bridge kernel methods and deep learning, deep kernel learning has been proven to be an effective method to learn complex feature representations by combining the nonparametric flexibility of kernel methods with the structural properties of deep learning. This article presents a comprehensive overview of the state-of-the-art approaches that bridge the MKL and deep learning techniques. Specifically, we systematically review the typical hybrid models, training techniques, and their theoretical and practical benefits, followed by remaining challenges and future directions. We hope that our perspectives and discussions serve as valuable references for new practitioners and theoreticians seeking to innovate in the applications of the approaches incorporating the advantages of both paradigms and exploring new synergies.  相似文献   

18.
多核学习方法   总被引:56,自引:5,他引:51  
多核学习方法是当前核机器学习领域的一个新的热点. 核方法是解决非线性模式分析问题的一种有效方法, 但在一些复杂情形下, 由单个核函数构成的核机器并不能满足诸如数据异构或不规则、样本规模巨大、样本不平坦分布等实际的应用需求, 因此将多个核函数进行组合, 以获得更好的结果是一种必然选择. 本文根据多核的构成, 从合成核、多尺度核、无限核三个角度, 系统综述了多核方法的构造理论, 分析了多核学习典型方法的特点及不足, 总结了各自的应用领域, 并凝炼了其进一步的研究方向.  相似文献   

19.
Recent literature has shown the merits of having deep representations in the context of neural networks. An emerging challenge in kernel learning is the definition of similar deep representations. In this paper, we propose a general methodology to define a hierarchy of base kernels with increasing expressiveness and combine them via multiple kernel learning (MKL) with the aim to generate overall deeper kernels. As a leading example, this methodology is applied to learning the kernel in the space of Dot-Product Polynomials (DPPs), that is a positive combination of homogeneous polynomial kernels (HPKs). We show theoretical properties about the expressiveness of HPKs that make their combination empirically very effective. This can also be seen as learning the coefficients of the Maclaurin expansion of any definite positive dot product kernel thus making our proposed method generally applicable. We empirically show the merits of our approach comparing the effectiveness of the kernel generated by our method against baseline kernels (including homogeneous and non homogeneous polynomials, RBF, etc...) and against another hierarchical approach on several benchmark datasets.  相似文献   

20.
The success of kernel-based learning methods depends on the choice of kernel. Recently, kernel learning methods have been proposed that use data to select the most appropriate kernel, usually by combining a set of base kernels. We introduce a new algorithm for kernel learning that combines a continuous set of base kernels, without the common step of discretizing the space of base kernels. We demonstrate that our new method achieves state-of-the-art performance across a variety of real-world datasets. Furthermore, we explicitly demonstrate the importance of combining the right dictionary of kernels, which is problematic for methods that combine a finite set of base kernels chosen a priori. Our method is not the first approach to work with continuously parameterized kernels. We adopt a two-stage kernel learning approach. We also show that our method requires substantially less computation than previous such approaches, and so is more amenable to multi-dimensional parameterizations of base kernels, which we demonstrate.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号