共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Kernel methods and deep learning are two of the most currently remarkable machine learning techniques that have achieved great success in many applications. Kernel methods are powerful tools to capture nonlinear patterns behind data. They implicitly learn high (even infinite) dimensional nonlinear features in the reproducing kernel Hilbert space (RKHS) while making the computation tractable by leveraging the kernel trick. It is commonly agreed that the success of kernel methods is very much dependent on the choice of kernel. Multiple kernel learning (MKL) is one possible scheme that performs kernel combination and selection for a variety of learning tasks, such as classification, clustering, and dimensionality reduction. Deep learning models project input data through several layers of nonlinearity and learn different levels of abstraction. The composition of multiple layers of nonlinear functions can approximate a rich set of naturally occurring input-output dependencies. To bridge kernel methods and deep learning, deep kernel learning has been proven to be an effective method to learn complex feature representations by combining the nonparametric flexibility of kernel methods with the structural properties of deep learning. This article presents a comprehensive overview of the state-of-the-art approaches that bridge the MKL and deep learning techniques. Specifically, we systematically review the typical hybrid models, training techniques, and their theoretical and practical benefits, followed by remaining challenges and future directions. We hope that our perspectives and discussions serve as valuable references for new practitioners and theoreticians seeking to innovate in the applications of the approaches incorporating the advantages of both paradigms and exploring new synergies. 相似文献
3.
一种支持向量机的混合核函数 总被引:2,自引:0,他引:2
核函数是支持向量机的核心,不同的核函数将产生不同的分类效果.由于普通核函数各有其利弊,为了得到学习能力和泛化能力较强的核函数,根据核函数的基本性质,两个核函数之和仍然是核函数,将局部核函数和全局核函数线性组合构成新的核函数--混合核函数.该核函数吸取了局部核函数和全局核函数的优点.利用混合核函数进行流程企业供应链预测实验,仿真结果验证了该核函数的有效性和正确性. 相似文献
4.
In machine learning literature, deep learning methods have been moving toward greater heights by giving due importance in both data representation and classification methods. The recently developed multilayered arc-cosine kernel leverages the possibilities of extending deep learning features into the kernel machines. Even though this kernel has been widely used in conjunction with support vector machines (SVM) on small-size datasets, it does not seem to be a feasible solution for the modern real-world applications that involve very large size datasets. There are lot of avenues where the scalability aspects of deep kernel machines in handling large dataset need to be evaluated. In machine learning literature, core vector machine (CVM) is being used as a scaling up mechanism for traditional SVMs. In CVM, the quadratic programming problem involved in SVM is reformulated as an equivalent minimum enclosing ball problem and then solved by using a subset of training sample (Core Set) obtained by a faster \((1+\epsilon )\) approximation algorithm. This paper explores the possibilities of using principles of core vector machines as a scaling up mechanism for deep support vector machine with arc-cosine kernel. Experiments on different datasets show that the proposed system gives high classification accuracy with reasonable training time compared to traditional core vector machines, deep support vector machines with arc-cosine kernel and deep convolutional neural network. 相似文献
5.
一种支持向量机的组合核函数 总被引:11,自引:0,他引:11
核函数是支持向量机的核心,不同的核函数将产生不同的分类效果,核函数也是支持向量机理论中比较难理解的一部分。通过引入核函数,支持向量机可以很容易地实现非线性算法。首先探讨了核函数的本质,说明了核函数与所映射空间之间的关系,进一步给出了核函数的构成定理和构成方法,说明了核函数分为局部核函数与全局核函数两大类,并指出了两者的区别和各自的优势。最后,提出了一个新的核函数——组合核函数,并将该核函数应用于支持向量机中,并进行了人脸识别实验,实验结果也验证了该核函数的有效性。 相似文献
6.
Multi-view learning deals with data that is described through multiple representations, or views. While various real-world data can be represented by three or more views, several existing multi-view classification methods can only handle two views. Previously proposed methods usually solve this issue by optimizing pairwise combinations of views. Although this can numerically deal with the issue of multiple views, it ignores the higher order correlations which can only be examined by exploring all views simultaneously. In this work new multi-view classification approaches are introduced which aim to include higher order statistics when three or more views are available. The proposed model is an extension to the recently proposed Restricted Kernel Machine classifier model and assumes shared hidden features for all views, as well as a newly introduced model tensor. Experimental results show an improvement with respect to state-of-the art pairwise multi-view learning methods, both in terms of classification accuracy and runtime. 相似文献
7.
Rhinelander J Liu XP 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2012,42(3):616-626
Kernel machines have gained much popularity in applications of machine learning. Support vector machines (SVMs) are a subset of kernel machines and generalize well for classification, regression, and anomaly detection tasks. The training procedure for traditional SVMs involves solving a quadratic programming (QP) problem. The QP problem scales super linearly in computational effort with the number of training samples and is often used for the offline batch processing of data. Kernel machines operate by retaining a subset of observed data during training. The data vectors contained within this subset are referred to as support vectors (SVs). The work presented in this paper introduces a subset selection method for the use of kernel machines in online, changing environments. Our algorithm works by using a stochastic indexing technique when selecting a subset of SVs when computing the kernel expansion. The work described here is novel because it separates the selection of kernel basis functions from the training algorithm used. The subset selection algorithm presented here can be used in conjunction with any online training technique. It is important for online kernel machines to be computationally efficient due to the real-time requirements of online environments. Our algorithm is an important contribution because it scales linearly with the number of training samples and is compatible with current training techniques. Our algorithm outperforms standard techniques in terms of computational efficiency and provides increased recognition accuracy in our experiments. We provide results from experiments using both simulated and real-world data sets to verify our algorithm. 相似文献
8.
9.
We propose an ensemble method for kernel machines. The training data is randomly split into a number of mutually exclusive partitions defined by a row and column parameter. Each partition forms an input space and is transformed by an automatically selected kernel function into a kernel matrix K. Subsequently, each K is used as training data for a base binary classifier (Random Forest). This results in a number of predictions equal to the number of partitions. A weighted average combines the predictions into one final prediction. To optimize the weights, a genetic algorithm is used. This approach has the advantage of simultaneously promoting (1) diversity, (2) accuracy, and (3) computational speed. (1) Diversity is fostered because the individual K’s are based on a subset of features and observations, (2) accuracy is sought by automatic kernel selection and the genetic algorithm, and (3) computational speed is obtained because the computation of each K can be parallelized. Using five times twofold cross validation we benchmark the classification performance of Kernel Factory against Random Forest and Kernel-Induced Random Forest (KIRF). We find that Kernel Factory has significantly better performance than Kernel-Induced Random Forest. When the right kernel is selected Kernel Factory is also significantly better than Random Forest. In addition, an open-source R-software package of the algorithm (kernelFactory) is available from CRAN. 相似文献
10.
Nicola Ancona Author Vitae Rosalia Maglietta Author VitaeAuthor Vitae 《Pattern recognition》2006,39(9):1588-1603
This paper focuses on the problem of how data representation influences the generalization error of kernel based learning machines like support vector machines (SVM) for classification. Frame theory provides a well founded mathematical framework for representing data in many different ways. We analyze the effects of sparse and dense data representations on the generalization error of such learning machines measured by using leave-one-out error given a finite amount of training data. We show that, in the case of sparse data representations, the generalization error of an SVM trained by using polynomial or Gaussian kernel functions is equal to the one of a linear SVM. This is equivalent to saying that the capacity of separating points of functions belonging to hypothesis spaces induced by polynomial or Gaussian kernel functions reduces to the capacity of a separating hyperplane in the input space. Moreover, we show that, in general, sparse data representations increase or leave unchanged the generalization error of kernel based methods. Dense data representations, on the contrary, reduce the generalization error in the case of very large frames. We use two different schemes for representing data in overcomplete systems of Haar and Gabor functions, and measure SVM generalization error on benchmarked data sets. 相似文献
11.
The main purpose of this paper is to approach the use of formal methods in computing. In more specific terms, we use a temporal logic to formalize the most fundamental aspects of the semantics of UML state machines. We pay special attention to the dynamic aspects of the different operations associated with states and transitions, as well as the behaviour of transitions related with composite states. This, to the best of our knowledge, has not been done heretofore using temporal logic.Our formalization is based on a temporal logic that combines points, intervals, and dates. Moreover this new temporal logic is built over an innovative and simple topological semantics, which simplifies the metatheory development. 相似文献
12.
Linear kernel Support Vector Machine Recursive Feature Elimination (SVM-RFE) is known as an excellent feature selection algorithm. Nonlinear SVM is a black box classifier for which we do not know the mapping function F{Phi} explicitly. Thus, the weight vector w cannot be explicitly computed. In this paper, we proposed a feature selection algorithm utilizing Support Vector Machine with RBF kernel based on Recursive Feature Elimination(SVM-RBF-RFE), which expands nonlinear RBF kernel into its Maclaurin series, and then the weight vector w is computed from the series according to the contribution made to classification hyperplane by each feature. Using wi2{w_i^2} as ranking criterion, SVM-RBF-RFE starts with all the features, and eliminates one feature with the least squared weight at each step until all the features are ranked. We use SVM and KNN classifiers to evaluate nested subsets of features selected by SVM-RBF-RFE. Experimental results based on 3 UCI and 3 microarray datasets show SVM-RBF-RFE generally performs better than information gain and SVM-RFE. 相似文献
13.
Asymptotic behaviors of support vector machines with Gaussian kernel 总被引:97,自引:0,他引:97
Support vector machines (SVMs) with the gaussian (RBF) kernel have been popular for practical use. Model selection in this class of SVMs involves two hyperparameters: the penalty parameter C and the kernel width sigma. This letter analyzes the behavior of the SVM classifier when these hyperparameters take very small or very large values. Our results help in understanding the hyperparameter space that leads to an efficient heuristic method of searching for hyperparameter values with small generalization errors. The analysis also indicates that if complete model selection using the gaussian kernel has been conducted, there is no need to consider linear SVM. 相似文献
14.
In this paper, we propose a light-weight framework using kernel machines for the detection of shellcodes used in drive-by download attacks. As the shellcodes are passed in webpages as JavaScript strings, we studied the effectiveness of the proposed approach on about 9850 shellcodes and 10000 JavaScript strings collected from the wild. Our analysis shows that the trained SVMs (Support Vector Machines) classified with an accuracy of over 99 %. Our evaluation of the trained SVM models with different proportions of training datasets proved to perform consistently with an average accuracy of 99.51 % and the proposed static approach proved to be effective against detecting even the polymorphic shellcode variants. The performance of our approach was compared to an emulation based approach and observed that our approach performed with slightly better accuracies by consuming about 33 % of the time consumed by the emulation based approach. 相似文献
15.
Zhu Xiaobin Li Zhuangzi Zhang Xiao-Yu Li Peng Xue Ziyu Wang Lei 《Multimedia Tools and Applications》2019,78(20):29271-29290
Multimedia Tools and Applications - Convolutional Neural Networks (CNNs) have been established as a powerful class of models for image classification and related tasks. However, the fully-connected... 相似文献
16.
Asymptotic efficiency of kernel support vector machines (SVM) 总被引:1,自引:0,他引:1
The paper analyzes the asymptotic properties of Vapnik’s SVM-estimates of a regression function as the size of the training
sample tends to infinity. The estimation problem is considered as infinite-dimensional minimization of a regularized empirical
risk functional in a reproducing kernel Hilbert space. The rate of convergence of the risk functional on SVM-estimates to
its minimum value is established. The sufficient conditions for the uniform convergence of SVM-estimates to a true regression
function with unit probability are given.
Translated from Kibernetika i Sistemnyi Analiz, No. 4, pp. 81–97, July–August 2009 相似文献
17.
New results on error correcting output codes of kernel machines 总被引:1,自引:0,他引:1
We study the problem of multiclass classification within the framework of error correcting output codes (ECOC) using margin-based binary classifiers. Specifically, we address two important open problems in this context: decoding and model selection. The decoding problem concerns how to map the outputs of the classifiers into class codewords. In this paper we introduce a new decoding function that combines the margins through an estimate of their class conditional probabilities. Concerning model selection, we present new theoretical results bounding the leave-one-out (LOO) error of ECOC of kernel machines, which can be used to tune kernel hyperparameters. We report experiments using support vector machines as the base binary classifiers, showing the advantage of the proposed decoding function over other functions of I he margin commonly used in practice. Moreover, our empirical evaluations on model selection indicate that the bound leads to good estimates of kernel parameters. 相似文献
18.
Gaussian mixture model (GMM) based approaches have been commonly used for speaker recognition tasks. Methods for estimation of parameters of GMMs include the expectation-maximization method which is a non-discriminative learning based method. Discriminative classifier based approaches to speaker recognition include support vector machine (SVM) based classifiers using dynamic kernels such as generalized linear discriminant sequence kernel, probabilistic sequence kernel, GMM supervector kernel, GMM-UBM mean interval kernel (GUMI) and intermediate matching kernel. Recently, the pyramid match kernel (PMK) using grids in the feature space as histogram bins and vocabulary-guided PMK (VGPMK) using clusters in the feature space as histogram bins have been proposed for recognition of objects in an image represented as a set of local feature vectors. In PMK, a set of feature vectors is mapped onto a multi-resolution histogram pyramid. The kernel is computed between a pair of examples by comparing the pyramids using a weighted histogram intersection function at each level of pyramid. We propose to use the PMK-based SVM classifier for speaker identification and verification from the speech signal of an utterance represented as a set of local feature vectors. The main issue in building the PMK-based SVM classifier is construction of a pyramid of histograms. We first propose to form hard clusters, using k-means clustering method, with increasing number of clusters at different levels of pyramid to design the codebook-based PMK (CBPMK). Then we propose the GMM-based PMK (GMMPMK) that uses soft clustering. We compare the performance of the GMM-based approaches, and the PMK and other dynamic kernel SVM-based approaches to speaker identification and verification. The 2002 and 2003 NIST speaker recognition corpora are used in evaluation of different approaches to speaker identification and verification. Results of our studies show that the dynamic kernel SVM-based approaches give a significantly better performance than the state-of-the-art GMM-based approaches. For speaker recognition task, the GMMPMK-based SVM gives a performance that is better than that of SVMs using many other dynamic kernels and comparable to that of SVMs using state-of-the-art dynamic kernel, GUMI kernel. The storage requirements of the GMMPMK-based SVMs are less than that of SVMs using any other dynamic kernel. 相似文献
19.
Echo state networks (ESNs) are large, random recurrent neural networks with a single trained linear readout layer. Despite the untrained nature of the recurrent weights, they are capable of performing universal computations on temporal input data, which makes them interesting for both theoretical research and practical applications. The key to their success lies in the fact that the network computes a broad set of nonlinear, spatiotemporal mappings of the input data, on which linear regression or classification can easily be performed. One could consider the reservoir as a spatiotemporal kernel, in which the mapping to a high-dimensional space is computed explicitly. In this letter, we build on this idea and extend the concept of ESNs to infinite-sized recurrent neural networks, which can be considered recursive kernels that subsequently can be used to create recursive support vector machines. We present the theoretical framework, provide several practical examples of recursive kernels, and apply them to typical temporal tasks. 相似文献
20.
The paper focuses on the task of approximate classification of semantically annotated individual resources in ontological knowledge bases. The method is based on classification models built through kernel methods, a well-known class of effective statistical learning algorithms. Kernel functions encode a notion of similarity among elements of some input space. The definition of a family of parametric language-independent kernel functions for individuals occurring in an ontology allows the application of these statistical learning methods on Semantic Web knowledge bases. The classification models induced by kernel methods offer an alternative way to classify individuals with respect to the typical exact and approximate deductive reasoning procedures. The proposed statistical setting enables further inductive approaches to a variety of other tasks that can better cope with the inherent incompleteness of the knowledge bases in the Semantic Web and with their potential incoherence due to their distributed nature. The effectiveness of the proposed method is empirically proved through experiments on the task of approximate classification with real ontologies collected from standard repositories. 相似文献