共查询到20条相似文献,搜索用时 0 毫秒
1.
Akira Tanaka Author Vitae Hideyuki Imai Author Vitae Author Vitae Masaaki Miyakoshi Author Vitae 《Pattern recognition》2007,40(11):2930-2938
Kernel machines are widely considered to be powerful tools in various fields of information science. By using a kernel, an unknown target is represented by a function that belongs to a reproducing kernel Hilbert space (RKHS) corresponding to the kernel. The application area is widened by enlarging the RKHS such that it includes a wide class of functions. In this study, we demonstrate a method to perform this by using parameter integration of a parameterized kernel. Some numerical experiments show that the unresolved problem of finding a good parameter can be neglected. 相似文献
2.
《Expert systems with applications》2014,41(2):735-743
Short-text classification is increasingly used in a wide range of applications. However, it still remains a challenging problem due to the insufficient nature of word occurrences in short-text documents, although some recently developed methods which exploit syntactic or semantic information have enhanced performance in short-text classification. The language-dependency problem, however, caused by the heavy use of grammatical tags and lexical databases, is considered the major drawback of the previous methods when they are applied to applications in diverse languages. In this article, we propose a novel kernel, called language independent semantic (LIS) kernel, which is able to effectively compute the similarity between short-text documents without using grammatical tags and lexical databases. From the experiment results on English and Korean datasets, it is shown that the LIS kernel has better performance than several existing kernels. 相似文献
3.
核函数的性质及其构造方法 总被引:5,自引:0,他引:5
支持向量机是一项机器学习技术,发展至今近10年了,已经成功地用于模式识别、回归估计以及聚类等,并由此衍生出了核方法。支持向量机由核函数与训练集完全刻画。进一步提高支持向量机性能的关键,是针对给定的问题设计恰当的核函数,这就要求对核函数本身有深刻了解。本文首先分析了核函数的一些重要性质,接着对3类核函数,即平移不变核函数、旋转不变核函数和卷积核,提出了简单实用的判别准则。在此基础上,验证和构造了很多重要核函数。 相似文献
4.
This paper presents a new regularized kernel-based approach for the estimation of the second order moments of stationary stochastic processes. The proposed estimator is defined by a Tikhonov-type variational problem. It contains few unknown parameters which can be estimated by cross validation solving a sequence of problems whose computational complexity scales linearly with the number of noisy moments (derived from the samples of the process). The correlation functions are assumed to be summable and the hypothesis space is a reproducing kernel Hilbert space induced by the recently introduced stable spline kernel. In this way, information on the decay to zero of the functions to be reconstructed is incorporated in the estimation process. An application to the identification of transfer functions in the case of white noise as input is also presented. Numerical simulations show that the proposed method compares favorably with respect to standard nonparametric estimation algorithms that exploit an oracle-type tuning of the parameters. 相似文献
5.
Polynomial Time Learnability of Simple Deterministic Languages 总被引:1,自引:0,他引:1
This paper is concerned with the problem of learning simple deterministic languages. The algorithm described in this paper is based on the theory of model inference given by Shapiro. In our setting, however, nonterminal membership queries, except for the start symbol, are not permitted. Extended equivalence queries are used instead. Nonterminals that are necessary for a correct grammar and their intended models are introduced automatically. We give an algorithm that, for any simple deterministic language L, outputs a grammar G in 2-standard form, such that L = L(G), using membership queries and extended equivalence queries. We also show that the algorithm runs in time polynomial in the length of the longest counterexample and the number of nonterminals in a minimal grammar for L. 相似文献
6.
Gregory Gutin Eun Jung Kim Matthias Mnich Anders Yeo 《Journal of Computer and System Sciences》2010,76(8):872-878
We study ordinal embedding relaxations in the realm of parameterized complexity. We prove the existence of a quadratic kernel for the Betweenness problem parameterized above its tight lower bound, which is stated as follows. For a set V of variables and set C of constraints “vi is between vj and vk”, decide whether there is a bijection from V to the set {1,…,|V|} satisfying at least |C|/3+κ of the constraints in C. Our result solves an open problem attributed to Benny Chor in Niedermeier's monograph “Invitation to Fixed-Parameter Algorithms”. The betweenness problem is of interest in molecular biology. An approach developed in this paper can be used to determine parameterized complexity of a number of other optimization problems on permutations parameterized above or below tight bounds. 相似文献
7.
Ben Shneiderman 《Software》1973,3(1):5-8
By plotting the key value against the record position in a sorted file and then fitting a least squares polynomial through the points, a fast retrieval technique is determined. The target key of a search is inserted in the polynomial and the first access from the file is made on the basis of the evaluation. Since the maximum deviation can be determined, an efficient local search can be made. If the maximum deviation is less than half the size of the file, polynomial searching is more efficient than binary searching. While this method can be applied in many cases, it is most useful in disk oriented file systems where the goal is to minimize the number of accesses even at the expense of some additional calculation. 相似文献
8.
Justine LebrunPhilippe-Henri Gosselin Sylvie Philipp-Foliguet 《Image and vision computing》2011,29(11):716-729
In the framework of online object retrieval with learning, we address the problem of graph matching using kernel functions. An image is represented by a graph of regions where the edges represent the spatial relationships. Kernels on graphs are built from kernel on walks in the graph. This paper firstly proposes new kernels on graphs and on walks, which are very efficient for graphs of regions. Secondly we propose fast solutions for exact or approximate computation of these kernels. Thirdly we show results for the retrieval of images containing a specific object with the help of very few examples and counter-examples in the framework of an active retrieval scheme. 相似文献
9.
Harald K. Wimmer 《Systems & Control Letters》1981,1(3):200-203
10.
经典线性算法的非线性核形式 总被引:7,自引:1,他引:6
经典线性算法的非线性核形式是近10年发展起来的一类非线性机器学习技术.它们最显著的特点是利用满足Mercer条件的核函数巧妙地推导出线性算法的非线性形式。并表述为与样本数目有关、与维数无关的优化问题.为了提高数值计算的稳定性、控制算法的推广能力以及改善迭代过程的收敛性。部分算法还采用了正则化技术.在概述核思想与核函数、正则化技术的基础上,系统地介绍了经典线性算法的非线性核形式,同时分析它们的优缺点,井讨论了进一步发展的方向. 相似文献
11.
Xueying Zhang Xiaofeng Liu Zizhong John Wang 《Engineering Applications of Artificial Intelligence》2013,26(10):2574-2580
The kernel function is the core of the Support Vector Machine (SVM), and its selection directly affects the performance of SVM. There has been no theoretical basis on choosing a kernel function for speech recognition. In order to improve the learning ability and generalization ability of SVM for speech recognition, this paper presents the Optimal Relaxation Factor (ORF) kernel function, which is a set of new SVM kernel functions for speech recognition, and proves that the ORF function is a Mercer kernel function. The experiments show the ORF kernel function's effectiveness on mapping trend, bi-spiral, and speech recognition problems. The paper draws the conclusion that the ORF kernel function performs better than the Radial Basis Function (RBF), the Exponential Radial Basis Function (ERBF) and the Kernel with Moderate Decreasing (KMOD). Furthermore, the results of speech recognition with the ORF kernel function illustrate higher recognition accuracy. 相似文献
12.
Hans L. Bodlaender Rodney G. Downey Michael R. Fellows Danny Hermelin 《Journal of Computer and System Sciences》2009,75(8):423-434
Kernelization is a strong and widely-applied technique in parameterized complexity. A kernelization algorithm, or simply a kernel, is a polynomial-time transformation that transforms any given parameterized instance to an equivalent instance of the same problem, with size and parameter bounded by a function of the parameter in the input. A kernel is polynomial if the size and parameter of the output are polynomially-bounded by the parameter of the input.In this paper we develop a framework which allows showing that a wide range of FPT problems do not have polynomial kernels. Our evidence relies on hypothesis made in the classical world (i.e. non-parametric complexity), and revolves around a new type of algorithm for classical decision problems, called a distillation algorithm, which is of independent interest. Using the notion of distillation algorithms, we develop a generic lower-bound engine that allows us to show that a variety of FPT problems, fulfilling certain criteria, cannot have polynomial kernels unless the polynomial hierarchy collapses. These problems include k-Path, k-Cycle, k-Exact Cycle, k-Short Cheap Tour, k-Graph Minor Order Test, k-Cutwidth, k-Search Number, k-Pathwidth, k-Treewidth, k-Branchwidth, and several optimization problems parameterized by treewidth and other structural parameters. 相似文献
13.
Chemoinformatics is a research field concerned with the study of physical or biological molecular properties through computer science?s research fields such as machine learning and graph theory. From this point of view, graph kernels provide a nice framework which allows to naturally combine machine learning and graph theory techniques. Graph kernels based on bags of patterns have proven their efficiency on several problems both in terms of accuracy and computational time. Treelet kernel is a graph kernel based on a bag of small subtrees. We propose in this paper several extensions of this kernel devoted to chemoinformatics problems. These extensions aim to weight each pattern according to its influence, to include the comparison of non-isomorphic patterns, to include stereo information and finally to explicitly encode cyclic information into kernel computation. 相似文献
14.
多核学习方法是当前核机器学习领域的一个新的热点. 核方法是解决非线性模式分析问题的一种有效方法, 但在一些复杂情形下, 由单个核函数构成的核机器并不能满足诸如数据异构或不规则、样本规模巨大、样本不平坦分布等实际的应用需求, 因此将多个核函数进行组合, 以获得更好的结果是一种必然选择. 本文根据多核的构成, 从合成核、多尺度核、无限核三个角度, 系统综述了多核方法的构造理论, 分析了多核学习典型方法的特点及不足, 总结了各自的应用领域, 并凝炼了其进一步的研究方向. 相似文献
15.
Dimensionality reduction is an important and challenging task in machine learning and data mining. Feature selection and feature extraction are two commonly used techniques for decreasing dimensionality of the data and increasing efficiency of learning algorithms. Specifically, feature selection realized in the absence of class labels, namely unsupervised feature selection, is challenging and interesting. In this paper, we propose a new unsupervised feature selection criterion developed from the viewpoint of subspace learning, which is treated as a matrix factorization problem. The advantages of this work are four-fold. First, dwelling on the technique of matrix factorization, a unified framework is established for feature selection, feature extraction and clustering. Second, an iterative update algorithm is provided via matrix factorization, which is an efficient technique to deal with high-dimensional data. Third, an effective method for feature selection with numeric data is put forward, instead of drawing support from the discretization process. Fourth, this new criterion provides a sound foundation for embedding kernel tricks into feature selection. With this regard, an algorithm based on kernel methods is also proposed. The algorithms are compared with four state-of-the-art feature selection methods using six publicly available datasets. Experimental results demonstrate that in terms of clustering results, the proposed two algorithms come with better performance than the others for almost all datasets we experimented with here. 相似文献
16.
Minimum class variance support vector machine (MCVSVM) and large margin linear projection (LMLP) classifier, in contrast with traditional support vector machine (SVM), take the distribution information of the data into consideration and can obtain better performance. However, in the case of the singularity of the within-class scatter matrix, both MCVSVM and LMLP only exploit the discriminant information in a single subspace of the within-class scatter matrix and discard the discriminant information in the other subspace. In this paper, a so-called twin-space support vector machine (TSSVM) algorithm is proposed to deal with the high-dimensional data classification task where the within-class scatter matrix is singular. TSSVM is rooted in both the non-null space and the null space of the within-class scatter matrix, takes full advantage of the discriminant information in the two subspaces, and so can achieve better classification accuracy. In the paper, we first discuss the linear case of TSSVM, and then develop the nonlinear TSSVM. Experimental results on real datasets validate the effectiveness of TSSVM and indicate its superior performance over MCVSVM and LMLP. 相似文献
17.
Generally, links among objects demonstrate certain patterns and contain rich semantic clues. These important clues can be
used to improve classification accuracy. However, many real-world link data may exhibit more complex regularity. For example,
there may be some noisy links that carry no human editorial endorsement about semantic relationships. To effectively capture
such regularity, this paper proposes latent linkage semantic kernels (LLSKs) by first introducing the linkage kernels to model
the local and global dependency structure of a link graph and then applying the singular value decomposition (SVD) in the
kernel-induced space. For the computational efficiency on large datasets, we also develop a block-based algorithm for LLSKs.
A kernel-based contextual dependency network (KCDN) model is then presented to exploit the dependencies in a network of objects
for collective classification. We provide experimental results demonstrating that the KCDN model, together with LLSKs, demonstrates
relatively high robustness on the datasets with the complex link regularity, and the block-based computation method can scale
well with varying sizes of the problem. 相似文献
18.
Combining visual dictionary, kernel-based similarity and learning strategy for image category retrieval 总被引:1,自引:0,他引:1
Philippe Henri Gosselin Matthieu Cord Sylvie Philipp-Foliguet 《Computer Vision and Image Understanding》2008,110(3):403
This paper presents a search engine architecture, RETIN, aiming at retrieving complex categories in large image databases. For indexing, a scheme based on a two-step quantization process is presented to compute visual codebooks. The similarity between images is represented in a kernel framework. Such a similarity is combined with online learning strategies motivated by recent machine-learning developments such as active learning. Additionally, an offline supervised learning is embedded in the kernel framework, offering a real opportunity to learn semantic categories. Experiments with real scenario carried out from the Corel Photo database demonstrate the efficiency and the relevance of the RETIN strategy and its outstanding performances in comparison to up-to-date strategies. 相似文献
19.
Enliang HuAuthor Vitae Songcan ChenAuthor VitaeLishan QiaoAuthor Vitae 《Neurocomputing》2011,74(17):2725-2733
We introduce a kernel learning algorithm, called kernel propagation (KP), to learn a nonparametric kernel from a mixture of a few pairwise constraints and plentiful unlabeled samples. Specifically, KP consists of two stages: the first is to learn a small-sized sub-kernel matrix just restricted to the samples with constrains, and the second is to propagate this learned sub-kernel matrix into a large-sized full-kernel matrix over all samples. As an interesting fact, our approach exposes a natural connection between KP and label propagation (LP), that is, one LP can naturally induce its KP counterpart. Thus, we develop three KPs from the three typical LPs correspondingly. Following the idea in KP, we also naturally develop an out-of-sample extension to directly capture a kernel matrix for outside-training data without the need of relearning. The final experiments verify that our developments are more efficient, more error-tolerant and also comparably effective in comparison with the state-of-the-art algorithm. 相似文献
20.
Aurélien Bellet Author Vitae Marc Bernard Author Vitae Thierry Murgue Author Vitae Marc Sebban Author Vitae 《Pattern recognition》2010,43(6):2330-2339
During the past few years, several works have been done to derive string kernels from probability distributions. For instance, the Fisher kernel uses a generative model M (e.g. a hidden Markov model) and compares two strings according to how they are generated by M. On the other hand, the marginalized kernels allow the computation of the joint similarity between two instances by summing conditional probabilities. In this paper, we adapt this approach to edit distance-based conditional distributions and we present a way to learn a new string edit kernel. We show that the practical computation of such a kernel between two strings x and x′ built from an alphabet Σ requires (i) to learn edit probabilities in the form of the parameters of a stochastic state machine and (ii) to calculate an infinite sum over Σ* by resorting to the intersection of probabilistic automata as done for rational kernels. We show on a handwritten character recognition task that our new kernel outperforms not only the state of the art string kernels and string edit kernels but also the standard edit distance used by a neighborhood-based classifier. 相似文献