首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We solve the argument mining problem by investigating discourse and communicative text structure. A new formal graph-based structure called communicative discourse tree (CDT) is defined. It consists of a discourse tree with additional labels on edges, which stand for verbs. These verbs represent communicative actions. Discourse trees are based on rhetoric relations, extracted from a text according to Rhetoric Structure Theory. The problem is tackled as a binary classification task, where the positive class corresponds to texts with arguments and the negative class corresponds to texts with no arguments. The feature engineering for the classification task is conducted, deciding on which syntactic and discourse features are associated with logical argumentation. Text classification framework based on syntactic, discourse and communicative discourse text structures with a number of learning approaches is implemented. Evaluation on a combined data-set is provided.  相似文献   

2.
Composite kernel learning   总被引:2,自引:0,他引:2  
The Support Vector Machine is an acknowledged powerful tool for building classifiers, but it lacks flexibility, in the sense that the kernel is chosen prior to learning. Multiple Kernel Learning enables to learn the kernel, from an ensemble of basis kernels, whose combination is optimized in the learning process. Here, we propose Composite Kernel Learning to address the situation where distinct components give rise to a group structure among kernels. Our formulation of the learning problem encompasses several setups, putting more or less emphasis on the group structure. We characterize the convexity of the learning problem, and provide a general wrapper algorithm for computing solutions. Finally, we illustrate the behavior of our method on multi-channel data where groups correspond to channels.  相似文献   

3.
Kernel machines such as Support Vector Machines (SVM) have exhibited successful performance in pattern classification problems mainly due to their exploitation of potentially nonlinear affinity structures of data through the kernel functions. Hence, selecting an appropriate kernel function, equivalently learning the kernel parameters accurately, has a crucial impact on the classification performance of the kernel machines. In this paper we consider the problem of learning a kernel matrix in a binary classification setup, where the hypothesis kernel family is represented as a convex hull of fixed basis kernels. While many existing approaches involve computationally intensive quadratic or semi-definite optimization, we propose novel kernel learning algorithms based on large margin estimation of Parzen window classifiers. The optimization is cast as instances of linear programming. This significantly reduces the complexity of the kernel learning compared to existing methods, while our large margin based formulation provides tight upper bounds on the generalization error. We empirically demonstrate that the new kernel learning methods maintain or improve the accuracy of the existing classification algorithms while significantly reducing the learning time on many real datasets in both supervised and semi-supervised settings.  相似文献   

4.
We claim and present arguments to the effect that a large class of manifold learning algorithms that are essentially local and can be framed as kernel learning algorithms will suffer from the curse of dimensionality, at the dimension of the true underlying manifold. This observation invites an exploration of nonlocal manifold learning algorithms that attempt to discover shared structure in the tangent planes at different positions. A training criterion for such an algorithm is proposed, and experiments estimating a tangent plane prediction function are presented, showing its advantages with respect to local manifold learning algorithms: it is able to generalize very far from training data (on learning handwritten character image rotations), where local nonparametric methods fail.  相似文献   

5.
Constrained clustering methods (that usually use must-link and/or cannot-link constraints) have been received much attention in the last decade. Recently, kernel adaptation or kernel learning has been considered as a powerful approach for constrained clustering. However, these methods usually either allow only special forms of kernels or learn non-parametric kernel matrices and scale very poorly. Therefore, they either learn a metric that has low flexibility or are applicable only on small data sets due to their high computational complexity. In this paper, we propose a more efficient non-linear metric learning method that learns a low-rank kernel matrix from must-link and cannot-link constraints and the topological structure of data. We formulate the proposed method as a trace ratio optimization problem and learn appropriate distance metrics through finding optimal low-rank kernel matrices. We solve the proposed optimization problem much more efficiently than SDP solvers. Additionally, we show that the spectral clustering methods can be considered as a special form of low-rank kernel learning methods. Extensive experiments have demonstrated the superiority of the proposed method compared to recently introduced kernel learning methods.  相似文献   

6.
王铁建  吴飞  荆晓远 《计算机科学》2017,44(12):131-134, 168
提出一种多核字典学习方法,用以对软件模块是否存在缺陷进行预测。用于软件缺陷预测的历史数据具有结构复杂、类不平衡的特点,用多个核函数构成的合成核将这些数据映射到一个高维特征空间,通过对多核字典基的选择,得到一个类别平衡的多核字典,用以对新的软件模块进行分类和预测,并判定其中是否存在缺陷。在NASA MDP数据集上的实验表明,与其他软件缺陷预测方法相比,多核字典学习方法能够针对软件缺陷历史数据结构复杂、类不平衡的特点,较好地解决软件缺陷预测问题。  相似文献   

7.
This paper addresses the problem of optimal feature extraction from a wavelet representation. Our work aims at building features by selecting wavelet coefficients resulting from signal or image decomposition on an adapted wavelet basis. For this purpose, we jointly learn in a kernelized large-margin context the wavelet shape as well as the appropriate scale and translation of the wavelets, hence the name “wavelet kernel learning”. This problem is posed as a multiple kernel learning problem, where the number of kernels can be very large. For solving such a problem, we introduce a novel multiple kernel learning algorithm based on active constraints methods. We furthermore propose some variants of this algorithm that can produce approximate solutions more efficiently. Empirical analysis show that our active constraint MKL algorithm achieves state-of-the art efficiency. When used for wavelet kernel learning, our experimental results show that the approaches we propose are competitive with respect to the state-of-the-art on brain–computer interface and Brodatz texture datasets.  相似文献   

8.
We propose a general framework to incorporate first-order logic (FOL) clauses, that are thought of as an abstract and partial representation of the environment, into kernel machines that learn within a semi-supervised scheme. We rely on a multi-task learning scheme where each task is associated with a unary predicate defined on the feature space, while higher level abstract representations consist of FOL clauses made of those predicates. We re-use the kernel machine mathematical apparatus to solve the problem as primal optimization of a function composed of the loss on the supervised examples, the regularization term, and a penalty term deriving from forcing real-valued constraints deriving from the predicates. Unlike for classic kernel machines, however, depending on the logic clauses, the overall function to be optimized is not convex anymore. An important contribution is to show that while tackling the optimization by classic numerical schemes is likely to be hopeless, a stage-based learning scheme, in which we start learning the supervised examples until convergence is reached, and then continue by forcing the logic clauses is a viable direction to attack the problem. Some promising experimental results are given on artificial learning tasks and on the automatic tagging of bibtex entries to emphasize the comparison with plain kernel machines.  相似文献   

9.
Software defect prediction aims to predict the defect proneness of new software modules with the historical defect data so as to improve the quality of a software system. Software historical defect data has a complicated structure and a marked characteristic of class-imbalance; how to fully analyze and utilize the existing historical defect data and build more precise and effective classifiers has attracted considerable researchers’ interest from both academia and industry. Multiple kernel learning and ensemble learning are effective techniques in the field of machine learning. Multiple kernel learning can map the historical defect data to a higher-dimensional feature space and make them express better, and ensemble learning can use a series of weak classifiers to reduce the bias generated by the majority class and obtain better predictive performance. In this paper, we propose to use the multiple kernel learning to predict software defect. By using the characteristics of the metrics mined from the open source software, we get a multiple kernel classifier through ensemble learning method, which has the advantages of both multiple kernel learning and ensemble learning. We thus propose a multiple kernel ensemble learning (MKEL) approach for software defect classification and prediction. Considering the cost of risk in software defect prediction, we design a new sample weight vector updating strategy to reduce the cost of risk caused by misclassifying defective modules as non-defective ones. We employ the widely used NASA MDP datasets as test data to evaluate the performance of all compared methods; experimental results show that MKEL outperforms several representative state-of-the-art defect prediction methods.  相似文献   

10.
Kernels and Distances for Structured Data   总被引:4,自引:2,他引:4  
Gärtner  Thomas  Lloyd  John W.  Flach  Peter A. 《Machine Learning》2004,57(3):205-232
This paper brings together two strands of machine learning of increasing importance: kernel methods and highly structured data. We propose a general method for constructing a kernel following the syntactic structure of the data, as defined by its type signature in a higher-order logic. Our main theoretical result is the positive definiteness of any kernel thus defined. We report encouraging experimental results on a range of real-world data sets. By converting our kernel to a distance pseudo-metric for 1-nearest neighbour, we were able to improve the best accuracy from the literature on the Diterpene data set by more than 10%.  相似文献   

11.
A new document management system is proposed in this paper. Its kernel is based on a new set of neuro-fuzzy systems of the ART family: FasArt and RFasArt. The first one, FasArt, is used to support a simple Optical Character Recognition (OCR) that inherits fine properties of ART architectures, such as fast and incremental learning, stability and modularity. On the other hand, RFasArt is a new recurrent version of FasArt which efficiently exploits contextual information in the task of logical labeling. The proposed system is extensively tested in two real-world applications, i.e. E-mail of printed business letter and digital library of scientific papers. Experimental results show logical labeling and OCR rates over 90%. The proposed system is better compared to a previous system proposed by the group, where instead of using contextual information in an integrated way, a postprocessing Viterbi-based model was employed.  相似文献   

12.
In this paper, the multiple kernel learning (MKL) is formulated as a supervised classification problem. We dealt with binary classification data and hence the data modelling problem involves the computation of two decision boundaries of which one related with that of kernel learning and the other with that of input data. In our approach, they are found with the aid of a single cost function by constructing a global reproducing kernel Hilbert space (RKHS) as the direct sum of the RKHSs corresponding to the decision boundaries of kernel learning and input data and searching that function from the global RKHS, which can be represented as the direct sum of the decision boundaries under consideration. In our experimental analysis, the proposed model had shown superior performance in comparison with that of existing two stage function approximation formulation of MKL, where the decision functions of kernel learning and input data are found separately using two different cost functions. This is due to the fact that single stage representation helps the knowledge transfer between the computation procedures for finding the decision boundaries of kernel learning and input data, which inturn boosts the generalisation capacity of the model.  相似文献   

13.
Kernel learning is widely used in many areas, and many methods are developed. As a famous kernel learning method, kernel principal component analysis (KPCA) endures two problems in the practical applications. One is that all training samples need to be stored for the computing the kernel matrix during kernel learning. Second is that the kernel and its parameter have the heavy influence on the performance of kernel learning. In order to solve the above problem, we present a novel kernel learning namely sparse data-dependent kernel principal component analysis through reducing the training samples with sparse learning-based least squares support vector machine and adaptive self-optimizing kernel structure according to the input training samples. Experimental results on UCI datasets, ORL and YALE face databases, and Wisconsin Breast Cancer database show that it is feasible to improve KPCA on saving consuming space and optimizing kernel structure.  相似文献   

14.
文档图像理解中最重要的部分是逻辑结构的提取。目前的研究主要集中在页面的布局分析上,少数对文档逻辑结构的研究只是针对单页文档或页面关系简单的多页文档。建筑标书的特殊性在于其层次式的逻辑组成结构没有明确的索引信息标识。本文提出了一种利用页面间引用关系获取文档逻辑结构的方法。该方法采用修正的树形结构表示文档的逻辑结构,逻辑树的创建过程就是逻辑结构的获取过程,而且有利于更高层的语义处理及还原输出。该方法已在标书自动处理系统中实现,保证了该系统的灵活和高效。  相似文献   

15.
Approaches to distance metric learning (DML) for Mahalanobis distance metric involve estimating a parametric matrix that is associated with a linear transformation. For complex pattern analysis tasks, it is necessary to consider the approaches to DML that involve estimating a parametric matrix that is associated with a nonlinear transformation. One such approach involves performing the DML of Mahalanobis distance in the feature space of a Mercer kernel. In this approach, the problem of estimation of a parametric matrix of Mahalanobis distance is formulated as a problem of learning an optimal kernel gram matrix from the kernel gram matrix of a base kernel by minimizing the logdet divergence between the kernel gram matrices. We propose to use the optimal kernel gram matrices learnt from the kernel gram matrix of the base kernels in pattern analysis tasks such as clustering, multi-class pattern classification and nonlinear principal component analysis. We consider the commonly used kernels such as linear kernel, polynomial kernel, radial basis function kernel and exponential kernel as well as hyper-ellipsoidal kernels as the base kernels for optimal kernel learning. We study the performance of the DML-based class-specific kernels for multi-class pattern classification using support vector machines. Results of our experimental studies on benchmark datasets demonstrate the effectiveness of the DML-based kernels for different pattern analysis tasks.  相似文献   

16.
17.
This paper addresses the problem of combining multi-modal kernels in situations in which object correspondence information is unavailable between modalities, for instance, where missing feature values exist, or when using proprietary databases in multi-modal biometrics. The method thus seeks to recover inter-modality kernel information so as to enable classifiers to be built within a composite embedding space. This is achieved through a principled group-wise identification of objects within differing modal kernel matrices in order to form a composite kernel matrix that retains the full freedom of linear kernel combination existing in multiple kernel learning. The underlying principle is derived from the notion of tomographic reconstruction, which has been applied successfully in conventional pattern recognition.In setting out this method, we aim to improve upon object-correspondence insensitive methods, such as kernel matrix combination via the Cartesian product of object sets to which the method defaults in the case of no discovered pairwise object identifications. We benchmark the method against the augmented kernel method, an order-insensitive approach derived from the direct sum of constituent kernel matrices, and also against straightforward additive kernel combination where the correspondence information is given a priori. We find that the proposed method gives rise to substantial performance improvements.  相似文献   

18.
An introduction to kernel-based learning algorithms   总被引:155,自引:0,他引:155  
This paper provides an introduction to support vector machines, kernel Fisher discriminant analysis, and kernel principal component analysis, as examples for successful kernel-based learning methods. We first give a short background about Vapnik-Chervonenkis theory and kernel feature spaces and then proceed to kernel based learning in supervised and unsupervised scenarios including practical and algorithmic considerations. We illustrate the usefulness of kernel algorithms by discussing applications such as optical character recognition and DNA analysis.  相似文献   

19.
A new formulation for multiway spectral clustering is proposed. This method corresponds to a weighted kernel principal component analysis (PCA) approach based on primal-dual least-squares support vector machine (LS-SVM) formulations. The formulation allows the extension to out-of-sample points. In this way, the proposed clustering model can be trained, validated, and tested. The clustering information is contained on the eigendecomposition of a modified similarity matrix derived from the data. This eigenvalue problem corresponds to the dual solution of a primal optimization problem formulated in a high-dimensional feature space. A model selection criterion called the Balanced Line Fit (BLF) is also proposed. This criterion is based on the out-of-sample extension and exploits the structure of the eigenvectors and the corresponding projections when the clusters are well formed. The BLF criterion can be used to obtain clustering parameters in a learning framework. Experimental results with difficult toy problems and image segmentation show improved performance in terms of generalization to new samples and computation times.  相似文献   

20.
In machine learning and statistics, kernel density estimators are rarely used on multivariate data due to the difficulty of finding an appropriate kernel bandwidth to overcome overfitting. However, the recent advances on information-theoretic learning have revived the interest on these models. With this motivation, in this paper we revisit the classical statistical problem of data-driven bandwidth selection by cross-validation maximum likelihood for Gaussian kernels. We find a solution to the optimization problem under both the spherical and the general case where a full covariance matrix is considered for the kernel. The fixed-point algorithms proposed in this paper obtain the maximum likelihood bandwidth in few iterations, without performing an exhaustive bandwidth search, which is unfeasible in the multivariate case. The convergence of the methods proposed is proved. A set of classification experiments are performed to prove the usefulness of the obtained models in pattern recognition.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号