首页 | 本学科首页   官方微博 | 高级检索  
     


Moment Kernels for Regular Distributions
Authors:Corinna Cortes  Mehryar Mohri
Affiliation:(1) Google Labs, 1440 Broadway, New York, NY, 10018;(2) AT&T Labs—Research, 180 Park Avenue, Florham Park, NJ, 07932
Abstract:Many machine learning problems in natural language processing, transaction-log analysis, or computational biology, require the analysis of variable-length sequences, or, more generally, distributions of variable-length sequences.Kernel methods introduced for fixed-size vectors have proven very successful in a variety of machine learning tasks. We recently introduced a new and general kernel framework, rational kernels, to extend these methods to the analysis of variable-length sequences or more generally distributions given by weighted automata. These kernels are efficient to compute and have been successfully used in applications such as spoken-dialog classification with Support Vector Machines.However, the rational kernels previously introduced in these applications do not fully encompass distributions over alternate sequences. They are based only on the counts of co-occurring subsequences averaged over the alternate paths without taking into accounts information about the higher-order moments of the distributions of these counts.In this paper, we introduce a new family of rational kernels, moment kernels, that precisely exploits this additional information. These kernels are distribution kernels based on moments of counts of strings. We describe efficient algorithms to compute moment kernels and apply them to several difficult spoken-dialog classification tasks. Our experiments show that using the second moment of the counts of n-gram sequences consistently improves the classification accuracy in these tasks.Editors: Dan Roth and Pascale Fung
Keywords:statistical learning  kernel methods  rational kernels  string kernels  weighted automata  weighted finite-state transducers  spoken-dialog classification
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号