排序方式: 共有156条查询结果,搜索用时 31 毫秒
1.
模式识别、函数拟合及概率密度估计等都属于基于数据学习的问题,现有方法的重
要基础是传统的统计学,前提是有足够多样本,当样本数目有限时难以取得理想的效果.统计
学习理论(SLT)是由Vapnik等人提出的一种小样本统计理论,着重研究在小样本情况下的
统计规律及学习方法性质.SLT为机器学习问题建立了一个较好的理论框架,也发展了一种
新的通用学习算法--支持向量机(SVM),能够较好的解决小样本学习问题.目前,SLT和
SVM已成为国际上机器学习领域新的研究热点.本文是一篇综述,旨在介绍SLT和SVM的
基本思想、特点和研究发展现状,以引起国内学者的进一步关注. 相似文献
2.
A Tutorial on Support Vector Machines for Pattern Recognition 总被引:661,自引:0,他引:661
Christopher J.C. Burges 《Data mining and knowledge discovery》1998,2(2):121-167
The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss when SVM solutions are unique and when they are global. We describe how support vector training can be practically implemented, and discuss in detail the kernel mapping technique which is used to construct SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large (even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian radial basis function kernels. While very high VC dimension would normally bode ill for generalization performance, and while at present there exists no theory which shows that good generalization performance is guaranteed for SVMs, there are several arguments which support the observed high accuracy of SVMs, which we review. Results of some experiments which were inspired by these arguments are also presented. We give numerous examples and proofs of most of the key theorems. There is new material, and I hope that the reader will find that even old material is cast in a fresh light. 相似文献
3.
4.
Bayesian Networks for Data Mining 总被引:76,自引:0,他引:76
David Heckerman 《Data mining and knowledge discovery》1997,1(1):79-119
A Bayesian network is a graphical model that encodesprobabilistic relationships among variables of interest. When used inconjunction with statistical techniques, the graphical model hasseveral advantages for data modeling. One, because the model encodesdependencies among all variables, it readily handles situations wheresome data entries are missing. Two, a Bayesian network can be used tolearn causal relationships, and hence can be used to gain understanding about a problem domain and to predict the consequencesof intervention. Three, because the model has both a causal andprobabilistic semantics, it is an ideal representation for combiningprior knowledge (which often comes in causal form) and data. Four,Bayesian statistical methods in conjunction with Bayesian networksoffer an efficient and principled approach for avoiding theoverfitting of data. In this paper, we discuss methods for constructing Bayesian networks from prior knowledge and summarizeBayesian statistical methods for using data to improve these models.With regard to the latter task, we describe methods for learning boththe parameters and structure of a Bayesian network, includingtechniques for learning with incomplete data. In addition, we relateBayesian-network methods for learning to techniques for supervised andunsupervised learning. We illustrate the graphical-modeling approachusing a real-world case study. 相似文献
5.
目前的入侵检测系统存在着在先验知识较少的情况下推广能力差的问题.在入侵检测系统中应用支持向量机算法,使得入侵检测系统在小样本(先验知识少)的条件下仍然具有良好的推广能力.首先介绍入侵检测研究的发展概况和支持向量机的分类算法,接着提出了基于支持向量机的入侵检测模型,然后以系统调用执行迹(system call trace)这类常用的入侵检测数据为例,详细讨论了该模型的工作过程,最后将计算机仿真结果与其他检测方法进行了比较.通过实验和比较发现,基于支持向量机的入侵检测系统不但所需要的先验知识远远小于其他方法,而且当检测性能相同时,该系统的训练时间将会缩短. 相似文献
6.
支持向量机及其应用研究综述 总被引:74,自引:1,他引:73
在分析支持向量机原理的基础上,分别从人脸检测、验证和识别、说话人/语音识别、文字/手写体识别、图像处理及其他应用研究等方面对SVM的应用研究进行了综述,并讨论了SVM的优点和不足,展望了其应用研究的前景. 相似文献
7.
8.
9.
分布估计算法是进化计算领域新兴起的一类随机优化算法,是当前国际进化计算领域的研究热点. 分布估计算法是遗传算法和统计学习的结合,通过统计学习的手段建立解空间内个体分布的概率模型,然后对概率模型随机采样产生新的群体,如此反复进行,实现群体的进化. 分布估计算法中没有传统的交叉、变异等遗传操作,是一种全新的进化模式;这种优化技术能够通过概率图模型对变量之间的关系进行建模,从而能有效的解决多变量相关的优化问题. 根据概率模型的复杂性,本文按照变量无关、双变量相关、多变量相关等三类分别介绍相应的分布估计算法. 作为一篇综述性文章,本文旨在全面系统的向国内读者介绍这一新技术,并总结分布估计算法的研究现状和未来的研究方向. 相似文献
10.
支持向量机算法和软件ChemSVM介绍 总被引:53,自引:27,他引:26
Vladimir N.Vapnik等提出的统计学习理论(statistical learning theory,简称SLT)和支持向量机(support vector machine,简称SVM)算法已取得令人鼓舞的研究成果。本文旨在对这一新理论和新算法的原理作一介绍,并展望这一计算机学界的新成果在化学化工领域的应用前景,“ChemSVM”软件提供了通用的支持向量机算法,并将其与数据库,知识库,原子参数及其他数据挖掘方法有机地集成起来。 相似文献