首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Langford  John  Blum  Avrim 《Machine Learning》2003,51(2):165-179
A major topic in machine learning is to determine good upper bounds on the true error rates of learned hypotheses based upon their empirical performance on training data. In this paper, we demonstrate new adaptive bounds designed for learning algorithms that operate by making a sequence of choices. These bounds, which we call Microchoice bounds, are similar to Occam-style bounds and can be used to make learning algorithms self-bounding in the style of Freund (1998). We then show how to combine these bounds with Freund's query-tree approach producing a version of Freund's query-tree structure that can be implemented with much more algorithmic efficiency.  相似文献   

2.
三维几何数据压缩算法的累计误差消除方法   总被引:1,自引:1,他引:0  
已有的三维几何数据压缩算法中几何信息普遍采用连续量化和预测编码策略,存在较严重的累计误差,这一误差是由预测点的值在编码前后不一致带来的.针对此问题提出一种解决方案,借助一组具有值不变性的参照平面代替原有算法的预测点,该方案能有效地消除预测编码策略中产生累计误差的上述根源、理论和实验表明:该疗法不仅在相同压缩比和速度下可进一步提高压缩算法精度,而且和多种几何压缩算法兼容.  相似文献   

3.
不论对分类问题还是回归问题,在构造实际可行的寻找决策函数f(x)的学习算法时,首先要有一个评价f(x)好坏的标准。而评价一个决策函数的性能时,一般是利用样本集估计其在检验集上推断时发生的错误率。给出了几个错误率估计算法,并详细分析了各估计函数的优缺点,最后的实验结果给出了从k-折交叉验证、RM-bounds和εα-estimator函数中预测出的测试错误率,进一步说明了不同的数据集可以选择不同的风险评估算法来预测出所选模型的最优参数。  相似文献   

4.
In this paper we introduce and illustrate non-trivial upper and lower bounds on the learning curves for one-dimensional Guassian Processes. The analysis is carried out emphasising the effects induced on the bounds by the smoothness of the random process described by the Modified Bessel and the Squared Exponential covariance functions. We present an explanation of the early, linearly-decreasing behavior of the learning curves and the bounds as well as a study of the asymptotic behavior of the curves. The effects of the noise level and the lengthscale on the tightness of the bounds are also discussed.  相似文献   

5.
We describe the IGTree learning algorithm, which compresses an instance base into a tree structure. The concept of information gain is used as a heuristic function for performing this compression. IGTree produces trees that, compared to other lazy learning approaches, reduce storage requirements and the time required to compute classifications. Furthermore, we obtained similar or better generalization accuracy with IGTree when trained on two complex linguistic tasks, viz. letter–phoneme transliteration and part-of-speech-tagging, when compared to alternative lazy learning and decision tree approaches (viz., IB1, information-gain-weighted IB1, and C4.5). A third experiment, with the task of word hyphenation, demonstrates that when the mutual differences in information gain of features is too small, IGTree as well as information-gain-weighted IB1 perform worse than IB1. These results indicate that IGTree is a useful algorithm for problems characterized by the availability of a large number of training instances described by symbolic features with sufficiently differing information gain values.  相似文献   

6.
This paper considers a family of spatially semi-discrete approximations, including boundary treatments, to hyperbolic and parabolic equations. We derive the dependence of the error-bounds on time as well as on mesh size.  相似文献   

7.
研究了具有未知但有界(UBB)误差系统辨识的最优定界椭球(OBE)算法对误差界低估的鲁棒性.证明了在一定的条件下,即使误差界低估,任何OBE算法都能保持其收敛性.这一结论可用于具有UBB误差的实际系统参数估计中,以期获得不太保守的结果.  相似文献   

8.
根据文本分类通常包含多异类数据源的特点,提出了多核SVM学习算法。该算法将分类核矩阵的二次组合重新表述成半无限规划,并说明其可以通过重复利用SVM来实现有效求解。实验结果表明,提出的算法可以用于数百个核的结合或者是数十万个样本的结合,对于多异类数据源的文本分类具有较高的查全率和查准率。  相似文献   

9.
This paper considers the relative entropy between the conditional distribution and an incorrectly initialized filter for the estimation of one component of a Markov process given observations of the second component. Using the Markov property, we first establish a decomposition of the relative entropy between the measures on observation path space associated to different initial conditions. Using this decomposition, it is shown that the relative entropy of the optimal filter relative to an incorrectly initialized filter is a positive supermartingale. By applying the decomposition to signals observed in additive, white noise, a relative entropy bound is obtained on the integrated, expected, mean square difference between the optimal and incorrectly initialized estimates of the observation function. Date received: October 6, 1997. Date revised: April 9, 1999.  相似文献   

10.
面向图象压缩的图象分类及压缩结果预测   总被引:4,自引:1,他引:4       下载免费PDF全文
图象数据存在冗余使图象压缩成为可能 ,而不同图象的数据冗余度特别是空间冗余度相差很大 .对被压缩图象的空间冗余度这一图象的本质属性进行研究、减少图象压缩及方法选择时的盲目性是非常必要的 .为此提出了面向图象压缩的图象分类这一新概念以及具体分类算法 .该算法利用图象小波高频系数的分布特点 ,采用图象边缘度作为图象空间冗余度的度量 ,将不同内容的图象按边缘度大小分类 .分类的结果可对不同图象的压缩结果进行预测 .实验结果表明 ,图象分类结果和对压缩结果的预测是有意义的 ,并与人的视觉相吻合 .该分类思想对其他图象处理算法的选择和优化也有参考价值 .  相似文献   

11.
A modification of the estimation algorithm stochastic approximation is presented. With assumptions to the statistical distribution of the training data it becomes possible, to estimate not only the mean value but also well directed deviating values of the data distribution. Thus, detailed error models can be identified by means of parameter-linear formulation of the new algorithm. By definition of suitable probabilities, these parametric error models are estimating soft error bounds. That way, an experimental identification method is provided that is able to support a robust controller design. The method was applied at an industrial robot, which is controlled by feedback linearisation. Based on a dynamic model realised by a neural network, the presented approach is utilised for the robust design of the stabilising decentral controllers.  相似文献   

12.
基于MultiBoost分类组装技术,提出了一种用增量交叉验证技术求MultiBoost最小分类误差的算法,以使之在指定分类器数量T的范围内找出具有最小分类误差的舍戍分类器.  相似文献   

13.
面向流数据分类的在线学习综述   总被引:1,自引:0,他引:1  
翟婷婷  高阳  朱俊武 《软件学报》2020,31(4):912-931
流数据分类旨在从连续不断到达的流式数据中增量学习一个从输入变量到类标变量的映射函数,以便对随时到达的测试数据进行准确分类.在线学习范式作为一种增量式的机器学习技术,是流数据分类的有效工具.主要从在线学习的角度对流数据分类算法的研究现状进行综述.具体地,首先介绍在线学习的基本框架和性能评估方法,然后着重介绍在线学习算法在一般流数据上的工作现状,在高维流数据上解决"维度诅咒"问题的工作现状,以及在演化流数据上处理"概念漂移"问题的工作现状,最后讨论高维和演化流数据分类未来仍然存在的挑战和亟待研究的方向.  相似文献   

14.
Consider a set of n advertisements (hereafter called ads) A ={A1,...,An} competing to be placed in a planning horizon which is divided into N time intervals called slots. An ad A i is specified by its size s i and frequency w i. The size s i represents the amount of space the ad occupies in a slot. Ad A i is said to be scheduled if exactly w i copies of A i are placed in the slots subject to the restriction that a slot contains at most one copy of an ad. In this paper, we consider two problems. The MINSPACE problem minimizes the maximum fullness among all slots in a feasible schedule where the fullness of a slot is the sum of the sizes of ads assigned to the slot. For the MAXSPACE problem, in addition, we are given a common maximum fullness S for all slots. The total size of the ads placed in a slot cannot exceed S. The objective is to find a feasible schedule of ads such that the total occupied slot space is maximized. We examine the complexity status of both problems and provide heuristics with performance guarantees.  相似文献   

15.
包分类技术是下一代网络设备的关键技术之一.研究有效的包分类算法是目前网络技术领域的热门课题.层压缩树包分类算法的基本思想是:对路径压缩之后的二叉树进行层压缩,使压缩树中的节点能够按序存储在数组中.通过对数组元素跳跃式的查找快速的对包头进行分类.仿真试验结果表明该算法在较大规则数下能够实现对包头的快速分类,分类速度可以达到每秒处理接近2M个包头,具有O(d)的时间复杂度(d为域的个数);在中等规模规则数下具有O(dN)的空间复杂度,并且其存储量优于其他算法(如Bitmap和区域分割包分类算法).由于层压缩树算法对包头的每个域独立查找,在硬件实现上采用并行查找各个域的处理方式将使该算法的查找性能得到更大的提高.  相似文献   

16.
昂贵优化问题的求解往往伴随着计算成本灾难,为了减少目标函数的真实评估次数,将序预测方法用于进化算法中候选解的选取.通过分类预测直接得到候选解的相对优劣关系,避免了对目标函数建立精确代理模型的需求,并且设计了序样本集约简方法,以降低序样本集的冗余性,提高序预测模型的训练效率.接下来,将序预测与遗传算法相结合.序预测辅助遗传算法在昂贵优化测试函数上的仿真实验表明,序预测方法可有效降低求解昂贵优化问题时的计算成本.  相似文献   

17.
提出了一种传感信号采集中的误差受控压缩算法.为适应传感信号特征多变的情况,根据各段信号的自相关系数动态调整梯度预测器的系数;通过改进最大步长均匀量化器降低量化噪声;采用Golomb-Rice编码算法对量化后的预测误差序列进行编码.根据数据采集系统前端噪声水平确定压缩误差参数的上限,进而获得压缩比的上限.算法在供水管道泄漏信号采集中的应用表明,压缩比达2.63时,压缩后重构信号漏点定位误差增加量小于0.2 m.  相似文献   

18.
基于样本密度和分类误差率的增量学习矢量量化算法研究   总被引:1,自引:0,他引:1  
李娟  王宇平 《自动化学报》2015,41(6):1187-1200
作为一种简单而成熟的分类方法, K最近邻(K nearest neighbor, KNN)算法在数据挖掘、模式识别等领域获得了广泛的应用, 但仍存在计算量大、高空间消耗、运行时间长等问题. 针对这些问题, 本文在增量学习型矢量量化(Incremental learning vector quantization, ILVQ)的单层竞争学习基础上, 融合样本密度和分类误差率的邻域思想, 提出了一种新的增量学习型矢量量化方法, 通过竞争学习策略对代表点邻域实现自适应增删、合并、分裂等操作, 快速获取原始数据集的原型集, 进而在保障分类精度基础上, 达到对大规模数据的高压缩效应. 此外, 对传统近邻分类算法进行了改进, 将原型近邻集的样本密度和分类误差率纳入到近邻判决准则中. 所提出算法通过单遍扫描学习训练集可快速生成有效的代表原型集, 具有较好的通用性. 实验结果表明, 该方法同其他算法相比较, 不仅可以保持甚至提高分类的准确性和压缩比, 且具有快速分类的优势.  相似文献   

19.
张栋  柯长青  余瞰 《遥感信息》2010,(3):26-29,111
首先介绍了CART、C5.0和概率神经网络三种机器学习算法的原理,然后以覆盖湖北省公安县的ALOS影像为数据源,从整体精度、对训练样本大小和噪声的敏感性三个方面对它们进行了比较分析。结果显示C5.0算法分类的整体精度最高,达到83.59%。概率神经网络受训练样本大小和噪声的影响最低:在训练样本大小降为原样本数据量的40%时,其精度为78.52%;噪声占训练样本量的10%时,精度只下降了4.3%。通过分析可以看出,在训练样本量充足时,C5.0算法的分类精度最好,而在样本不足或者包含噪声的情况下,使用概率神经网络算法能比其他两种算法取得更好的分类效果。  相似文献   

20.
基于压缩动量项的增量型ELM虚拟机能耗预测   总被引:1,自引:0,他引:1  
邹伟东  夏元清 《自动化学报》2019,45(7):1290-1297
在基于基础设施即服务(Infrastructure as a service,IaaS)的云服务模式下,精准的虚拟机能耗预测,对于在众多物理服务器之间进行虚拟机调度策略的制定具有十分重要的意义.针对基于传统的增量型极限学习机(Incremental extreme learning machine,I-ELM)的预测模型存在许多降低虚拟机能耗预测准确性和效率的冗余节点,在现有I-ELM模型中加入压缩动量项将网络训练误差反馈到隐含层的输出中使预测结果更逼近输出样本,能够减少I-ELM的冗余隐含层节点,从而加快I-ELM的网络收敛速度,提高I-ELM的泛化性能.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号