首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
This paper focuses on the problem of how data representation influences the generalization error of kernel based learning machines like support vector machines (SVM) for classification. Frame theory provides a well founded mathematical framework for representing data in many different ways. We analyze the effects of sparse and dense data representations on the generalization error of such learning machines measured by using leave-one-out error given a finite amount of training data. We show that, in the case of sparse data representations, the generalization error of an SVM trained by using polynomial or Gaussian kernel functions is equal to the one of a linear SVM. This is equivalent to saying that the capacity of separating points of functions belonging to hypothesis spaces induced by polynomial or Gaussian kernel functions reduces to the capacity of a separating hyperplane in the input space. Moreover, we show that, in general, sparse data representations increase or leave unchanged the generalization error of kernel based methods. Dense data representations, on the contrary, reduce the generalization error in the case of very large frames. We use two different schemes for representing data in overcomplete systems of Haar and Gabor functions, and measure SVM generalization error on benchmarked data sets.  相似文献   

2.
Recent studies have employed simple linear dynamical systems to model trial-by-trial dynamics in various sensorimotor learning tasks. Here we explore the theoretical and practical considerations that arise when employing the general class of linear dynamical systems (LDS) as a model for sensorimotor learning. In this framework, the state of the system is a set of parameters that define the current sensorimotor transformation-the function that maps sensory inputs to motor outputs. The class of LDS models provides a first-order approximation for any Markovian (state-dependent) learning rule that specifies the changes in the sensorimotor transformation that result from sensory feedback on each movement. We show that modeling the trial-by-trial dynamics of learning provides a substantially enhanced picture of the process of adaptation compared to measurements of the steady state of adaptation derived from more traditional blocked-exposure experiments. Specifically, these models can be used to quantify sensory and performance biases, the extent to which learned changes in the sensorimotor transformation decay over time, and the portion of motor variability due to either learning or performance variability. We show that previous attempts to fit such models with linear regression have not generally yielded consistent parameter estimates. Instead, we present an expectation-maximization algorithm for fitting LDS models to experimental data and describe the difficulties inherent in estimating the parameters associated with feedback-driven learning. Finally, we demonstrate the application of these methods in a simple sensorimotor learning experiment: adaptation to shifted visual feedback during reaching.  相似文献   

3.
This paper focuses on the problem of how data representation influences the generalization error of kernel-based learning machines like support vector machines (SVMs). We analyse the effects of sparse and dense data representations on the generalization error of SVM. We show that using sparse representations the performances of classifiers belonging to hypothesis spaces induced by polynomial or Gaussian kernel functions reduce to the performances of linear classifiers. Sparse representations reduce the generalization error as long as the representation is not too sparse as with very large dictionaries. Dense data representations reduce the generalization error also using very large dictionaries. We use two schemes for representing data in data-independent overcomplete Haar and Gabor dictionaries, and measure the generalization error of SVMs on benchmark datasets. We study sparse and dense representations in the case of data-dependent overcomplete dictionaries and we show how this leads to principal component analysis.  相似文献   

4.
标准的BP神经网络存在训练速度慢、容易陷入极小点、泛化能力低的特点。文中用附加动量项和改进学习速率相结合的方法对标准的BP神经网络进行了改进,并将其应用在木构古建筑的寿命预测中。仿真结果表明,和标准的BP神经网络相比,改进后的BP神经网络提高了泛化能力,能较准确地拟合训练值,避免了在确定计算参数过程中所产生的计算误差。  相似文献   

5.
Subspace information criterion for model selection   总被引:7,自引:0,他引:7  
The problem of model selection is considerably important for acquiring higher levels of generalization capability in supervised learning. In this article, we propose a new criterion for model selection, the subspace information criterion (SIC), which is a generalization of Mallows's C(L). It is assumed that the learning target function belongs to a specified functional Hilbert space and the generalization error is defined as the Hilbert space squared norm of the difference between the learning result function and target function. SIC gives an unbiased estimate of the generalization error so defined. SIC assumes the availability of an unbiased estimate of the target function and the noise covariance matrix, which are generally unknown. A practical calculation method of SIC for least-mean-squares learning is provided under the assumption that the dimension of the Hilbert space is less than the number of training examples. Finally, computer simulations in two examples show that SIC works well even when the number of training examples is small.  相似文献   

6.
在域间分布适配的过程中,容易丢失一些重要的域自身信息,在源域上难以训练获得一个有效的分类器,影响其在目标域上的泛化与标注性能.基于此种情况,文中提出联合类间及域间分布适配的迁移学习方法.通过学习一个公共投影矩阵,分别将源域与目标域映射到一个公共子空间上.采用最大均值差异方法分别度量类间及域间分布距离.在目标函数的优化过程中,不但显式地使域间分布差异变小,而且增大不同类别间的差异性,提高源域与目标域之间知识迁移的性能.在迁移学习数据集上的实验表明文中方法的有效性.  相似文献   

7.
作为一种解决标签模糊性问题的新学习范式, 标记分布学习(LDL)近年来受到了广泛的关注. 为了进一步提升标记分布学习的预测性能, 提出一种联合深度森林与异质集成的标记分布学习方法(LDLDF). 所提方法采用深度森林的级联结构模拟具有多层处理结构的深度学习模型, 在级联层中组合多个异质分类器增加集成的多样性. 相较于其他现有LDL方法, LDLDF能够逐层处理信息, 学习更好的特征表示, 挖掘数据中丰富的语义信息, 具有强大的表示学习能力和泛化能力. 此外, 考虑到深层模型可能出现的模型退化问题, LDLDF采用一种层特征重用机制(layer feature reuse)降低模型的训练误差, 有效利用深层模型每一层的预测能力. 大量的实验结果表明, 所提方法优于近期的同类方法.  相似文献   

8.
Spatial architecture neural network (SANN), which is inspired by the connecting mode of excitatory pyramidal neurons and inhibitory interneurons of neocortex, is a multilayer artificial neural network and has good learning accuracy and generalization ability when used in real applications. However, the backpropagation-based learning algorithm (named BP-SANN) may be time consumption and slow convergence. In this paper, a new fast and accurate two-phase sequential learning scheme for SANN is hereby introduced to guarantee the network performance. With this new learning approach (named SFSL-SANN), only the weights connecting to output neurons will be trained during the learning process. In the first phase, a least-squares method is applied to estimate the span-output-weight on the basis of the fixed randomly generated initialized weight values. The improved iterative learning algorithm is then used to learn the feedforward-output-weight in the second phase. Detailed effectiveness comparison of SFSL-SANN is done with BP-SANN and other popular neural network approaches on benchmark problems drawn from the classification, regression and time-series prediction applications. The results demonstrate that the SFSL-SANN is faster convergence and time-saving than BP-SANN, and produces better learning accuracy and generalization performance than other approaches.  相似文献   

9.
Suppes  Patrick  Böttner  Michael  Liang  Lin 《Machine Learning》1995,19(2):133-152
We are developing a theory of probabilistic language learning in the context of robotic instruction in elementary assembly actions. We describe the process of machine learning in terms of the various events that happen on a given trial, including the crucial association of words with internal representations of their meaning. Of central importance in learning is the generalization from utterances to grammatical forms. Our system derives a comprehension grammar for a superset of a natural language from pairs of verbal stimuli like Go to the screw! and corresponding internal representations of coerced actions. For the derivation of a grammar no knowledge of the language to be learned is assumed but only knowledge of an internal language.We present grammars for English, Chinese, and German generated from a finite sample of about 500 commands that are roughly equivalent across the three languages. All of the three grammars, which are context-free in form, accept an infinite set of commands in the given language.  相似文献   

10.
Computational neuroscience studies have examined the human visual system through functional magnetic resonance imaging (fMRI) and identified a model where the mammalian brain pursues two independent pathways for recognizing biological movement tasks. On the one hand, the dorsal stream analyzes the motion information by applying optical flow, which considers the fast features. On the other hand, the ventral stream analyzes the form information with slow features. The proposed approach suggests that the motion perception of the human visual system comprises fast and slow feature interactions to identify biological movements. The form features in the visual system follow the application of the active basis model (ABM) with incremental slow feature analysis (IncSFA). Episodic observation is required to extract the slowest features, whereas the fast features update the processing of motion information in every frame. Applying IncSFA provides an opportunity to abstract human actions and use action prototypes. However, the fast features are obtained from the optical flow division, which gives an opportunity to interact with the system as the final recognition is performed through a combination of the optical flow and ABM-IncSFA information and through the application of kernel extreme learning machine. Applying IncSFA into the ventral stream and involving slow and fast features in the recognition mechanism are the major contributions of this research. The two human action datasets for benchmarking (KTH and Weizmann) and the results highlight the promising performance of this approach in model modification.  相似文献   

11.
Concerns the effect of noise on the performance of feedforward neural nets. We introduce and analyze various methods of injecting synaptic noise into dynamically driven recurrent nets during training. Theoretical results show that applying a controlled amount of noise during training may improve convergence and generalization performance. We analyze the effects of various noise parameters and predict that best overall performance can be achieved by injecting additive noise at each time step. Noise contributes a second-order gradient term to the error function which can be viewed as an anticipatory agent to aid convergence. This term appears to find promising regions of weight space in the beginning stages of training when the training error is large and should improve convergence on error surfaces with local minima. The first-order term is a regularization term that can improve generalization. Specifically, it can encourage internal representations where the state nodes operate in the saturated regions of the sigmoid discriminant function. While this effect can improve performance on automata inference problems with binary inputs and target outputs, it is unclear what effect it will have on other types of problems. To substantiate these predictions, we present simulations on learning the dual parity grammar from temporal strings for all noise models, and present simulations on learning a randomly generated six-state grammar using the predicted best noise model.  相似文献   

12.
针对铜冶炼过程中的能耗难以预测的问题,提出基于支持向量回归的铜冶炼节能过程参数优化学习方法:,首先分析影响铜能耗的各种参数,然后利支持向量回归算法对输入参数和输出能耗之间的关系进行训练,从而筛选出最优参数,为生产能耗控制模型提供了基础。实验结果:表明,提出方法:较传统的BP神经网络算法相比具有学习速度快,收敛性好,泛化能力强等特点,且能耗预测的平均相对误差小于7%。  相似文献   

13.
Sprekeler H 《Neural computation》2011,23(12):3287-3302
The past decade has seen a rise of interest in Laplacian eigenmaps (LEMs) for nonlinear dimensionality reduction. LEMs have been used in spectral clustering, in semisupervised learning, and for providing efficient state representations for reinforcement learning. Here, we show that LEMs are closely related to slow feature analysis (SFA), a biologically inspired, unsupervised learning algorithm originally designed for learning invariant visual representations. We show that SFA can be interpreted as a function approximation of LEMs, where the topological neighborhoods required for LEMs are implicitly defined by the temporal structure of the data. Based on this relation, we propose a generalization of SFA to arbitrary neighborhood relations and demonstrate its applicability for spectral clustering. Finally, we review previous work with the goal of providing a unifying view on SFA and LEMs.  相似文献   

14.
Quadrupeds show several locomotion patterns when adapting to environmental conditions. An immediate transition among walk, trot, and gallop implies the existence of a memory for locomotion patterns. In this article, we postulate that motion pattern learning necessitates the repetitive presentation of the same environmental conditions and aim at constructing a mathematical model for new pattern learning. The model construction considers a decerebrate cat experiment in which only the left forelimb is driven at higher speed by a belt on a treadmill. A central pattern generator (CPG) model that qualitatively describes this decerebrate cat's behavior has already been proposed. In developing this model, we introduce a memory mechanism to store the locomotion pattern, where the memory is represented as the minimal point of the potential function. The recollection process is described as a gradient system of this potential function, while in the memorization process a new pattern learning is regarded as a new minimal point generation by the bifurcation from an already existing minimal point. Finally, we discuss the generalization of this model to motion adaptation and learning.  相似文献   

15.
The subject of this article is the modelling of the influence of non-minimum phase discrete-time system dynamics on the performance of norm optimal iterative learning control (NOILC) algorithms with the intent of explaining the observed phenomenon and predicting its primary characteristics. It is established that performance in the presence of one or more non-minimum phase plant zeros typically has two phases. These consist of an initial fast monotonic reduction of the L 2 error norm (mean square error) followed by a very slow asymptotic convergence. Although the norm of the tracking error does eventually converge to zero, the practical implications over a finite number of trials is apparent convergence to a non-zero error. The source of this slow convergence is identified using the singular value distribution of the system's all pass component. A predictive model of the onset of slow convergence behaviour is developed as a set of linear constraints and shown to be valid when the iteration time interval is sufficiently long. The results provide a good prediction of the magnitude of error norm where slow convergence begins. Formulae for this norm and associated error time series are obtained for single-input single-output systems with several non-minimum phase zeros outside the unit circle using Lagrangian techniques. Numerical simulations are given to confirm the validity of the analysis.  相似文献   

16.
多目标决策在大脑的认知功能中起着关键的作用.在本研究中,我们将一个额叶视区网络模型扩展为一个基于学习的模型,并训练这个模型使其完成一个认知决策任务——non-choice任务,然后用模拟结果解释大脑在进行多目标选择时的认知过程.经过上千次训练后,额叶视区模型从随机选择决策目标转变为选择与最大奖励相关联的决策.在训练过程中,模型的多目标决策顺序也与目标关联的奖励梯度相关.此外,改变不同决策间的奖励差对模型的决策速度有重要的影响,可以使模型进入两种学习阶段:快速学习阶段和慢速学习阶段.  相似文献   

17.
Sparse representations have gained much interest due to the rapid growth of intelligent embedded systems, the need to reduce the time to mine large datasets, and for reducing the footprint of recognition based applications on portable devices. Computational learning theory tells us that the Vapnik–Chervonenkis (VC) dimension of a learning model directly impacts the structural risk and the generalization ability of a learning model. The minimal complexity machine (MCM) was recently proposed as a way to learn a hyperplane classifier by minimizing a tight bound on the VC dimension; results show that it learns very sparse representations that yield test set accuracies that are comparable to the state-of-the-art. The MCM formulation works in the primal itself, both when the classifier is learnt in the input space and when it is learnt implicitly in a higher dimensional feature space. In the latter case, the hyperplane is constructed in the empirical feature space (EFS). In this paper, we examine the hyperplane restricted to the EFS. The EFS is a finite dimensional vector space spanned by the image vectors in a higher dimensional feature space. Since the VC dimension of a linear hyperplane classifier is exactly the number of features used by the classifier, the dimension of the EFS is a direct measure of both the sparsity of the model and the VC dimension. This allows us to formulate optimization problems that focus on learning sparse representations, and yet generalize well. We derive an EFS version of the MCM, that allows us to minimize the model complexity and improve sparsity. We also propose a novel least squares version of the MCM in the EFS. Experimental results demonstrate that the EFS variants yield sparse models with generalization comparable to the state-of-the-art.  相似文献   

18.
The phasic firing of dopamine neurons has been theorized to encode a reward-prediction error as formalized by the temporal-difference (TD) algorithm in reinforcement learning. Most TD models of dopamine have assumed a stimulus representation, known as the complete serial compound, in which each moment in a trial is distinctly represented. We introduce a more realistic temporal stimulus representation for the TD model. In our model, all external stimuli, including rewards, spawn a series of internal microstimuli, which grow weaker and more diffuse over time. These microstimuli are used by the TD learning algorithm to generate predictions of future reward. This new stimulus representation injects temporal generalization into the TD model and enhances correspondence between model and data in several experiments, including those when rewards are omitted or received early. This improved fit mostly derives from the absence of large negative errors in the new model, suggesting that dopamine alone can encode the full range of TD errors in these situations.  相似文献   

19.
基于支持向量机的蒸煮过程卡伯值软测量   总被引:2,自引:0,他引:2  
卡伯值是制浆蒸煮过程表征纸浆质量的一个重要质量指标,卡伯值软测量是实现蒸煮过程质量控制的重要途径。支持向量机是一种新型的机器学习方法。该方法采用了结构风险最小化原则,与传统机器学习方法相比,在最小化学习误差的同时可以保证有较小的泛化误差。将支持向量机应用于制浆蒸煮过程卡伯值的软测量建模,取得了比经验模型更好的预测效果。  相似文献   

20.
传统决策树通过对特征空间的递归划分寻找决策边界,给出特征空间的“硬”划分。但对于处理大数据和复杂模式问题时,这种精确决策边界降低了决策树的泛化能力。为了让决策树算法获得对不精确知识的自动获取,把模糊理论引进了决策树,并在建树过程中,引入神经网络作为决策树叶节点,提出了一种基于神经网络的模糊决策树改进算法。在神经网络模糊决策树中,分类器学习包含两个阶段:第一阶段采用不确定性降低的启发式算法对大数据进行划分,直到节点划分能力低于真实度阈值[ε]停止模糊决策树的增长;第二阶段对该模糊决策树叶节点利用神经网络做具有泛化能力的分类。实验结果表明,相较于传统的分类学习算法,该算法准确率高,对识别大数据和复杂模式的分类问题能够通过结构自适应确定决策树规模。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号