首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Learning to predict future visual dynamics given input video sequences is a challenging but essential task. Although many stochastic video prediction models are proposed, they still suffer from “multi-modal entanglement”, which refers to the ambiguity of learned representations for multi-modal dynamics modeling. While most existing video prediction models are label-free, we propose a self-supervised labeling strategy to improve spatiotemporal prediction networks without extra supervision. Starting from a set of clustered pseudo-labels, our framework alternates between model optimization and label updating. The key insight of our method lies in that we exploit the reconstruction error from the optimized model itself as an indicator to progressively refine the label assignment on the training set. The two steps are interdependent, with the predictive model guiding the direction of label updates, and in turn, effective pseudo-labels further help the model learn better disentangled multi-modal representation. Experiments on two different video prediction datasets demonstrate the effectiveness of the proposed method.  相似文献   

2.
场景文字识别是一个极具挑战性的研究方向,有着重要的应用价值。但是由于文字表现形式丰富多样,识别结果大多不尽如人意。针对此问题,本文提出了基于课程学习的训练方法。该方法对数据集进行排序得到一个难度提升的训练序列,而不是随机地从数据集中选择训练样本,使得模型在训练初期能够学习到更加精确的特征,提高了模型的鲁棒性。通过实验分析,本文所提出的方法可以加快模型的收敛速度,使用不同课程序列训练ASTER算法在COCO-Text数据集上得到1.8%、1%的提升,CRNN算法在COCO-Text数据集上得到0.2%的提升。  相似文献   

3.
基于序列到序列的对话生成在实现情感状态转移时大多采用外部情感词嵌入的方式,编码器很难捕获解码器的情感状态,解码器被强制嵌入的外部情感词干扰,造成生成回复情感词堆叠及缺乏情感信息上下文.为解决上述问题,该文提出基于位置感知的情感可控对话生成模型.在编码的过程中,当前输入词向量和位置向量共同参与编码,在不影响当前输入的情况...  相似文献   

4.
Over successive stages, the ventral visual system develops neurons that respond with view, size and position invariance to objects including faces. A major challenge is to explain how invariant representations of individual objects could develop given visual input from environments containing multiple objects. Here we show that the neurons in a 1-layer competitive network learn to represent combinations of three objects simultaneously present during training if the number of objects in the training set is low (e.g. 4), to represent combinations of two objects as the number of objects is increased to for e.g. 10, and to represent individual objects as the number of objects in the training set is increased further to for e.g. 20. We next show that translation invariant representations can be formed even when multiple stimuli are always present during training, by including a temporal trace in the learning rule. Finally, we show that these concepts can be extended to a multi-layer hierarchical network model (VisNet) of the ventral visual system. This approach provides a way to understand how a visual system can, by self-organizing competitive learning, form separate invariant representations of each object even when each object is presented in a scene with multiple other objects present, as in natural visual scenes.  相似文献   

5.
Visual analysis of human behavior has generated considerable interest in the field of computer vision because of its wide spectrum of potential applications. Human behavior can be segmented into atomic actions, each of which indicates a basic and complete movement. Learning and recognizing atomic human actions are essential to human behavior analysis. In this paper, we propose a framework for handling this task using variable-length Markov models (VLMMs). The framework is comprised of the following two modules: a posture labeling module and a VLMM atomic action learning and recognition module. First, a posture template selection algorithm, based on a modified shape context matching technique, is developed. The selected posture templates form a codebook that is used to convert input posture sequences into discrete symbol sequences for subsequent processing. Then, the VLMM technique is applied to learn the training symbol sequences of atomic actions. Finally, the constructed VLMMs are transformed into hidden Markov models (HMMs) for recognizing input atomic actions. This approach combines the advantages of the excellent learning function of a VLMM and the fault-tolerant recognition ability of an HMM. Experiments on realistic data demonstrate the efficacy of the proposed system.  相似文献   

6.
Brader JM  Senn W  Fusi S 《Neural computation》2007,19(11):2881-2912
We present a model of spike-driven synaptic plasticity inspired by experimental observations and motivated by the desire to build an electronic hardware device that can learn to classify complex stimuli in a semisupervised fashion. During training, patterns of activity are sequentially imposed on the input neurons, and an additional instructor signal drives the output neurons toward the desired activity. The network is made of integrate-and-fire neurons with constant leak and a floor. The synapses are bistable, and they are modified by the arrival of presynaptic spikes. The sign of the change is determined by both the depolarization and the state of a variable that integrates the postsynaptic action potentials. Following the training phase, the instructor signal is removed, and the output neurons are driven purely by the activity of the input neurons weighted by the plastic synapses. In the absence of stimulation, the synapses preserve their internal state indefinitely. Memories are also very robust to the disruptive action of spontaneous activity. A network of 2000 input neurons is shown to be able to classify correctly a large number (thousands) of highly overlapping patterns (300 classes of preprocessed Latex characters, 30 patterns per class, and a subset of the NIST characters data set) and to generalize with performances that are better than or comparable to those of artificial neural networks. Finally we show that the synaptic dynamics is compatible with many of the experimental observations on the induction of long-term modifications (spike-timing-dependent plasticity and its dependence on both the postsynaptic depolarization and the frequency of pre- and postsynaptic neurons).  相似文献   

7.
We have combined competitive and Hebbian learning in a neural network designed to learn and recall complex spatiotemporal sequences. In such sequences, a particular item may occur more than once or the sequence may share states with another sequence. Processing of repeated/shared states is a hard problem that occurs very often in the domain of robotics. The proposed model consists of two groups of synaptic weights: competitive interlayer and Hebbian intralayer connections, which are responsible for encoding respectively the spatial and temporal features of the input sequence. Three additional mechanisms allow the network to deal with shared states: context units, neurons disabled from learning, and redundancy used to encode sequence states. The network operates by determining the current and the next state of the learned sequences. The model is simulated over various sets of robot trajectories in order to evaluate its storage and retrieval abilities; its sequence sampling effects; its robustness to noise and its tolerance to fault.  相似文献   

8.
基于循环神经网络的语音识别模型   总被引:5,自引:1,他引:4  
朱小燕  王昱  徐伟 《计算机学报》2001,24(2):213-218
近年来基于隐马尔可夫模型(HMM)的语音识别技术得到了很大发展。然而HMM模型有着一定的局限性,如何克服HMM的一阶假设和独立性假设带来的问题一直是研究讨论的热点,在语音识别中引入神经网络的方法是克服HMM局限性的一条途径。该文将循环神经网络应用于汉语语音识别,修改了原网络模型并提出了相应的训练方法,实验结果表明该模型具有良好的连续信号处理性能,与传统的HMM模型效果相当,新的训练策略能够在提高训练速度的同时,使得模型分类性能有明显提高。  相似文献   

9.
Many important machine learning models, supervised and unsupervised, are based on simple Euclidean distance or orthogonal projection in a high dimensional feature space. When estimating such models from small training sets we face the problem that the span of the training data set input vectors is not the full input space. Hence, when applying the model to future data the model is effectively blind to the missed orthogonal subspace. This can lead to an inflated variance of hidden variables estimated in the training set and when the model is applied to test data we may find that the hidden variables follow a different probability law with less variance. While the problem and basic means to reconstruct and deflate are well understood in unsupervised learning, the case of supervised learning is less well understood. We here investigate the effect of variance inflation in supervised learning including the case of Support Vector Machines (SVMS) and we propose a non-parametric scheme to restore proper generalizability. We illustrate the algorithm and its ability to restore performance on a wide range of benchmark data sets.  相似文献   

10.
基于粗糙集与BP神经网络的多因素预测模型   总被引:3,自引:0,他引:3       下载免费PDF全文
运用粗糙集方法和信息熵概念,在不改变训练样本分类质量的条件下,按照输入影响因素相对于输出的重要度的大小,对输入参数集进行约简,确定神经网络输入层变量和神经元个数。通过对典型样本的学习,建立粗糙集BP神经网络多因素预测模型,将其用于导弹系统研制费用预测。结果表明,该方法减少了网络的训练时间,改善了学习效率,具有较高的预测精度,是可行的、有效的。  相似文献   

11.
A novel neural network model is described that implements context-dependent learning of complex sequences. The model utilises leaky integrate-and-fire neurons to extract timing information from its input and modifies its weights using a learning rule with synaptic noise. Learning and recall phases are seamlessly integrated so that the network can gradually shift from learning to predicting its input. Experimental results using data from the real-world problem domain demonstrate that the use of context has three important benefits: (a) it prevents catastrophic interference during learning of multiple overlapping sequences, (b) it enables the completion of sequences from missing or noisy patterns, and (c) it provides a mechanism to selectively explore the space of learned sequences during free recall.  相似文献   

12.
Confirming configurations in EFSM testing   总被引:1,自引:0,他引:1  
We investigate the problem of configuration verification for the extended FSM (EFSM) model. This is an extension of the FSM state identification problem. Specifically, given a configuration ("state vector") and an arbitrary set of configurations, determine an input sequence such that the EFSM in the given configuration produces an output sequence different from that of the configurations in the given set or at least in a maximal proper subset. Such a sequence can be used in a test case to confirm the destination configuration of a particular EFSM transition. We demonstrate that this problem could be reduced to the EFSM traversal problem, so that the existing methods and tools developed in the context of model checking become applicable. We introduce notions of EFSM projections and products and, based on these notions, we develop a theoretical framework for determining configuration-confirming sequences. The proposed approach is illustrated on a realistic example.  相似文献   

13.
本文提出一种基于无监督域自适应的行人重识别方法.给定有标签的源域训练集和无标签的目标域训练集,探索如何提升行人重识别模型在目标域测试集上的泛化能力.以此为目的,在模型的训练过程中,将源域和目标域训练集同时输入到模型中进行训练,提取全局特征的同时,提取局部特征进行行人图像描述以学到更加细粒度的特征.提出将长短时记忆网络(LSTM)以端到端的方式应用于行人的建模,将其视为从头到脚的身体部分序列.本文方法主要分为两个步骤:(1)利用StarGAN对无标签目标域图片进行数据增强;(2)源域和目标域数据集同时输入到全局分支和基于LSTM的局部分支共同训练.在Market-1501和DukeMTMC-reID数据集上,本文提出的模型都取得了较好的性能,充分体现了其有效性.  相似文献   

14.
Learning and convergence properties of linear threshold elements or perceptrons are well understood for the case where the input vectors (or the training sets) to the perceptron are linearly separable. Little is known, however, about the behavior of the perceptron learning algorithm when the training sets are linearly nonseparable. We present the first known results on the structure of linearly nonseparable training sets and on the behavior of perceptrons when the set of input vectors is linearly nonseparable. More precisely, we show that using the well known perceptron learning algorithm, a linear threshold element can learn the input vectors that are provably learnable, and identify those vectors that cannot be learned without committing errors. We also show how a linear threshold element can be used to learn large linearly separable subsets of any given nonseparable training set. In order to develop our results, we first establish formal characterizations of linearly nonseparable training sets and define learnable structures for such patterns. We also prove computational complexity results for the related learning problems. Next, based on such characterizations, we show that a perceptron does the best one can expect for linearly nonseparable sets of input vectors and learns as much as is theoretically possible.  相似文献   

15.
In recent years, hydroforming has become the topic of a lot of active research. Researchers have been looking for better procedures and prediction tools to improve the quality of the product and reduce the prototyping cost. Similar to any other metal forming process, hydroforming leads to non-homogeneous plastic deformations of the workpiece. In this paper, a model is developed to predict the amount of deformation caused by hydroforming using random neural networks (RNNs). RNNs learn the behavior of a system from the provided input/output data in a manner similar to the way the human brain does. This is different from the usual connectionist neural network (NN) models which are based on simple functional analyses. Experimental data is collected and used in training as well as testing the RNNs. The RNN models have feedforward architectures and use a generalized learning algorithm in the training process. Multi-layer RNNs with as few as six neurons were used to capture the nonlinear correlations between the input and output data collected from an experimental setup. The RNN models were able to predict the center deflection, the thickness variation, as well as the deformed shape of circular plate specimens with good accuracy. Received: February 2004 / Accepted: September 2005  相似文献   

16.
We survey research of recent years on the supervised training of feedforward neural networks. The goal is to expose how the networks work, how to engineer them so they can learn data with less extraneous noise, how to train them efficiently, and how to assure that the training is valid. The scope covers gradient descent and polynomial line search, from backpropagation through conjugate gradients and quasi Newton methods. There is a consensus among researchers that adaptive step gains (learning rates) can stabilize and accelerate convergence and that a good starting weight set improves both the training speed and the learning quality. The training problem includes both the design of a network function and the fitting of the function to a set of input and output data points by computing a set of coefficient weights. The form of the function can be adjusted by adjoining new neurons and pruning existing ones and setting other parameters such as biases and exponential rates. Our exposition reveals several useful results that are readily implementable  相似文献   

17.
针对基于固定阶Markov链模型的方法不能充分利用不同阶次子序列结构特征的问题,提出一种基于多阶Markov模型的符号序列贝叶斯分类新方法。首先,建立了基于多阶次Markov模型的条件概率分布模型;其次,提出一种附后缀表的n-阶子序列后缀树结构和高效的树构造算法,该算法能够在扫描一遍序列集过程中建立多阶条件概率模型;最后,提出符号序列的贝叶斯分类器,其训练算法基于最大似然法学习不同阶次模型的权重,分类算法使用各阶次的加权条件概率进行贝叶斯分类预测。在三个应用领域实际序列集上进行了系列实验,结果表明:新分类器对模型阶数变化不敏感;与使用固定阶模型的支持向量机等现有方法相比,所提方法在基因序列与语音序列上可以取得40%以上的分类精度提升,且可输出符号序列Markov模型最优阶数参考值。  相似文献   

18.
A new constructive algorithm is presented for building neural networks that learn to reproduce output temporal sequences based on one or several input sequences. This algorithm builds a network for the task of system modelling, dealing with continuous variables in the discrete time domain. The constructive scheme makes it user independent. The network's structure consists of an ordinary set and a classification set, so it is a hybrid network like that of Stokbro et al. [6], but with a binary classification. The networks can easily be interpreted, so the learned representation can be transferred to a human engineer, unlike many other network models. This allows for a better understanding of the system structure than just its simulation. This constructive algorithm limits the network complexity automatically, hence preserving extrapolation capabilities. Examples with real data from three totally different sources show good performance and allow for a promising line of research.  相似文献   

19.
A new algorithm for error-tolerant subgraph isomorphism detection   总被引:3,自引:0,他引:3  
We propose a new algorithm for error-correcting subgraph isomorphism detection from a set of model graphs to an unknown input graph. The algorithm is based on a compact representation of the model graphs. This representation is derived from the set of model graphs in an off-line preprocessing step. The main advantage of the proposed representation is that common subgraphs of different model graphs are represented only once. Therefore, at run time, given an unknown input graph, the computational effort of matching the common subgraphs for each model graph onto the input graph is done only once. Consequently, the new algorithm is only sublinearly dependent on the number of model graphs. Furthermore, the new algorithm can be combined with a future cost estimation method that greatly improves its run-time performance  相似文献   

20.
《Applied Soft Computing》2008,8(1):166-173
Almost all current training algorithms for neural networks are based on gradient descending technique, which causes long training time. In this paper, we propose a novel fast training algorithm called Fast Constructive-Covering Algorithm (FCCA) for neural network construction based on geometrical expansion. Parameters are updated according to the geometrical location of the training samples in the input space, and each sample in the training set is learned only once. By doing this, FCCA is able to avoid iterative computing and much faster than traditional training algorithms. Given an input sequence in an arbitrary order, FCCA learns “easy” samples first and “confusing” samples are easily learned after these “easy” samples. This sample reordering process is done on the fly based on geometrical concept. In addition, FCCA begins with an empty hidden layer, and adds new hidden neurons when necessary. This constructive learning avoids blind selection of neural network structure. The experimental work for classification problems illustrates the advantages of FCCA, especially in learning speed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号