首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Human action recognition is an important issue in the pattern recognition field, with applications ranging from remote surveillance to the indexing of commercial video content. However, human actions are characterized by non-linear dynamics and are therefore not easily learned and recognized. Accordingly, this study proposes a silhouette-based human action recognition system in which a three-step procedure is used to construct an efficient discriminant spatio-temporal subspace for k-NN classification purposes. In the first step, an Adaptive Locality Preserving Projection (ALPP) method is proposed to obtain a low-dimensional spatial subspace in which the linearity in the local data structure is preserved. To resolve the problem of overlaps in the spatial subspace resulting from the ambiguity of the human body shape among different action classes, temporal data are extracted using a Non-base Central-Difference Action Vector (NCDAV) method. Finally, the Large Margin Nearest Neighbor (LMNN) metric learning method is applied to construct an efficient spatio-temporal subspace for classification purposes. The experimental results show that the proposed system accurately recognizes a variety of human actions in real time and outperforms most existing methods. In addition, a robustness test with noisy data indicates that our system is remarkably robust toward noise in the input images.  相似文献   

3.
A method for electrocardiogram (ECG) pattern modeling and recognition via deterministic learning theory is presented in this paper. Instead of recognizing ECG signals beat-to-beat, each ECG signal which contains a number of heartbeats is recognized. The method is based entirely on the temporal features (i.e., the dynamics) of ECG patterns, which contains complete information of ECG patterns. A dynamical model is employed to demonstrate the method, which is capable of generating synthetic ECG signals. Based on the dynamical model, the method is shown in the following two phases: the identification (training) phase and the recognition (test) phase. In the identification phase, the dynamics of ECG patterns is accurately modeled and expressed as constant RBF neural weights through the deterministic learning. In the recognition phase, the modeling results are used for ECG pattern recognition. The main feature of the proposed method is that the dynamics of ECG patterns is accurately modeled and is used for ECG pattern recognition. Experimental studies using the Physikalisch-Technische Bundesanstalt (PTB) database are included to demonstrate the effectiveness of the approach.  相似文献   

4.
群体行为的多层次深度分析是行为识别领域亟待解决的重要问题。在深度神经网络研究的基础上,提出了群体行为识别的层级性分析模型。基于调控网络的迁移学习,实现了行为群体中多人体的时序一致性检测;通过融合时空特征学习,完成了群体行为中时长无约束的个体行为识别;通过场景中个体行为类别、交互场景上下文信息的融合,实现了对群体行为稳定有效的识别。在公用数据集上进行的大量实验表明,与现有方法相比,该模型在群体行为分析识别方面具有良好的效果。  相似文献   

5.
传统人体动作识别算法无法充分利用视频中人体动作的时空信息,且识别准确率较低。提出一种新的三维密集卷积网络人体动作识别方法。将双流网络作为基本框架,在空间网络中运用添加注意力机制的三维密集网络提取视频中动作的表观信息特征,结合时间网络对连续视频序列运动光流的运动信息进行特征提取,经过时空特征和分类层的融合后得到最终的动作识别结果。同时为更准确地提取特征并对时空网络之间的相互作用进行建模,在双流网络之间加入跨流连接对时空网络进行卷积层的特征融合。在UCF101和HMDB51数据集上的实验结果表明,该模型识别准确率分别为94.52%和69.64%,能够充分利用视频中的时空信息,并提取运动的关键信息。  相似文献   

6.
利用图像传感器的光电轴角编码器编码研究   总被引:2,自引:0,他引:2  
研究了绝对式角度编码器的编码原理及解码方法.在分析国内外已有编码原理的基础上,提出了一种以位移连续码为基础的新型码盘图案,该码盘在背景光照射下,由图像传感器进行光学图像采集,快速数字化后经代码识别算法获取绝对位置信息.该图形简单,制作方便,类似于增量式编码器图案,但实现了绝对式编码,易于实现编码器的小型化.实验结果表明该系统设计及理论依据是完全正确的.  相似文献   

7.
针对视频动作识别中的时空建模问题,在深度学习框架下提出基于融合时空特征的时序增强动作识别方法.首先对输入视频应用稀疏时序采样策略,适应视频时长变化,降低视频级别时序建模成本.在识别阶段计算相邻特征图间的时序差异,以差异计算结果增强特征级别的运动信息.最后,利用残差结构与时序增强结构的组合方式提升网络整体时空建模能力.实验表明,文中算法在UCF101、HMDB51数据集上取得较高准确率,并在实际工业操作动作识别场景下,以较小的网络规模达到较优的识别效果.  相似文献   

8.
The paper proposes a novel approach to fuzzy modeling of human working memory (WM) using electroencephalographic (EEG) signals, acquired during human face encoding and recall experiments in connection with a face recognition problem. The EEG signals acquired from the short term memory (STM) during memory encoding instances are considered as the input of the proposed working memory model. On the other hand, the EEG response of the WM to visual stimuli acquired during WM recall instances are considered as the output of the proposed working memory model. The entire experiment is primarily divided into two phases. In the first phase, the WM of a human subject is modeled by a fuzzy implication relation, describing a mapping from the STM response (during encoding) to the WM responses (during recall) to visual stimuli. During STM encoding, the subject is visually presented with the full face stimulus of a person. During WM recall, four partial face stimuli of the same person (made familiar during encoding) are used for the subject to recall the respective full face.The second phase is undertaken to validate the WM model by visually stimulating the subject again with randomly selected partial faces of people, being familiar in the first phase and the WM EEG responses are recorded. The WM responses along with the WM model, developed in the first phase, are used to retrieve the STM information by using an inverse fuzzy (implication) relation. Besides WM modeling, another important contribution of the paper lies in devising a solution to the inverse fuzzy relation computation in the settings of an optimization problem. An error metric is then defined to measure the discrepancy between the model-predicted STM encoding pattern and the actual pattern encoded by the STM (as captured by the EEG signal during encoding in the first phase). Apparently, smaller the error magnitude better is the accuracy of the proposed model to effectively differentiate people with memory failures. Experimentally it is observed that the proposed model yields a very small error, in the order of 10−4, thus showing a high level of similarity between actual and model predicted STM response for all the healthy subjects. An experiment undertaken using eLORETA software confirms that the orbito-frontal cortex of prefrontal lobe is responsible for STM encoding whereas dorsolateral prefrontal region is responsible for WM recall. An analysis undertaken reveals that the proposed WM model produces the best response in the theta frequency band of EEG spectra, thus assuring the association of the theta frequency range in the face recognition task. Comparative analysis performed also substantiates that the proposed technique of computing max–min inverse fuzzy relation outperforms the existing techniques for inverse fuzzy computation, with a successful retrieval accuracy of 87.92%. The proposed study would find interesting applications to diagnose memory failures for people with Pre-frontal lobe amnesia.  相似文献   

9.
为提高室内移动机器人的环境感知能力,针对其常处的结构化走廊场景的分类、Spiking神经网络(SNN)和基于SNN的新型计算模型NeuCube进行研究。SNN利用尖脉冲传递时、空信息,比传统的神经网络更适于动态、时序信息的分析,以及各种模式信息的识别和分类。此外,SNN更易于用硬件实现。在对NeuCube的基本原理、学习方法和计算步骤进行讨论的基础上,利用多超声传感信息和NeuCube对室内移动机器人常处的7种走廊场景进行识别。实验结果表明基于多超声传感信息和NeuCube的移动机器人走廊场景分类方法可以对7种走廊场景进行有效识别,该方法有助于增强移动机器人的自主性和提高其智能水平。  相似文献   

10.
In this paper, we present a new silhouette-based gait recognition method via deterministic learning theory, which combines spatio-temporal motion characteristics and physical parameters of a human subject by analyzing shape parameters of the subject?s silhouette contour. It has been validated only in sequences with lateral view, recorded in laboratory conditions. The ratio of the silhouette?s height and width (H–W ratio), the width of the outer contour of the binarized silhouette, the silhouette area and the vertical coordinate of centroid of the outer contour are combined as gait features for recognition. They represent the dynamics of gait motion and can more effectively reflect the tiny variance between different gait patterns. The gait recognition approach consists of two phases: a training phase and a test phase. In the training phase, the gait dynamics underlying different individuals? gaits are locally accurately approximated by radial basis function (RBF) networks via deterministic learning theory. The obtained knowledge of approximated gait dynamics is stored in constant RBF networks. In the test phase, a bank of dynamical estimators is constructed for all the training gait patterns. The constant RBF networks obtained from the training phase are embedded in the estimators. By comparing the set of estimators with a test gait pattern, a set of recognition errors are generated, and the average L1 norms of the errors are taken as the similarity measure between the dynamics of the training gait patterns and the dynamics of the test gait pattern. The test gait pattern similar to one of the training gait patterns can be recognized according to the smallest error principle. Finally, the recognition performance of the proposed algorithm is comparatively illustrated to take into consideration the published gait recognition approaches on the most well-known public gait databases: CASIA, CMU MoBo and TUM GAID.  相似文献   

11.
This paper presents a novel approach for action recognition, localization and video matching based on a hierarchical codebook model of local spatio-temporal video volumes. Given a single example of an activity as a query video, the proposed method finds similar videos to the query in a target video dataset. The method is based on the bag of video words (BOV) representation and does not require prior knowledge about actions, background subtraction, motion estimation or tracking. It is also robust to spatial and temporal scale changes, as well as some deformations. The hierarchical algorithm codes a video as a compact set of spatio-temporal volumes, while considering their spatio-temporal compositions in order to account for spatial and temporal contextual information. This hierarchy is achieved by first constructing a codebook of spatio-temporal video volumes. Then a large contextual volume containing many spatio-temporal volumes (ensemble of volumes) is considered. These ensembles are used to construct a probabilistic model of video volumes and their spatio-temporal compositions. The algorithm was applied to three available video datasets for action recognition with different complexities (KTH, Weizmann, and MSR II) and the results were superior to other approaches, especially in the case of a single training example and cross-dataset1 action recognition.  相似文献   

12.
It has been proposed that invariant pattern recognition might be implemented using a learning rule that utilizes a trace of previous neural activity which, given the spatio-temporal continuity of the statistics of sensory input, is likely to be about the same object though with differing transforms in the short time scale. Recently, it has been demonstrated that a modified Hebbian rule which incorporates a trace of previous activity but no contribution from the current activity can offer substantially improved performance. In this paper we show how this rule can be related to error correction rules, and explore a number of error correction rules that can be applied to and can produce good invariant pattern recognition. An explicit relationship to temporal difference learning is then demonstrated, and from this further learning rules related to temporal difference learning are developed. This relationship to temporal difference learning allows us to begin to exploit established analyses of temporal difference learning to provide a theoretical framework for better understanding the operation and convergence properties of these learning rules, and more generally, of rules useful for learning invariant representations. The efficacy of these different rules for invariant object recognition is compared using VisNet, a hierarchical competitive network model of the operation of the visual system.  相似文献   

13.
深度学习在人物动作识别方面已取得较好的成效,但当前仍然需要充分利用视频中人物的外形信息和运动信息。为利用视频中的空间信息和时间信息来识别人物行为动作,提出一种时空双流视频人物动作识别模型。该模型首先利用两个卷积神经网络分别抽取视频动作片段空间和时间特征,接着融合这两个卷积神经网络并提取中层时空特征,最后将提取的中层特征输入到3D卷积神经网络来完成视频中人物动作的识别。在数据集UCF101和HMDB51上,进行视频人物动作识别实验。实验结果表明,所提出的基于时空双流的3D卷积神经网络模型能够有效地识别视频人物动作。  相似文献   

14.
This paper presents new findings in the design and application of biologically plausible neural networks based on spiking neuron models, which represent a more plausible model of real biological neurons where time is considered as an important feature for information encoding and processing in the brain. The design approach consists of an evolutionary strategy based supervised training algorithm, newly developed by the authors, and the use of different biologically plausible neuronal models. A dynamic synapse (DS) based neuron model, a biologically more detailed model, and the spike response model (SRM) are investigated in order to demonstrate the efficacy of the proposed approach and to further our understanding of the computing capabilities of the nervous system. Unlike the conventional synapse, represented as a static entity with a fixed weight, employed in conventional and SRM-based neural networks, a DS is weightless and its strength changes upon the arrival of incoming input spikes. Therefore its efficacy depends on the temporal structure of the impinging spike trains. In the proposed approach, the training of the network free parameters is achieved using an evolutionary strategy where, instead of binary encoding, real values are used to encode the static and DS parameters which underlie the learning process. The results show that spiking neural networks based on both types of synapse are capable of learning non-linearly separable data by means of spatio-temporal encoding. Furthermore, a comparison of the obtained performance with classical neural networks (multi-layer perceptrons) is presented.  相似文献   

15.
Spike timing-dependent plasticity (STDP) is a learning rule that modifies the strength of a neuron's synapses as a function of the precise temporal relations between input and output spikes. In many brains areas, temporal aspects of spike trains have been found to be highly reproducible. How will STDP affect a neuron's behavior when it is repeatedly presented with the same input spike pattern? We show in this theoretical study that repeated inputs systematically lead to a shaping of the neuron's selectivity, emphasizing its very first input spikes, while steadily decreasing the postsynaptic response latency. This was obtained under various conditions of background noise, and even under conditions where spiking latencies and firing rates, or synchrony, provided conflicting informations. The key role of first spikes demonstrated here provides further support for models using a single wave of spikes to implement rapid neural processing.  相似文献   

16.
Spatio-temporal pattern recognition problems are particularly challenging. They typically involve detecting change that occurs over time in two-dimensional patterns. Analytic techniques devised for temporal data must take into account the spatial relationships among data points. An artificial neural network known as the self-organizing feature map (SOM) has been used to analyze spatial data. This paper further investigates the use of the SOM with spatio-temporal pattern recognition. The principles of the two-dimensional SOM are developed into a novel three-dimensional network and experiments demonstrate that (i) the three-dimensional network makes a better topological ordering and (ii) there is a difference in terms of the spatio-temporal analysis that can be made with the three-dimensional network. Received 21 October 1999 / Revised 11 February 2000 / Accepted 2 May 2000  相似文献   

17.
The timing information contained in the response of a neuron to noisy periodic synaptic input is analyzed for the leaky integrate-and-fire neural model. We address the question of the relationship between the timing of the synaptic inputs and the output spikes. This requires an analysis of the interspike interval distribution of the output spikes, which is obtained in the gaussian approximation. The conditional output spike density in response to noisy periodic input is evaluated as a function of the initial phase of the inputs. This enables the phase transition matrix to be calculated, which relates the phase at which the output spike is generated to the initial phase of the inputs. The interspike interval histogram and the period histogram for the neural response to ongoing periodic input are then evaluated by using the leading eigenvector of this phase transition matrix. The synchronization index of the output spikes is found to increase sharply as the inputs become synchronized. This enhancement of synchronization is most pronounced for large numbers of inputs and lower frequencies of modulation and also for rates of input near the critical input rate. However, the mutual information between the input phase of the stimulus and the timing of output spikes is found to decrease at low input rates as the number of inputs increases. The results show close agreement with those obtained from numerical simulations for large numbers of inputs.  相似文献   

18.
An optimal design of the driving current pattern for a disc‐type axial‐flux brushless DC wheel motor of an electric vehicle is proposed in this paper. The electro‐magnetic dynamic model of the motor is established with magnetic circuits, describing the relationship between the output torque and excitation current. The optimal current pattern, in terms of magnitude and phase angle, is then obtained by maximizing the output torque with respect to the rotor shift. Compared with the traditional three‐phase‐on current pattern of fixed 120–degree phase shift, both the average torque and efficiency with the driving current of an optimal advanced switching angle are seen to be improved under various loading conditions. The motor performance with the optimal driving waveform is simulated and verified by experiments.  相似文献   

19.
20.
石祥滨  李怡颖  刘芳  代钦 《计算机应用研究》2021,38(4):1235-1239,1276
针对双流法进行视频动作识别时忽略特征通道间的相互联系、特征存在大量冗余的时空信息等问题,提出一种基于双流时空注意力机制的端到端的动作识别模型T-STAM,实现了对视频关键时空信息的充分利用。首先,将通道注意力机制引入到双流基础网络中,通过对特征通道间的依赖关系进行建模来校准通道信息,提高特征的表达能力。其次,提出一种基于CNN的时间注意力模型,使用较少的参数学习每帧的注意力得分,重点关注运动幅度明显的帧。同时提出一种多空间注意力模型,从不同角度计算每帧中各个位置的注意力得分,提取多个运动显著区域,并且对时空特征进行融合进一步增强视频的特征表示。最后,将融合后的特征输入到分类网络,按不同权重融合两流输出得到动作识别结果。在数据集HMDB51和UCF101上的实验结果表明T-STAM能有效地识别视频中的动作。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号