首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Slow Feature Analysis (SFA) extracts slowly varying features from a quickly varying input signal. It has been successfully applied to modeling the visual receptive fields of the cortical neurons. Sufficient experimental results in neuroscience suggest that the temporal slowness principle is a general learning principle in visual perception. In this paper, we introduce the SFA framework to the problem of human action recognition by incorporating the discriminative information with SFA learning and considering the spatial relationship of body parts. In particular, we consider four kinds of SFA learning strategies, including the original unsupervised SFA (U-SFA), the supervised SFA (S-SFA), the discriminative SFA (D-SFA), and the spatial discriminative SFA (SD-SFA), to extract slow feature functions from a large amount of training cuboids which are obtained by random sampling in motion boundaries. Afterward, to represent action sequences, the squared first order temporal derivatives are accumulated over all transformed cuboids into one feature vector, which is termed the Accumulated Squared Derivative (ASD) feature. The ASD feature encodes the statistical distribution of slow features in an action sequence. Finally, a linear support vector machine (SVM) is trained to classify actions represented by ASD features. We conduct extensive experiments, including two sets of control experiments, two sets of large scale experiments on the KTH and Weizmann databases, and two sets of experiments on the CASIA and UT-interaction databases, to demonstrate the effectiveness of SFA for human action recognition. Experimental results suggest that the SFA-based approach (1) is able to extract useful motion patterns and improves the recognition performance, (2) requires less intermediate processing steps but achieves comparable or even better performance, and (3) has good potential to recognize complex multiperson activities.  相似文献   

2.
Understanding the guiding principles of sensory coding strategies is a main goal in computational neuroscience. Among others, the principles of predictive coding and slowness appear to capture aspects of sensory processing. Predictive coding postulates that sensory systems are adapted to the structure of their input signals such that information about future inputs is encoded. Slow feature analysis (SFA) is a method for extracting slowly varying components from quickly varying input signals, thereby learning temporally invariant features. Here, we use the information bottleneck method to state an information-theoretic objective function for temporally local predictive coding. We then show that the linear case of SFA can be interpreted as a variant of predictive coding that maximizes the mutual information between the current output of the system and the input signal in the next time step. This demonstrates that the slowness principle and predictive coding are intimately related.  相似文献   

3.
Wiskott L 《Neural computation》2003,15(9):2147-2177
Temporal slowness is a learning principle that allows learning of invariant representations by extracting slowly varying features from quickly varying input signals. Slow feature analysis (SFA) is an efficient algorithm based on this principle and has been applied to the learning of translation, scale, and other invariances in a simple model of the visual system. Here, a theoretical analysis of the optimization problem solved by SFA is presented, which provides a deeper understanding of the simulation results obtained in previous studies.  相似文献   

4.
Li M  Clark JJ 《Neural computation》2004,16(11):2293-2321
Incorporation of visual-related self-action signals can help neural networks learn invariance. We describe a method that can produce a network with invariance to changes in visual input caused by eye movements and covert attention shifts. Training of the network is controlled by signals associated with eye movements and covert attention shifting. A temporal perceptual stability constraint is used to drive the output of the network toward remaining constant across temporal sequences of saccadic motions and covert attention shifts. We use a four-layer neural network model to perform the position-invariant extraction of local features and temporal integration of invariant presentations of local features in a bottom-up structure. We present results on both simulated data and real images to demonstrate that our network can acquire both position and attention shift invariance.  相似文献   

5.
Over successive stages, the ventral visual system develops neurons that respond with view, size and position invariance to objects including faces. A major challenge is to explain how invariant representations of individual objects could develop given visual input from environments containing multiple objects. Here we show that the neurons in a 1-layer competitive network learn to represent combinations of three objects simultaneously present during training if the number of objects in the training set is low (e.g. 4), to represent combinations of two objects as the number of objects is increased to for e.g. 10, and to represent individual objects as the number of objects in the training set is increased further to for e.g. 20. We next show that translation invariant representations can be formed even when multiple stimuli are always present during training, by including a temporal trace in the learning rule. Finally, we show that these concepts can be extended to a multi-layer hierarchical network model (VisNet) of the ventral visual system. This approach provides a way to understand how a visual system can, by self-organizing competitive learning, form separate invariant representations of each object even when each object is presented in a scene with multiple other objects present, as in natural visual scenes.  相似文献   

6.
针对传统人工提取专家特征来进行通信信号识别的方法存在局限性大、低信噪比下准确率低的问题,提出一种复基带信号与卷积神经网络自动调制识别相结合的新方法。该方法将接收到的信号进行预处理,得到包含同相分量和正交分量的复基带信号,该信号作为输入卷积神经网络模型的数据集,通过多次训练调整模型结构以及卷积核、步长、特征图和激活函数等超参数,利用训练好的模型对通信信号进行特征提取和识别。实现了对2FSK、4FSK、BPSK、8PSK、QPSK、QAM16和QAM64 七种数字通信信号类型的识别分类。实验结果表明,当信噪比为0dB时,七种信号的平均识别准确率已达94.61%,验证了算法是有效的且在低信噪比条件下有较高的准确率。  相似文献   

7.
张振  许少华 《软件》2020,(2):102-107
针对多通道非线性时变信号分类问题,提出一种基于稀疏自编码器的深度小波过程神经网络(SAE-DWPNN)。通过构建一种多输入/多输出的小波过程神经网络(WPNN),实现对时变信号的多尺度分解和对过程分布特征的初步提取;通过在WPNN隐层之后叠加一个SAE深度网络,对所提取的信号特征进行高层次的综合和表示,并基于softmax分类器实现对时变信号的分类。SAE-DWPNN将现有过程神经网络扩展为深度结构,同时将深度SAE网络在信息处理机制上扩展到时间域,扩展了两类模型的信息处理能力。该网络可提取多通道时序信号的分布特征及其结构特征,并保持样本特征的多样性,提高了对信号时频特性和结构特征的分析能力。文中分析了SAE-DWPNN的性质,给出了综合训练算法。以基于12导联ECG信号的7种心血管疾病分类诊断为例,实验结果验证了模型和算法的有效性。  相似文献   

8.
周鹏 《计算机应用研究》2023,40(6):1728-1733
目前已有的手指运动想象脑电信号多分类任务的分类性能均难以达到可用性能。在详细分析脑电信号时间尺度上的多种成分的基础上,设计一种信号子段提取的自监督子网络,然后把子段输入下一个子网络用于信号分类,两个子网综合成一个自监督混合的多任务深度网络。在训练阶段,子段提取子网络针对每条脑电信号提取不同的子段,由后面的分类子网络来判断该子段是否最佳而自动调整子段位置,总体损失函数由两个子网络的两个损失函数加权而成,通过整体网络学习算法实现最佳子段信号的提取并获得最佳分类效果。验证和测试阶段,子段提取子网络按照训练完成的参数自动提取相应的子段输入分类子网络进行分类。在the largest SCP data of Motor-Imagery和BCI Competition IV中Data sets 4数据集上进行网络性能验证,SCP数据集上全部受试者3指分类任务的平均测试分类准确率达70%以上,4指平均测试分类准确率达60%左右,5指平均测试分类准确率达50%左右,比现有的报道有明显的提升。证实该网络能够有效地提取出运动想象脑电信号子段,具有良好的分类效果和泛化性能。  相似文献   

9.
水声信号识别近年来备受关注,由于海洋信道具有时变空变性、信号传播的衰落特性和水下目标声源具有复杂多变性,水声信号识别任务面临巨大挑战.传统的水声信号识别方法难以充分获取目标的表征信息且不具备良好的抗噪声能力,识别效果有待提升.针对上述问题,本文提出一种基于多分支外部注意力网络(multi-branch external attention network, MEANet)的水声信号识别方法,可以在复杂海洋环境下充分获取水声信号的特征并进行识别. MEANet由多分支主干网络,通道、空间注意力模块和外部注意力模块组成.首先,输入数据通过多个并行的主干网络分支,提取水声信号不同层级的特征信息;其次,辅以通道、空间注意力模块对水声信号的通道和空间维度分别进行加权,调节不同通道和空间位置对特征表示的重要性;最后,整合外部注意力模块,以外部记忆单元和附加计算来引导网络的特征提取和预测,从而显著提高模型的识别率和鲁棒性.实验结果表明,本文提出的MEANet在ShipsEar数据集上的水声信号识别率达到98.84%,显著优于其他对比算法,证实了其有效性.  相似文献   

10.
目的 远程光体积描记(remote photoplethysmography,rPPG)是一种基于视频的非接触式心率测量技术,受到学者的广泛关注。从视频数据中提取脉搏信号需要同时考虑时间和空间信息,然而现有方法往往将空间处理与时间处理割裂开,从而造成建模不准确、测量精度不高等问题。本文提出一种基于多视角2维卷积的神经网络模型,对帧内和帧间相关性进行建模,从而提高测量精度。方法 所提网络包括普通2维卷积块和多视角卷积块。普通2维卷积块将输入数据在空间维度做初步抽象。多视角卷积块包括3个通道,分别从输入数据的高—宽、高—时间、宽—时间3个视角进行2维卷积操作,再将3个视角的互补时空特征进行融合得到最终的脉搏信号。所提多视角2维卷积是对传统单视角2维卷积网络在时间维度的扩展。该方法不破坏视频原有结构,通过3个视角的卷积操作挖掘时空互补特征,从而提高脉搏测量精度。结果 在公共数据集PURE(pulse rate detection dataset)和自建数据集Self-rPPG(self-built rPPG dataset)上的实验结果表明,所提网络提取脉搏信号的信噪比相比于传统方法在两个数据集上分别提高了3.92 dB和1.92 dB,平均绝对误差分别降低了3.81 bpm和2.91 bpm;信噪比相比于单视角网络分别提高了2.93 dB和3.20 dB,平均绝对误差分别降低了2.20 bpm和3.61 bpm。结论 所提网络能够在复杂环境中以较高精度估计出受试者的脉搏信号,表明了多视角2维卷积在rPPG脉搏提取的有效性。与基于单视角2维神经网络的rPPG算法相比,本文方法提取的脉搏信号噪声、低频分量更少,泛化能力更强。  相似文献   

11.
姚家琪  荆华  赵春晖 《控制与决策》2023,38(7):1918-1926
旋转机械设备是工业生产中的关键性设备,对其进行高效故障诊断,对于保障工业安全生产具有重要意义.传统的旋转机械设备智能故障诊断方法采取人工特征提取策略,存在依赖专家经验知识、特征泛化性差、特征完备性不足等局限性,导致故障诊断模型精度差,特别是在噪声环境下性能下降明显.对此,提出一种用于旋转机械故障诊断的多模态耦合输入神经网络模型.首先,利用信号分解方法将原始输入信号分解为多个子信号,并将子信号与原始信号成对组成二维矩阵并输入到神经网络中,使得网络能够提取其间重要的相关特征;然后,利用双通道并行的卷积神经网络和长短期记忆网络分别提取信号中的时空间特征并融合,大大提高网络模型的特征表达完备性,实现对旋转机械设备的高精度故障分类.通过实验验证了所提出模型相较于传统故障模型具有更高的准确率,并且对于噪声干扰也有较好的适应性.  相似文献   

12.
Recently, there has been interest in developing diagnosis methods that combine model-based and data-driven diagnosis. In both approaches, selecting the relevant measurements or extracting important features from historical data is a key determiner of the success of the algorithm. Recently, deep learning methods have been effective in automating the feature selection process. Autoencoders have been shown to be an effective neural network configuration for extracting features from complex data, however, they may also learn irrelevant features. In addition, end-to-end classification neural networks have also been used for diagnosis, but like autoencoders, this method may also learn unimportant features thus making the diagnostic inference scheme inefficient. To rapidly extract significant fault features, this paper employs end-to-end networks and develops a new feature extraction method based on importance analysis and knowledge distilling. First, a set of cumbersome neural network models are trained to predict faults and some of their internal values are defined as features. Then an occlusion-based importance analysis method is developed to select the most relevant input variables and learned features. Finally, a simple student neural network model is designed based on the previous analysis results and an improved knowledge distilling method is proposed to train the student model. Because of the way the cumbersome networks are trained, only fault features are learned, with the importance analysis further pruning the relevant feature set. These features can be rapidly generated by the student model. We discuss the algorithms, and then apply our method to two typical dynamic systems, a communication system and a 10-tank system employed to demonstrate the proposed approach.  相似文献   

13.
因为噪声总是会影响检测的结果,所以低信噪比下的信号检测是目前检测领域的热点,而强噪声背景下微弱信号的提取又是信号检测的难点。小波神经网络比数字滤波器更加适合检测微弱信号。小波神经网络是一种时频分析的自适应系统,它能检测信号中的微小变化。该文提出了一种新的检测白噪声中微弱信号的方法。仿真结果表明,小波神经网络在检测微弱信号的特征和改善信噪比方面是一种十分有效的方法。  相似文献   

14.
廖辉  周国荣 《信息与控制》2003,32(5):413-417
在存在大量混合随机干扰的系统中,采用一种特征值信号的智能提取方法.通过对对象信号波形的特征搜索,实时地辨识和提取能反映被测对象真实状态的特征信号,有效地抑制干扰对波形信号的影响,大大提高整个系统的检测和监控精度.文章给出了对象模式特征的表述、特征信号的生成与提取的理论描述和实现方法.实际工程项目中的成功应用表明,该方法具有很强的处理信号干扰的能力,尤其适用于受现场混合干扰影响的系统的分析与监控,具有很好的应用和推广价值.  相似文献   

15.
An artificial neural network that self-organizes to recognize various images presented as a training set is described. One application of the network uses multiple functionally disjoint stages to provide pattern recognition that is invariant to translations of the object in the image plane. The general form of the network uses three stages that perform the functionally disjoint tasks of preprocessing, invariance, and recognition. The preprocessing stage is a single layer of processing elements that performs dynamic thresholding and intensity scaling. The invariance stage is a multilayered connectionist implementation of a modified Walsh-Hadamard transform used for generating an invariant representation of the image. The recognition stage is a multilayered self-organizing neural network that learns to recognize the representation of the input image generated by the invariance stage. The network can successfully self-organize to recognize objects without regard to the location of the object in the image field and has some resistance to noise and distortions  相似文献   

16.
Learning identity with radial basis function networks   总被引:11,自引:0,他引:11  
Radial basis function (RBF) networks are compared with other neural network techniques on a face recognition task for applications involving identification of individuals using low-resolution video information. The RBF networks are shown to exhibit useful shift, scale and pose (y-axis head rotation) invariance after training when the input representation is made to mimic the receptive field functions found in early stages of the human vision system. In particular, representations based on difference of Gaussian (DoG) filtering and Gabor wavelet analysis are compared. Extensions of the techniques to the case of image sequence analysis are described and a time delay (TD) RBF network is used for recognising simple movement-based gestures. Finally, we discuss how these techniques can be used in real-life applications that require recognition of faces and gestures using low-resolution video images.  相似文献   

17.
针对心电(ECG)信号智能分析模型中,复杂波形的特征提取困难,人工设计特征造成源信号特征丢失,标签样本不足等问题,提出了一种基于深度稀疏自编码器(Deep Sparse Auto-Encoders,DSAEs)的ECG特征提取方法。该方法在DSAEs进行贪婪逐层训练时,采用适应性矩阵估计(Adaptive moment estimation,Adam)对网络权重进行寻优,以此获得最优参数组合,同时提取出高层隐含层的输出,并作为ECG高度抽象的低维特征。最后利用支持向量机(Support Vector Machines,SVM)构建分类模型,完成对ECG的特征分类。使用MIT-BIH心律失常数据库的ECG数据进行仿真实验,结果表明,提出的ECG特征提取方法能有效地分层抽取特征,提高分类识别准确率。  相似文献   

18.
吕菲  夏秀渝 《自动化学报》2017,43(4):634-644
经典的听觉注意计算模型主要针对声音强度、频率、时间等初级听觉特征进行研究,这些特征不能较好地模拟听觉注意指向性,必须寻求更高级的听觉特征来区分不同声音.根据听觉感知机制,本文基于声源方位特征和神经网络提出了一种双通路信息处理的自下而上听觉选择性注意计算模型.模型首先对双耳信号进行预处理和频谱分析;然后,将其分别送入where通路和what通路,其中where通路用于提取方位特征参数,并利用神经网络提取声源的局部方位特征,接着通过局部特征聚合和全局优化法得到方位特征显著图;最后,根据方位特征显著图提取主导方位并作用于what通路,采用时频掩蔽法分离出相应的主导音.仿真结果表明:该模型引入方位特征作为聚类线索,利用多级神经网络自动筛选出值得注意的声音对象,实时提取复杂声学环境中的主导音,较好地模拟了人类听觉的方位分类机制、注意选择机制和注意转移机制.  相似文献   

19.
It was confirmed that a real mobile robot with a simple visual sensor could learn appropriate motions to reach a target object by direct-vision-based reinforcement learning (RL). In direct-vision-based RL, raw visual sensory signals are put directly into a layered neural network, and then the neural network is trained using back propagation, with the training signal being generated by reinforcement learning. Because of the time-delay in transmitting the visual sensory signals, the actor outputs are trained by the critic output at two time-steps ahead. It was shown that a robot with a simple monochrome visual sensor can learn to reach a target object from scratch without any advance knowledge of this task by direct-vision-based RL. This work was presented in part at the 7th International Symposium on Artificial Life and Robotics, Oita, Japan, January 16–18, 2002  相似文献   

20.
为解决睡眠呼吸暂停(Sleep Apnea, SA)检测中使用传统的机器学习方法需花大量工作在特征工程上导致效率低下,以及模型多以单通道信号进行特征提取存在识别效果不佳的问题,提出一种基于时序卷积网络(Temporal Convolutional Network, TCN)和堆叠稀疏降噪自编码器(Stacked Sparse Denoismg Auto-Encoder, SSDAEs)的多模态特征融合模型来实现特征自动提取。该模型以心电和呼吸2种信号作为输入,首先利用TCN网络提取输入信号的时序特征,然后通过SSDAEs提取信号的浅层与深层的高维特征,对于不同特征空间的心电信号特征和呼吸信号特征采用一个小型神经网络进行特征融合,将该模型与随机森林算法结合,用于解决SA片段检测问题。实验结果表明,该方法在SA片段检测的准确率、灵敏度、特异性分别是91.5%、88.9%、90.8%。通过与以往相关研究对比,验证了该模型的SA检测性能更好,效率更高。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号