期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Extraction and classification of tempo stimuli from electroencephalography recordings using convolutional recurrent attention model

Gi Yong Lee Min-Soo Kim Hyoung-Gook Kim 《ETRI Journal》2021,43(6):1081-1092

Electroencephalography (EEG) recordings taken during the perception of music tempo contain information that estimates the tempo of a music piece. If information about this tempo stimulus in EEG recordings can be extracted and classified, it can be effectively used to construct a music-based brain–computer interface. This study proposes a novel convolutional recurrent attention model (CRAM) to extract and classify features corresponding to tempo stimuli from EEG recordings of listeners who listened with concentration to the tempo of musics. The proposed CRAM is composed of six modules, namely, network inputs, two-dimensional convolutional bidirectional gated recurrent unit-based sample encoder, sample-level intuitive attention, segment encoder, segment-level intuitive attention, and softmax layer, to effectively model spatiotemporal features and improve the classification accuracy of tempo stimuli. To evaluate the proposed method's performance, we conducted experiments on two benchmark datasets. The proposed method achieves promising results, outperforming recent methods. 相似文献

2.

Design,analysis and application of a volumetric convolutional neural network

《Journal of Visual Communication and Image Representation》2017

The design, analysis and application of a volumetric convolutional neural network (VCNN) are studied in this work. Although many CNNs have been proposed in the literature, their design is empirical. In the design of the VCNN, we propose a feed-forward K-means clustering algorithm to determine the filter number and size at each convolutional layer systematically. For the analysis of the VCNN, the cause of confusing classes in the output of the VCNN is explained by analyzing the relationship between the filter weights (also known as anchor vectors) from the last fully-connected layer to the output. Furthermore, a hierarchical clustering method followed by a random forest classification method is proposed to boost the classification performance among confusing classes. For the application of the VCNN, we examine the 3D shape classification problem and conduct experiments on a popular ModelNet40 dataset. The proposed VCNN offers the state-of-the-art performance among all volume-based CNN methods. 相似文献

3.

Detecting facial manipulated videos based on set convolutional neural networks

《Journal of Visual Communication and Image Representation》2021

With the boom of artificial intelligence, facial manipulation technology is becoming more simple and more numerous. At the same time, the technology also has a large and profound negative impact on face forensics, such as Deepfakes. In this paper, in order to aggregate multiframe features to detect facial manipulation videos, we solve facial manipulated video detection from set perspective and propose a novel framework based on set, which is called set convolutional neural network (SCNN). Three instances of the proposed framework SCNN are implemented and evaluated on the Deepfake TIMIT dataset, FaceForensics++ dataset and DFDC Preview datset. The results show that the method outperforms previous methods and can achieve state-of-the-art performance on both datasets. As a perspective, the proposed method is a fusion promotion of single-frame digital video forensics network. 相似文献

4.

Multi-layered attentional peephole convolutional LSTM for abstractive text summarization

Md. Motiur Rahman Fazlul Hasan Siddiqui 《ETRI Journal》2021,43(2):288-298

Abstractive text summarization is a process of making a summary of a given text by paraphrasing the facts of the text while keeping the meaning intact. The manmade summary generation process is laborious and time-consuming. We present here a summary generation model that is based on multilayered attentional peephole convolutional long short-term memory (MAPCoL; LSTM) in order to extract abstractive summaries of large text in an automated manner. We added the concept of attention in a peephole convolutional LSTM to improve the overall quality of a summary by giving weights to important parts of the source text during training. We evaluated the performance with regard to semantic coherence of our MAPCoL model over a popular dataset named CNN/Daily Mail, and found that MAPCoL outperformed other traditional LSTM-based models. We found improvements in the performance of MAPCoL in different internal settings when compared to state-of-the-art models of abstractive text summarization. 相似文献

5.

Robust image retrieval by cascading a deep quality assessment network

《Signal Processing: Image Communication》2020

The performance of computer vision algorithms can severely degrade in the presence of a variety of distortions. While image enhancement algorithms have evolved to optimize image quality as measured according to human visual perception, their relevance in maximizing the success of computer vision algorithms operating on the enhanced image has been much less investigated. We consider the problem of image enhancement to combat Gaussian noise and low resolution with respect to the specific application of image retrieval from a dataset. We define the notion of image quality as determined by the success of image retrieval and design a deep convolutional neural network (CNN) to predict this quality. This network is then cascaded with a deep CNN designed for image denoising or super resolution, allowing for optimization of the enhancement CNN to maximize retrieval performance. This framework allows us to couple enhancement to the retrieval problem. We also consider the problem of adapting image features for robust retrieval performance in the presence of distortions. We show through experiments on distorted images of the Oxford and Paris buildings datasets that our algorithms yield improved mean average precision when compared to using enhancement methods that are oblivious to the task of image retrieval. ¹ 相似文献

6.

Learning to predict the quality of distorted-then-compressed images via a deep neural network

《Journal of Visual Communication and Image Representation》2021

Being captured by amateur photographers, reciprocally propagated through multimedia pipelines, and compressed with different levels, real-world images usually suffer from a wide variety of hybrid distortions. Faced with this scenario, full-reference (FR) image quality assessment (IQA) algorithms can not deliver promising predictions due to the inferior references. Meanwhile, existing no-reference (NR) IQA algorithms remain limited in their efficacy to deal with different distortion types. To address this obstacle, we explore a NR-IQA metric by predicting the perceptual quality of distorted-then-compressed images using a deep neural network (DNN). First, we propose a novel two-stream DNN to handle both authentic distortions and synthetic compressions and adopt effective strategies to pre-train the two branches of the network. Specifically, we transfer the knowledge learned from in-the-wild images to account for authentic distortions by utilizing a pre-trained deep convolutional neural network (CNN) to provide meaningful initializations. Meanwhile, we build a CNN for synthetic compressions and pre-train it on a dataset including synthetic compressed images. Subsequently, we bilinearly pool these two sets of features as the image representation. The overall network is fine-tuned on an elaborately-designed auxiliary dataset, which is annotated by a reliable objective quality metric. Furthermore, we integrate the output of the authentic-distortion-aware branch with that of the overall network following a two-step prediction manner to boost the prediction performance, which can be applied in the distorted-then-compressed scenario when the reference image is available. Extensive experimental results on several databases especially on the LIVE Wild Compressed Picture Quality Database show that the proposed method achieves state-of-the-art performance with good generalizability and moderate computational complexity. 相似文献

7.

通道门控Res2Net卷积神经网络自动调制识别

陈昊郭文普康凯《电讯技术》2023,63(12):1869-1875

针对低信噪比条件下自动调制识别准确率不高的问题,提出了通道门控Res2Net卷积神经网络自动调制识别模型。该模型主要由二维卷积神经(Two-dimensional Convolutional Neural Network, 2D-CNN)网络、多尺度残差网络(Residual 2-network, Res2Net)、压缩与激励网络(Squeeze-and-Excitation Network, SENet)和长短期记忆(Long Short-Term Memory, LSTM)网络组成,通过卷积从原始I/Q数据中提取多尺度特征,结合门控机制对特征通道进行权重调整,并利用LSTM对卷积所得特征进行序列建模,确保数据特征被有效挖掘,从而提升自动调制识别的准确率。在基准数据集RML2016.10a下的调制识别实验表明,所提模型在信噪比为12 dB时识别精度为92.68%,在信噪比2 dB以上时平均识别精度大于91%,较经典CLDNN模型、LSTM模型和同类型PET-CGDNN模型、CGDNet模型能取得更高的调制类型识别准确率。相似文献

8.

基于CNN的SAR图像目标和场景分类算法

下载免费PDF全文

陈立福武鸿崔先亮郭正华贾智伟《雷达科学与技术》2018,16(6):627-632

随着合成孔径雷达（Synthetic Aperture Radar, SAR）成像技术的发展和SAR图像数据的急剧增加,SAR图像解译技术成为了当前的研究热点。针对SAR图像的目标和场景分类问题,提出了一种改进的基于卷积神经网络的图像分类算法。为克服卷积神经网络训练过程中因数据量不足而出现的过拟合问题,采用数据增强人工增加训练样本的大小;针对高层卷积层参数过多的问题,采用一种多尺度卷积模块替代高层的卷积层;在输出层采用卷积和全局均值池化的组合替代传统的全连接层,大幅度减少了网络参数。网络训练阶段,通过误差反向传播来更新网络参数。针对MSTAR数据集和高分辨率的机载SAR图像分别进行目标及场景分类,实验结果表明该算法实现了较好的分类性能。相似文献

9.

A novel convolutional neural network architecture of multispectral remote sensing images for automatic material classification

《Signal Processing: Image Communication》2021

For real-world simulation, terrain models must combine various types of information on material and texture in terrain reconstruction for the three-dimensional numerical simulation of terrain. However, the construction of such models using the conventional method often involves high costs in both manpower and time. Therefore, this study used a convolutional neural network (CNN) architecture to classify material in multispectral remote sensing images to simplify the construction of future models. Visible light (i.e., RGB), near infrared (NIR), normalized difference vegetation index (NDVI), and digital surface model (DSM) images were examined.This paper proposes the use of the robust U-Net (RUNet) model, which integrates multiple CNN architectures, for material classification. This model, which is based on an improved U-Net architecture combined with the shortcut connections in the ResNet model, preserves the features of shallow network extraction. The architecture is divided into an encoding layer and a decoding layer. The encoding layer comprises 10 convolutional layers and 4 pooling layers. The decoding layer contains four upsampling layers, eight convolutional layers, and one classification convolutional layer. The material classification process in this study involved the training and testing of the RUNet model. Because of the large size of remote sensing images, the training process randomly cuts subimages of the same size from the training set and then inputs them into the RUNet model for training. To consider the spatial information of the material, the test process cuts multiple test subimages from the test set through mirror padding and overlapping cropping; RUNet then classifies the subimages. Finally, it merges the subimage classification results back into the original test image.The aerial image labeling dataset of the National Institute for Research in Digital Science and Technology (Inria, abbreviated from the French Institut national de recherche en sciences et technologies du numérique) was used as well as its configured dataset (called Inria-2) and a dataset from the International Society for Photogrammetry and Remote Sensing (ISPRS). Material classification was performed with RUNet. Moreover, the effects of the mirror padding and overlapping cropping were analyzed, as were the impacts of subimage size on classification performance. The Inria dataset achieved the optimal results; after the morphological optimization of RUNet, the overall intersection over union (IoU) and classification accuracy reached 70.82% and 95.66%, respectively. Regarding the Inria-2 dataset, the IoU and accuracy were 75.5% and 95.71%, respectively, after classification refinement. Although the overall IoU and accuracy were 0.46% and 0.04% lower than those of the improved fully convolutional network, the training time of the RUNet model was approximately 10.6 h shorter. In the ISPRS dataset experiment, the overall accuracy of the combined multispectral, NDVI, and DSM images reached 89.71%, surpassing that of the RGB images. NIR and DSM provide more information on material features, reducing the likelihood of misclassification caused by similar features (e.g., in color, shape, or texture) in RGB images. Overall, RUNet outperformed the other models in the material classification of remote sensing images. The present findings indicate that it has potential for application in land use monitoring and disaster assessment as well as in model construction for simulation systems. 相似文献

10.

基于卷积神经网络的Android流量分类方法

郭益民张爱新《通信技术》2020,(2):432-437

深度学习就是机器学习研究的过程,主要通过模拟人脑分析学习的过程对数据进行分析。目前,深度学习技术已经在计算机视觉、语音识别、自然语言处理等领域获得了较大发展,并且随着该技术的不断发展,为网络流量分类和异常检测带来了新的发展方向。移动智能手机与大家的生活息息相关,但是其存在的安全问题也日益凸显。针对传统机器学习算法对于流量分类需要人工提取特征、计算量大的问题,提出了基于卷积神经网络模型的应用程序流量分类算法。首先,将网络流量数据集进行数据预处理,去除无关数据字段,并使数据满足卷积神经网络的输入特性。其次,设计了一种新的卷积神经网络模型,从网络结构、超参数空间以及参数优化方面入手,构造了最优分类模型。该模型通过卷积层自主学习数据特征,解决了传统基于机器学习的流量分类算法中的特征选择问题。最后,通过CICAndmal2017网络公开数据集进行模型测试,相比于传统的机器学习流量分类模型,设计的卷积神经网络模型的查准率和查全率分别提高了2.93%和11.87%,同时在类精度、召回率以及F1分数方面都有较好的提升。相似文献

11.

移动信道图像传输的神经网络译码性能

刘华章王新政张立民《通信技术》2000,(2):83-85

采用（２,１,３）卷积码,仿真和讨论了标准图像通过具有四种不同调制方式和两种不同车速的八个移动信道的神经网络译码器性能,并且研究了在移动信道中卷积码快衰落特征对图像传输可靠性的影响和纠错性能。得到一些重要的结论,并提出了些移动通信系统中图像舆的建设。相似文献

12.

基于脑电信号深度迁移学习的驾驶疲劳检测

王斐吴仕超刘少林张亚徽魏颖《电子与信息学报》2019,41(9):2264-2272

脑电信号一直被誉为疲劳检测的“金标准”,驾驶者的精神状态可通过对脑电信号的分析得到。但由于脑电信号具有非线性、非平稳性和空间分辨率低等特点,传统的机器学习方法在运用脑电信号进行疲劳检测时还存在识别率低,特征提取操作繁琐等不足。为此,该文基于脑电信号的电极-频率分布图,提出运用深度迁移学习实现的驾驶疲劳检测方法,即搭建深度卷积神经网络,并利用SEED脑电情绪数据集对其进行预训练,然后通过迁移学习方法将其用于驾驶疲劳检测。实验结果表明,卷积神经网络模型能够很好地从电极-频率分布图中获得与疲劳状态相关的特征信息,达到较好的识别效果。此外,基于迁移学习策略可以将训练好的深度网络模型迁移到其他识别任务上,有助于推动脑电信号在驾驶疲劳检测系统中的应用。相似文献

13.

Robust face anti-spoofing with depth information

《Journal of Visual Communication and Image Representation》2017

With the prevalence of face authentication applications, the prevention of malicious attack from fake faces such as photos or videos, i.e., face anti-spoofing, has attracted much attention recently. However, while an increasing number of works on the face anti-spoofing have been reported based on 2D RGB cameras, most of them cannot handle various attacking methods. In this paper we propose a robust representation jointly modeling 2D textual information and depth information for face anti-spoofing. The textual feature is learned from 2D facial image regions using a convolutional neural network (CNN), and the depth representation is extracted from images captured by a Kinect. A face in front of the camera is classified as live if it is categorized as live using both cues. We collected a face anti-spoofing experimental dataset with depth information, and reported extensive experimental results to validate the robustness of the proposed method. 相似文献

14.

FaceHunter: A multi-task convolutional neural network based face detector

《Signal Processing: Image Communication》2016

In this paper, we propose a new multi-task Convolutional Neural Network (CNN) based face detector, which is named FaceHunter for simplicity. The main idea is to make the face detector achieve a high detection accuracy and obtain much reliable face boxes. Reliable face boxes output will be much helpful for further face image analysis. To reach this goal, we design a deep CNN network with a multi-task loss, i.e., one is for discriminating face and non-face, and another is for face box regression. An adaptive pooling layer is added before full connection to make the network adaptive to variable candidate proposals, and the truncated SVD is applied to compress the parameters of the fully connected layers. To further speed up the detector, the convolutional feature map is directly used to generate the candidate proposals by using Region Proposal Network (RPN). The proposed FaceHunter is evaluated on the AFW dataset, FDDB dataset and Pascal Faces respectively, and extensive experiments demonstrate its powerful performance against several state-of-the-art detectors. 相似文献

15.

A new hardware Trojan detection technique using deep convolutional neural network

《Integration, the VLSI Journal》2021

The involvement of external vendors in semiconductor industries increases the chance of hardware Trojan (HT) insertion in different phases of the integrated circuit (IC) design. Recently, several partial reverse engineering (RE) based HT detection techniques are reported, which attempt to reduce the time and complexity involved in the full RE process by applying machine learning or image processing techniques in IC images. However, these techniques fail to extract the relevant image features, not robust to image variations, complicated, less generalizable, and possess a low detection rate. Therefore, to overcome the above limitations, this paper proposes a new partial RE based HT detection technique that detects Trojans from IC layout images using Deep Convolutional Neural Network (DCNN). The proposed DCNN model consists of stacking several convolutional and pooling layers. It layer-wise extracts and selects the most relevant and robust features automatically from the IC images and eliminates the need to apply the feature extraction algorithm separately. To prevent the over-training of the DCNN model, a new stopping condition method and two new metrics, namely Accuracy difference measure (ADM) and Loss difference measure (LDM), are proposed that halts the training only when the performance of our model genuinely drops. Further, to combat the issue of process variations and fabrication noise generated during the RE process, we include noisy images with varying parameters in the training process of the model. We also apply the data augmentation and regularization techniques in the model to address the issues of underfitting and overfitting. Experimental evaluation shows that the proposed technique provides 99% and 97.4% accuracy on Trust-Hub and synthetic ISCAS dataset, respectively, which is on-an-average 15.83% and 21.69% higher than the existing partial RE based techniques. 相似文献

16.

基于特征融合卷积神经网络的FMCW雷达人体动作识别

张丽丽刘博屈乐乐刘雨轩《电讯技术》2022,62(2):147-154

针对微多普勒特征识别人体动作的局限性,基于调频连续波( Frequency Modulated Continuous Wave,FMCW)雷达采用深度学习方法对人体动作识别,提出了一种特征融合卷积神经网络结构.利用FMCW雷达采样的人体动作回波数据分别构建出时间-距离特征和微多普勒特征图,将这两种特征图作为输入数据分别... 相似文献

17.

一种基于集成卷积神经网络的SAR图像目标识别算法

下载免费PDF全文

李汪华张贞凯《电讯技术》2023,63(12):1918-1924

针对合成孔径雷达(Synthetic Aperture Radar, SAR)图像目标识别问题，提出了一种基于集成卷积神经网络(Convolutional Neural Network, CNN)的SAR图像目标识别方法。首先对原始数据集进行数据增强的预处理操作，以扩充训练样本；接着通过重采样的方法从训练样本中获取不同的训练子集，并在训练各基分类器时引入Dropout和Padding操作，有效增强了网络泛化能力；然后采用Adadelta算法与Nesterov动量法结合的思想来优化网络，提高了网络的收敛速度和识别精度；最后采用相对多数投票法对基分类器的分类结果进行集成。在MSTAR数据集上进行的实验结果表明，集成后的模型识别准确率达到99.30%,识别性能优于单个卷积神经网络，具有较强的泛化能力和较好的稳健性。相似文献

18.

基于全卷积神经网络的SAR图像目标分类

下载免费PDF全文

陈永生喻玲娟谢晓春《雷达科学与技术》2018,16(3):242-248

近年来,卷积神经网络(Convolutional Neural Network,CNN)在合成孔径雷达(Synthetic Aperture Radar,SAR)图像目标分类中取得了较好的分类结果。CNN结构中,前面若干层由交替的卷积层、池化层堆叠而成,后面若干层为全连接层。全卷积神经网络（All Convolutional Neural Network, A-CNN）是对CNN结构的一种改进,其中池化层和全连接层都用卷积层代替,该结构已在计算机视觉领域被应用。针对公布的MSTAR数据集,提出了基于A-CNN的SAR图像目标分类方法,并与基于CNN的SAR图像分类方法进行对比。实验结果表明,基于A-CNN的SAR图像目标分类正确率要高于基于CNN的分类正确率。相似文献

19.

基于图卷积神经网络的胸部放射影像疾病分类方法北大核心CSCD

赵佳雷黄青松刘利军黄冕《光电子．激光》2022,(6):667-672

医学X射线作为胸部疾病的常规检查手段,可以对早期不明显的胸部疾病进行诊断,并且观察出病变部位。但是,同一张放射影像上呈现出多种疾病特征,对分类任务而言是一个挑战。此外,疾病标签之间存在着不同的对应关系,进一步导致了分类任务的困难。针对以上问题,本文将图卷积神经网络(graph convolutional neural network,GCN)与传统卷积神经网络(convolutional neural network,CNN)相结合,提出了一种将标签特征与图像特征融合的多标签胸部放射影像疾病分类方法。该方法利用图卷积神经网络对标签的全局相关性进行建模,即在疾病标签上构建有向关系图,有向图中每个节点表示一种标签类别,再将该图输入图卷积神经网络以提取标签特征,最后与图像特征融合以进行分类。本文所提出的方法在ChestX-ray14数据集上的实验结果显示对14种胸部疾病的平均AUC达到了0.843,与目前3种经典方法以及先进方法进行比较,本文方法能够有效提高分类性能。相似文献

20.

基于无预训练卷积神经网络的红外车辆目标检测

下载免费PDF全文

陈皋王卫华林丹丹《红外技术》2021,43(4):342-348

为解决基于卷积神经网络的目标检测算法对预训练权重的过度依赖,特别是数据稀缺条件下的红外场景目标检测,提出了融入注意力模块来缓解不进行预训练所带来的检测性能下降的方法。本文基于YOLO v3算法,在网络结构中融入模仿人类注意力机制的SE和CBAM模块,对提取的特征进行通道层面和空间层面的重标定。根据特征的重要程度,自适应地赋予不同权重,最终提升检测精度。在构建的红外车辆目标数据集上,注意力模块能够显著提升无预训练卷积神经网络的检测精度,融入了CBAM模块的网络检测精度为86.3 mAP。实验结果证明了注意力模块能够提升网络的特征提取能力,使网络摆脱对预训练权重的过度依赖。相似文献