首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
针对传统的分类方法由于提取的特征比较单一或者分类器结构过于简单,导致手语识别率较低的问题,本文将深度卷积神经网络架构作为分类器与多特征融合算法进行结合,通过使用纹理特征结合形状特征做到有效识别。首先纹理特征通过LBP、卷积神经网络和灰度共生矩阵方法得到,其中形状特征向量由Hu氏不变量和傅里叶级数组成。为了避免过拟合现象,使用"dropout"方法训练深度卷积神经网络。这种基于深度卷积神经网络的多特征融合的手语识别方法,在"hand"数据库中,对32种势的识别率为97.73%。相比一般的手语识别方法,此方法鲁棒性更强,并且识别率更高。  相似文献   

2.
3.
This work attempts to address two fundamental questions about the structure of the convolutional neural networks (CNN): (1) why a nonlinear activation function is essential at the filter output of all intermediate layers? (2) what is the advantage of the two-layer cascade system over the one-layer system? A mathematical model called the “REctified-COrrelations on a Sphere” (RECOS) is proposed to answer these two questions. After the CNN training process, the converged filter weights define a set of anchor vectors in the RECOS model. Anchor vectors represent the frequently occurring patterns (or the spectral components). The necessity of rectification is explained using the RECOS model. Then, the behavior of a two-layer RECOS system is analyzed and compared with its one-layer counterpart. The LeNet-5 and the MNIST dataset are used to illustrate discussion points. Finally, the RECOS model is generalized to a multilayer system with the AlexNet as an example.  相似文献   

4.
We considered the prediction of driver's cognitive states related to driving performance using EEG signals. We proposed a novel channel-wise convolutional neural network (CCNN) whose architecture considers the unique characteristics of EEG data. We also discussed CCNN-R, a CCNN variation that uses Restricted Boltzmann Machine to replace the convolutional filter, and derived the detailed algorithm. To test the performance of CCNN and CCNN-R, we assembled a large EEG dataset from 3 studies of driver fatigue that includes samples from 37 subjects. Using this dataset, we investigated the new CCNN and CCNN-R on raw EEG data and also Independent Component Analysis (ICA) decomposition. We tested both within-subject and cross-subject predictions and the results showed CCNN and CCNN-R achieved robust and improved performance over conventional DNN and CNN as well as other non-DL algorithms.  相似文献   

5.
In this paper, we propose a new multi-task Convolutional Neural Network (CNN) based face detector, which is named FaceHunter for simplicity. The main idea is to make the face detector achieve a high detection accuracy and obtain much reliable face boxes. Reliable face boxes output will be much helpful for further face image analysis. To reach this goal, we design a deep CNN network with a multi-task loss, i.e., one is for discriminating face and non-face, and another is for face box regression. An adaptive pooling layer is added before full connection to make the network adaptive to variable candidate proposals, and the truncated SVD is applied to compress the parameters of the fully connected layers. To further speed up the detector, the convolutional feature map is directly used to generate the candidate proposals by using Region Proposal Network (RPN). The proposed FaceHunter is evaluated on the AFW dataset, FDDB dataset and Pascal Faces respectively, and extensive experiments demonstrate its powerful performance against several state-of-the-art detectors.  相似文献   

6.
Convolutional neural networks (CNNs) with large model size and computing operations are difficult to be deployed on embedded systems, such as smartphones or AI cameras. In this paper, we propose a novel structured pruning method, termed the structured feature sparsity training (SFST), to speed up the inference process and reduce the memory usage of CNNs. Unlike other existing pruning methods, which require multiple iterations of pruning and retraining to ensure stable performance, SFST only needs to fine-tune the pretrained model with additional regularization on the less important features and then prune them, no multiple pruning and retraining needed. SFST can be deployed to a variety of modern CNN architectures including VGGNet, ResNet and MobileNetv2. Experimental results on CIFAR, SVHN, ImageNet and MSTAR benchmark dataset demonstrate the effectiveness of our scheme, which achieves superior performance over the state-of-the-art methods.  相似文献   

7.
Computer-empowered detection of possible faults for Heating, Ventilation and Air-Conditioning (HVAC) subsystems, e.g., chillers, is one of the most important applications in Artificial Intelligence (AI) integrated Internet of Things (IoT). The cyber-physical system greatly enhances the safety and security of the working facilities, reducing time, saving energy and protecting humans’ health. Under the current trends of smart building design and energy management optimization, Automated Fault Detection and Diagnosis (AFDD) of chillers integrated with IoT is highly demanded. Recent studies show that standard machine learning techniques, such as Principal Component Analysis (PCA), Support Vector Machine (SVM) and tree-structure-based algorithms, are useful in capturing various chiller faults with high accuracy rates. With the fast development of deep learning technology, Convolutional Neural Networks (CNNs) have been widely and successfully applied to various fields. However, for chiller AFDD, few existing works are adopting CNN and its extensions in the feature extraction and classification processes. In this study, we propose to perform chiller FDD using a CNN-based approach. The proposed approach has two distinct advantages over existing machine learning-based chiller AFDD methods. First, the CNN-based approach does not require the feature selection/extraction process. Since CNN is reputable with its feature extraction capability, the feature extraction and classification processes are merged, leading to a more neat AFDD framework compared to traditional approaches. Second, the classification accuracy is significantly improved compared to traditional methods using the CNN-based approach.  相似文献   

8.
Image source identification is important to verify the origin and authenticity of digital images. However, when images are altered by some post-processing, the performance of the existing source verification methods may degrade. In this paper, we propose a convolutional neural network (CNN) to solve the above problem. Specifically, we present a theoretical framework for different tampering operations, to confirm whether a single operation has affected photo response non-uniformity (PRNU) contained in images. Then, we divide these operations into two categories: non-influential operation and influential operation. Besides, the images altered by the combination of non-influential and influential operations are equal to images that have only undergone a single influential operation. To make our introduced CNN robust to both non-influential operation and influential operation, we define a multi-kernel noise extractor that consists of a high-pass filter and three parallel convolution filters of different sizes. The features generated by the parallel convolution layers are then fed to subsequent convolutional layers for further feature extraction. The experimental results provide the effectiveness of our method.  相似文献   

9.
Compared with the traditional image denoising method, although the convolutional neural network (CNN) has better denoising performance, there is an important issue that has not been well resolved: the residual image obtained by learning the difference between noisy image and clean image pairs contains abundant image detail information, resulting in the serious loss of detail in the denoised image. In this paper, in order to relearn the lost image detail information, a mathematical model is deducted from a minimization problem and an end-to-end detail retaining CNN (DRCNN) is proposed. Unlike most denoising methods based on CNN, DRCNN is not only focus to image denoising, but also the integrity of high frequency image content. DRCNN needs less parameters and storage space, therefore it has better generalization ability. Moreover, DRCNN can also adapt to different image restoration tasks such as blind image denoising, single image superresolution (SISR), blind deburring and image inpainting. Extensive experiments show that DRCNN has a better effect than some classic and novel methods.  相似文献   

10.
针对现有网络隐写分析算法特征提取难度大、算法适用范围单一的问题,文章提出了一种基于卷积神经网络的网络隐写分析方法。对网络数据流进行预处理,将所有数据包处理成大小相同的矩阵,最大限度地保留数据特征完整性;使用异构卷积进行特征提取,减少模型计算量及参数数量,加快模型收敛速度;取消池化层,提高模型训练效率。与传统网络隐写分析方法相比,模型能够自动提取数据特征,识别多种网络隐写算法。  相似文献   

11.
Automatic image annotation is one of the most important challenges in computer vision, which is critical to many real-world researches and applications. In this paper, we focus on the issue of large scale image annotation with deep learning. Firstly, considering the existing image data, especially the network images, most of the labels of themselves are inaccurate or imprecise. We propose a Multitask Voting (MV) method, which can improve the accuracy of original annotation to a certain extent, thereby enhancing the training effect of the model. Secondly, the MV method can also achieve the adaptive label, whereas most existing methods pre-specify the number of tags to be selected. Additionally, based on convolutional neural network, a large scale image annotation model MVAIACNN is constructed. Finally, we evaluate the performance with experiments on the MIRFlickr25K and NUS-WIDE datasets, and compare with other methods, demonstrating the effectiveness of the MVAIACNN.  相似文献   

12.
以往的文本情感分析模型存在忽略文本边缘信息、池化层破坏文本序列特征的问题,并且特征提取能力与识别关键信息的能力也存在不足。为了进一步提升情感分析的效果,提出了一种基于注意力机制的动态卷积神经网络(Dynamic Convolutional Neural Network,DCNN)与双向门控循环单元(Bi-directional Gated Recurrent Unit,BiGRU)的文本情感分析模型DCNN-BiGRU-Att。首先,利用宽卷积核提取文本边缘特征,采用动态k-max池化保留了文本的相对位置序列特征。其次,构建了DCNN与BiGRU的并行混合结构,避免了部分特征损失问题,并同时保留局部特征与全局上下文信息两种特征,提高了模型的特征提取能力。最后,在特征融合之后引入注意力机制,将注意力机制的作用全局化,提高了模型识别关键信息的能力。将该模型在MR与SST-2两个公开数据集上与多个深度学习模型进行对比,其准确率分别提高了1.27%和1.07%,充分证明了该模型的合理有效性。  相似文献   

13.
With the rapid development of mobile Internet and digital technology, people are more and more keen to share pictures on social networks, and online pictures have exploded. How to retrieve similar images from large-scale images has always been a hot issue in the field of image retrieval, and the selection of image features largely affects the performance of image retrieval. The Convolutional Neural Networks (CNN), which contains more hidden layers, has more complex network structure and stronger ability of feature learning and expression compared with traditional feature extraction methods. By analyzing the disadvantage that global CNN features cannot effectively describe local details when they act on image retrieval tasks, a strategy of aggregating low-level CNN feature maps to generate local features is proposed. The high-level features of CNN model pay more attention to semantic information, but the low-level features pay more attention to local details. Using the increasingly abstract characteristics of CNN model from low to high. This paper presents a probabilistic semantic retrieval algorithm, proposes a probabilistic semantic hash retrieval method based on CNN, and designs a new end-to-end supervised learning framework, which can simultaneously learn semantic features and hash features to achieve fast image retrieval. Using convolution network, the error rate is reduced to 14.41% in this test set. In three open image libraries, namely Oxford, Holidays and ImageNet, the performance of traditional SIFT-based retrieval algorithms and other CNN-based image retrieval algorithms in tasks are compared and analyzed. The experimental results show that the proposed algorithm is superior to other contrast algorithms in terms of comprehensive retrieval effect and retrieval time.  相似文献   

14.
通过改进动态路由和压缩函数的方式对Hinton等的胶囊网络模型进行改进。运用了加强数据集的方法,增加了数据集的大小,在一定程度上避免了过拟合现象的发生。通过实验表明,改进后的胶囊网络模型在结构上有了简化,在效率上比未改进的模型有了明显的提高。在改进的胶囊网络模型基础上,提出了将改进后的胶囊网络与卷积神经网络相结合的网络模型。该模型训练准确率达到97.56%,模型评估准确率达88%。  相似文献   

15.
For real-world simulation, terrain models must combine various types of information on material and texture in terrain reconstruction for the three-dimensional numerical simulation of terrain. However, the construction of such models using the conventional method often involves high costs in both manpower and time. Therefore, this study used a convolutional neural network (CNN) architecture to classify material in multispectral remote sensing images to simplify the construction of future models. Visible light (i.e., RGB), near infrared (NIR), normalized difference vegetation index (NDVI), and digital surface model (DSM) images were examined.This paper proposes the use of the robust U-Net (RUNet) model, which integrates multiple CNN architectures, for material classification. This model, which is based on an improved U-Net architecture combined with the shortcut connections in the ResNet model, preserves the features of shallow network extraction. The architecture is divided into an encoding layer and a decoding layer. The encoding layer comprises 10 convolutional layers and 4 pooling layers. The decoding layer contains four upsampling layers, eight convolutional layers, and one classification convolutional layer. The material classification process in this study involved the training and testing of the RUNet model. Because of the large size of remote sensing images, the training process randomly cuts subimages of the same size from the training set and then inputs them into the RUNet model for training. To consider the spatial information of the material, the test process cuts multiple test subimages from the test set through mirror padding and overlapping cropping; RUNet then classifies the subimages. Finally, it merges the subimage classification results back into the original test image.The aerial image labeling dataset of the National Institute for Research in Digital Science and Technology (Inria, abbreviated from the French Institut national de recherche en sciences et technologies du numérique) was used as well as its configured dataset (called Inria-2) and a dataset from the International Society for Photogrammetry and Remote Sensing (ISPRS). Material classification was performed with RUNet. Moreover, the effects of the mirror padding and overlapping cropping were analyzed, as were the impacts of subimage size on classification performance. The Inria dataset achieved the optimal results; after the morphological optimization of RUNet, the overall intersection over union (IoU) and classification accuracy reached 70.82% and 95.66%, respectively. Regarding the Inria-2 dataset, the IoU and accuracy were 75.5% and 95.71%, respectively, after classification refinement. Although the overall IoU and accuracy were 0.46% and 0.04% lower than those of the improved fully convolutional network, the training time of the RUNet model was approximately 10.6 h shorter. In the ISPRS dataset experiment, the overall accuracy of the combined multispectral, NDVI, and DSM images reached 89.71%, surpassing that of the RGB images. NIR and DSM provide more information on material features, reducing the likelihood of misclassification caused by similar features (e.g., in color, shape, or texture) in RGB images. Overall, RUNet outperformed the other models in the material classification of remote sensing images. The present findings indicate that it has potential for application in land use monitoring and disaster assessment as well as in model construction for simulation systems.  相似文献   

16.
The design, analysis and application of a volumetric convolutional neural network (VCNN) are studied in this work. Although many CNNs have been proposed in the literature, their design is empirical. In the design of the VCNN, we propose a feed-forward K-means clustering algorithm to determine the filter number and size at each convolutional layer systematically. For the analysis of the VCNN, the cause of confusing classes in the output of the VCNN is explained by analyzing the relationship between the filter weights (also known as anchor vectors) from the last fully-connected layer to the output. Furthermore, a hierarchical clustering method followed by a random forest classification method is proposed to boost the classification performance among confusing classes. For the application of the VCNN, we examine the 3D shape classification problem and conduct experiments on a popular ModelNet40 dataset. The proposed VCNN offers the state-of-the-art performance among all volume-based CNN methods.  相似文献   

17.
近年来,卷积神经网络被广泛应用于图像超分辨率领域。针对基于卷积神经网络的超分辨率算法存在图像特征提取不充分,参数量大和训练难度大等问题,本文提出了一种基于门控卷积神经网络(gated convolutional neural network, GCNN)的轻量级图像超分辨率重建算法。首先,通过卷积操作对原始低分辨率图像进行浅层特征提取。之后,通过门控残差块(gated residual block, GRB)和长短残差连接充分提取图像特征,其高效的结构也能加速网络训练过程。GRB中的门控单元(gated unit, GU)使用区域自注意力机制提取输入特征图中的每个特征点权值,紧接着将门控权值与输入特征逐元素相乘作为GU输出。最后,使用亚像素卷积和卷积模块重建出高分辨率图像。在Set14、BSD100、Urban100和Manga109数据集上进行实验,并和经典方法进行对比,本文算法有更高的峰值信噪比(peak signal-to-noise ratio,PSNR)和结构相似性(structural similarity,SSIM),重建出的图像有更清晰的轮廓边缘和细节信息。  相似文献   

18.
Image steganalysis based on convolutional neural networks(CNN) has attracted great attention. However, existing networks lack attention to regional features with complex texture, which makes the ability of discrimination learning miss in network. In this paper, we described a new CNN designed to focus on useful features and improve detection accuracy for spatial-domain steganalysis. The proposed model consists of three modules: noise extraction module, noise analysis module and classification module. A channel attention mechanism is used in the noise extraction module and analysis module, which is realized by embedding the SE(Squeeze-and-Excitation) module into the residual block. Then, we use convolutional pooling instead of average pooling to aggregate features. The experimental results show that detection accuracy of the proposed model is significantly better than those of the existing models such as SRNet, Zhu-Net and GBRAS-Net. Compared with these models, our model has better generalization ability, which is critical for practical application.  相似文献   

19.
超分辨率重建在视频的传输和显示中起着重要的 作用。为了既保证重建视频的清晰度,又面向用户 实时显示,提出了一种采用精简卷积神经网络的快速视频超分辨率重建方法。所提的精简卷 积神经网络体 现在以下三点:首先,考虑到输入的尺寸大小会直接影响网络的运算速度,所提网络省去传 统方法的预插 值过程,直接对多个低分辨率输入视频帧提取特征,并进行多维特征通道融合。接着,为了 避免网络中产 生零梯度而丢失视频的重要信息,采用参数线性纠正单元(Parametric Rectified Linear Unit,PReLU)作为激 活函数,并采用尺寸更小的滤波器调整网络结构以进行多层映射。最后,在网络末端添加反 卷积层上采样 得到重建视频。实验结果显示,所提方法相比有代表性的方法在PSNR和SSIM指 标上分别平均 提升了0.32dB和0.016,同时在 GPU下达到平均41帧/秒的重建速度。结果表明所提方法可快速重建质 量更优的视频。  相似文献   

20.
Relocated I-frames are a key type of abnormal inter-coded frame in double compressed videos with shifted GOP structures. In this work, a frame-wise detection method of relocated I-frame is proposed based on convolutional neural network (CNN). The proposed detection framework contains a novel network architecture, which initializes with a preprocessing layer and is followed by a well-designed CNN. In the preprocessing layer, the high-frequency component extraction operation is applied to eliminate the influence of diverse video contents. To mitigate overfitting, several advanced structures, such as 1 × 1 convolutional filter and the global average-pooling layer, are carefully introduced in the design of the CNN architecture. Public available YUV sequences are collected to construct a dataset of double compressed videos with different coding parameters. According to the experiments, the proposed framework can achieve a more promising performance of relocated I-frame detection than a well-known CNN structure (AlexNet) and the method based on average prediction residual.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号