Underwater target recognition is a key technology for underwater acoustic countermeasure. How to classify and recognize underwater targets according to the noise information of underwater targets has been a hot topic in the field of underwater acoustic signals. In this paper, the deep learning model is applied to underwater target recognition. Improved anti-noise Power-Normalized Cepstral Coefficients (ia-PNCC) is proposed, based on PNCC applied to underwater noises. Multitaper and normalized Gammatone filter banks are applied to improve the anti-noise capacity. The method is combined with a convolutional neural network in order to recognize the underwater target. Experiment results show that the acoustic feature presented by ia-PNCC has lower noise and are well-suited to underwater target recognition using a convolutional neural network. Compared with the combination of convolutional neural network with single acoustic feature, such as MFCC (Mel-scale Frequency Cepstral Coefficients) or LPCC (Linear Prediction Cepstral Coefficients), the combination of the ia-PNCC with a convolutional neural network offers better accuracy for underwater target recognition.  相似文献   

As a common and high-risk type of disease, heart disease seriously threatens people’s health. At the same time, in the era of the Internet of Thing (IoT), smart medical device has strong practical significance for medical workers and patients because of its ability to assist in the diagnosis of diseases. Therefore, the research of real-time diagnosis and classification algorithms for arrhythmia can help to improve the diagnostic efficiency of diseases. In this paper, we design an automatic arrhythmia classification algorithm model based on Convolutional Neural Network (CNN) and Encoder-Decoder model. The model uses Long Short-Term Memory (LSTM) to consider the influence of time series features on classification results. Simultaneously, it is trained and tested by the MIT-BIH arrhythmia database. Besides, Generative Adversarial Networks (GAN) is adopted as a method of data equalization for solving data imbalance problem. The simulation results show that for the inter-patient arrhythmia classification, the hybrid model combining CNN and Encoder-Decoder model has the best classification accuracy, of which the accuracy can reach 94.05%. Especially, it has a better advantage for the classification effect of supraventricular ectopic beats (class S) and fusion beats (class F).  相似文献   

张立峰  王智  吴思橙 《计量学报》2022,43(10):1306-1312
提出了一种基于卷积神经网络(CNN)与门控循环单元(GRU)的垂直管道气液两相流流型识别方法。该方法基于电阻层析成像(ERT)系统的重建图像,对其填充处理后进行离散余弦变换(DCT),求取最大、最小 DCT 系数的差值,选取一定帧数长度数据作为网络输入,对流型进行识别。分析了输入序列长度对CNN-GRU、CNN 及 GRU 网络分类准确的影响,确定了最佳输入向量维度分别为 60、65 及 50,使用实验数据对3种网络进行训练、测试,结果表明,CNN-GRU网络分类准确率最高,平均流型识别准确率可达 99.40%。  相似文献   

This paper presents a handwritten document recognition system based on the convolutional neural network technique. In today’s world, handwritten document recognition is rapidly attaining the attention of researchers due to its promising behavior as assisting technology for visually impaired users. This technology is also helpful for the automatic data entry system. In the proposed system prepared a dataset of English language handwritten character images. The proposed system has been trained for the large set of sample data and tested on the sample images of user-defined handwritten documents. In this research, multiple experiments get very worthy recognition results. The proposed system will first perform image pre-processing stages to prepare data for training using a convolutional neural network. After this processing, the input document is segmented using line, word and character segmentation. The proposed system get the accuracy during the character segmentation up to 86%. Then these segmented characters are sent to a convolutional neural network for their recognition. The recognition and segmentation technique proposed in this paper is providing the most acceptable accurate results on a given dataset. The proposed work approaches to the accuracy of the result during convolutional neural network training up to 93%, and for validation that accuracy slightly decreases with 90.42%.  相似文献   

Handwritten character recognition systems are used in every field of life nowadays, including shopping malls, banks, educational institutes, etc. Urdu is the national language of Pakistan, and it is the fourth spoken language in the world. However, it is still challenging to recognize Urdu handwritten characters owing to their cursive nature. Our paper presents a Convolutional Neural Networks (CNN) model to recognize Urdu handwritten alphabet recognition (UHAR) offline and online characters. Our research contributes an Urdu handwritten dataset (aka UHDS) to empower future works in this field. For offline systems, optical readers are used for extracting the alphabets, while diagonal-based extraction methods are implemented in online systems. Moreover, our research tackled the issue concerning the lack of comprehensive and standard Urdu alphabet datasets to empower research activities in the area of Urdu text recognition. To this end, we collected 1000 handwritten samples for each alphabet and a total of 38000 samples from 12 to 25 age groups to train our CNN model using online and offline mediums. Subsequently, we carried out detailed experiments for character recognition, as detailed in the results. The proposed CNN model outperformed as compared to previously published approaches.  相似文献   

Vehicle type classification is considered a central part of an intelligent traffic system. In recent years, deep learning had a vital role in object detection in many computer vision tasks. To learn high-level deep features and semantics, deep learning offers powerful tools to address problems in traditional architectures of handcrafted feature-extraction techniques. Unlike other algorithms using handcrated visual features, convolutional neural network is able to automatically learn good features of vehicle type classification. This study develops an optimized automatic surveillance and auditing system to detect and classify vehicles of different categories. Transfer learning is used to quickly learn the features by recording a small number of training images from vehicle frontal view images. The proposed system employs extensive data-augmentation techniques for effective training while avoiding the problem of data shortage. In order to capture rich and discriminative information of vehicles, the convolutional neural network is fine-tuned for the classification of vehicle types using the augmented data. The network extracts the feature maps from the entire dataset and generates a label for each object (vehicle) in an image, which can help in vehicle-type detection and classification. Experimental results on a public dataset and our own dataset demonstrated that the proposed method is quite effective in detection and classification of different types of vehicles. The experimental results show that the proposed model achieves 96.04% accuracy on vehicle type classification.  相似文献   

With the development of deep learning and Convolutional Neural Networks (CNNs), the accuracy of automatic food recognition based on visual data have significantly improved. Some research studies have shown that the deeper the model is, the higher the accuracy is. However, very deep neural networks would be affected by the overfitting problem and also consume huge computing resources. In this paper, a new classification scheme is proposed for automatic food-ingredient recognition based on deep learning. We construct an up-to-date combinational convolutional neural network (CBNet) with a subnet merging technique. Firstly, two different neural networks are utilized for learning interested features. Then, a well-designed feature fusion component aggregates the features from subnetworks, further extracting richer and more precise features for image classification. In order to learn more complementary features, the corresponding fusion strategies are also proposed, including auxiliary classifiers and hyperparameters setting. Finally, CBNet based on the well-known VGGNet, ResNet and DenseNet is evaluated on a dataset including 41 major categories of food ingredients and 100 images for each category. Theoretical analysis and experimental results demonstrate that CBNet achieves promising accuracy for multi-class classification and improves the performance of convolutional neural networks.  相似文献   

Aim to countermeasure the presentation attack for iris recognition system, an iris liveness detection scheme based on batch normalized convolutional neural network (BNCNN) is proposed to improve the reliability of the iris authentication system. The BNCNN architecture with eighteen layers is constructed to detect the genuine iris and fake iris, including convolutional layer, batch-normalized (BN) layer, Relu layer, pooling layer and full connected layer. The iris image is first preprocessed by iris segmentation and is normalized to 256×256 pixels, and then the iris features are extracted by BNCNN. With these features, the genuine iris and fake iris are determined by the decision-making layer. Batch normalization technique is used in BNCNN to avoid the problem of over fitting and gradient disappearing during training. Extensive experiments are conducted on three classical databases: the CASIA Iris Lamp database, the CASIA Iris Syn database and Ndcontact database. The results show that the proposed method can effectively extract micro texture features of the iris, and achieve higher detection accuracy compared with some typical iris liveness detection methods.  相似文献   

The COVID-19 pandemic poses an additional serious public health threat due to little or no pre-existing human immunity, and developing a system to identify COVID-19 in its early stages will save millions of lives. This study applied support vector machine (SVM), k-nearest neighbor (K-NN) and deep learning convolutional neural network (CNN) algorithms to classify and detect COVID-19 using chest X-ray radiographs. To test the proposed system, chest X-ray radiographs and CT images were collected from different standard databases, which contained 95 normal images, 140 COVID-19 images and 10 SARS images. Two scenarios were considered to develop a system for predicting COVID-19. In the first scenario, the Gaussian filter was applied to remove noise from the chest X-ray radiograph images, and then the adaptive region growing technique was used to segment the region of interest from the chest X-ray radiographs. After segmentation, a hybrid feature extraction composed of 2D-DWT and gray level co-occurrence matrix was utilized to extract the features significant for detecting COVID-19. These features were processed using SVM and K-NN. In the second scenario, a CNN transfer model (ResNet 50) was used to detect COVID-19. The system was examined and evaluated through multiclass statistical analysis, and the empirical results of the analysis found significant values of 97.14%, 99.34%, 99.26%, 99.26% and 99.40% for accuracy, specificity, sensitivity, recall and AUC, respectively. Thus, the CNN model showed significant success; it achieved optimal accuracy, effectiveness and robustness for detecting COVID-19.  相似文献   

王胜  吕林涛  杨宏才 《包装工程》2019,40(11):203-211
目的 为了改善传统机器检测印刷产品缺陷存在误费率高的不足。方法 提出以卷积神经网络为控制核心的印刷品缺陷检测系统。设计可在实际检测中应用的卷积神经网络,设计在线印刷质量检测系统的硬件结构。结果 对结构相同而训练次数、学习率不同的卷积神经网络进行了缺陷检测的性能对比,验证了该卷积神经网络在学习率小于0.01时,可以获得较好的识别效果;在学习率大于0.05时,网络不容易收敛。网络训练次数越多,精度越高,相应的训练时间也较长。在满足快速性和精确度的条件下,确定了适应某印刷品的缺陷检验网络训练次数为50,学习率为0.005,此时的识别率为90%。结论 经过实验证明,该检测系统具有良好的缺陷识别能力,缺陷类型的分类准确率较高。该系统具有一定的实用价值。  相似文献   

As a common medium in our daily life, images are important for most people to gather information. There are also people who edit or even tamper images to deliberately deliver false information under different purposes. Thus, in digital forensics, it is necessary to understand the manipulating history of images. That requires to verify all possible manipulations applied to images. Among all the image editing manipulations, recoloring is widely used to adjust or repaint the colors in images. The color information is an important visual information that image can deliver. Thus, it is necessary to guarantee the correctness of color in digital forensics. On the other hand, many image retouching or editing applications or software are equipped with recoloring function. This enables ordinary people without expertise of image processing to apply recoloring for images. Hence, in order to secure the color information of images, in this paper, a recoloring detection method is proposed. The method is based on convolutional neural network which is quite popular in recent years. Unlike the traditional linear classifier, the proposed method can be employed for binary classification as well as multiple labels classification. The classification performance of different structure for the proposed architecture is also investigated in this paper.  相似文献   

舒忠  郑波儿 《包装工程》2024,45(7):222-233
目的 解决超分辨率图像重构模型中存在的功能单元之间关联性差,图像色度特征提取完整性不强、超分辨率重构失真控制和采样过程残差控制偏弱等问题。方法 通过在卷积神经网络模型引入双激活函数,提高模型中各功能单元之间的兼容连接性;引用密集连接卷积神经网络构建超分辨率失真控制单元,分别实现对4个色度分量进行卷积补偿运算;将残差插值函数应用于上采样单元中,使用深度反投影网络规则实现超分辨率色度特征插值运算。结果 设计的模型集联了内部多个卷积核,实现了超分辨率色度失真补偿,使用了统一的处理权值,确保了整个模型内部组成单元的有机融合。结论 相关实验结果验证了本文图像重构模型具有良好可靠性、稳定性和高效性。  相似文献   

Calculating the semantic similarity of two sentences is an extremely challenging problem. We propose a solution based on convolutional neural networks (CNN) using semantic and syntactic features of sentences. The similarity score between two sentences is computed as follows. First, given a sentence, two matrices are constructed accordingly, which are called the syntax model input matrix and the semantic model input matrix; one records some syntax features, and the other records some semantic features. By experimenting with different arrangements of representing the syntactic and semantic features of the sentences in the matrices, we adopt the most effective way of constructing the matrices. Second, these two matrices are given to two neural networks, which are called the sentence model and the semantic model, respectively. The convolution process of the neural networks of the two models is carried out in multiple perspectives. The outputs of the two models are combined as a vector, which is the representation of the sentence. Third, given the representation vectors of two sentences, the similarity score of these representations is computed by a layer in the CNN. Experiment results show that our algorithm (SSCNN) surpasses the performance MPCPP, which noticeably the best recent work of using CNN for sentence similarity computation. Comparing with MPCNN, the convolution computation in SSCNN is considerably simpler. Based on the results of this work, we suggest that by further utilization of semantic and syntactic features, the performance of sentence similarity measurements has considerable potentials to be improved in the future.  相似文献   

Existing segmentation and augmentation techniques on convolutional neural network (CNN) has produced remarkable progress in object detection. However, the nominal accuracy and performance might be downturned with the photometric variation of images that are directly ignored in the training process, along with the context of the individual CNN algorithm. In this paper, we investigate the effect of a photometric variation like brightness and sharpness on different CNN. We observe that random augmentation of images weakens the performance unless the augmentation combines the weak limits of photometric variation. Our approach has been justified by the experimental result obtained from the PASCAL VOC 2007 dataset, with object detection CNN algorithms such as YOLOv3 (You Only Look Once), Faster R-CNN (Region-based CNN), and SSD (Single Shot Multibox Detector). Each CNN model shows performance loss for varying sharpness and brightness, ranging between −80% to 80%. It was further shown that compared to random augmentation, the augmented dataset with weak photometric changes delivered high performance, but the photometric augmentation range differs for each model. Concurrently, we discuss some research questions that benefit the direction of the study. The results prove the importance of adaptive augmentation for individual CNN model, subjecting towards the robustness of object detection.  相似文献   

简川霞  陈鑫  林浩  张韬  王华明 《包装工程》2021,42(15):275-283
目的 针对目前印刷套准识别方法依赖于经验人工设计特征提取的问题,提出一种不需要人工提取图像特征的卷积神经网络模型,实现印刷套准状态的识别.方法 采用图像增强技术实现不均衡训练集的均衡化,增加训练集图像的数量,提高模型的识别准确率.设计基于AlexNet网络结构的印刷套准识别模型的结构参数,分析批处理样本数量和基础学习率对模型性能的影响规律.结果 文中方法获得的总印刷套准识别准确率为0.9860,召回率为1.0000,分类准确率几何平均数为0.9869.结论 文中方法能自动提取图像特征,不依赖于人工设计的特征提取方法.在构造的数据集上,文中方法的分类性能优于实验中的支持向量机方法.  相似文献   

Skin cancer is one of the most severe diseases, and medical imaging is among the main tools for cancer diagnosis. The images provide information on the evolutionary stage, size, and location of tumor lesions. This paper focuses on the classification of skin lesion images considering a framework of four experiments to analyze the classification performance of Convolutional Neural Networks (CNNs) in distinguishing different skin lesions. The CNNs are based on transfer learning, taking advantage of ImageNet weights. Accordingly, in each experiment, different workflow stages are tested, including data augmentation and fine-tuning optimization. Three CNN models based on DenseNet-201, Inception-ResNet-V2, and Inception-V3 are proposed and compared using the HAM10000 dataset. The results obtained by the three models demonstrate accuracies of 98%, 97%, and 96%, respectively. Finally, the best model is tested on the ISIC 2019 dataset showing an accuracy of 93%. The proposed methodology using CNN represents a helpful tool to accurately diagnose skin cancer disease.  相似文献   

为实现易拉罐灌装过程中喷码字符实时检测,提出了一种基于卷积神经网络的实时检测方法。该方法首先对采集的图像进行直方图均衡化和OSTU处理,然后对图像进行形态学膨胀操作,通过连通域面积法提取出喷码字符区域并进行旋转矫正,再采用投影法将字符区域分割为单个字符,在离线状态下采用卷积神经网络对字符进行训练,从而在在线检测时进行识别。实验表明,该方法检测一帧图像平均时间为46 ms,准确率达98.97%,实时性和准确性较高,可以满足工业易拉罐喷码字符在线实时检测要求。  相似文献   

The estimation of image resampling factors is an important problem in image forensics. Among all the resampling factor estimation methods, spectrumbased methods are one of the most widely used methods and have attracted a lot of research interest. However, because of inherent ambiguity, spectrum-based methods fail to discriminate upscale and downscale operations without any prior information. In general, the application of resampling leaves detectable traces in both spatial domain and frequency domain of a resampled image. Firstly, the resampling process will introduce correlations between neighboring pixels. In this case, a set of periodic pixels that are correlated to their neighbors can be found in a resampled image. Secondly, the resampled image has distinct and strong peaks on spectrum while the spectrum of original image has no clear peaks. Hence, in this paper, we propose a dual-stream convolutional neural network for image resampling factors estimation. One of the two streams is gray stream whose purpose is to extract resampling traces features directly from the rescaled images. The other is frequency stream that discovers the differences of spectrum between rescaled and original images. The features from two streams are then fused to construct a feature representation including the resampling traces left in spatial and frequency domain, which is later fed into softmax layer for resampling factor estimation. Experimental results show that the proposed method is effective on resampling factor estimation and outperforms some CNN-based methods.  相似文献   

Distributed Denial-of-Service (DDoS) has caused great damage to the network in the big data environment. Existing methods are characterized by low computational efficiency, high false alarm rate and high false alarm rate. In this paper, we propose a DDoS attack detection method based on network flow grayscale matrix feature via multiscale convolutional neural network (CNN). According to the different characteristics of the attack flow and the normal flow in the IP protocol, the seven-tuple is defined to describe the network flow characteristics and converted into a grayscale feature by binary. Based on the network flow grayscale matrix feature (GMF), the convolution kernel of different spatial scales is used to improve the accuracy of feature segmentation, global features and local features of the network flow are extracted. A DDoS attack classifier based on multi-scale convolution neural network is constructed. Experiments show that compared with correlation methods, this method can improve the robustness of the classifier, reduce the false alarm rate and the missing alarm rate.  相似文献   

Deep learning techniques, particularly convolutional neural networks (CNNs), have exhibited remarkable performance in solving vision-related problems, especially in unpredictable, dynamic, and challenging environments. In autonomous vehicles, imitation-learning-based steering angle prediction is viable due to the visual imagery comprehension of CNNs. In this regard, globally, researchers are currently focusing on the architectural design and optimization of the hyperparameters of CNNs to achieve the best results. Literature has proven the superiority of metaheuristic algorithms over the manual-tuning of CNNs. However, to the best of our knowledge, these techniques are yet to be applied to address the problem of imitation-learning-based steering angle prediction. Thus, in this study, we examine the application of the bat algorithm and particle swarm optimization algorithm for the optimization of the CNN model and its hyperparameters, which are employed to solve the steering angle prediction problem. To validate the performance of each hyperparameters’ set and architectural parameters’ set, we utilized the Udacity steering angle dataset and obtained the best results at the following hyperparameter set: optimizer, Adagrad; learning rate, 0.0052; and nonlinear activation function, exponential linear unit. As per our findings, we determined that the deep learning models show better results but require more training epochs and time as compared to shallower ones. Results show the superiority of our approach in optimizing CNNs through metaheuristic algorithms as compared with the manual-tuning approach. Infield testing was also performed using the model trained with the optimal architecture, which we developed using our approach.  相似文献   

