水声目标智能识别是水声装备智能化的重要组成部分,深度学习则是实现水声目标智能识别的重要技术手段之一。当前水声目标智能识别经常面临数据集较小带来的训练样本量不足的情况,针对小数据集识别中存在的因过拟合导致模型泛化能力不足,以及输入的水声信号二维谱图样式不统一的问题,文章提出了一种基于VGGish神经网络模型的水声目标识别方法。该方法以VGGish网络作为特征提取器,并在VGGish网络前部加入了信号预处理模块,同时设计了一种基于传统机器学习算法的联合分类器,通过以上措施解决了过拟合问题和二维谱图样式不统一问题。实验结果显示,该方法应用在ShipsEar数据集上得到了94.397%的识别准确率,高于传统预训练-微调法得到的最高90.977%的准确率,并且在相同条件下该方法的模型训练耗时仅为传统预训练-微调方法的0.5%左右,有效提高了识别准确率和模型训练速度。  相似文献   

为了提高基于图像的物体识别准确率,提出一种改进双流卷积递归神经网络的RGB-D物体识别算法(Re-CRNN).将RGB图像与深度光学信息结合,基于残差学习对双流卷积神经网络(CNN)进行改进:增加顶层特征融合单元,在RGB图像和深度图像中学习联合特征,将提取的RGB和深度图像的高层次特征进行跨通道信息融合,继而使用So...  相似文献   

在商品市场激烈竞争、超市商场商品琳琅满目的今日,商品怎样能从货架上跳进消费者的视野,而不至于淹没在花花绿绿的商品海洋里,变得十分重要。也就是说,商品的包装设计应便于消费者对商品的快速识别。  相似文献   

本文根据不同神经网络的分类特点,提出将径向基函数网络和多层感知器网络复合构成复合基网络,用于水声信号的分类识别,试验表明,该网络的分类能力及对未来训练目标的适应性优于BP网和RBF网。  相似文献   

自智能交通系统出现以来,汽车驾乘员的安全带检测一直是备受关注的研究课题.依据城市道路的交通卡口监控数据,研究一种基于深度学习的汽车驾乘人员安全带检测算法,能够准确识别驾驶员是否佩戴安全带.通过对卡口图片进行人工标定,并运用深度学习方法训练两个检测器和一个分类器,最终实现安全带的快速定位和分类.本文提出的方法在城市道路卡口采集的图像上检测效果较好.  相似文献   

提出了一种改进型隐马尔可夫模型/神经网络混合分类器,该分类器将隐马尔可夫模型的时间校正能力与神经网络的静态区分能力结合在一起。它首先利用循环无跳转HMM模型对第一测试特征序列进行全状态分割。将T帧特征序列按时间演化顺序校正成N帧平均状态序列。然后 交其作为RBF网络的输入矢量进行分类。实验结果表明,该分类器比单纯的神经网络或隐马尔可夫模型分类器具有更限的分类效果。  相似文献   

为了降低噪声信息的干扰及提高商品图像识别的准确率,提出了基于深度残差收缩网络的商品图像识别模型.该模型在深度残差网络的基础上融入软阈值函数及注意力机制,软阈值函数将注意力机制注意到的不重要的特征置为0,从而降低噪声信息的干扰,提高图像识别的准确率.实验首先通过爬虫方式获取了包含了51种商品的数据集,并且对该数据集通过图像翻转以及对图像加噪等操作,形成具有44066张图像的商品数据库.然后将深度残差收缩网络与深度残差网络、SENet算法模型对数据进行训练对比,同时对部分商品图像进行了测试.实验结果表明,深度残差收缩网络不仅可以提高商品图像识别准确率,同时还提高了模型的运行速度.  相似文献   

作为连接消费者和商品最直接的媒介,商品包装承担着传达设计情感、引导消费者购买的责任.无论在何种类型的消费终端,商品要想在琳琅满目的货架上脱颖而出,其包装必须具有较高的货架影响力.  相似文献   

单隐层前向神经网络的学习能力是有限的.特别地,作为分类器,单隐层前向神经网络对于图像的复杂信息和不同图像之间的细节信息很难学习和处理.文章借鉴深度神经网络的思想,将单隐层矩阵输入的神经网络拓展到多隐层神经网络,并采用传统的反向传播算法对其训练并给出学习算法.通过多个数据库的实验对比,结果显示所提出的算法具有良好的效果.  相似文献   

刘振  邱家兴  程玉胜 《声学技术》2019,38(4):459-463
从调制(Demodulation on Noise, DEMON)谱谐波簇中提取的结构特征可以建立用于螺旋桨叶片数识别的模板。使用模板匹配算法进行螺旋桨叶片数识别时,存在依赖模板库和置信度准则、算法约束条件多、无法发现缺失模板等问题。本文提出了一种将深度神经网络(Deep Neural Network, DNN)应用于螺旋桨叶片数识别的方法,该方法仅在训练深度神经网络时使用模板库,克服了识别过程中对模板库和置信度准则的依赖。此外,通过提取识别错误项,可以找到缺失模板,实现了对模板库数据的补充。使用该算法对大量实测数据进行检测,发现深度神经网络具有更高的识别正确率,而且识别过程更加简单可靠。  相似文献   


Lip reading is typically regarded as visually interpreting the speaker’s lip movements during the speaking. This is a task of decoding the text from the speaker’s mouth movement. This paper proposes a lip-reading model that helps deaf people and persons with hearing problems to understand a speaker by capturing a video of the speaker and inputting it into the proposed model to obtain the corresponding subtitles. Using deep learning technologies makes it easier for users to extract a large number of different features, which can then be converted to probabilities of letters to obtain accurate results. Recently proposed methods for lip reading are based on sequence-to-sequence architectures that are designed for natural machine translation and audio speech recognition. However, in this paper, a deep convolutional neural network model called the hybrid lip-reading (HLR-Net) model is developed for lip reading from a video. The proposed model includes three stages, namely, pre-processing, encoder, and decoder stages, which produce the output subtitle. The inception, gradient, and bidirectional GRU layers are used to build the encoder, and the attention, fully-connected, activation function layers are used to build the decoder, which performs the connectionist temporal classification (CTC). In comparison with the three recent models, namely, the LipNet model, the lip-reading model with cascaded attention (LCANet), and attention-CTC (A-ACA) model, on the GRID corpus dataset, the proposed HLR-Net model can achieve significant improvements, achieving the CER of 4.9%, WER of 9.7%, and Bleu score of 92% in the case of unseen speakers, and the CER of 1.4%, WER of 3.3%, and Bleu score of 99% in the case of overlapped speakers.


In general, we describe three different methods to select an appropriate distribution form:histogram, probability plots, and hypothesis test. The life distribution is recognized by a neural network method. The relationship among life distribution with life data is described through threshold and weight of neural networks. The method is convenient to use. An example is presented to validate this method, and the results are satisfactory.  相似文献   

为设计出符合消费者需求的产品,采用基于神经网路的产品意象造型设计方法.研究过程中首先确定感性词汇与造型设计要素,在此基础上,利用BP神经网络模型建立2者之间的关系,通过对BP模型中的编码、输入层、输出层、隐含层、激活函数以及相关参数的分析设置,进行产品感性意象设计的实验仿真,最后通过测试验证了模型的有效性.结合折叠自行车的设计进行研究,结果表明,该方法是正确可行的.  相似文献   

为克服传统模板匹配方式识别字符存在的缺陷,采用BP神经网络技术,有效识别字符特征,快速识别字符.运用C++编程以及OpenCV计算机视觉库,降低系统实现的复杂程度,实现了字符的快速准确识别.  相似文献   

As the amount of online video content is increasing, consumers are becoming increasingly interested in various product names appearing in videos, particularly in cosmetic-product names in videos related to fashion, beauty, and style. Thus, the identification of such products by using image recognition technology may aid in the identification of current commercial trends. In this paper, we propose a two-stage deep-learning detection and classification method for cosmetic products. Specifically, variants of the YOLO network are used for detection, where the bounding box for each given input product is predicted and subsequently cropped for classification. We use four state-of-the-art classification networks, namely ResNet, InceptionResNetV2, DenseNet, and EfficientNet, and compare their performance. Furthermore, we employ dilated convolution in these networks to obtain better feature representations and improve performance. Extensive experiments demonstrate that YOLOv3 and its tiny version achieve higher speed and accuracy. Moreover, the dilated networks marginally outperform the base models, or achieve similar performance in the worst case. We conclude that the proposed method can effectively detect and classify cosmetic products.  相似文献   

This paper presents a handwritten document recognition system based on the convolutional neural network technique. In today’s world, handwritten document recognition is rapidly attaining the attention of researchers due to its promising behavior as assisting technology for visually impaired users. This technology is also helpful for the automatic data entry system. In the proposed system prepared a dataset of English language handwritten character images. The proposed system has been trained for the large set of sample data and tested on the sample images of user-defined handwritten documents. In this research, multiple experiments get very worthy recognition results. The proposed system will first perform image pre-processing stages to prepare data for training using a convolutional neural network. After this processing, the input document is segmented using line, word and character segmentation. The proposed system get the accuracy during the character segmentation up to 86%. Then these segmented characters are sent to a convolutional neural network for their recognition. The recognition and segmentation technique proposed in this paper is providing the most acceptable accurate results on a given dataset. The proposed work approaches to the accuracy of the result during convolutional neural network training up to 93%, and for validation that accuracy slightly decreases with 90.42%.  相似文献   

The Convolutional Neural Network (CNN) is a widely used deep neural network. Compared with the shallow neural network, the CNN network has better performance and faster computing in some image recognition tasks. It can effectively avoid the problem that network training falls into local extremes. At present, CNN has been applied in many different fields, including fault diagnosis, and it has improved the level and efficiency of fault diagnosis. In this paper, a two-streams convolutional neural network (TCNN) model is proposed. Based on the short-time Fourier transform (STFT) spectral and Mel Frequency Cepstrum Coefficient (MFCC) input characteristics of two-streams acoustic emission (AE) signals, an AE signal processing and classification system is constructed and compared with the traditional recognition methods of AE signals and traditional CNN networks. The experimental results illustrate the effectiveness of the proposed model. Compared with single-stream convolutional neural network and a simple Long Short-Term Memory (LSTM) network, the performance of TCNN which combines spatial and temporal features is greatly improved, and the accuracy rate can reach 100% on the current database, which is 12% higher than that of single-stream neural network.  相似文献   

针对地震勘探中噪声压制的问题,构建了一种适合分类和识别地震子波的卷积神经网络模型.首先对卷积神经网络模型的激活函数、卷积核大小以及归一化层等进行了设计,然后利用已搭建好的卷积神经网络对地震信号的时频谱图进行特征提取,最后实现了不同类型的含噪地震信号的分类和识别.实验结果表明,该模型有高分类率和识别率及较好的抗干扰能力,...  相似文献   

目的 建立一种快速、准确、无损的塑料打包带的检验及分类方法。方法 利用高光谱在波长为350~990 nm的条件下采集52个不同来源的塑料打包带样品的高光谱数据,并对样品进行Savitzky-Golay平滑处理,同时结合主成分分析对样品进行降维。将提取到的主成分进行K-Means聚类,以聚类结果为依据建立径向基函数神经网络(RBFNN)与BP神经网络模型(BPNN)。结果 打包带样品的高光谱谱图在400~500 nm、600~700 nm处有较大区别。实验共提取了5个初始特征值大于1的主成分,可以解释96.633%的原始数据。通过K-means聚类将塑料打包带样品分为6类,Calinski-Harabasz指数为28.76,RBFNN分类准确率为86.7%;BPNN分类准确率为98.1%,BPNN的分类效果更好。结论 研究表明神经网络在高光谱谱图分类处理上具有较高的准确度,同时也验证了高光谱在区分检验塑料打包带类物证的可行性与科学性,为公安机关提供了一种新的检验方法。  相似文献   

安静  唐英杰  马鑫然 《包装工程》2021,42(3):246-251
目的为了改进当前布匹检测算法覆盖瑕疵种类不全、瑕疵检测准确率低和定位精度差的问题,提出一种端到端的素色布匹瑕疵检测的实用算法。方法首先通过图像增强扩充样本数量,使用以Resnet50为主干的Cascade-RCNN网络,通过加入可变形卷积、特征融合网络,增加锚框数目的方法实现素色布匹瑕疵检测。结果通过实验对比表明,该算法可实现检测20种布匹瑕疵,检测是否为瑕疵布匹的准确率为97%,瑕疵定位的平均检测精度为65%,每张样本平均时间为80 ms。结论该算法有效提升了布匹瑕疵检测的准确率和精度,检测瑕疵类别更全面,并且可以获取缺陷位置和类别,能够满足工业上的生产需求。  相似文献   

