首页 | 本学科首页   官方微博 | 高级检索  
     

基于三维卷积神经网络的虫音特征识别方法
引用本文:万永菁,王博玮,娄定风.基于三维卷积神经网络的虫音特征识别方法[J].计算机应用,2019,39(9):2744-2748.
作者姓名:万永菁  王博玮  娄定风
作者单位:华东理工大学信息科学与工程学院,上海,200237;深圳海关,广东深圳,518045
基金项目:国家自然科学基金资助项目(61872143);国家大学生创新创业训练计划项目(201810251064)。
摘    要:进口木材蛀虫检疫是海关的一项重要工作,但其存在着虫声检测算法准确率低、鲁棒性差等问题。针对这些问题,提出了一种基于三维卷积神经网络(3D CNN)的虫音检测方法以实现虫音特征的识别。首先,对原始虫音音频进行交叠分帧预处理,并使用短时傅里叶变换得到虫音音频的语谱图;然后,将语谱图作为3D CNN的输入,使其通过包含三层卷积层的3D CNN以判断音频中是否存在虫音特征。通过设置不同分帧长度下的输入进行网络训练及测试;最后以准确率、F1分数以及ROC曲线作为评估指标进行性能分析。结果表明,在交叠分帧长度取5 s时,训练及测试效果最佳。此时,3D CNN模型在测试集上的准确率达到96.0%,F1分数为0.96,且比二维卷积神经网络(2D CNN)模型准确率提高近18%。说明所提算法能准确地从音频信号中提取虫音特征并完成蛀虫识别任务,为海关检验检疫提供有力保障。

关 键 词:三维卷积神经网络  短时傅里叶变换  语谱图  虫音识别  声学信号处理
收稿时间:2019-03-22
修稿时间:2019-05-24

Insect sound feature recognition method based on three-dimensional convolutional neural network
WAN Yongjing,WANG Bowei,LOU Dingfeng.Insect sound feature recognition method based on three-dimensional convolutional neural network[J].journal of Computer Applications,2019,39(9):2744-2748.
Authors:WAN Yongjing  WANG Bowei  LOU Dingfeng
Affiliation:1. School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China;
2. Shenzhen Customs, Shenzhen Guangdong 518045, China
Abstract:The quarantine of imported wood is an important task for the customs, but there are problems such as low accuracy and poor robustness in the insect sound detection algorithm. To solve these problems, an insect sound detection method based on Three-Dimensional Convolutional Neural Network (3D CNN) was proposed to detect the presence of insect sound features. Firstly, the original insect audio was framed and pre-processed, and Short-Time Fourier Transform (STFT) was operated to obtain the spectrogram of the insect audio. Then, the spectrogram was used as the input of the 3D CNN consisting three convolutional layers. Network training and testing were conducted by setting inputs with different framing lengths. Finally, the analysis of performance was carried out using metrics like accuracy, F1 score and ROC curve. The experiments showed that the test results were best when the overlap framing length was 5 seconds. The best result of the 3D CNN model on the test set achieved an accuracy of 96.0% and an F1 score of 0.96. The accuracy was increased by nearly 18% compared with that of the two-dimensional convolutional neural network (2D CNN) model. It shows that the proposed model can extract the insect sound features from the audio signal more accurately and complete the insect identification task, which provides an engineering solution for customs inspection and quarantine.
Keywords:Three-Dimensional Convolutional Neural Network (3D CNN)  Short-Time Fourier Transform (STFT)  spectrogram  insect sound detection  acoustic signal processing  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号