首页 | 本学科首页   官方微博 | 高级检索  
     

基于频谱增强和卷积宽度学习的音乐流派分类
引用本文:刘万军,李雨萌,曲海成.基于频谱增强和卷积宽度学习的音乐流派分类[J].计算机系统应用,2023,32(10):85-95.
作者姓名:刘万军  李雨萌  曲海成
作者单位:辽宁工程技术大学 软件学院, 葫芦岛 125105
基金项目:国家自然科学基金面上项目(42271409); 辽宁省高等学校基本科研项目(LIKMZ20220699)
摘    要:针对频谱图对于音乐特征挖掘较弱、深度学习分类模型复杂且训练时间长的问题,设计了一种基于频谱增强和卷积宽度学习(CNNBLS)的音乐流派分类模型.该模型首先通过SpecAugment中随机屏蔽部分频率信道的方法增强梅尔频谱图,再将切割后的梅尔频谱图作为CNNBLS的输入,同时将指数线性单元函数(ELU)融合进CNNBLS的卷积层,以增强其分类精度.相较于其他机器学习网络框架, CNNBLS能用少量的训练时间获得较高的分类精度.此外, CNNBLS可以对增量数据进行快速学习.实验结果表明:无增量模型CNNBLS在训练400首音乐数据可获得90.06%的分类准确率,增量模型Incremental-CNNBLS在增加400首训练数据后可达91.53%的分类准确率.

关 键 词:梅尔频谱  宽度学习  语音增强  音乐流派分类  指数线性单元函数(ELU)
收稿时间:2023/3/30 0:00:00
修稿时间:2023/5/11 0:00:00

Music Genre Classification Based on Spectrogram Enhancement and CNNBLS
LIU Wan-Jun,LI Yu-Meng,QU Hai-Cheng.Music Genre Classification Based on Spectrogram Enhancement and CNNBLS[J].Computer Systems& Applications,2023,32(10):85-95.
Authors:LIU Wan-Jun  LI Yu-Meng  QU Hai-Cheng
Affiliation:School of Software, Liaoning Technical University, Huludao 125105, China
Abstract:For the problems of weak music feature mining, complex deep learning classification models, and long training time, a music genre classification model based on spectrogram enhancement and convolutional neural network-based broad learning system (CNNBLS) is designed. This model first enhances the Mel spectrogram by randomly masking part of frequency channels in SpecAugment and then uses the cut Mel spectrogram as the input of CNNBLS. At the same time, exponential linear unit functions (ELUs) are fused into the convolutional layer of CNNBLS to enhance its classification accuracy. Compared to other machine learning network frameworks, CNNBLS can achieve higher classification accuracy with less training time. In addition, CNNBLS can quickly learn incremental data. The experimental results show that the non-incremental model of CNNBLS can achieve a classification accuracy of 90.06% after training 400 pieces of music data, while the incremental model of Incremental-CNNBLS can achieve a classification accuracy of 91.53% after adding 400 pieces of training data.
Keywords:Mel spectrogram|broad learning system (BLS)|speech enhancement|music genre classification (MGC)|exponential linear unit function (ELU)
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号