首页 | 本学科首页   官方微博 | 高级检索  
     

时频图像特征用于声场景分类
引用本文:高敏,尹雪飞,陈克安.时频图像特征用于声场景分类[J].声学技术,2017,36(5):399-404.
作者姓名:高敏  尹雪飞  陈克安
作者单位:西北工业大学电子信息学院, 陕西西安 710129,西北工业大学电子信息学院, 陕西西安 710129,西北工业大学航海学院, 陕西西安 710072
基金项目:国家自然科学基金资助项目(11574249、11074202)
摘    要:为解决根据音频流识别声场景的问题,对音频信号进行恒Q变换,得到其时频表达图像,然后进行滤波平滑等处理,随之提取能够表述信号谱能量变化方向信息的梯度直方图特征,以及能够捕捉信号谱纹理信息的局部二值模式特征,输入具有线性核函数的支持向量机分类器,对不同声场景数据进行分类实验。结果表明,相对于传统的时频域特征和梅尔频率倒谱系数特征,所提出的特征基本能够捕捉到给定声场景具有区分度的信息,所得分类率更高,且两者的互补作用使得联合特征分类效果达到最优,该方法为声信号特征提取贡献了一种新思路。

关 键 词:声场景  恒Q变换  梯度直方图  局部二值模式
收稿时间:2016/11/4 0:00:00
修稿时间:2017/3/15 0:00:00

Time-frequency representation based feature extraction for audio scene classification
GAO Min,YIN Xue-fei and CHEN Ke-an.Time-frequency representation based feature extraction for audio scene classification[J].Technical Acoustics,2017,36(5):399-404.
Authors:GAO Min  YIN Xue-fei and CHEN Ke-an
Affiliation:School of Electronics and Information, Northwestern Polytechnical University, Xi''an 710129, Shaanxi, China,School of Electronics and Information, Northwestern Polytechnical University, Xi''an 710129, Shaanxi, China and School of Marine Science and Technology, Northwestern Polytechnical University, Xi''an 710072, Shaanxi, China
Abstract:To recognize audio scene in a complex environment according to an audio stream, a constant-Q transform is chosen to obtain the time-frequency representation (TFR) of the signal. Due to the lack of prior knowledge on the signal and noise, a mean filtering is used to smooth the TFR image, then the features based on the histogram of gradients (HOG) of the TFR image are extracted, which can reflect the local direction of variation (both in time and frequency) of the signal power spectrum. Consequently the Local Binary Pattern (LBP) feature is considered, which captures the texture information of the signal. As for the classification algorithm, support vector machine with linear kernel function is used. Classification experiment has been done on the data of different acoustic scenes. Compared with the classical audio features such as MFCCs, the proposed features capture the discriminative power of a given audio scene to show good performance in classification, and the combined features achieve the best results. It is valuable in the field of feature extraction of acoustic signal.
Keywords:acoustic scene classification  constant-Q transform  histogram of oriented gradient  local binary pattern
本文献已被 CNKI 等数据库收录!
点击此处可从《声学技术》浏览原始摘要信息
点击此处可从《声学技术》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号