首页 | 本学科首页   官方微博 | 高级检索  
     

MP3压缩域中语音分割的研究与实现
引用本文:常辽豫,余小清,万旺根,李昌莲,许雪琼. MP3压缩域中语音分割的研究与实现[J]. 计算机应用, 2009, 29(4): 1188-1192
作者姓名:常辽豫  余小清  万旺根  李昌莲  许雪琼
作者单位:上海大学,通信与信息工程学院,上海,200072
基金项目:国家自然科学基金,上海市国际合作基金,上海市教委电路与系统重点学科项目 
摘    要:针对说话人改变点检测问题,在MP3格式下用改进型BIC算法实现了多话者改变点的检测。根据非压缩域中MFCC的求取过程,提出了一种在压缩域MP3格式下利用MDCT系数计算MFCC特征参数的新方法。在此基础上,使用改进型BIC改变点检测算法检测说话人改变点。实验表明在相同的BIC检测算法下,利用在压缩域中提取的MFCC特征参数进行语音分割,可以得到和非压缩域相似的分割精确度。

关 键 词:压缩域  BIC检测  语音分割  MDCT系数
收稿时间:2008-10-10
修稿时间:2008-12-15

Research and realization of speech segmentation in MP3 compressed domain
CHANG Liao-yu,YU Xiao-qing,WAN Wang-gen,LI Chang-lian,XU Xue-qiong. Research and realization of speech segmentation in MP3 compressed domain[J]. Journal of Computer Applications, 2009, 29(4): 1188-1192
Authors:CHANG Liao-yu  YU Xiao-qing  WAN Wang-gen  LI Chang-lian  XU Xue-qiong
Affiliation:School of Communication and Information Engineering;Shanghai University;Shanghai 200072;China
Abstract:This article proposed an approach for detecting the voice change of speakers by employing improved Bayesian Information Criterion (BIC) algorithm in MPEG1-layer3 (MP3) compressed domain. According to the process of MFCC calculation in raw audio, a new Mel-Frequency Cepstral Coefficients (MFCC) algorithm by utilizing Modified Discrete Cosine Transform (MDCT) coefficient in MP3 domain was presented. Based on these coefficients, the improved BIC algorithm was employed to decide which point was the voice change point of speakers. The experimental results show that using the MFCC coefficients extracted in MP3 domain for speech segmentation, similar segmentation precision can be obtained as that in uncompressed domain.
Keywords:compressed domain  Bayesian Information Criterion (BIC) detection  speech segmentation  Modified Discrete Cosine Transform (MDCT) coefficient
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号