首页 | 本学科首页   官方微博 | 高级检索  
     

基于骨导麦克风数据的咀嚼发音类型自动识别与分类方法
引用本文:更太加,张新意,魏建国.基于骨导麦克风数据的咀嚼发音类型自动识别与分类方法[J].声学技术,2022,41(4):556-561.
作者姓名:更太加  张新意  魏建国
作者单位:青海民族大学人工智能应用技术国家民委重点实验室, 青海西宁 810007;天津大学智能与计算学部, 天津 300350
基金项目:国家重点研发计划(2020YFC2004100)、国家自然科学基金(61876131,U1936102)、天津市人工智能重点项目(19ZXZNGX00030)。
摘    要:口腔运动与人们的饮食规律息息相关,该文通过对口腔运动状态的分析识别来监测人们的饮食规律,以此来指导人们的饮食习惯。借助语音识别技术的思想和方法,分析识别口腔运动产生的骨导音,为提升识别效率,采用了传统的隐马尔可夫模型。基于隐马尔可夫模型建立了一套骨导音识别系统,在进行骨导音识别之前,通过分帧加窗、提取梅尔频率倒谱系数,对其进行模型训练;在识别过程中,找出与待测音频信号和模板库中匹配度最高的模型,以其模型输出结果作为最后的识别结果。该方法的识别结果可以达到 84%,实验结果表明该方法具有一定的可行性。

关 键 词:梅尔倒谱系数  隐马尔可夫模型  HTK工具  口腔运动状态
收稿时间:2021/4/7 0:00:00
修稿时间:2021/8/19 0:00:00

Automatic recognition and classification of chewing sound types based on bone conduction microphone data
KHYSRU Kuntharrgyal,ZHANG Xinyi,WEI Jianguo.Automatic recognition and classification of chewing sound types based on bone conduction microphone data[J].Technical Acoustics,2022,41(4):556-561.
Authors:KHYSRU Kuntharrgyal  ZHANG Xinyi  WEI Jianguo
Affiliation:Key Laboratory of Artificial Intelligence Application Technology State Ethnic Affairs Commission, Qinghai Nationalities University, Xining 810007, China;Tianjin Key Laboratory of Cognitive Computing and Application, College of Intelligence and Computing, Tianjin University, Tianjin 300350, China
Abstract:Oral movement is closely related to people''s eating habits. In this paper, an investigation of people''s eating habits is conducted through the analysis and identification of oral movement, so as to guide people regulating their eating habits. The speech recognition method is used to analyze and recognize the bone conduction sound produced by oral movement. In this paper, based on hidden Markov model (HMM), a bone conduction sound recognition system is established with the help of HTK toolkit. Before recognizing bone conduction sound, model training is first carried out by windowing on frames and extracting Mel frequency cepstral coefficients (MFCC). The process of model training is to improve model parameters for establishing a template library. In the process of recognition, the model with the highest match to the audio signal to be tested is found in the template library, and the output result of this model is taken as the final recognition result with a recognition accuracy reaching 84.7%. The experimental results show that the method is feasible.
Keywords:Mel frequency cepstral coefficients (MFCC)  hidden Markov model (HMM)  HTK toolkit  recognition of oral movement
点击此处可从《声学技术》浏览原始摘要信息
点击此处可从《声学技术》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号