添加音素持续时间信息到频谱模型的说话人辨认研究 Adding Phoneme Duration Information to Spectral Model in Speaker Identification期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

添加音素持续时间信息到频谱模型的说话人辨认研究

引用本文：	刘大鹏,尾关和彦,朱庆生.添加音素持续时间信息到频谱模型的说话人辨认研究[J].微机发展,2007,17(5):156-159.

作者姓名：	刘大鹏尾关和彦朱庆生

作者单位：	重庆大学计算机学院，电气通信大学信息通信工程系，重庆大学计算机学院重庆400044，电气通信大学信息通信工程系，日本东京182-8585，日本东京182-8585，重庆400044

摘要：	传统的声音识别系统通过短时声音频谱信息来辨识说话人,这种方法在某些条件下具有较好的性能。但是由于有些说话人特征隐藏在较长的语音片段中,通过添加长时信息可能会进一步提高系统的性能。在文中,音素持续时间信息被添加到传统模型上,以提高说话人辨识率。频谱信息是通过短时分析获得的,但音素持续时间的提取却属于长时分析,它需要更多的语音数据。通过大量语音数据探讨了音素持续时间信息对说话人辨识的有效性,提出2种方法来解决数据量小所引起的问题。实验结果表明,当说话人的声音模型被恰当建立时,即使在语音数据量小的情况下,音素持续时间信息对说话人辨识率的提高也是有效的。
关键词：	说话人声音辨识高斯混合模型音素持续时间信息
文章编号：	1673-629X(2007)05-0156-04
修稿时间：	2006年8月12日
Adding Phoneme Duration Information to Spectral Model in Speaker Identification

LIU Da-peng,Kazuhiko Ozeki,ZHU Qing-sheng.Adding Phoneme Duration Information to Spectral Model in Speaker Identification[J].Microcomputer Development,2007,17(5):156-159.

Authors:	LIU Da-peng Kazuhiko Ozeki ZHU Qing-sheng

Affiliation:	LIU Da-peng1,2,Kazuhiko Ozeki2,ZHU Qing-sheng1

Abstract:	Conventional speaker recognition systems use short-term spectral information to identify speakers.They perform well on some conditions.However,since a part of speaker characteristics is hidden in longer speech segments,the performance may be further improved by adding this long-term information.In this paper,phoneme duration information is added to the conventional model to improve the recognition rate.While spectral information is extracted by short-term analysis,extracting phoneme duration information requires long-term analysis.Thus phoneme duration analysis usually needs more speech data than spectral analysis does.In the first part of this work,effectiveness of phoneme duration information is investigated by using a large amount of speech data.Then two methods are presented to solve the problem caused by only using a small amount of data.Results of the experiments show that phoneme duration information is effective to improve speaker identification performance even when using a small amount of speech data,if the speaker models are built appropriately.

Keywords:	speaker identification GMM phoneme duration information
本文献已被 CNKI 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏