首页 | 本学科首页   官方微博 | 高级检索  
     

基于最大熵模型的汉语短语间停顿识别
引用本文:钱揖丽,荀恩东. 基于最大熵模型的汉语短语间停顿识别[J]. 计算机工程与应用, 2008, 44(17): 18-20. DOI: 10.3778/j.issn.1002-8331.2008.17.005
作者姓名:钱揖丽  荀恩东
作者单位:北京工业大学计算机学院,北京,100022;山西大学计算机与信息技术学院,太原,030006;北京语言大学信息科学学院,北京,100083
摘    要:正确标记短语间的停顿,对提高文语转换系统合成语音的自然度起着重要作用。介绍一种采用最大熵模型从真实自然的语音流中自动识别汉语短语间停顿的方法。模型的特征集包含语音和词法两类特征,采用半自动的方式获得。首先由人工根据经验设计候选特征集,然后采用特征选择算法对候选特征进行筛选,选择更有效的特征构成最终特征集,并训练生成用于汉语短语间停顿识别的最大熵模型。3组实验的结果表明,模型能够取得比较满意的短语间停顿识别效果。

关 键 词:最大熵  语音停顿  短语边界
收稿时间:2008-01-25
修稿时间:2008-3-12 

Phrase break identification based on maximum entropy model
QIAN Yi-li,XUN En-dong. Phrase break identification based on maximum entropy model[J]. Computer Engineering and Applications, 2008, 44(17): 18-20. DOI: 10.3778/j.issn.1002-8331.2008.17.005
Authors:QIAN Yi-li  XUN En-dong
Affiliation:1.College of Computer Science and Technology,Beijing University of Technology,Beijing 100022,China 2.College of Computer Science and Information Technology,Shanxi University,Taiyuan 030006,China 3.College of Information Sciences,Beijing Language and Culture University,Beijing 100083,China
Abstract:In TTS system,it is very important to mark phrase breaks correctly for high naturalness and quality of output speech.This paper presents a maximum entropy based model for phrase break identification in Chinese sentence.The characteristics for model can be divided into two different types,acoustic characteristics and linguistic characteristics.The characteristic set is acquired through a semiautomatic method.Firstly,design spare characteristics based experience;and then it uses an automatic arithmetic to pick out effective characteristics and build final characteristic set;and then trains and builds maximum entropy model based on the set.The experiment results show that the maximum entropy model can acquire satisfactory effect.
Keywords:maximum entropy  speech breaks  phrase boundary
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号