首页 | 本学科首页   官方微博 | 高级检索  
     

汉语TTS系统中多音字问题的一种有效解决方案
引用本文:刘景勇,柴佩琪,姚秋明. 汉语TTS系统中多音字问题的一种有效解决方案[J]. 微型电脑应用, 2005, 21(4): 52-55
作者姓名:刘景勇  柴佩琪  姚秋明
作者单位:同济大学计算机科学与工程系
摘    要:多音字现象的存在给汉语TTS(TexttoSpeech)系统增加了难度。本文旨在提出一种解决中文TTS系统中的多音字判决问题的统一方案。这种方案基于统计学习的思想。首先构造一个基于特征的词典,该词典可以根据学习的语料动态更新。在有权值和无权值两种更新词典的方法中,通过试验对比最终选择了无权值的方法。我们采取建立规则的办法作为对词典的补充,分别用分类回归树(CART)、扩展的随机复杂度(ESC)进行了实验。通过实验,最终以CART生成的局部规则对词典进行补充,得到了较为满意的效果。

关 键 词:文语转换(TTS)  特征词典  分类回归树(CART)  扩展的随机复杂度(ESC)
文章编号:1007-757X(2005)04-0052-04
修稿时间:2004-12-30

An Effective Solution to Polyphone Problem in Mandarin TTS
Liu Jingyong,Chai Peiqi,Yao Qiuming. An Effective Solution to Polyphone Problem in Mandarin TTS[J]. Microcomputer Applications, 2005, 21(4): 52-55
Authors:Liu Jingyong  Chai Peiqi  Yao Qiuming
Abstract:The phenomenon of polyphone characters in Chinese increases the difficulty of Mandarin TTS (Text to Speech) system. This thesis is aimed to propose a unified approach to the polyphone decision in Mandarin TTS. The method is based on the thinking of statistical learning. First, we construct a lexicon based on multi-features, which can update automatically according to the corpus in learning. Both of the weighted and unweighted methods are used to update the lexicon. Eventually we choose the unweighted one due to its higher accuracy. We make experiments with classification and regression tree (CART) as well as extended stochastic complexity (ESC). Through experiments, we achieve a relatively satisfactory result using CART to create partial rules as the complement to the lexicon.
Keywords:TTS feature lexicon CART ESC
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号