首页 | 本学科首页   官方微博 | 高级检索  
     


A NOVEL SPACE-COMPRESSED CHINESE WORD DIGRAM BASED ON BI-CHARACTER CO-ARTICULATION FREQUENCY
Authors:Zhao Yibao Qiao Liyan Tan Jianxun Sun Shenghe
Abstract:Chinese Phonetic-Character Conversion(CPCC) is an important issue in Chinese speech recognition and Chinese sentence keyboard input system. The approaches based on large corpus statistic Markov language model (such as bigram, trigram) become more and more popular today. This paper presents an improved Chinese word bigram, space-compressed Chinese word bigram, which stores the bi-word co-articulation frequency in the form of the bi-character co-articulation frequency. The bi-word co-articulation frequency is estimated from the bi-character co-articulation frequency library. The CPCC experiment with the improved Chinese word bigram shows: it can reach a higher correct conversion ratio with less space occupation.
Keywords:CPCC  Markov model  Bigram  Word frequency estimate
本文献已被 CNKI 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号