首页 | 本学科首页   官方微博 | 高级检索  
     

一种给批量汉字加注带有声调拼音的方法
引用本文:马志强.一种给批量汉字加注带有声调拼音的方法[J].微电子学与计算机,2008,25(4):185-188.
作者姓名:马志强
作者单位:内蒙古工业大学,内蒙古,呼和浩特,010080
摘    要:由于汉字存在着多音字的情况,所以给汉字加注带有声调的拼音带来了困难.为了解决这一问题,设计了单字与词语相结合的加注方法.首先构建了带有声调的拼音字典和词典,拼音字典中同一个多音字的拼音按照使用频率进行排放,并且对词典中的词语按照最后一个字进行了索引;然后设计了基于整词二分的二层索引结构,实现了改进的最大逆向分词算法;最后设计了三种实验方案,进行了对比实验.实验结果表明,在没有使用该方法前它的错误率为11%,使用后错误率下降为0.09%.

关 键 词:汉语拼音  多音字  拼音词典  中文分词
文章编号:1000-7180(2008)04-0185-03

The Method of Chinese Marked with Pinyin with Tonality
MA Zhi-qiang.The Method of Chinese Marked with Pinyin with Tonality[J].Microelectronics & Computer,2008,25(4):185-188.
Authors:MA Zhi-qiang
Abstract:As there is polyphony in Chinese characters,it is surely difficult for us to mark with pinyin with tonality.The method is designed to solve the problem by combining character with word.Firstly,the dictionary of Pinyin with tonality and the lexicon with Pinyin were created.The Pinyin of polyphone in the dictionary was arrayed by the used frequency,and the dictionary of Pinyin was indexed according to the last character of word.Secondly,the improved reverse maximum matching algorithm based on the two lay indexing structure based binary-seek-by-word was implemented.Finally,three experimental schemes were tested,and the results indicated the method makes error rate down from 11% to 0.09%.
Keywords:Chinese phonetic alphabetic  polyphony  lexicon with Pinyin  Chinese word segment
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号