首页 | 本学科首页   官方微博 | 高级检索  
     


Mining atomic Chinese abbreviations with a probabilistic single character recovery model
Authors:Jing-Shin Chang  Wei-Lun Teng
Affiliation:(1) Department of Computer Science & Information Engineering, National Chi-Nan University, Puli, Nantou, Taiwan, ROC
Abstract:An HMM-based single character recovery (SCR) model is proposed in this paper to extract a large set of atomic abbreviations and their full forms from a text corpus. By an “atomic abbreviation,” it refers to an abbreviated word consisting of a single Chinese character. This task is important since Chinese abbreviations cannot be enumerated exhaustively but the abbreviation process for compound words seems to be compositional. One can often decode an abbreviated word character by character to its full form. With a large atomic abbreviation dictionary, one may be able to handle multiple character abbreviation problems more easily based on the compositional property of abbreviations.
Keywords:Abbreviation  Atomic abbreviation  Single character recovery model
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号