首页 | 本学科首页   官方微博 | 高级检索  
     

用于文本校对的分词与词性标注一体化算法
作者单位:上海交通大学电子工程系 上海200240
摘    要:分词和词性标注是中文处理中的一项基本步骤,其性能的好坏很大程度上影响了中文处理的效果。传统上人们使用基于词典的机械分词法,但是,在文本校对处理中的文本错误会恶化这种方法的结果,使之后的查错和纠错就建立在一个不正确的基础上。文中试探着寻找一种适用于文本校对处理的分词和词性标注算法。提出了全切分和一体化标注的思想。试验证明,该算法除了具有较高的正确率和召回率之外,还能够很好地抑制文本错误给分词和词性标注带来的影响。

关 键 词:文本校对  分词  词性标注  一体化算法

One Combined Approach of Chinese Segment and Tagging for Proofreading
WANG Yong-jing,LIU Gong-shen,LI Sheng-hong,JING Tao. One Combined Approach of Chinese Segment and Tagging for Proofreading[J]. Microcomputer Development, 2008, 0(8)
Authors:WANG Yong-jing  LIU Gong-shen  LI Sheng-hong  JING Tao
Abstract:Segment and part-of-speech tagging is two important procedures in Chinese processing.Use machine segment based on dictionary traditionally,but during the process of proofreading the errors in the input texts would deteriorate the result of segment and tagging,and then the errors' detection and correction would be made on base of the inexact output.In the paper,tried to find a method suitable for proofreading,and a combined of automatic segment and tagging approach was proposed,which was proved effective to minimize the influence of the errors with a high precise and callback rate.
Keywords:automatic proofreading  automatic segment  tagging  combined approach
本文献已被 CNKI 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号