首页 | 本学科首页   官方微博 | 高级检索  
     

基于规则挖掘和Naive Bayes方法的组合型歧义字段切分
引用本文:张严虎潘璐璐彭子平张靖波于中华. 基于规则挖掘和Naive Bayes方法的组合型歧义字段切分[J]. 计算机应用, 2008, 28(7): 1686-1688
作者姓名:张严虎潘璐璐彭子平张靖波于中华
作者单位:四川大学计算机学院,成都610064
摘    要:组合型歧义字段切分是中文自动分词的难点之一。在对现有方法进行深入分析的基础上,提出了一种新的切分算法。该算法自动从训练语料中挖掘词语搭配规则和语法规则,基于这些规则和Naive Bayes模型综合决策进行组合型歧义字段切分。充分的实验表明,相对于文献中的研究结果,该算法对组合型歧义字段切分的准确率提高了大约8%。

关 键 词:中文分词  组合型歧义  词语搭配规则  语法规则
收稿时间:2008-01-14
修稿时间:2008-03-07

Resolving combinational ambiguity in Chinese word segmentation based on rule mining and Naive Bayes method
Abstract:Combinational ambiguity is one of the most difficult problems for Chinese word segmentation. After in-depth analysis of the other algorithms in literature, the paper proposed a new segmentation algorithm. The algorithm automatically mined word collocation rules and grammar rules from training corpus, and then made integrated decisions to resolve combinational ambiguity based on the mined rules and Naive Bayes method. Extensive experiments show that the proposed algorithm obtains an accuracy increase of 8% against the related works.
Keywords:
本文献已被 维普 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号