首页 | 本学科首页   官方微博 | 高级检索  
     

基于条件随机场的中文领域分词研究
引用本文:朱艳辉,刘 璟,徐叶强,田海龙,马 进.基于条件随机场的中文领域分词研究[J].计算机工程与应用,2016,52(15):97-100.
作者姓名:朱艳辉  刘 璟  徐叶强  田海龙  马 进
作者单位:湖南工业大学 计算机与通信学院,湖南 株洲 412007
摘    要:针对条件随机场分词不具有良好的领域自适应性,提出一种条件随机场与领域词典相结合的方法提高领域自适应性,并根据构词规则提出了固定词串消解,动词消解,词概率消解三种方法消除歧义。实验结果表明,该分词流程和方法,提高了分词的准确率和自适应性,在计算机领域和医学领域的分词结果F值分别提升了7.6%和8.7%。

关 键 词:中文分词  条件随机场  领域自适应  歧义消解  领域分词  逆向最大匹配算法  

Chinese word segmentation research based on Conditional Random Field
ZHU Yanhui,LIU Jing,XU Yeqiang,TIAN Hailong,MA Jin.Chinese word segmentation research based on Conditional Random Field[J].Computer Engineering and Applications,2016,52(15):97-100.
Authors:ZHU Yanhui  LIU Jing  XU Yeqiang  TIAN Hailong  MA Jin
Affiliation:School of Computer and Communication, Hunan University of Technology, Zhuzhou, Hunan 412007, China
Abstract:According to the Conditional Random Field for Chinese word segmentation, the field is hard to adaptive. A combination of CRF and domain dictionary is proposed to improve the field adaptability, and for eliminating ambiguity, this paper uses fixed word collocation, verb dictionary and word probability by the rule of word formation. The experiental results show that this approach improves the accuracy and adaptability of the word segmentation. F value of the segmentation results in computer and medical fields is increased by 7.6% and 8.7%.
Keywords:Chinese word segmentation  Conditional Random Field(CRF)  domain adaption  ambiguity resolution  domain segmentation  reverse directional maximum match method  
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号