首页 | 本学科首页   官方微博 | 高级检索  
     

基于无监督学习的专业领域分词歧义消解方法
引用本文:修驰,宋柔.基于无监督学习的专业领域分词歧义消解方法[J].计算机应用,2013,33(3):780-783.
作者姓名:修驰  宋柔
作者单位:1.北京工业大学 计算机学院,北京 100022; 2.北京语言大学 信息科学学院,北京 100083
基金项目:国家自然科学基金资助项目(60872121)。
摘    要:中文自然语言处理中专业领域分词的难度远远高于通用领域。特别是在专业领域的分词歧义方面,一直没有找到有效的解决方法。针对该问题提出基于无监督学习的专业领域分词歧义消解方法。以测试语料自身的字符串频次信息、互信息、边界熵信息为分词歧义的评价标准,独立、组合地使用这三种信息解决分词歧义问题。实验结果显示该方法可以有效消解专业领域的分词歧义,并明显提高分词效果。

关 键 词:专业领域分词  分词歧义  字符串频次  互信息  边界熵  
收稿时间:2012-09-26
修稿时间:2012-10-31

Disambiguation of domain word segmentation based on unsupervised learning
XIU Chi SONG Rou.Disambiguation of domain word segmentation based on unsupervised learning[J].journal of Computer Applications,2013,33(3):780-783.
Authors:XIU Chi SONG Rou
Affiliation:1.College of Computer Science, Beijing University of Technology, Beijing 100022, China;
2.College of Information Science, Beijing Language and Culture University, Beijing 100083,China
Abstract:Domain word segmentation is much more difficult than general word segmentation in Chinese natural language processing. The segmentation ambiguity has been lack of effective solution especially. Concerning this problem, an unsupervised learning method for domain segmentation ambiguity was proposed. String frequency, mutual information and boundary entropy were selected as evaluation standard for segmentation ambiguity. Individual and combination of these three kinds of information were used to solve the problem. The experimental results suggest that the proposed can solve the domain segmentation ambiguity efficiently and effectively.
Keywords:domain word segmentation                                                                                                                        segmentation ambiguity                                                                                                                        string frequency                                                                                                                        mutual information                                                                                                                        boundary entropy
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号