首页 | 本学科首页   官方微博 | 高级检索  
     

无监督的动态分词方法
引用本文:高军,陈锡先.无监督的动态分词方法[J].北京邮电大学学报,1997,20(4):66-69.
作者姓名:高军  陈锡先
作者单位:北京邮电大学电信工程学院
摘    要:介绍了一种变长汉语语料自动分词方法,这种方法以信息理论中极限熵的概念为基础,运用汉字字串间最大似然度的概念,对汉语语料进行自动分词。讨论了这些方法的局限性,并列出了一些试验结果。

关 键 词:信息处理  汉语语料库  自动分词

Unsupervised Dynamic Word Segmentation
Gao Jun,Chen Xixian.Unsupervised Dynamic Word Segmentation[J].Journal of Beijing University of Posts and Telecommunications,1997,20(4):66-69.
Authors:Gao Jun  Chen Xixian
Abstract:A variable distance automatic word segmentation method to Chinese corpus is presented.It is based on the concept of limiting entropy in information theory, and utilizes the maximum likelihood between the strings of Chinese characters to do automatic Chinese word segmentation.A method of establishing unsupervised dynamic word segmentation dictionary is specially studied.The limitations of these methods are described.Some experimental results are also covered.
Keywords:information processing  Chinese text corpuse  automatic word segmentation
本文献已被 CNKI 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号