首页 | 本学科首页   官方微博 | 高级检索  
     

汉语自动分词词典机制的实验研究
引用本文:孙茂松,左正平,黄昌宁.汉语自动分词词典机制的实验研究[J].中文信息学报,2000,14(1):1-6.
作者姓名:孙茂松  左正平  黄昌宁
作者单位:清华大学计算机科学与技术系
基金项目:本研究得到国家自然科学基金资助(合同号:69433010)
摘    要:分词词典是汉语自动分词系统的一个基本组成部分。其查询速度直接影响到分词系统的处理速度。本文设计并通过实验考察了三种典型的分词词典机制:整词二分、TRIE索引树及逐字二分,着重比较了它们的时间、空间效率。实验显示:基于逐字二分的分词词典机制简洁、高效,较好地满足了实用型汉语自动分词系统的需要。

关 键 词:中文信息处理  汉语自动分词  汉语自动分词词典机制  
修稿时间:1999年4月6日

An Experimental Study on Dictionary Mechanism for Chinese Word Segmentation
Sun Maosong,Zuo Zhengping,Huang Changning,The State Key Laboratory of Intelligent Technology and Systems.An Experimental Study on Dictionary Mechanism for Chinese Word Segmentation[J].Journal of Chinese Information Processing,2000,14(1):1-6.
Authors:Sun Maosong  Zuo Zhengping  Huang Changning  The State Key Laboratory of Intelligent Technology and Systems
Affiliation:The State Key Laboratory of Intelligent Technology and Systems ; Department of Computer Science and Technology , Tsinghua University
Abstract:The dictionary mechanism serves as one of the basic components in Chinese word segmentation systems.Its performance influences the segmentation speed significantly.In this paper,we design and implement three typical dictionary mechanisms,i.e.binary seek by word,TRIE indexing tree and binary seek by characters,from word segmentation point of view,and compare their space and time complexity experimentally.It can be seen that the binary seek by characters model is the most appropriate one being capable of fulfilling the need for speed of practical Chinese word segmenters to the maximum extent.
Keywords:Chinese information processing  Chinese word segmentation  Dictionary mechanism for Chinese word segmentation
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号