汉语自动分词词典机制的实验研究 An Experimental Study on Dictionary Mechanism for Chinese Word Segmentation期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

汉语自动分词词典机制的实验研究

引用本文：	孙茂松,左正平,黄昌宁.汉语自动分词词典机制的实验研究[J].中文信息学报,2000,14(1):1-6.

作者姓名：	孙茂松左正平黄昌宁

作者单位：	清华大学计算机科学与技术系

基金项目：	本研究得到国家自然科学基金资助(合同号:69433010)

摘要：	分词词典是汉语自动分词系统的一个基本组成部分。其查询速度直接影响到分词系统的处理速度。本文设计并通过实验考察了三种典型的分词词典机制:整词二分、TRIE索引树及逐字二分,着重比较了它们的时间、空间效率。实验显示:基于逐字二分的分词词典机制简洁、高效,较好地满足了实用型汉语自动分词系统的需要。
关键词：	中文信息处理汉语自动分词汉语自动分词词典机制
修稿时间：	1999年4月6日
An Experimental Study on Dictionary Mechanism for Chinese Word Segmentation

Sun Maosong,Zuo Zhengping,Huang Changning,The State Key Laboratory of Intelligent Technology and Systems.An Experimental Study on Dictionary Mechanism for Chinese Word Segmentation[J].Journal of Chinese Information Processing,2000,14(1):1-6.

Authors:	Sun Maosong Zuo Zhengping Huang Changning The State Key Laboratory of Intelligent Technology and Systems

Affiliation:	The State Key Laboratory of Intelligent Technology and Systems ; Department of Computer Science and Technology , Tsinghua University

Abstract:	The dictionary mechanism serves as one of the basic components in Chinese word segmentation systems.Its performance influences the segmentation speed significantly.In this paper,we design and implement three typical dictionary mechanisms,i.e.binary seek by word,TRIE indexing tree and binary seek by characters,from word segmentation point of view,and compare their space and time complexity experimentally.It can be seen that the binary seek by characters model is the most appropriate one being capable of fulfilling the need for speed of practical Chinese word segmenters to the maximum extent.

Keywords:	Chinese information processing Chinese word segmentation Dictionary mechanism for Chinese word segmentation
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏