首页 | 本学科首页   官方微博 | 高级检索  
     

对中文自动分词机制的研究和改进
作者单位:同济大学软件工程学院 上海201804
摘    要:本文研究了中文分词技术,改进了传统的整词二分分词机制,设计了一种按照词的字数分类组织的新的词典结构,该词典的更新和添加更加方便,并根据此词典结构提出了相应的快速分词算法。通过对比实验表明,与传统的整词二分、逐字二分和TRIE索引树分词方法相比,该分词方法分词速度更快。

关 键 词:自然语言处理  中文分词  词典法分词

An Improved Mechanism on the Chinese Word Segmentation
GUO Yi. An Improved Mechanism on the Chinese Word Segmentation[J]. Digital Community & Smart Home, 2008, 0(7)
Authors:GUO Yi
Abstract:This paper studied the Technology of Chinese word segmentation,improved the traditional whole Chinese word binary seg-mentation mechanism,provided a new dictionary mechanism: organized according to the number of the word count of the Chinese words.The dictionary is very easy to update and append new words.And it put forward a new fast algorithm of Chinese word segmentation based the new dictionary mechanism.Through contrastive experiments it proves that the new algorithm having faster Chinese word segmenting speed than whole Chinese word binary segmentation,one by one word Chinese word binary segmentation and the TRIE tree Chinese word segmentation.
Keywords:natural language processing  Chinese word segmentation  Chinese word segmentation based on dictionary
本文献已被 CNKI 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号