首页 | 本学科首页   官方微博 | 高级检索  
     

基于Lucene的中文分词技术研究
引用本文:汤国锋,徐振伟,张华熊. 基于Lucene的中文分词技术研究[J]. 电脑编程技巧与维护, 2012, 0(10): 4-5,12
作者姓名:汤国锋  徐振伟  张华熊
作者单位:浙江理工大学,杭州,310018
摘    要:分析了现有的几种中文分词的算法,提出了在逆向最大匹配算法的基础上结合语义理解的分词方法,利用最大概率分词的方法解决多种分词结果的问题,以此来改进Lucene[1]的中文分词的算法,提高了分词的速度和准确性。

关 键 词:中文分词  Lucene  最大匹配  最大概率

The Research of Chinese Segmentation Based on Lucene
TANG Guofeng , XU Zhenwei , ZHANG Huaxiong. The Research of Chinese Segmentation Based on Lucene[J]. Computer Programming Skills & Maintenance, 2012, 0(10): 4-5,12
Authors:TANG Guofeng    XU Zhenwei    ZHANG Huaxiong
Affiliation:(Zhejiang Sci-Tech University,Hangzhou 310018)
Abstract:This paper analyses several existing Chinese segmentation algorithm,and puts forward an algorithm which is based on the reverse maximal matching algorithm,besides,this algorithm is on the combination of semantic understanding word segmentation method and maximum probability participle method which is to solve the problem of many results of segmentation,so as to improve the Lucene Chinese segmentation algorithm and improve the segmentation speed and accuracy.
Keywords:Chinese segmentation  Lucene  maximal matching  maximum probability
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号