首页 | 本学科首页   官方微博 | 高级检索  
     

基于全文检索的Segmenter分词算法改进
引用本文:赵远东,陈康,陈建华.基于全文检索的Segmenter分词算法改进[J].数字社区&智能家居,2009,5(1):202-205.
作者姓名:赵远东  陈康  陈建华
作者单位:南京信息工程大学信息与控制学院,江苏南京210044
基金项目:江苏省教育厅自然科学基金项目(07KJB510068)
摘    要:流行的Segmenter分词算法是开放的java源代码,它是一个很好的分词工具。但是,在全文检索系统中对分词有特殊的要求,例如歧义包客、专业词典等。所以在这里将Segmenter词典和算法都进行了改选,建立了树状词典,在匹配算法中使用了歧义包容法则、分支处理、动态规划和词库预载入。以适应电子商务案例全文检索系统。

关 键 词:分词  全文检索  树形词典  歧义包容  分支处理  动态规划  词库预载入

The Improvement on the Arithmetic of Segementer for Word Split Based on Full Text Retrieves
ZHAO Yuan-dong,CHEN Kang,CHEN Jian-hua.The Improvement on the Arithmetic of Segementer for Word Split Based on Full Text Retrieves[J].Digital Community & Smart Home,2009,5(1):202-205.
Authors:ZHAO Yuan-dong  CHEN Kang  CHEN Jian-hua
Affiliation:Dept. of Information and Control;Nanjing University of Information Science & Technology;Nanjing 210044;China
Abstract:The popular arithmetic of segementer for word split is a open java code. It is a good tool for word split. However, there is special requirement for word split in the full text retrieve system such as Different Meamngs Contain and the Professional Dictionary. In order to improve on the arithmetic it be established by a Dictionary in Shape of Tree, use Different Meanings Contain rule,Offset Transaction, Dynamic Programming and Dictionary Pre-load in the matching arithmetic hoping that it can server for our Full Text Retrieves System of E-Commerce Cases.
Keywords:word split  full text retrieves system  dictionary in shape of tree  different meanings contain  offset transaction  dynamic programming  dictionary pre-load  
本文献已被 CNKI 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号