首页 | 本学科首页   官方微博 | 高级检索  
     

基于Lucene全文检索系统的设计与实现
引用本文:周敬才,胡华平,岳虹. 基于Lucene全文检索系统的设计与实现[J]. 计算机工程与科学, 2015, 37(2): 252-256
作者姓名:周敬才  胡华平  岳虹
作者单位:1. 61070部队,福建福州,350003
2. 61070部队,福建福州350003;国防科学技术大学计算机学院,湖南长沙410073
基金项目:国家863计划资助项目(2012AA7116048)
摘    要:随着信息化水平不断提高,如何从海量信息中快速查找到所需内容成为当前研究的热点。在分析了全文检索基本原理及Lucene系统结构的基础上,提出了MVC模式的全文检索模型,并实现了一套基于SSH框架技术和Lucene搜索引擎的全文检索系统。该系统扩展了检索文档支持的类型,不仅可以对TXT、MS Office各类文档进行检索,还能对PDF、HTML、RTF等文档进行检索;改进了中文分词器,提高了中文分词效率与精确度;改善了人机交互方式,实现了类似百度、谷歌搜索显示功能,对搜索关键字进行高亮显示。系统应用情况表明,该系统创建索引效率高,具有较快的检索速度以及较全的检索结果。

关 键 词:Lucene  文档解析  全文检索  搜索引擎
收稿时间:2013-06-24
修稿时间:2013-09-29

Design and implementation of Lucene-based full-text retrieval system
ZHOU Jing-cai , HU Hua-ping , YUE Hong. Design and implementation of Lucene-based full-text retrieval system[J]. Computer Engineering & Science, 2015, 37(2): 252-256
Authors:ZHOU Jing-cai    HU Hua-ping    YUE Hong
Affiliation:(1.Troop 61070,Fuzhou 350003;2.College of Computer,National University of Defense Technology,Changsha 410073,China)
Abstract:With the continuous improvement of informationization, a high performance, full-featured text search system, which can fast locate the matching records among massive data, has become a new research hotspot. Based on the analysis of the fundamentals of the full text retrieval techniques and the structure of Lucene system, we present a MVC pattern full text retrieval model and develop a retrieval system based on SSH framework and Lucene search engine. It has three contributions. Firstly this system optimizes the supported file formats, and adds PDF, HTML, and RTF along with TXT, Ms office documents into the search library. Secondly, it improves the Chinese words segmentation machine in efficiency and accuracy. Thirdly, it enhances human machine interaction and achieves a similar display function as Baidu and Google, which can highlight the search keywords. The practical application of this system demonstrates that it is efficient in creating indexes and can speed up search with much more relevant results.
Keywords:Lucene  document parse  fulll-text retrieval;search engine
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号