首页 | 本学科首页   官方微博 | 高级检索  
     

浅析基于Lucene框架的网络论坛信息分析与实现
作者单位:同济大学软件学院
摘    要:在当今互联网的各大论坛,帖吧,个人空间,BLOG中存在着非常多的网友发布的薪水,职业,等方面的信息。如在各大高校的BBS中,每年到应届生加入求职大军时,相当一部分学生会就各公司的薪水状况,福利待遇进行询问以及讨论,而一部分得到公司职位的学生亦会在一定程度上公布公司的薪水待遇状况。这一类各公司的薪水信息,对于以后学生的求职,职业人士跳槽,具有相当的参考价值。同样,在各大论坛,互动社区中,都会有相当大的薪水信息帖。很不幸的是,这些薪水讨论帖子淹没在浩淼的互联网中,并且通过互联网在大陆近十年来发展积淀,此类信息已经达到了海量。利用成熟的Lucene框架,文本检索理论(Information Retrieve),以及相关的分词,归类,索引技术,将这些信息,按照网友提交的关键字进行抽取与分析。按网友需要搜索出具备参考价值的职业信息。为网友的求职,跳槽,提供决策支持。

关 键 词:网络  数据  搜索  中文切词  公司待遇  公司评论  Lucene

Web Forum Data Analyzing Based on Lucene
LIU Fei. Web Forum Data Analyzing Based on Lucene[J]. Digital Community & Smart Home, 2008, 0(25)
Authors:LIU Fei
Abstract:Nowadays, millions of data that are related salary, payment, bonus, company commentary exist in forum, post bar, BLOG. For example, at every school-forum, companies' salary, bonus and commentary always are most hot topic. When undergraduate are in searching of job, that information is one of most valuable reference. Particularly, some student who take company's offer usually post the offer's information on his school-forum or his private space such as BLOG, or QQ zone, etc. Unfortunately, that information usually disappears by data growing. And, for the past several decades, hundreds of millions of above information have being increasingly accumulated on Internet. Study show that a framework named Lucene is a fully capable of realizing data analyzing, splitting character and response the right result to customer by those requesting include salary inquiring, company inquiring etc.
Keywords:Internet  Search  Chinese split  Salary  Lucene
本文献已被 CNKI 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号