首页 | 本学科首页   官方微博 | 高级检索  
     

基于多知识源融合的关键词重要性评价研究
引用本文:刘远超,吴冲,王晓龙.基于多知识源融合的关键词重要性评价研究[J].哈尔滨工业大学学报,2007,39(7):1138-1141.
作者姓名:刘远超  吴冲  王晓龙
作者单位:1. 哈尔滨工业大学,管理学院,哈尔滨,150001;哈尔滨工业大学,计算机科学与技术学院,哈尔滨,150001
2. 哈尔滨工业大学,管理学院,哈尔滨,150001
3. 哈尔滨工业大学,计算机科学与技术学院,哈尔滨,150001
基金项目:国家高技术研究发展计划(863计划) , 国家自然科学基金
摘    要:词的重要性评价是关键词抽取研究中比较重要的环节,其中一种比较常用的方法是对词的相关属性进行加权分析,并根据综合权值确定重要性程度.词所处的位置、词频、词性以及与线索词的同现信息等都是影响关键词抽取的重要因素.本文首先对可能影响关键词抽取的因素进行了探讨和分析,而后利用遗传算法对各个知识源参数进行了优化.在人工标注的语料上进行的测试结果验证了该方法的可行性.

关 键 词:关键词抽取  参数优化  遗传算法  知识源
文章编号:0367-6234(2007)07-1138-04
修稿时间:2005-08-31

The evaluation of word importance in keyword extraction based on the fusion of multiple knowledge sources
LIU Yuan-chao,WU Chong,WANG Xiao-long.The evaluation of word importance in keyword extraction based on the fusion of multiple knowledge sources[J].Journal of Harbin Institute of Technology,2007,39(7):1138-1141.
Authors:LIU Yuan-chao  WU Chong  WANG Xiao-long
Affiliation:1. School of Management, Harbin Institute of Technology, Harbin 150001, China; 2. School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
Abstract:The evaluation of word importance is one of the important steps for extraction.Currently a popular extraction method is to evaluate the comprehensive weight for every content word in terms of their attributes,the chance for a content word to be selected as keyword is determined by its comprehensive weight.Word location,word frequency,word POS and the concurrency with cue words are all key elements for the computation of comprehensive weight.In this paper,the impacts of these elements on keyword extraction are first analyzed from the theoretical and statistical angle,and then GA is utilized to optimize the coefficient of these attributes.The test on the human-tagged corpus verifies that our method is feasible
Keywords:keyword extraction  parameter optimization  genetic algorithm  knowledge source
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号