首页 | 本学科首页   官方微博 | 高级检索  
     

特征驱动的关键词提取算法综述
引用本文:常耀成,张宇翔,王红,万怀宇,肖春景. 特征驱动的关键词提取算法综述[J]. 软件学报, 2018, 29(7): 2046-2070
作者姓名:常耀成  张宇翔  王红  万怀宇  肖春景
作者单位:中国民航大学 计算机科学与技术学院, 天津 300300,中国民航大学 计算机科学与技术学院, 天津 300300,中国民航大学 计算机科学与技术学院, 天津 300300,北京交通大学 计算机与信息技术学院, 北京 100044,中国民航大学 计算机科学与技术学院, 天津 300300
基金项目:国家自然科学基金(U1533104,U1633110,61603028);中央高校基本科研业务费(ZXH2012P009)
摘    要:面向文本的关键词自动提取一直以来是自然语言处理领域的一个关键基础问题和研究热点.特别是,随着当前对文本数据应用需求的不断增加,使得关键词提取技术进一步得到研究者的广泛关注.尽管近年来关键词提取技术得到长足的发展,但提取结果目前还远未取得令人满意的效果.为了促进关键词提取问题的解决,本文对近年来国内、外学者在该研究领域取得的成果进行了系统总结,具体包括候选关键词生成、特征工程和关键词提取三个主要步骤,并对未来可能的研究方向进行了探讨和展望.不同于围绕提取方法进行总结的综述文献,本文主要围绕着各种方法使用的特征信息归纳总结现有成果,这种从特征驱动的视角考察现有研究成果的方式有助于综合利用现有特征或提出新特征,进而提出更有效的关键词提取方法.

关 键 词:关键词提取  候选关键词生成  特征  有监督方法  图方法
收稿时间:2017-07-19
修稿时间:2017-11-02

Features Oriented Survey of State-of-the-Art Keyphrase Extraction Algorithms
CHANG Yao-Cheng,ZHANG Yu-Xiang,WANG Hong,WAN Huai-Yu and XIAO Chun-Jing. Features Oriented Survey of State-of-the-Art Keyphrase Extraction Algorithms[J]. Journal of Software, 2018, 29(7): 2046-2070
Authors:CHANG Yao-Cheng  ZHANG Yu-Xiang  WANG Hong  WAN Huai-Yu  XIAO Chun-Jing
Affiliation:School of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China,School of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China,School of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China,School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China and School of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China
Abstract:Keyphrases that efficiently represent the main topics discussed in a document are widely used in various document processing tasks, so automatic keyphrase extraction has been one of fundamental problems and hot research issues in the field of natural language processing (NLP). Although automatic keyphrase extraction has received a lot of attention and the extraction technologies have developed quickly, state-of-the-art performance on this task is far from satisfactory. In order to help to solve the keyphrase extraction problem, this paper presents a survey of the state of the art in keyphrase extraction, mainly including candidate keyphrase generation, feature engineering and keyphrase extraction models. In addition, some published datasets are listed, and the evaluation approaches are analyzed, and the challenges and trends of automatic keyword extraction techniques are also discussed ahead. Different from the existing surveys that mainly focus on the models of keyphrase extraction, this paper provides a features oriented survey of automatic keyphrase extraction. This perspective may help to fuse the existing features and propose the new effective extraction approaches.
Keywords:keyphrase extraction  candidate keyphrase generation  features  supervised approaches  graph-based approaches
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号