首页 | 本学科首页   官方微博 | 高级检索  
     

自然语言处理预训练模型的研究综述
引用本文:余同瑞,金冉,韩晓臻,李家辉,郁婷.自然语言处理预训练模型的研究综述[J].计算机工程与应用,2020,56(23):12-22.
作者姓名:余同瑞  金冉  韩晓臻  李家辉  郁婷
作者单位:1.浙江万里学院 大数据与软件工程学院,浙江 宁波 315100 2.浙江大学 计算机科学与技术学院,杭州 310027
基金项目:宁波市自然科学基金;国家级大学生创新创业训练计划项目;浙江省教育厅一般项目;浙江省基础公益研究计划项目;教育部人文社会科学研究项目;国家自然科学基金
摘    要:近年来,深度学习技术被广泛应用于各个领域,基于深度学习的预处理模型将自然语言处理带入一个新时代。预训练模型的目标是如何使预训练好的模型处于良好的初始状态,在下游任务中达到更好的性能表现。对预训练技术及其发展历史进行介绍,并按照模型特点划分为基于概率统计的传统模型和基于深度学习的新式模型进行综述;简要分析传统预训练模型的特点及局限性,重点介绍基于深度学习的预训练模型,并针对它们在下游任务的表现进行对比评估;梳理出具有启发意义的新式预训练模型,简述这些模型的改进机制以及在下游任务中取得的性能提升;总结目前预训练的模型所面临的问题,并对后续发展趋势进行展望。

关 键 词:深度学习  自然语言处理  预处理  词向量  语言模型  

Review of Pre-training Models for Natural Language Processing
YU Tongrui,JIN Ran,HAN Xiaozhen,LI Jiahui,YU Ting.Review of Pre-training Models for Natural Language Processing[J].Computer Engineering and Applications,2020,56(23):12-22.
Authors:YU Tongrui  JIN Ran  HAN Xiaozhen  LI Jiahui  YU Ting
Affiliation:1.College of Big Data and Software Engineering, Zhejiang Wanli University, Ningbo, Zhejiang 315100, China 2.College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
Abstract:In recent years, deep learning technology has been advancing, pre-training technology for deep learning brings natural language processing into a new era. Pre-training model aims to how to make pre-trained model stay in good initial state and achieve better performances in subsequent downstream tasks. This paper firstly introduces pre-training technology and its development history. And then, this paper further classifies it into the following two types, namely probability-statistics-based traditional model and deep-learning-based new model, according to different features of pre-training models to conduct corresponding detailed introductions. Firstly, it briefly analyzes the characteristics and limitations of today’s pre-training models and highlights the existing deep-learning-based pre-training models. And based on their performances in downstream tasks, it gives necessary comparisons and assessments accordingly. Finally, it combs out a series of whole-new pre-training models with instructive significances,briefly describes corresponding feasible improvement mechanisms and the performance enhancements achieved in downstream tasks, summarizes the problems existing therein, as well as prospectes its development trend in near future.
Keywords:deep learning  natural language processing  pre-training  word embedding  language model  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号