首页 | 本学科首页   官方微博 | 高级检索  
     

一种简单的共享式多层梯度补给方法
引用本文:杜飞,杨云,胡媛媛,曹丽娟. 一种简单的共享式多层梯度补给方法[J]. 软件学报, 2020, 31(7): 2157-2168
作者姓名:杜飞  杨云  胡媛媛  曹丽娟
作者单位:云南大学国家示范性软件学院,云南昆明 650504;云南大学国家示范性软件学院,云南昆明 650504;昆明市数据科学与智能计算重点实验室,云南昆明 650504;云南省高校数据科学与智能计算重点实验室,云南昆明 650504
基金项目:国家自然科学基金(61663046,61876166);云南省应用基础研究计划(2016FB104);云南省中青年学术技术带头人后备人才项目(2017HB005);云南省创新团队项目(2017HC012);云南省高校重点实验室建设计划
摘    要:深度学习通过多层特征提取方式,可以将原始复杂数据自动表征为高级抽象特征,该模型具有很强的建模能力,普遍应用于图像识别语音识别、自然语言处理等高复杂问题中.但深度学习由于网络层数深、参数规模庞大,训练时常常会产生梯度消失、陷入局部最优解、过度拟合等现象.借鉴集成学习的思想,提出一个新颖的深度共享集成网络,该网络通过在深度学习各隐藏层引出多个独立输出层的联合训练的方式,在网络的各层注入梯度,从而对低层隐藏层进行梯度补给,从而降低深度学习中的梯度消失现象,并通过集成多输出层的方式使得整个网络拥有更强的泛化性能.

关 键 词:深度学习  集成学习  堆叠泛化  梯度消失  梯度注入
收稿时间:2017-11-07
修稿时间:2018-03-11

Easy Way for Multilayer Gradient Supplies
DU Fei,YANG Yun,HU Yuan-Yuan,CAO Li-Juan. Easy Way for Multilayer Gradient Supplies[J]. Journal of Software, 2020, 31(7): 2157-2168
Authors:DU Fei  YANG Yun  HU Yuan-Yuan  CAO Li-Juan
Affiliation:National Pilot School of Software, Yunnan University, Kunming 650504, China;National Pilot School of Software, Yunnan University, Kunming 650504, China;Kunming Key Laboratory of Data Science and Intelligent Computing, Kunming 650504, China;Yunnan Provincial University Key Laboratory of Data Science and Intelligent Computing, Kunming 650504, China
Abstract:Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These have dramatically improved the state-of-the-art methods in speech recognition, visual object recognition, natural language processing, and many other domains. However, due to the large number of layers and large parameter scales, deep learning often results in gradient vanishing, falling into local optimal solution, overfitting, and so on. By using ensemble learning methods, this study proposes a novel deep sharing ensemble network. Through joint training many independent output layers in each hidden layer and injecting gradients, this network can reduce the gradient vanishing phenomenon, and through ensemble multi-output, it can get a better generalization performance.
Keywords:deep learning  ensemble learning  stacked generalization  vanishing gradients  gradients injection
本文献已被 万方数据 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号