首页 | 本学科首页   官方微博 | 高级检索  
     

利用标注者相关性的深度生成式众包学习
引用本文:李绍园,韦梦龙,黄圣君.利用标注者相关性的深度生成式众包学习[J].软件学报,2022,33(4):1274-1286.
作者姓名:李绍园  韦梦龙  黄圣君
作者单位:计算机科学与技术学院/人工智能学院(南京航空航天大学), 江苏 南京 211106
基金项目:国家自然科学基金(61906089),江苏省基础研究计划(BK20190408),中国博士后基金(2019TQ0152).
摘    要:传统监督学习需要训练样本的真实标记信息,而在很多情况下,真实标记并不容易收集.与之对比,众包学习从多个可能犯错的非专家收集标注,通过某种融合方式估计样本的真实标记.注意到现有深度众包学习工作对标注者相关性建模不足,而非深度众包学习方面的工作表明,标注者相关性建模利用有助于改善学习效果.提出一种深度生成式众包学习方法,以结合深度神经网络优势及利用标注者相关性.该模型由深度神经网络分类器先验和标注生成过程组成,其中,标注生成过程通过引入各类别内标注者能力的混合模型以建模标注者相关性.为自适应地匹配数据及模型复杂度,实现了完全贝叶斯推断.基于结构变分自编码器的自然梯度随机变分推断技术,将共轭参数变分消息传递与神经网络参数随机梯度下降结合到统一框架,实现端到端的高效优化.在22个真实众包数据集上的实验结果验证了该方法的有效性.

关 键 词:众包学习  深度生成式模型  标注者相关性  贝叶斯  自然梯度随机变分推断
收稿时间:2021/5/31 0:00:00
修稿时间:2021/7/16 0:00:00

Deep Generative Crowdsourcing Learning with Worker Correlation Utilization
LI Shao-Yuan,WEI Meng-Long,HUANG Sheng-Jun.Deep Generative Crowdsourcing Learning with Worker Correlation Utilization[J].Journal of Software,2022,33(4):1274-1286.
Authors:LI Shao-Yuan  WEI Meng-Long  HUANG Sheng-Jun
Affiliation:College of Computer Science and Technology/College of Artificial Intelligence(Nanjing University of Aeronautics and Astronautics), Nanjing 211106, China
Abstract:Traditional supervised learning requires the groundtruth labels for the training data, which can be difficult to collect in many cases. In contrast, crowdsourcing learning collects noisy annotations from multiple non-expert workers and infers the latent true labels through some aggregation approach. In this paper, we notice that existing deep crowdsourcing work do not sufficiently model worker correlations, which however is shown to be helpful for learning by previous non-deep learning approaches. We propose a deep generative crowdsourcing learning model to combine the strength of deep neural networks (DNN) and at the same time exploit the worker correlations. The model comprises a DNN classifier as a prior for the true labels, and one annotation generation process in which a mixture model of workers'' reliabilities within each class is introduced for inter-worker correlation. To automatically trade-off between the model complexity and data fitting, we develop fully Bayesian inference. Based on the natural-gradient stochastic variational inference techniques developed for structured variational autoencoder (SVAE), we combine variational message passing for conjugate parameters and stochastic gradient descent for DNN under a unified framework to conduct efficient end-to-end optimization. Experimental results on 22 real world crowdsourcing data sets demonstrate the effectiveness of the proposed approach.
Keywords:crowdsourcing  deep generative model  worker correlations  Bayesian  natural-gradient stochastic variational inference
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号