首页 | 本学科首页   官方微博 | 高级检索  
     

基于枢轴语言的图像描述生成研究
引用本文:张凯,李军辉,周国栋.基于枢轴语言的图像描述生成研究[J].中文信息学报,2019,33(3):110-117.
作者姓名:张凯  李军辉  周国栋
作者单位:苏州大学 计算机科学与技术学院,江苏 苏州 215006
基金项目:国家自然科学基金(61401295)
摘    要:当前图像描述生成的研究主要仅限于单语言(如英文),这得益于大规模的已人工标注的图像及其英文描述语料。该文探索零标注资源情况下,以英文作为枢轴语言的图像中文描述生成研究。具体地,借助于神经机器翻译技术,该文提出并比较了两种图像中文描述生成的方法:(1)串行法,该方法首先将图像生成英文描述,然后由英文描述翻译成中文描述;(2)构建伪训练语料法,该方法首先将训练集中图像的英文描述翻译为中文描述,得到图像-中文描述的伪标注语料,然后训练一个图像中文描述生成模型。特别地,对于第二种方法,该文还比较了基于词和基于字的中文描述生成模型。实验结果表明,采用构建伪训练语料法优于串行法,同时基于字的中文描述生成模型也要优于基于词的模型,BLEU_4值达到0.341。

关 键 词:图像描述生成  机器翻译  神经网络  枢轴语言

Image Caption via Pivot Language
ZHANG Kai,LI Junhui,ZHOU Guodong.Image Caption via Pivot Language[J].Journal of Chinese Information Processing,2019,33(3):110-117.
Authors:ZHANG Kai  LI Junhui  ZHOU Guodong
Affiliation:School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu 215006, China
Abstract:Due to the publically available large-scale image dataset with manually labeled English captions, most studies on image caption aim at generating captions in a single language (e.g., English). In this paper, we explore zero-resource image caption to generate Chinese captions via English as the pivot language. Specifically, we propose and compare two approaches by taking advantage of recent advances in neural machine translation. The first approach, called pipeline approach, first generates English caption for a given image and then translates the English caption into Chinese. The second approach, called building pseudo-training set approach, first translates all English captions in training sets and development set into Chinese to obtain image-Chinese caption datasets, and then directly train a model to generate Chinese caption for a given image. Experimental results show that the second approach, i.e., the character-based Chinese caption generation model on the pseudo-training set, is superior to the pipeline approach.
Keywords:image caption  machine translation  neural network  pivot language  
本文献已被 维普 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号