基于多模态模式迁移的知识图谱实体配图 Entity Image Collection Based on Multi-Modality Pattern Transfer期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于多模态模式迁移的知识图谱实体配图

引用本文：	蒋雪瑶,力维辰,刘井平,李直旭,肖仰华.基于多模态模式迁移的知识图谱实体配图[J].计算机工程,2022,48(8):70-76.

作者姓名：	蒋雪瑶力维辰刘井平李直旭肖仰华

作者单位：	1. 复旦大学软件学院, 上海 200433;2. 华东理工大学信息科学与工程学院, 上海 200237

基金项目：	上海市科技创新行动计划（19511120400）。

摘要：	构建多模态知识图谱的核心在于为知识图谱中的实体匹配正确合适的图像。现有的实体配图方法主要将百科图谱以及图像搜索引擎作为实体候选图像的来源，但对图像数据元的应用方式比较简单，不能准确把握图像数据来源的特点，且可扩展性较差。提出一种基于多模态模式迁移的知识图谱实体配图方法，从不同类别的头部实体中抽取对应的语义模板及视觉模式迁移到同类非头部实体的图像获取过程中，其中语义模板用于构建搜索引擎检索关键词，视觉模式用于对检索结果去噪，最终为WikiData中25类共1.278×10⁵个实体收集1.8×10⁶幅图像。实验结果表明，与IMGpedia、VisualSem、Richpedia和MMKG这4种多模态知识图谱相比，利用该方法构建所得的知识图谱中实体对应的图像在准确性和多样性上更具优势，在下游任务链接预测中，通过引入该方法收集到的图像可使模型的预测链接准确性得到显著提升，在Hits@10的指标上取得59.74%的准确率，较对比方法提高12.7个百分点以上。
关键词：	多模态知识图谱符号接地模式迁移链接预测实体配图
收稿时间：	2022-02-25
修稿时间：	2022-04-05
Entity Image Collection Based on Multi-Modality Pattern Transfer

JIANG Xueyao,LI Weichen,LIU Jingping,LI Zhixu,XIAO Yanghua.Entity Image Collection Based on Multi-Modality Pattern Transfer[J].Computer Engineering,2022,48(8):70-76.

Authors:	JIANG Xueyao LI Weichen LIU Jingping LI Zhixu XIAO Yanghua

Affiliation:	1. School of Software, Fudan University, Shanghai 200433, China;2. School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China

Abstract:	The core of constructing multi-modality knowledge graph is to ensure the correct and appropriate images match the entities in the knowledge graph.Existing entity image collection methods mainly use encyclopedias and image search engines as the source of images to serve as entity candidates;however, their application of image data elements is relatively simple in that they cannot accurately grasp the characteristics of image data sources, and their scalability is poor.Here, an entity image collection method based on multi-modality pattern transfer is proposed.The method extracts the corresponding semantic template from different types of head entities and transfers the visual mode to the image acquisition process of similar non-head entities.Semantic templates are used to build search engine search keywords, and visual modes are used to denoise the search results.Ultimately, the method collects 1.8×10⁶ images for 1.278×10⁵ entities in 25 categories of WikiData.The experimental results show that, compared with IMGpedia, VisualSem, Richpedia, and MMKG, the images corresponding to entities in the multi-modality knowledge graph constructed by the proposed method are more accurate with greater diversity.The accuracy of the link prediction in downstream task can be significantly improved by introducing the images collected by this method.In Hits@10, the accuracy of the index is 59.74%, which is at least 12.7 percentage points higher than that of the methods used for comparison.

Keywords:	multi-modality knowledge graph symbol grounding pattern transfer link prediction entity image collection

	点击此处可从《计算机工程》浏览原始摘要信息
	点击此处可从《计算机工程》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏