首页 | 本学科首页   官方微博 | 高级检索  
     

基于多模态模式迁移的知识图谱实体配图
引用本文:蒋雪瑶,力维辰,刘井平,李直旭,肖仰华.基于多模态模式迁移的知识图谱实体配图[J].计算机工程,2022,48(8):70-76.
作者姓名:蒋雪瑶  力维辰  刘井平  李直旭  肖仰华
作者单位:1. 复旦大学 软件学院, 上海 200433;2. 华东理工大学 信息科学与工程学院, 上海 200237
基金项目:上海市科技创新行动计划(19511120400)。
摘    要:构建多模态知识图谱的核心在于为知识图谱中的实体匹配正确合适的图像。现有的实体配图方法主要将百科图谱以及图像搜索引擎作为实体候选图像的来源,但对图像数据元的应用方式比较简单,不能准确把握图像数据来源的特点,且可扩展性较差。提出一种基于多模态模式迁移的知识图谱实体配图方法,从不同类别的头部实体中抽取对应的语义模板及视觉模式迁移到同类非头部实体的图像获取过程中,其中语义模板用于构建搜索引擎检索关键词,视觉模式用于对检索结果去噪,最终为WikiData中25类共1.278×105个实体收集1.8×106幅图像。实验结果表明,与IMGpedia、VisualSem、Richpedia和MMKG这4种多模态知识图谱相比,利用该方法构建所得的知识图谱中实体对应的图像在准确性和多样性上更具优势,在下游任务链接预测中,通过引入该方法收集到的图像可使模型的预测链接准确性得到显著提升,在Hits@10的指标上取得59.74%的准确率,较对比方法提高12.7个百分点以上。

关 键 词:多模态知识图谱  符号接地  模式迁移  链接预测  实体配图  
收稿时间:2022-02-25
修稿时间:2022-04-05

Entity Image Collection Based on Multi-Modality Pattern Transfer
JIANG Xueyao,LI Weichen,LIU Jingping,LI Zhixu,XIAO Yanghua.Entity Image Collection Based on Multi-Modality Pattern Transfer[J].Computer Engineering,2022,48(8):70-76.
Authors:JIANG Xueyao  LI Weichen  LIU Jingping  LI Zhixu  XIAO Yanghua
Affiliation:1. School of Software, Fudan University, Shanghai 200433, China;2. School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
Abstract:The core of constructing multi-modality knowledge graph is to ensure the correct and appropriate images match the entities in the knowledge graph.Existing entity image collection methods mainly use encyclopedias and image search engines as the source of images to serve as entity candidates;however, their application of image data elements is relatively simple in that they cannot accurately grasp the characteristics of image data sources, and their scalability is poor.Here, an entity image collection method based on multi-modality pattern transfer is proposed.The method extracts the corresponding semantic template from different types of head entities and transfers the visual mode to the image acquisition process of similar non-head entities.Semantic templates are used to build search engine search keywords, and visual modes are used to denoise the search results.Ultimately, the method collects 1.8×106 images for 1.278×105 entities in 25 categories of WikiData.The experimental results show that, compared with IMGpedia, VisualSem, Richpedia, and MMKG, the images corresponding to entities in the multi-modality knowledge graph constructed by the proposed method are more accurate with greater diversity.The accuracy of the link prediction in downstream task can be significantly improved by introducing the images collected by this method.In Hits@10, the accuracy of the index is 59.74%, which is at least 12.7 percentage points higher than that of the methods used for comparison.
Keywords:multi-modality knowledge graph  symbol grounding  pattern transfer  link prediction  entity image collection  
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号