基于多相似性度量和集合编码的属性对齐方法 Attribute Alignment Based on Multi-Similarity Measure and Set Encoding期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于多相似性度量和集合编码的属性对齐方法

引用本文：	伍家豪,陈波,韩先培,孙乐.基于多相似性度量和集合编码的属性对齐方法[J].中文信息学报,2021,35(4):35-43.

作者姓名：	伍家豪陈波韩先培孙乐

作者单位：	1.中国科学院软件研究所,北京 100190; 2.中国科学院大学,北京 100049

摘要：	属性对齐的目标是发现异构知识图谱中表示同一概念的属性之间的对应关系,是实现跨图谱知识融合的关键技术之一。现有模型通常利用基于规则和词嵌入的方法进行属性对齐,但这些方法仍存在以下两个问题:相似性度量不全面和属性实例信息未被充分利用。针对上述问题,该文提出了基于多相似性度量的属性对齐模型,通过多个角度设计相似性度量方法来获取属性间的相似性特征,并利用机器学习模型进行特征聚合。同时,为了充分利用属性的实例信息,在上述模型框架下提出了属性实例集合表示学习算法,通过将属性实例集合编码为向量来提取集合间的主题相似性,从而辅助属性对齐。在属性对齐数据集上的实验验证了模型的有效性,实验还表明,集合的表示学习算法能够有效捕捉属性实例的主题特征,并显著提升属性对齐结果。
关键词：	属性对齐表示学习多相似性度量集合编码
收稿时间：	2019-12-12
Attribute Alignment Based on Multi-Similarity Measure and Set Encoding

WU Jiahao,CHEN Bo,HAN Xianpei,SUN Le.Attribute Alignment Based on Multi-Similarity Measure and Set Encoding[J].Journal of Chinese Information Processing,2021,35(4):35-43.

Authors:	WU Jiahao CHEN Bo HAN Xianpei SUN Le

Affiliation:	1.Institute of Software, Chinese Academy of Sciences, Beijing 100190, China;2.University of Chinese Academy of Sciences, Beijing 100049, China

Abstract:	The goal of attribute alignment is to find the corresponding relationship which representing the same concept in heterogeneous knowledge graph. It is one of the key technologies to knowledge fusion. The existing models based on rules and word embedding are defected in incomplete similarity measurement and insufficient using of attribute instance information. To address this issue, this paper proposes an attribute alignment model based on multi similarity measures. We design similarity measures from multiple perspectives, and use machine learning model to aggregate this kind of features. At the same time, this paper proposes the attribute instance set representation learning algorithm. We extract the topic similarity between sets by encoding the attribute instance set as vectors, so as to assist attribute alignment. Experiments prove the validity of the model, and show that the set representation learning algorithm can effectively capture the subject feature of attribute instances and significantly improve the attribute alignment results.

Keywords:	attribute alignment representation learning multi-similarity measures set encoding

	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏