自适应属性选择的实体对齐方法

doi:10.6040/j.issn.1672-3961.0.2019.415

山东大学学报 (工学版) ›› 2020, Vol. 50 ›› Issue (1): 14-20.doi: 10.6040/j.issn.1672-3961.0.2019.415

自适应属性选择的实体对齐方法

苏佳林^1,²(),王元卓¹,靳小龙¹,程学旗¹

1. 中国科学院计算技术研究所网络数据科学与技术重点实验室，北京 100080
2. 中国科学院大学计算机与控制学院，北京 101408

收稿日期:2019-07-22 出版日期:2020-02-20 发布日期:2020-02-14
作者简介:苏佳林(1996-),女,辽宁锦州人,硕士研究生,主要研究方向为知识图谱. E-mail:sujialin17g@ict.ac.cn
基金资助:
国家重点研发计划项目课题(2016YFB1000902);国家自然科学基金资助项目(61572469);国家自然科学基金资助项目(61772501);国家自然科学基金资助项目(61572473);国家自然科学基金资助项目(91646120)

Entity alignment method based on adaptive attribute selection

Jialin SU^1,²(),Yuanzhuo WANG¹,Xiaolong JIN¹,Xueqi CHENG¹

1. CAS Key Lab of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China
2. School of Computer and Control Engineering, University of Chinese Academy of Sciences, Beijing 101408, China

Received:2019-07-22 Online:2020-02-20 Published:2020-02-14
Supported by:
国家重点研发计划项目课题(2016YFB1000902);国家自然科学基金资助项目(61572469);国家自然科学基金资助项目(61772501);国家自然科学基金资助项目(61572473);国家自然科学基金资助项目(91646120)

摘要/Abstract

摘要：

现有实体对齐方法普遍存在传统方法依赖外部信息和人工构建特征，而基于表示学习的方法忽略了知识图谱中的结构信息的问题。针对上述问题，提出自适应属性选择的实体对齐方法，融合实体的语义和结构信息训练基于两个图谱联合表示学习的实体对齐模型。提出使用基于自适应属性选择的属性强约束模型，根据数据集特征自动生成最优属性类型和权重约束，提升实体对齐效果。两个实际数据集上的试验表明,该方法与传统表示学习方法相比准确率最高提升了约11%。

关键词: 知识图谱, 实体对齐, 自适应属性选择, 联合表示学习, 属性强约束

Abstract:

Most existing entity alignment methods typically relied on external information and required expensive manual feature construction to complete alignment. Knowledge graph-based methods used only semantic information and failed to use structural information. Therefore, this paper proposed a new entity alignment method based on adaptive attribute selection, training an entity alignment model based on the joint embedding of the two knowledge graphs, which combined the semantic and structural information. Also, this paper proposed the use of strong attribute constraint based on adaptive attribute selection, which could adaptively generate the most effective attribute category and weight, to improve the performance of entity alignment. Experiments on two realistic datasets showed that, compared with traditional methods, the precision of the proposed method was improved by 11%.

Key words: knowledge graph, entity alignment, adaptive attribute selection, joint embedding, strong attribute constraint

中图分类号:

TP391

苏佳林,王元卓,靳小龙,程学旗. 自适应属性选择的实体对齐方法[J]. 山东大学学报 (工学版), 2020, 50(1): 14-20.

Jialin SU,Yuanzhuo WANG,Xiaolong JIN,Xueqi CHENG. Entity alignment method based on adaptive attribute selection[J]. Journal of Shandong University(Engineering Science), 2020, 50(1): 14-20.

图/表 3

图1

表1

表2

参考文献 15

1	MENG Rui , CHEN Lei , TONG Yongxin , et al. Knowledge base semantic integration using crowdsourcing[J]. TKDE, 2017, 27 (5): 1087- 1100.
2	CAI Pengshan, LI Wei, FENG Yansong, et al. Learning knowledge representation across knowledge graphs[C]//AAAI 2017 Workshop on Knowledge-Based Techniques for Problem Solving and Reasoning (KnowProS'17). Hawaii, USA: IEEE, 2017.
3	GUAN Saiping , JIN Xiaolong , WANG Yuanzhuo , et al. Self-learning and embedding based entity alignment[J]. Knowledge and Information Systems, 2018, (24): 1- 26.
4	苏佳林, 王元卓, 靳小龙, 等. 融合语义和结构信息的知识图谱实体对齐[J]. 山西大学学报(自然科学版), 2019, 42 (1): 23- 30.
	SU Jialin , WANG Yuanzhuo , JIN Xiaolong , et al. Knowledge graph entity alignment with semantic and structural information[J]. Journal of Shanxi University(Nattural Science Edition), 2019, 42 (1): 23- 30.
5	TRSEDYA B D, Qi J, ZHANG R. Entity alignment between knowledge graphs using attribute embeddings[C]// AAAI Thirty-Third Conference on Artificial Intelligence. Hawaii, USA: IEEE, 2019.
6	NGOMO A , AUER S . Limes: a time-efficient approach for large-scale link discovery on the web of data[J]. International Joint Conference on Artificial Intelligence, 2011, 1 (15): 2312- 2317.
7	SCHARFFE F, YANBIN F L, ZHOU C. Rdf-ai: an architecture for rdf datasets matching, fusion and interlink [C]// Proceeding of IJCAI 2009 Workshop on Identity and Knowledge Representation (IR-KR). Pasadena, CA, USA: ACM, 2009.
8	VOLZ J, BIZEr C, GAEDKE M, et al. Discovering and maintaining links on the web of data[C]// Proceedings of the 8th International Semantic Web Conference. Washington, USA, IEEE, 2009.
9	RAIMOND Y, SUTTON C, SANDLER M. Automatic interlinking of music datasets on the semantic web[C]//Proceedings of the 1st Workshop about Linked Data on the Web. Beijing, China: IEEE, 2008.
10	NIU Xing, RONG Shu, WANG Haofen, et al. An effective rule miner for instance matching in a web of data[C]//Proceedings of the 21st ACM international conference on Information and knowledge management. Maui, USA: ACM, 2012.
11	BORDES A, USUNIER N, GARCIA Duran A, et al. Translating embeddings for modeling multi-relational data[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. Lake Tahoe, USA: NIPS, 2013.
12	LIN Yankai, LIU Zhiyuan, SUN Maosong, et al. Learning entity and relation embeddings for knowledge graph completion[C]// Proceedings of AAAI Conference on Artificial Intelligence, 2015. Texas, USA: IEEE, 2015.
13	WANG Zhen, ZHANG Jianwen, FENG Jianlin, et al. Knowledge graph embedding by translating on hyperplanes[C]//Proceedings the Twenty-eighth AAAI Conference on Artificial Intelligence, 2014. Québec City, Canada: IEEE, 2014.
14	JI Guoliang, HE Shizhu, XU Liheng, et al. Knowledge graph embedding via dynamic mapping matrix[C]// Meeting of the Association for Computational Linguistics & the International Joint Conference on Natural Language Processing. Beijing, China: ACL, 2015.
15	CHEN Muhao, TIAN Yingtao, YANG Mohan, et al. Multilingual knowledge graph embeddings for cross-lingual knowledge alignment[C]// International Joint Conference on Artificial Intelligence. Vancouver, Canada: ACL, 2017.

相关文章 15

[1]	蔡国永,林强,任凯琪. 基于域对抗网络和BERT的跨领域文本情感分析[J]. 山东大学学报 (工学版), 2020, 50(1): 1-7,20.
[2]	姚元玺. 基于分场景重构的风电汇聚趋势性量化方法[J]. 山东大学学报 (工学版), 2019, 49(6): 86-92.
[3]	岳俊梅,张冬梅. 基于CSI的轻量级自适应井下定位算法[J]. 山东大学学报 (工学版), 2019, 49(5): 112-118.
[4]	张继,金翠,王洪元,陈首兵. 基于奇异值分解行人对齐网络的行人重识别[J]. 山东大学学报 (工学版), 2019, 49(5): 91-97.
[5]	张宗堂,王森,孙世林. 一种针对不平衡数据分类的集成学习算法[J]. 山东大学学报 (工学版), 2019, 49(4): 8-13.
[6]	陈馨菂,李天瑞,杨欢欢. 基于时间序列数据的交互式主题河流可视化[J]. 山东大学学报 (工学版), 2019, 49(4): 29-35, 43.
[7]	黄劲潮. 深度残差特征与熵能量优化运动目标跟踪算法[J]. 山东大学学报 (工学版), 2019, 49(4): 14-23.
[8]	汪嘉晨,唐向红,陆见光. 轴承故障诊断中特征选取技术[J]. 山东大学学报 (工学版), 2019, 49(2): 80-87, 95.
[9]	张红斌,邱蝶蝶,邬任重,朱涛,滑瑾,姬东鸿. 基于极端梯度提升树算法的图像属性标注[J]. 山东大学学报 (工学版), 2019, 49(2): 8-16.
[10]	侯霄雄,许新征,朱炯,郭燕燕. 基于AlexNet和集成分类器的乳腺癌计算机辅助诊断方法[J]. 山东大学学报 (工学版), 2019, 49(2): 74-79.
[11]	杨煦,陈辉,林游思,屠长河. 飞行蝙蝠标记自动提取与追踪算法[J]. 山东大学学报 (工学版), 2019, 49(2): 67-73.
[12]	向润,陈素芬,曾雪强. 基于多重多元回归的人脸年龄估计[J]. 山东大学学报 (工学版), 2019, 49(2): 54-60.
[13]	胡云,张舒,李慧,佘侃侃,施珺. 基于信任网络重构的推荐算法[J]. 山东大学学报 (工学版), 2019, 49(2): 42-46.
[14]	高明霞,李经纬. 基于word2vec词模型的中文短文本分类方法[J]. 山东大学学报 (工学版), 2019, 49(2): 34-41.
[15]	李童,马然,郑鸿鹤,安平,胡翔宇. 基于视频统计特征的差错敏感度模型[J]. 山东大学学报 (工学版), 2019, 49(2): 116-121.

多维度评价

Viewed

Full text

Abstract

Cited

Shared

Discussed

关系路径	Source1实体数	Source2实体数	训练集三元组数	验证集三元组数	测试集三元组数	实体总数	关系类型总数	属性类型总数	单网络三元组总数
Cora	145	143	76	20	20	1 441	7	10	2 180
Baidu Douban M/TV	762	762	462	150	150	8 143	5	6	27 960

使用方法	Cora			Baidu Douban M/TV
使用方法	准确率/%	召回率/%	F₁/%	准确率/%	召回率/%	F₁/%
TransE	87.06	63.79	73.63	98.97	88.58	93.49
cross-KG	92.04	65.69	76.64	99.31	82.76	90.28
SEEA	90.72	75.86	82.64	99.61	90.23	94.69
文献[4]	96.97	72.41	82.91	99.80	94.43	97.04
文献[5]	85.03	85.03	85.03	88.21	88.21	88.21
本研究	98.51	84.62	91.04	98.00	96.00	96.99

自适应属性选择的实体对齐方法

Entity alignment method based on adaptive attribute selection

RichHTML

PDF (PC)

摘要/Abstract

引用本文

使用本文

图/表 3

参考文献 15

相关文章 15

多维度评价

本文评价

推荐阅读 10