您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2020, Vol. 50 ›› Issue (1): 14-20.doi: 10.6040/j.issn.1672-3961.0.2019.415

• 机器学习与数据挖掘 • 上一篇    下一篇

自适应属性选择的实体对齐方法

苏佳林1,2(),王元卓1,靳小龙1,程学旗1   

  1. 1. 中国科学院计算技术研究所网络数据科学与技术重点实验室,北京 100080
    2. 中国科学院大学计算机与控制学院,北京 101408
  • 收稿日期:2019-07-22 出版日期:2020-02-20 发布日期:2020-02-14
  • 作者简介:苏佳林(1996-),女,辽宁锦州人,硕士研究生,主要研究方向为知识图谱. E-mail:sujialin17g@ict.ac.cn
  • 基金资助:
    国家重点研发计划项目课题(2016YFB1000902);国家自然科学基金资助项目(61572469);国家自然科学基金资助项目(61772501);国家自然科学基金资助项目(61572473);国家自然科学基金资助项目(91646120)

Entity alignment method based on adaptive attribute selection

Jialin SU1,2(),Yuanzhuo WANG1,Xiaolong JIN1,Xueqi CHENG1   

  1. 1. CAS Key Lab of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China
    2. School of Computer and Control Engineering, University of Chinese Academy of Sciences, Beijing 101408, China
  • Received:2019-07-22 Online:2020-02-20 Published:2020-02-14
  • Supported by:
    国家重点研发计划项目课题(2016YFB1000902);国家自然科学基金资助项目(61572469);国家自然科学基金资助项目(61772501);国家自然科学基金资助项目(61572473);国家自然科学基金资助项目(91646120)

摘要:

现有实体对齐方法普遍存在传统方法依赖外部信息和人工构建特征,而基于表示学习的方法忽略了知识图谱中的结构信息的问题。针对上述问题,提出自适应属性选择的实体对齐方法,融合实体的语义和结构信息训练基于两个图谱联合表示学习的实体对齐模型。提出使用基于自适应属性选择的属性强约束模型,根据数据集特征自动生成最优属性类型和权重约束,提升实体对齐效果。两个实际数据集上的试验表明,该方法与传统表示学习方法相比准确率最高提升了约11%。

关键词: 知识图谱, 实体对齐, 自适应属性选择, 联合表示学习, 属性强约束

Abstract:

Most existing entity alignment methods typically relied on external information and required expensive manual feature construction to complete alignment. Knowledge graph-based methods used only semantic information and failed to use structural information. Therefore, this paper proposed a new entity alignment method based on adaptive attribute selection, training an entity alignment model based on the joint embedding of the two knowledge graphs, which combined the semantic and structural information. Also, this paper proposed the use of strong attribute constraint based on adaptive attribute selection, which could adaptively generate the most effective attribute category and weight, to improve the performance of entity alignment. Experiments on two realistic datasets showed that, compared with traditional methods, the precision of the proposed method was improved by 11%.

Key words: knowledge graph, entity alignment, adaptive attribute selection, joint embedding, strong attribute constraint

中图分类号: 

  • TP391

图1

基于自适应属性选择的实体对齐方法"

表1

实体对齐数据集"

关系路径 Source1实体数 Source2实体数 训练集三元组数 验证集三元组数 测试集三元组数 实体总数 关系类型总数 属性类型总数 单网络三元组总数
Cora 145 143 76 20 20 1 441 7 10 2 180
Baidu Douban M/TV 762 762 462 150 150 8 143 5 6 27 960

表2

自适应属性选择的实体对齐实验结果"

使用方法 Cora Baidu Douban M/TV
准确率/% 召回率/% F1/% 准确率/% 召回率/% F1/%
TransE 87.06 63.79 73.63 98.97 88.58 93.49
cross-KG 92.04 65.69 76.64 99.31 82.76 90.28
SEEA 90.72 75.86 82.64 99.61 90.23 94.69
文献[4] 96.97 72.41 82.91 99.80 94.43 97.04
文献[5] 85.03 85.03 85.03 88.21 88.21 88.21
本研究 98.51 84.62 91.04 98.00 96.00 96.99
1 MENG Rui , CHEN Lei , TONG Yongxin , et al. Knowledge base semantic integration using crowdsourcing[J]. TKDE, 2017, 27 (5): 1087- 1100.
2 CAI Pengshan, LI Wei, FENG Yansong, et al. Learning knowledge representation across knowledge graphs[C]//AAAI 2017 Workshop on Knowledge-Based Techniques for Problem Solving and Reasoning (KnowProS'17). Hawaii, USA: IEEE, 2017.
3 GUAN Saiping , JIN Xiaolong , WANG Yuanzhuo , et al. Self-learning and embedding based entity alignment[J]. Knowledge and Information Systems, 2018, (24): 1- 26.
4 苏佳林, 王元卓, 靳小龙, 等. 融合语义和结构信息的知识图谱实体对齐[J]. 山西大学学报(自然科学版), 2019, 42 (1): 23- 30.
SU Jialin , WANG Yuanzhuo , JIN Xiaolong , et al. Knowledge graph entity alignment with semantic and structural information[J]. Journal of Shanxi University(Nattural Science Edition), 2019, 42 (1): 23- 30.
5 TRSEDYA B D, Qi J, ZHANG R. Entity alignment between knowledge graphs using attribute embeddings[C]// AAAI Thirty-Third Conference on Artificial Intelligence. Hawaii, USA: IEEE, 2019.
6 NGOMO A , AUER S . Limes: a time-efficient approach for large-scale link discovery on the web of data[J]. International Joint Conference on Artificial Intelligence, 2011, 1 (15): 2312- 2317.
7 SCHARFFE F, YANBIN F L, ZHOU C. Rdf-ai: an architecture for rdf datasets matching, fusion and interlink [C]// Proceeding of IJCAI 2009 Workshop on Identity and Knowledge Representation (IR-KR). Pasadena, CA, USA: ACM, 2009.
8 VOLZ J, BIZEr C, GAEDKE M, et al. Discovering and maintaining links on the web of data[C]// Proceedings of the 8th International Semantic Web Conference. Washington, USA, IEEE, 2009.
9 RAIMOND Y, SUTTON C, SANDLER M. Automatic interlinking of music datasets on the semantic web[C]//Proceedings of the 1st Workshop about Linked Data on the Web. Beijing, China: IEEE, 2008.
10 NIU Xing, RONG Shu, WANG Haofen, et al. An effective rule miner for instance matching in a web of data[C]//Proceedings of the 21st ACM international conference on Information and knowledge management. Maui, USA: ACM, 2012.
11 BORDES A, USUNIER N, GARCIA Duran A, et al. Translating embeddings for modeling multi-relational data[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. Lake Tahoe, USA: NIPS, 2013.
12 LIN Yankai, LIU Zhiyuan, SUN Maosong, et al. Learning entity and relation embeddings for knowledge graph completion[C]// Proceedings of AAAI Conference on Artificial Intelligence, 2015. Texas, USA: IEEE, 2015.
13 WANG Zhen, ZHANG Jianwen, FENG Jianlin, et al. Knowledge graph embedding by translating on hyperplanes[C]//Proceedings the Twenty-eighth AAAI Conference on Artificial Intelligence, 2014. Québec City, Canada: IEEE, 2014.
14 JI Guoliang, HE Shizhu, XU Liheng, et al. Knowledge graph embedding via dynamic mapping matrix[C]// Meeting of the Association for Computational Linguistics & the International Joint Conference on Natural Language Processing. Beijing, China: ACL, 2015.
15 CHEN Muhao, TIAN Yingtao, YANG Mohan, et al. Multilingual knowledge graph embeddings for cross-lingual knowledge alignment[C]// International Joint Conference on Artificial Intelligence. Vancouver, Canada: ACL, 2017.
[1] 田轶群,林荣恒. 基于知识图谱的查询显示系统的设计与实现[J]. 山东大学学报 (工学版), 2022, 52(2): 67-73.
[2] 龚乐君,杨璐,高志宏,李华康. LncRNA与疾病关系的知识图谱构建[J]. 山东大学学报 (工学版), 2021, 51(2): 26-33.
[3] 段江丽,胡新. 自然语言问答中的语义关系识别[J]. 山东大学学报 (工学版), 2020, 50(3): 1-7.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 李 侃 . 嵌入式相贯线焊接控制系统开发与实现[J]. 山东大学学报(工学版), 2008, 38(4): 37 -41 .
[2] 来翔 . 用胞映射方法讨论一类MKdV方程[J]. 山东大学学报(工学版), 2006, 36(1): 87 -92 .
[3] 余嘉元1 , 田金亭1 , 朱强忠2 . 计算智能在心理学中的应用[J]. 山东大学学报(工学版), 2009, 39(1): 1 -5 .
[4] 陈瑞,李红伟,田靖. 磁极数对径向磁轴承承载力的影响[J]. 山东大学学报(工学版), 2018, 48(2): 81 -85 .
[5] 王波,王宁生 . 机电装配体拆卸序列的自动生成及组合优化[J]. 山东大学学报(工学版), 2006, 36(2): 52 -57 .
[6] 张英,郎咏梅,赵玉晓,张鉴达,乔鹏,李善评 . 由EGSB厌氧颗粒污泥培养好氧颗粒污泥的工艺探讨[J]. 山东大学学报(工学版), 2006, 36(4): 56 -59 .
[7] 王丽君,黄奇成,王兆旭 . 敏感性问题中的均方误差与模型比较[J]. 山东大学学报(工学版), 2006, 36(6): 51 -56 .
[8] Yue Khing Toh1 , XIAO Wendong2 , XIE Lihua1 . 基于无线传感器网络的分散目标跟踪:实际测试平台的开发应用(英文)[J]. 山东大学学报(工学版), 2009, 39(1): 50 -56 .
[9] 孙炜伟,王玉振. 考虑饱和的发电机单机无穷大系统有限增益镇定[J]. 山东大学学报(工学版), 2009, 39(1): 69 -76 .
[10] 孙玉利,李法德,左敦稳,戚美 . 直立分室式流体连续通电加热系统的升温特性[J]. 山东大学学报(工学版), 2006, 36(6): 19 -23 .