JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE) ›› 2015, Vol. 45 ›› Issue (6): 7-15.doi: 10.6040/j.issn.1672-3961.2.2015.085

Previous Articles     Next Articles

Chinese entity relation extraction based on entity semantic similarity

XU Qing1, DUAN Liguo1, LI Aiping1,2, YIN Guimei3   

  1. 1. College of Computer Science and Technology, Taiyuan University of Technology, Taiyuan 030024, Shanxi, China;
    2. State Key Laboratory of Software Engineering, Wuhan University, Wuhan 430072, Hubei, China;
    3. Department of Computer Science and Technology, Taiyuan Normal University, Taiyuan 030600, Shanxi, China
  • Received:2015-05-18 Revised:2015-10-26 Online:2015-12-20 Published:2015-05-18

Abstract: In order to explore the impact of the semantic similarity on the Chinese entity relation extraction, two new features were proposed, which were the "TongYiCi Cilin" code tree constructed with the entities'5 layer code in "TongYiCi Cilin" and the entity semantic similarity tree constructed with the average of the semantic similarity between the entity word in relation instance and all entity words in each category of relation. The impact on the relation extraction performance of these two new features together with the existing "TongYiCi Cilin" code feature and the entity type information feature was explored. In the cases with single features, the entity type feature got the best performance, and the F values of subtype and type were 84.9 and 83.2; In the cases with combination features, the combination of the entity type feature and the "TongYiCi Cilin" code tree feature got the best performance, the F values of both subtype and type were 2.5 higher than the entity type feature. But the performance of three combinations features became poorer instead of better. The results showed that the "TongYiCi Cilin" code tree was an effective supplement of the entity type information, but excessive features may result in information redundancy and poor performance.

Key words: syntax tree, semantic similarity, tree kernel, TongYiCi CiLin, Chinese entity relation extraction

CLC Number: 

  • TP391.1
[1] 秦兵,刘安安,刘挺. 无指导的中文开放式实体关系抽取[J]. 计算机研究与发展,2015,52(5):1029-1035. QIN Bin, LIU Anan, LIU Ting. Unsupervised Chinese open entity relation extraction[J]. Journal of Computer Research and Development, 2015, 52(5):1029-1035.
[2] 贾真,何大可,尹红风,等. 基于无监督学习的部分-整体关系获取[J]. 西南交通大学学报,2014, 49(4):590-596. JIA Zhen, HE Dake, YIN Fongfeng, et al. Acquisition of part-whole relations based on unsupervised learning[J]. Journal of Southwest Jiaotong University, 2014, 49(4):590-596.
[3] 杨博,蔡东风,杨华. 开放式信息抽取研究进展[J]. 中文信息学报,2014,28(4):1-11. YANG Fu, CAI Dongfeng, YANG Hua. Progress in open information extraction[J]. Journal of Chinese Information Processing, 2014, 28(4):1-11.
[4] 李付民,杨静,贺樑. 基于中文句法结构的关系挖掘[J]. 计算机工程,2014,40(7):143-147. LI Fumin, YANG Jing, HE Liang. Relation extraction based on Chinese syntactic structure[J]. Computer Engineering, 2014, 40(7):143-147.
[5] 刘琦,肖仰华,汪卫. 一种面向海量中文文本的典型类属关系识别方法[J]. 计算机工程,2015,41(2):26-30. LIU Qi, XIAO Yanghua, WANG Wei. A Recognition approach of typical generic relationship for massive Chinese text[J]. Computer Engineering, 2015, 41(2):26-30.
[6] 张苇如,孙乐,韩先培. 基于维基百科和模式聚类的实体关系抽取方法[J]. 中文信息学报,2012,26(2):75-81. ZHANG Weiru, SUN Le, HAN Xianpei. A entity relation extraction method based on Wikepadia and pattern clustering[J]. Journal of Chinese Information Processing, 2012, 26(2):75-81.
[7] 车万翔,刘挺,李生. 实体关系自动抽取[J]. 中文信息学报,2005,19(2):1-6. CHE Wanxiang, LIU Ting, LI Sheng. Automatic entity relation extraction[J]. Journal of Chinese Information Processing, 2005, 19(2):1-6.
[8] 徐健,张智雄,吴振新. 实体关系抽取的技术方法综述[J]. 现代图书情报技术,2008(8):18-23. XU Jian, ZHANG Zhixiong, WU Zhenxin. Review on techniques of entity relation extraction[J]. Xiandai Tushu Qingbao Jishu, 2008(8):18-23.
[9] 欧阳丹彤,瞿剑峰,叶育鑫. 关系抽取中基于本体的远监督样本扩充[J]. 软件学报,2014,25(9):2088-2101. OUYANG Dantong, ZHAI Jianfeng, YE Yuxin. Extending training set in distant supervision by ontology for relation extraction[J]. Journal of Software, 2014, 25(9):2088-2101.
[10] 贾真,何大可,杨燕, 等. 基于弱监督学习的中文网络百科关系抽取[J]. 智能系统学报,2015,10(1):113-119. JIA Zhen, HE Dake, YANG Yang, et al. Relation extraction from Chinese online encyclopedia based on weakly supervised learning[J]. CAAL Transactions on Intelligent Systems, 2015, 10(1):113-119.
[11] 朱苏阳,惠浩添,钱龙华,等. 基于自监督学习的维基百科家庭关系抽取[J]. 计算机应用,2015,35(4):1013-1016. ZHU Suyang, HUI Haotian, QIAN Longhua, et al. Family relation extraction from Wikipedia by self-supervised learning[J]. Journal of Computer Applications, 2015, 35(4):1013-1016.
[12] 董静,孙乐,冯元勇,等. 中文实体关系抽取中的特征选择研究[J]. 中文信息学报,2007,21(4):80-85. DONG Jing, SUN Le, FENG Yuanyong, et al. Chinese automatic entity relation extraction[J]. Journal of Chinese Information Processing, 2007, 21(4):80-85.
[13] 刘路,李弼程,张先飞. 基于正反例训练的SVM命名实体关系抽取[J]. 计算机应用,2008,28(6):1444-1446. LIU Lu, LI Bicheng, ZHANG Xianfei. Named entity relation extraction based on SVM training by positive and negative cases[J]. Computer Applications, 2008, 28(6):1444-1446.
[14] 刘克彬,李芳,刘磊,等. 基于核函数中文关系自动抽取系统的实现[J]. 计算机研究与发展,2007,44(8):1406-1411. LIU Kebin, LI Fang, LIU Lei, et al. Implementation of a kernel-based Chinese relation extraction system[J]. Journal of Computer Research and Development, 2007, 44(8):1406-1411.
[15] 郭喜跃,何婷婷,胡小华,等. 基于句法语义特征的中文实体关系抽取[J]. 中文信息学报,2014,28(6):183-189. GUO Xiyue, HE Tingting, HU Xiaohua, et al. Chinese named entity relation extraction based on syntactic and semantic features[J]. Journal of Chinese Information Processing, 2014, 28(6):183-189.
[16] QIAN Longhua, ZHOU Guodong, ZHU Qiaoming. Employing constituent dependency information for tree kernel-based semantic relation extraction between named entities[J]. ACM Transactions on Asian Language Information Processing(TALIP), 2011, 10(3):15:1-15:24.
[17] QIAN Longhua, ZHOU Guodong, KONG Fang. Exploiting constituent dependencies for tree kernel-based semantic relation extraction[J]. ACM Transaction on Asian Language Information Processing, 2011, 10(3):697-704.
[18] ZHANG M, ZHANG J, SU J, et al. D. A composite kernel to extract relations between entities with both flat and structured features[C]//Proceedings of COLING-ACL. Sydney, Australia, Association for Computational Linguistics Stroudsburg, 2006:825-832.
[19] 刘丹丹,彭成,钱龙华, 等. 词汇语义信息对中文实体关系抽取影响的比较[J]. 计算机应用,2012,32(8):2238-2244. LIU Dandan, PENG Cheng, QIAN Longhua, et al. Comparative analysis of impact of lexical semantic information on Chinese entity relation extraction[J]. Journal of Computer Applications, 2012, 32(8):2238-2244.
[20] 梅家驹,竺一鸣,高蕴琦, 等. 编纂汉语类义词典的尝试-《同义词词林》简介[J]. 辞书研究,1983,01:133-138. MEI Jiaju, ZHU Yiming, GAO Yunqi, et al. The introduction of TongYiCi CiLin[J]. Lexicographical Studies, 1983, 01:133-138.
[21] 刘丹丹,彭成,钱龙华, 等. 《同义词词林》在中文实体关系抽取中的作用[J]. 中文信息学报,2014,28(2):91-99. LIU Dandan, PENG Cheng, QIAN Longhua, et al. The effect of TongYiCi CiLin in Chinese entity relation extraction[J]. Journal of Chinese Information Processing, 2014, 28(2):91-99.
[22] 田久乐,赵蔚. 基于同义词词林的词语相似度计算方法[J]. 吉林大学学报:信息科学版,2010,26(6):602-608. TIAN Jiule, ZHAO Wei. Word similarity algorithm based on Tongyici Cilin in semantic web adaptive learning system[J]. Journal of Jilin University:Information Science Editon, 2010, 26(6):602-608.
[23] 陈鹏,郭剑毅,余正涛, 等. 融合领域知识短语树核函数的中文领域实体关系抽取[J]. 南京大学学报:自然科学版,2015,51(1):181-186. CHEN Peng, GUO Jianyi, YU Zhengtao, et al. Chinese domain entity relation extraction based on domain knowledge phrasal tree[J]. Journal of Nanjing University:Natural Sciences, 2015, 51(1):181-186.
[24] 刘志刚,李德仁,秦前清, 等. 支持向量机在多类分类问题中的推广[J]. 计算机工程与应用,2004,40(7):10-13. LIU Zhigang, LI Deren, QIN Qianqing, et al. An analytical overview of methods for multi-category support vertor machines[J]. Computer Engineering and Applications, 2004, 40(7):10-13.
[25] 虞欢欢,钱龙华,周国栋,等. 基于合一句法和实体语义树的中文语义关系抽取[J]. 中文信息学报,2010,24(5):17-23. YU Huanhuan, QIAN Longhua, ZHOU Guodong, et al. Chinese semantic relation extraction based on unified syntactic and entity semantic tree[J]. Journal of Chinese Information Processing, 2010, 24(5):17-23.
[1] LIN Jianghao, ZHOU Yongmei, YANG Aimin, CHEN Jin. Building of domain sentiment lexicon based on word2vec [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 40-47.
[2] QIAN Suchi, PENG Furong, LU Jianfeng. Tag optimization based on semantic similarity [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2015, 45(2): 37-42.
[3] LIU Xiaoyong. A semi-supervised method based on tree kernel for relationship extraction [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2015, 45(2): 22-26.
[4] YIN Kun, YIN Hongfeng*, YANG Yan, JIA Zhen. Semantic similarity computation of Baidu encyclopedia entries based on SimRank [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2014, 44(3): 29-35.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!