JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE) ›› 2014, Vol. 44 ›› Issue (6): 32-37.doi: 10.6040/j.issn.1672-3961.1.2014.163

Previous Articles     Next Articles

Chinese entity relation extraction based on entity disambiguation

SHAO Fa1, HUANG Yinge1, ZHOU Lanjiang1,2, GUO Jianyi1,2, YU Zhengtao1,2, ZHANG Jinpeng1   

  1. 1. School of Information Engineering and Automation, Kunming 650500, Yunnan, China;
    2. Key Laboratory of Intelligent Information Processing, Kunming University of Science and Technology, Kunming 650500, Yunnan, China
  • Received:2013-12-29 Revised:2014-11-19 Online:2014-12-20 Published:2013-12-29

Abstract: To solve the polysemy problem in Chinese Entity Relation Extraction in open text, a Chinese entity relation extraction method based on entity disambiguation was proposed. First, mining entity relation pairs from HowNet,and the entities were mapped from HowNet to Wikipedia by using disambiguation method based on Bayesian classification so as to obtain high-quality relationship instance; Then, extracting the sentence instances in the corresponding context with these relation instances, to construct a basic extraction pattern; Finally, extracting new cases use the new pattern. The experimental results showed that the accuracy of the proposed method was higher than the methods without semantic disambiguation and pattern merging.

Key words: Bayesian classification, entity disambiguation, pattern merging, entity relation extraction, Wikipedia

CLC Number: 

  • TP391
[1] RICARDO B Y, BERTHIER R N. Modern information retrieval[M]. New York: ACM press, 1999:3-9.
[2] 王继成, 萧嵘, 孙正兴, 等. Web 信息检索研究进展[J]. 计算机研究与发展, 2001, 38(2):187-193. WANG Jicheng, XIAO Rong, SUN Zhengxing, et al. State of the art of information retrieval on the web[J]. Journal of Computer Research and Development, 2001, 38(2):187-193.
[3] 郑实福, 刘挺,秦兵,等. 自动问答综述[J]. 中文信息学报, 2002, 16(6): 46-52. ZHENG Shifu, LIU Ting, QING Bing, et al. Overview of Question-Answering[J]. Journal of Chinese Information, 2002, 16(6): 46-52.
[4] MOLLA D, VICEDO J L. Question answering in restricted domains: an overview[J]. Computational Linguistics, 2007, 33(1): 41-61.
[5] 程妮, 崔建海, 王军. 国外信息过滤系统的研究综述[J]. 现代图书情报技术, 2005, 21(6): 30-38. CHENG Ni, CUI Jianhai, WANG Jun. Overview of research on foreign information filtering systems[J]. New Technology of Library and Information Service, 2005, 21(6): 30-38.
[6] 刘群. 统计机器翻译综述[J]. 中文信息学报, 2003, 17(4): 1-12. LIU Qun. Survey on statistical machine translation[J]. Journal of Chinese information, 2003, 17(4):1-12.
[7] 杜金华, 张萌, 宗成庆, 等. 中国机器翻译研究的机遇与挑战——第八届全国机器翻译研讨会总结与展望[J]. 中文信息学报, 2013, 27(4): 1-8. DU Jinhua, ZHANG Meng, ZONG Chengqing, et al. Opportunities and challenges for machine translation research in China—summary and prospects for the eighth China workshop on machine translation[J]. Journal of Chinese information, 2013, 27(4):1-8.
[8] LEE J, FINK D. Knowledge mapping: encouragements and impediments to adoption[J]. Journal of Knowledge Management, 2013, 17(1): 16-28.
[9] 赵军,刘康,周光有,等.开放式文本信息抽取[J]. 中文信息学报, 2011, 25(6): 98-110. ZHAO Jun, LIU Kang, ZHOU Guangyou, et al. Open text information extraction[J]. Journal of Chinese information, 2011, 25(6): 98-108.
[10] AGICHTEIN E, GRAVANO L. Snowball: Extracting relations from large plain-text collections[C]//Proceedings of the Fifth ACM Conference on Digital Libraries. New York: Association for Computing Machinery, 2000: 85-94.
[11] WELD D S, HOFFMANN R, WU F. Using wikipedia to bootstrap open information extraction[J]. ACM SIGMOD Record, 2009, 37(4): 62-68.
[12] YAN Y, OKAZAKI N, MATSUO Y, et al. Unsupervised relation extraction by mining Wikipedia texts using information from the Web[C]//Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-Volume 2. Stroudsburg, PA, USA: Association for Computational Linguistics, 2009:1021-1029.
[13] MARIA R C, ENRIQUE A, PABLO C Automatic extraction of semantic relationships for wordnet by means of pattern learning from wikipedia[C]//Proceedings of 10th International Conference on Applications of Natural Language to Information Systems, NLDB 2005. Alicante, Spain: Springer Berlin Heidelberg, 2005:67-79.
[14] 张苇如,孙乐,韩先培.基于维基百科和模式聚类的实体关系抽取方法[J].中文信息学报, 2012, 26(2): 75-81. ZHANG Weiru, SUN Le, HAN Xianpei. A entity relation extraction method based on Wikipedia and pattern clustering[J]. Journal of Chinese Information, 2012, 26(2):75-81.
[15] 赵军.命名实体识别,排歧和跨语言关联[J].中文信息学报, 2009, 23(2):3-17. ZHAO Jun. Named entity recognition, WSD and cross language[J]. Journal of Chinese Information, 2009, 23(2): 3-17.
[16] CUCERZAN S. Large-scale named entity disambiguation based on Wikipedia data[C]//EMNLP-CoNLL. 2007. Prague, Czech Republic: DBLP, 2007: 708-716.
[17] 董振东, 董强. 知网和汉语研究[J]. 当代语言学, 2004, 3(1):33-44. DONG Zhendong, DONG Qiang. Construction of a knowledge system and its impact on Chinese research[J]. Contemporary Linguistics, 2004, 3(1):33-44.
[18] 石洪波,王志海,黄厚宽,等.一种限定性的双层贝叶斯分类模型[J].软件学报, 2004, 15(2):193-198. SHI Hongbo, WANG Zhihai, HANG Houkuan, et al. A restricted double-level bayesian classification model[J]. Journal of software, 2004, 15(2):193-198.
[19] 张云涛,龚玲,王永成.基于语料库的朴素贝叶斯方法的词义消歧[J].中南大学学报, 2005, 8(1):483-485. ZHANG Yuntao, GONG Ling, WANG Yongcheng. Corpus-based word sense disambiguation using Naive Bayesian[J].Journal of Central South University, 2005, 8(1):483-485.
[20] NGUYEN D, MATSUO Y, ISHIZUKA M. Subtree mining for relation extraction from Wikipedia[C]//Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers. Stroudsburg, PA, USA: Association for Computational Linguistics, 2007:125-128.
[21] 王宏鼎, 谭少华, 唐世渭, 等. 基于模式元素语义关系的模式合并方法研究[J]. 北京大学学报: 自然科学版, 2007, 43(3):405-411. WANG Hongding, TAN Shaohua, TANG Shiwei, et al. Schema Merging Study with semantic relationships of Schema elements[J]. Journal of Peking University: Information Science Edition, 2007, 43(3):405-411.
[1] XU Qing, DUAN Liguo, LI Aiping, YIN Guimei. Chinese entity relation extraction based on entity semantic similarity [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2015, 45(6): 7-15.
[2] ZHU Na-na1, 2, ZHANG Hua-xiang1, 2*, LIU Li1, 2. Automatic image annotation based on approved FCM algorithm and Bayesian classification [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2013, 43(6): 12-16.
[3] LEI Chun-ya1, GUO Jian-yi1,2, YU Zheng-tao1,2, MAO Cun-li1,2, ZHANG Shao-min1, HUANG Pu1. Domain of automatic entity relation extraction based on seed self-expansion and maximum entropy machine learning [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(5): 141-145.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!