山东大学学报(工学版) ›› 2015, Vol. 45 ›› Issue (2): 22-26.doi: 10.6040/j.issn.1672-3961.1.2014.259
刘晓勇
LIU Xiaoyong
摘要: 为了解决传统的半监督关系抽取算法易产生的"语义变异"问题,提出一种新的基于树核函数的半监督关系抽取算法。该算法主要采用树核函数和种子集约束扩展两个策略,弱化"语义变异"现象带来的关系抽取不够准确的问题,提高关系识别的正确率。在基准数据集PopBank上的试验研究表明,提出的使用约束机制扩充种子集的半监督学习方法在4个评价指标上(Precision, Recall, F-measure, Accuracy)均优于常用的两种关系抽取方法,从而验证了该算法与其他算法相比能够具有较好的关系抽取能力。
中图分类号:
[1] MONCECCHI G, MINEL J L, WONSEVER D.A survey of kernel methods for relation extraction[C]//Proceedings of Workshop on NLP and Web-based technologies. Bahía Blanca, Argentine:Springer, 2010:1-9. [2] ZHANG Z. Weakly-supervised relation classification for information extraction[C]//Proceedings of the thirteenth ACM international conference on Information and knowledge management. Washington D C, USA:ACM, 2004:581-588. [3] CHEN J, JI D, TAN C L, et al. Relation extraction using label propagation based semi-supervised learning[C]//Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. Sydney, Australia:Association for Computational Linguistics, 2006:129-136. [4] CHEN J, JI D, TAN C L, et al. Semi-supervised relation extraction with label propagation[C]//Proceedings of the Human Language Technology Conference of the NAACL. New York, USA:Association for Computational Linguistics, 2006:25-28. [5] QIAN L, ZHOU G, KONG F, et al. Semi-supervised learning for semantic relation classification using stratified sampling strategy[C]//Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. Singapore: Association for Computational Linguistics, 2009:1437-1445. [6] ROZENFELD B, FELDMAN R. Self-supervised relation extraction from the Web[J]. Knowledge and Information Systems, 2008, 17(1):17-33. [7] GREENWOOD M A, STEVENSON M. Improving semi-supervised acquisition of relation extraction patterns[C]//Proceedings of the Workshop on Information Extraction Beyond the Document. Sydney, Australia:Association for Computational Linguistics, 2006:29-35. [8] XU F Y, USZKOREIT H, LI H. A seed-driven bottom-up machine learning framework for extracting relations of various complexity[C]//Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. Prague, Czech Republic:Association for Computational Linguistics, 2007:584-591. [9] XU F Y, USZKOREIT H, LI Hong, et al. Adaptation of Relation Extraction Rules to New Domains[C]//Proceedings of the Poster Session of the Sixth International Conference on Language Resources and Evaluation. Marrakech, Morocco: European Language Resources Association, 2008:2446-2450. [10] USZKOREIT H, XU F Y, LI H. Analysis and Improvement of Minimally Supervised Machine Learning for Relation Extraction[M]//HORACEK H, METAIS E, MUNOZ R, et al. Natural Language Processing and Information Systems. Berlin:Springer-Verlag Berlin, 2010:8-23. [11] XU F Y, USZKOREIT Hanz, SEBASTIAN Krause, et al. Boosting Relation Extraction with Limited Closed-World Knowledge[C]//Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010). Beijing, China:Association for Computational Linguistics, 2010:1354-1362. [12] 何婷婷, 徐超, 李晶. 基于种子自扩展的命名实体关系抽取方法[J]. 计算机工程, 2006, 32(21):183-184. HE Tingting, XU Chao, LI Jing. Named entity relation extraction method based on seed self-expansion[J]. Computer Engineering, 2006, 32(21):183-184. [13] 陈锦秀, 姬东鸿. 基于图的半监督关系抽取[J]. 软件学报, 2008, 19(11):2843-2852. CHEN Jinxiu, JI Donghong. Graph-based semi-supervised relation extraction[J]. Journal of Software, 2008, 19(11):2843-2852. [14] 崔宝今,林鸿飞,张霄.基于半监督学习的蛋白质关系投取研究[J]. 山东大学学报:工学版,2009,39(3):16-21. CUI Baojin,LIN Hongfei,ZHANG Xiao. Research of protein-protein interaction extraction based on semi-supervised learning[J].Journal of Shandong University:Engineering Science, 2009, 39(3):16-21. [15] 王艳华,杨志豪,李彦鹏,等. 基于监督学习和半监督学习的蛋白质关系抽取[J]. 江西师范大学学报:自然科学版,2013,37(4):392-396. WANG Yanhua, YANG Zhihao, LI Yanpeng, et al. Protein interaction extraction based on the combination of supervised and semi-supervised learning method[J].Journal of Jiangxi Normal University:Natural Science Edition, 2013, 37(4):392-396. [16] 王艳华. 面向生物医学领域的信息抽取研究[D].大连:大连理工大学,2013:12-22. WANG Yanhua. A study of information extraction for biomedical field[D].Dalian: Dalian University of Technology, 2013:12-22. [17] 陈立玮,冯岩松,赵东岩. 基于弱监督学习的海量网络数据关系抽取[J]. 计算机研究与发展,2013, 50(9):1825-1835. CHEN Liwei, FENG Yansong, ZHAO Dongyan. Extracting relations from the web via weakly suprevised learning[J].Journal of Computer Research and Development, 2013, 50(9):1825-1835. [18] 程显毅,朱倩. 未定义类型的关系抽取的半监督学习框架研究[J]. 南京大学学报:自然科学版, 2012,48(4):466-474. CHENG Xianyi, ZHU Qian. A study of relation extraction of undefined relation type based on semi-supervised learning framework[J].Journal of Nanjing University:Natural Sciences, 2012, 48(4):466-474. [19] ABNEY S. Bootstrpping[C]//Proceedings of 40th Annual Meeting of the Association for Computational Linguistics. Philadelphia, USA:Association for Computational Linguistics, 2002:360-367. [20] CURRAN J R, MURPHY T, SCHOLZ B. Minimising semantic drift with Mutual Exclusion Bootstrapping[C]//Proceedings of the 10th Conference of the Pacific Association for Computational Linguistics. Melbourne, Australia:Pacic Association for Computational Linguistics, 2007:172-180. [21] COLLINS M, DUFFY N, PARK F. New ranking algorithms for parsing and tagging: kernels over discrete structures, and the voted perceptron[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Philadelphia, USA:Association for Computational Linguistics, 2002:263-270. [22] MOSCHITTI A. A study on convolution kernels for shallow semantic parsing[C]//Proceedings of the 42th Conference on Association for Computational Linguistic.Barcelona, Spain:Association for Computational Linguistics, 2004:335-342. [23] MOSCHITTI A. Making tree kernels practical for natural language learning[C]//Proceedings of the 11th International Conference on European Association for Computational Linguistics(EACL). Trento, Italy:Association for Computer Linguistics, 2006:113-120. [24] VISHWANATHAN S V N, SMOLA A J. Fast kernels for string and tree matching[C]//Proceedings of 18th Annual Conference on Neural Information Processing Systems. Quebec, Canada:[s.n.], 2004:113-130. [25] Department of Linguistics, University of Colorado Boulder. PopBank[DB/OL]. (2012-06-24)[2014-03-26].http://verbs.colorado.edu/~mpalmer/projects/ace.html. |
[1] | 叶明全,高凌云,万春圆. 基于人工蜂群和SVM的基因表达数据分类[J]. 山东大学学报(工学版), 2018, 48(3): 10-16. |
[2] | 韩学山,王俊雄,孙东磊,李文博,张心怡,韦志清. 计及空间关联冗余的节点负荷预测方法[J]. 山东大学学报(工学版), 2017, 47(6): 7-12. |
[3] | 刘岩,李幼军,陈萌. 基于EMD和SVM的抑郁症静息态脑电信号分类研究[J]. 山东大学学报(工学版), 2017, 47(3): 21-26. |
[4] | 李素姝,王士同,李滔. 基于LS-SVM与模糊补准则的特征选择方法[J]. 山东大学学报(工学版), 2017, 47(3): 34-42. |
[5] | 徐庆, 段利国, 李爱萍, 阴桂梅. 基于实体词语义相似度的中文实体关系抽取[J]. 山东大学学报(工学版), 2015, 45(6): 7-15. |
[6] | 刘杰, 杨鹏, 吕文生, 刘阿古达木, 刘俊秀. 基于气象因素的PM2.5质量浓度预测模型[J]. 山东大学学报(工学版), 2015, 45(6): 76-83. |
[7] | 浩庆波, 牟少敏, 尹传环, 昌腾腾, 崔文斌. 一种基于聚类的快速局部支持向量机算法[J]. 山东大学学报(工学版), 2015, 45(1): 13-18. |
[8] | 邵发, 黄银阁, 周兰江, 郭剑毅, 余正涛, 张金鹏. 基于实体消歧的中文实体关系抽取[J]. 山东大学学报(工学版), 2014, 44(6): 32-37. |
[9] | 李发权, 杨立才, 颜红博. 基于PCA-SVM多生理信息融合的情绪识别方法[J]. 山东大学学报(工学版), 2014, 44(6): 70-76. |
[10] | 周咏梅1,杨佳能2,阳爱民2. 面向文本情感分析的中文情感词典构建方法[J]. 山东大学学报(工学版), 2013, 43(6): 27-33. |
[11] | 王昊,华继学,范晓诗. 基于双联支持向量机的入侵检测技术[J]. 山东大学学报(工学版), 2013, 43(6): 53-56. |
[12] | 施珺,朱敏. 一种基于灰色系统和支持向量机的预测优化模型[J]. 山东大学学报(工学版), 2012, 42(5): 7-11. |
[13] | 赵加敏,冯爱民*,刘学军. 局部密度嵌入的结构单类支持向量机[J]. 山东大学学报(工学版), 2012, 42(4): 13-18. |
[14] | 潘冬寅,朱发,徐昇,业宁*. 结肠癌基因表达谱的特征选取研究[J]. 山东大学学报(工学版), 2012, 42(2): 23-29. |
[15] | 孙鹏,程世庆*,谢敬思,张海瑞. 预测混合生物质灰熔点的CV-GA-SVM模型[J]. 山东大学学报(工学版), 2012, 42(2): 108-111. |
|