您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2015, Vol. 45 ›› Issue (1): 19-23.doi: 10.6040/j.issn.1672-3961.1.2014.250

• 机器学习与数据挖掘 • 上一篇    下一篇

一种基于动态词典和三支决策的情感分析方法

周哲1,2, 商琳1,2   

  1. 1. 南京大学计算机科学与技术系, 江苏 南京 210046;
    2. 南京大学计算机软件新技术国家重点实验室, 江苏 南京 210046
  • 收稿日期:2014-05-23 修回日期:2014-10-15 发布日期:2014-05-23
  • 通讯作者: 商琳(1973-),女,江苏南京人,博士,副教授,主要研究方向为数据挖掘,人工智能和粗糙集.E-mail:shanglin@nju.edu.cn E-mail:shanglin@nju.edu.cn
  • 作者简介:周哲(1991-),男,江苏南京人,硕士研究生,主要研究方向为文本情感分析.E-mail:mahirunozuki1@gmail.com
  • 基金资助:
    国家自然科学基金面上项目(61170180)

A sentiment analysis method based on dynamic lexicon and three-way decision

ZHOU Zhe1,2, SHANG Lin1,2   

  1. 1. Department of Computer Science and Technology, Nanjing University, Nanjing 210046, Jiangsu, China;
    2. State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210046, Jiangsu, China
  • Received:2014-05-23 Revised:2014-10-15 Published:2014-05-23

摘要: 提出了一种新的特征提取方式,与三支决策思想相结合,运用在文本情感分析中,以提高分类器的效率。根据训练集合创建动态情感词典,然后根据情感词典提取文本的抽象特征,形成特征矩阵。在分类过程中,如果分类器对于目标文本的所属分类确信程度不够高,那么分类器会利用三支决策的思想,将文本置于边界域中,等待别的处理方法。实验结果表明,在英文影评数据集上,基于动态词典的特征提取方法可以取得更好的分类准确率,而且三支决策规则可将一些样例放入边界域,提高了分类准确率。

关键词: 情感分析, 三支决策, 文本数据挖掘, 特征抽取, 观点挖掘

Abstract: A new way of feature extraction and the concept of three-way decision was utilized in traditional text sentiment analysis methods in order to boost the classification accuracy. In the new method, a dynamic lexicon was introduced according to the training set and was utilized to extract abstract features for every piece of text to form the feature matrix. Besides, in the classification process, target texts with which the classifier had low confidence of sentiment labels were put into the boundary region for later decision. Experimental results showed that the method reached better results with the help of dynamic sentiment lexicon, and the three-way decision also raised the accuracy of classification.

Key words: text data mining, three-way decision, sentiment analysis, opinion mining, feature extraction

中图分类号: 

  • TP391
[1] PANG B, LEE L, VAITHYANATHAN S. Thumbs up?: sentiment classification using machine learning techniques[C]//Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10. Philadelphia,USA: Association for Computational Linguistics, 2002:79-86.
[2] PANG B, LEE L. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts[C]//Proceedings of the 42nd Annual meeting on Association for Computational Linguistics. Barcelona, Spain: Association for Computational Linguistics, 2004: 271.
[3] BACCIANELLA S, ESULI A, SEBASTIANI F. SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining[C]//Proceedings of the International Conference on Language Resources and Evaluation(LREC). Valletta, Malta:LREC, 2010:2200-2204.
[4] TABOADA M, BROOKE J, TOFILOSKI M, et al. Lexicon-based methods for sentiment analysis[J]. Computational Linguistics, 2011, 37(2): 267-307.
[5] TURNEY P D. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews[C]//Proceedings of the 40th annual meeting on association for computational linguistics. Philadelphia, USA: Association for Computational Linguistics, 2002: 417-424.
[6] LI Gang, LIU Fei. A clustering-based approach on sentiment analysis[C]//Intelligent Systems and Knowledge Engineering (ISKE), 2010 International Conference on. Hangzhou, China: IEEE, 2010: 331-337.
[7] OFEK N, CARAGEA C, ROKACH L, et al. A Suite of Core NLP Tools[DB/OL].(2010-11-16)[2014-03-26].http://nlp.stanford.edu/software/corenlp.shtml.
[8] SINGH V K, PIRYANI R, UDDIN A, et al. Sentiment analysis of textual reviews: Evaluating machine learning, unsupervised and SentiWordNet approaches[C]//Knowledge and Smart Technology (KST), 2013 5th International Conference on. Chonburi, Thailand: IEEE, 2013: 122-127.
[9] OFEK N, CARAGEA C, ROKACH L, et al. Improving sentiment analysis in an online cancer survivor community using dynamic sentiment lexicon[C]//Social Intelligence and Technology (SOCIETY), 2013 International Conference on. Pennsylvania, USA: IEEE, 2013: 109-113.
[10] YAO Yiyu. Three-way decision: an interpretation of rules in rough set theory[M]//Rough Sets and Knowledge Technology. Berlin Heidelberg: Springer, 2009: 642-649.
[11] ZHOU Bing, YAO Yiyu, LUO Jigang. A three-way decision approach to email spam filtering[M]//Advances in Artificial Intelligence. Berlin Heidelberg: Springer, 2010: 28-39.
[1] 沈冀,马志强,李图雅,张力. 面向短文本情感分析的词扩充LDA模型[J]. 山东大学学报(工学版), 2018, 48(3): 120-126.
[2] 张玉玲,尹传环. 基于SVM的安卓恶意软件检测[J]. 山东大学学报(工学版), 2017, 47(1): 42-47.
[3] 周咏梅1,阳爱民1,林江豪2. 中文微博情感词典构建方法[J]. 山东大学学报(工学版), 2014, 44(3): 36-40.
[4] 周咏梅1,杨佳能2,阳爱民2. 面向文本情感分析的中文情感词典构建方法[J]. 山东大学学报(工学版), 2013, 43(6): 27-33.
[5] 解洪胜,张虹 . 基于支持向量机的图像纹理识别方法[J]. 山东大学学报(工学版), 2006, 36(6): 95-99 .
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!