您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版)

• 机器学习与数据挖掘 • 上一篇    下一篇

中文微博情感词典构建方法

周咏梅1,阳爱民1,林江豪2   

  1. 1.广东外语外贸大学思科信息学院,广东 广州510006;
    2.广东外语外贸大学管理学院,广东 广州510006
  • 收稿日期:2013-04-30 出版日期:2014-06-20 发布日期:2013-04-30
  • 作者简介:周咏梅(1971- ),女,湖南永州人,教授,主要研究方向为文本情感分析.E-mail:yongmeizhou@163.com
  • 基金资助:
    国家社科基金资助项目(12BYY045);教育部人文社会科学研究青年项目(10YJCZH247);广东省科技计划资助项目(2010B031000014)

A method of building Chinese microblog sentiment lexicon

ZHOU Yongmei1, YANG Aimin1, LIN Jianghao2   

  1. 1.Cisco School of Informatics, Guangdong University of Foreign Studies, Guangzhou 510006, Guangdong, China;
    2. School of Management, Guangdong University of Foreign Studies, Guangzhou 510006, Guangdong, China
  • Received:2013-04-30 Online:2014-06-20 Published:2013-04-30

摘要: 提出了一种中文微博情感词典构建方法。采用上下文熵的网络用语发现策略,通过TFIDF(term frequencyinverse document frequency)进行二次过滤得到网络用语;利用SOPMI(semantic orientationpointwise mutual information)算法在已标注的微博语料库中计算网络用语的情感倾向值,构建网络用语情感词典;将词典应用到微博情感分类实验,并与朴素贝叶斯分类器的分类性能进行了比较分析。实验结果表明,直接利用微博情感词典的分类效果好于朴素贝叶斯分类器,并具有分类过程简单、快速等优势。

关键词: 微博情感词典, 网络用语, 朴素贝叶斯, 情感分析, 上下文熵

Abstract:

A method of building Chinese microblog sentiment lexicon was proposed,which adopted the discovery strategies of context entropy for network language, acquired network languages from the secondary filtration by TF-IDF and computed the sentiment weights of network language by SO-PMI algorithm in the labeled corpus. The built lexicon was applied into the analysis experiments of micro-blog sentiment,which was compared with that of naive bayesian classifier. Experiment results showed that the efficacy of classification by the built micro-blog sentimental lexicon was better than that by naive bayesian classifier,and was simple and rapid in the classification process.

Key words: sentiment analysis, context entropy, microblog sentiment lexicon, network languages, naive Bayesian

[1] 沈冀,马志强,李图雅,张力. 面向短文本情感分析的词扩充LDA模型[J]. 山东大学学报(工学版), 2018, 48(3): 120-126.
[2] 周哲, 商琳. 一种基于动态词典和三支决策的情感分析方法[J]. 山东大学学报(工学版), 2015, 45(1): 19-23.
[3] 于江德1,赵红丹1,郑勃举1,余正涛2. 基于中文人名用字特征的性别判定方法[J]. 山东大学学报(工学版), 2014, 44(1): 13-18.
[4] 周咏梅1,杨佳能2,阳爱民2. 面向文本情感分析的中文情感词典构建方法[J]. 山东大学学报(工学版), 2013, 43(6): 27-33.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!