山东大学学报(工学版) ›› 2014, Vol. 44 ›› Issue (6): 15-18.doi: 10.6040/j.issn.1672-3961.1.2014.108
徐晓丹, 段正杰, 陈中育
XU Xiaodan, DUAN Zhengjie, CHEN Zhongyu
摘要: 针对情感分类中采用单一特征分类精度不高的问题,提出多特征加权的分类算法:根据扩展的情感词典计算每个词的情感倾向度,经CHI特征选择后,根据情感词的极性强度调整贝叶斯分类模型中该词的正负后验概率,在原值的基础上加上极性强度影响值。实验将该方法和其他3种单特征选择方法在酒店、影视等语料上的分类精度进行了对比,分类精度得到提升。实验结果表明,将词语的情感倾向度的特征融入到分类器中方法,在有效提高情感倾向性分类精度的同时降低了特征维数。
中图分类号:
| [1] PANG Bo, LEE Lillian. Opinion mining and sentiment analysis[J]. Foundations and Trends in Information Retrieval, 2008, 2(1-2):11-35. [2] LIN Weihao, WILSON Theresa, WIEBE Janyce. Identifying perspectives at the document and sentence levels[C]//Proceeding of the Conference on Natural Language Learning (CoNLL). Morristown:ACL Press, 2006:109-116. [3] KIM Soomin, HOVY Eduard. Crystal: Analyzing predictive opinions on the Web[C]//Proceeding of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL). Morristown: ACL Press, 2007:1056-1064. [4] 赵妍妍,秦兵,刘挺.文本情感分析[J].软件学报,2010,21(8):1834-1848. ZHAO Yanyan, QIN Bing, LIU Ting. Sentiment analysis[J]. Journal of Software, 2010, 21(8):1834-1848. [5] 吴琼,谭松波,程学旗.中文情感倾向性分析的相关研究进展[J].信息技术快报,2010,8:16-31. WU qiong, TAN Songbo, CHENG Xueqi.The progress in the study of chinese text orientation analysis[J]. Information Technology Letter, 2010, 8:16-31. [6] HU Mingqing, LIU Bing. Mining and summarizing customer reviews[C]//Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York: ACM Press, 2004:168-177. [7] PANG Bo, LEE Lillian, VAITHYANATHAN Shivakumar.Sentiment classification using machine learning techniques[C]//Proceeding of Empirical Methods in Natural Language Processing.Morristown:ACL Press, 2002:79-86. [8] YU Hong, HATZIVASSILOGLOU Vasileios.Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences[C]//Proceedings of the EMNLP 2003. Morristown: ACL Press, 2003:129-136. [9] RAO Delip, RAVICHANDRAN Deepak. Semi-Supervised polarity lexicon induction[C]//Proceedings of the EACL 2009. Morristown:ACL Press, 2009:675-682. [10] TAKAMURA Hiroya, INUI Takashi, OKUMURA Manabu. Extracting semantic orientation of words using spin model[C]//Proceedings of the Association for Computational Linguistics. Morristown: ACL Press, 2005:133-140. [11] LIU Qun, LI Sujian. Word similarity computing based on howNet[C]//Proceedings of the 3th Chinese Lexical Semantic Workshop. Taibei:CLSW Press, 2002:45-56. [12] 江敏,肖诗斌,王弘蔚,等.一种改进的基于《知网》的词语语义相似度计算[J].中文信息学报, 2008, 22(5):84-89. JIANG Min, XIAO Shibin, WANG Hongwei, et al. An improved word similarity computing method based on hownet[J].Journal of Chinese Information Processing, 2008, 22(5):84-89. [13] 朱嫣岚,闵锦,周雅倩,等.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. ZHU Yanlan, MIN Jin, ZHOU Yaqian, et al.Sementic orientation computing based on howNet[J]. Journal of Chinese Information Processing, 2006, 20(1):14-20. [14] TURNEY Peter. Semantic orientation applied to unsupervised classification of reviews[C]//Proceedings of ACL. Morristown:ACL Press, 2002:417-424. [15] 杨超,冯时,王大玲,等.基于情感扩展技术的网络舆情倾向性分析[J].小型微型计算机系统,2010,04: 691-695. YANG Chao, FENG Shi, WANG Daling, et al. Analysis on Web public opinion orientation on extending sentiment lexicon[J].Journal of Chinese Computer System, 2010, 04: 691-695. [16] KU Lunwei, LO Yongsheng, CHEN Hsinhsi. Using opinion scores ofwords for sentence-level opinion extraction[C]//Proceedings of the 6th NACSIS Test Collections for IR Workshop Meeting on Evaluation of Information Access Technologies.Tokyo:NTCIR Press, 2007:316-322. [17] YANG Yiming, PEDERSEN Jan. A comparative study on feature selection in text categorization[C]//Proceeding of the 14th International Conference on Machine Learning.San Francisco: Morgan Kaufmann Press, 1997: 412-420. |
| [1] | 唐杰烽,张佳,龙锦益. 基于全局冗余最小的快速多标签特征选择方法[J]. 山东大学学报 (工学版), 2025, 55(6): 21-34. |
| [2] | 吴正健,吾尔尼沙·买买提,杨耀威,阿力木江·艾沙,库尔班·吾布力. 基于DRCoALTP的印刷体文档图像多文种识别方法[J]. 山东大学学报 (工学版), 2025, 55(1): 51-57. |
| [3] | 白琳,俱通,王浩,雷明珠,潘晓英. 面向不平衡数据的提升均衡集成学习算法[J]. 山东大学学报 (工学版), 2024, 54(4): 59-66. |
| [4] | 陈晓江,杨晓奇,陈广豪,刘伍颖. 混合BERT和宽度学习的低时间复杂度短文本分类[J]. 山东大学学报 (工学版), 2024, 54(4): 51-58. |
| [5] | 宋辉,张轶哲,张功萱,孟元. 基于类权重和最小化预测熵的测试时集成方法[J]. 山东大学学报 (工学版), 2024, 54(3): 36-43. |
| [6] | 聂秀山,巩蕊,董飞,郭杰,马玉玲. 短视频场景分类方法综述[J]. 山东大学学报 (工学版), 2024, 54(3): 1-11. |
| [7] | 徐金华,罗义凯,李昱燃,李岩. 基于时频分解与深度学习的轨道客流预测[J]. 山东大学学报 (工学版), 2024, 54(2): 60-68. |
| [8] | 马坤,刘筱云,李乐平,纪科,陈贞翔,杨波. 用于意图识别的自适应多标签信息学习模型[J]. 山东大学学报 (工学版), 2024, 54(1): 45-51. |
| [9] | 于泓,杜娟,魏琳,张利. 计及行为特征的市场化用户电量数据拟合方法[J]. 山东大学学报 (工学版), 2023, 53(4): 113-119. |
| [10] | 李颖,王建坤. 基于监督图正则化和信息融合的轻度认知障碍分类方法[J]. 山东大学学报 (工学版), 2023, 53(4): 65-73. |
| [11] | 张喜龙,韩萌,陈志强,武红鑫,李慕航. 动态集成选择的不平衡漂移数据流Boosting分类算法[J]. 山东大学学报 (工学版), 2023, 53(4): 83-92. |
| [12] | 刘财辉,周琪,叶晓文. 一种基于改进ReliefF算法的入侵检测模型[J]. 山东大学学报 (工学版), 2023, 53(2): 1-10. |
| [13] | 许传臻,袭肖明,李维翠,孙仪,杨璐. 基于自适应多分辨率特征学习的CNV分型网络[J]. 山东大学学报 (工学版), 2022, 52(4): 69-75. |
| [14] | 袁高腾,周晓峰,郭宏乐. 基于特征选择算法的ECG信号分类[J]. 山东大学学报 (工学版), 2022, 52(4): 38-44. |
| [15] | 孟令灿,聂秀山,张雪. 基于遮挡目标去除的公交车拥挤度分类算法[J]. 山东大学学报 (工学版), 2022, 52(4): 83-88. |
|