Journal of Shandong University(Engineering Science) ›› 2019, Vol. 49 ›› Issue (1): 41-46.doi: 10.6040/j.issn.1672-3961.0.2018.341

• Machine Learning & Data Mining • Previous Articles     Next Articles

Features analysis for Chinese irony detection

Rongxiang ZHOU(),Xiuyi JIA*()   

  1. School of Computer Science and Engineering, Nanjing University of Science & Technology, Nanjing 210094, Jiangsu, China
  • Received:2018-08-13 Online:2019-02-20 Published:2019-03-01
  • Contact: Xiuyi JIA;
  • Supported by:


The research object was data in microblog. The features of irony detection were studied. In view of the characteristics of microblog and irony detection, a variety of features were constructed, such as emotional phrases, emoticons and so on. The experiments showed that the proposed irony features improved 0.34% on recognition accuracy, 0.74% on recall and 0.18% on F-measure, compared with the existing ones for the imbalanced datasets. The proposed irony features also improved 0.44% on recognition accuracy, 2.54% on recall and 0.14% on F-measure, compared with the existing ones for the balanced datasets.

Key words: irony detection, sentiment analysis, features construction, imbalanced dataset, balanced dataset

CLC Number: 

  • TP391

Table 1

Categories ofdegree adverbs"

类别 程度副词
极量 分外,十分,备加,万分,倍加,异常,尤,尤为,尤其,最,最为,无比,极,极为,极其,极度,极端,格外,殊,深为,特,特别,甚,甚为,绝,绝伦,绝对,绝顶,至,至为,透,透顶,顶,顶顶,非常
高量 大,大为,大大,太,好,好不,好生,很,忒,怪,挺,愈,愈为,愈加,愈发,愈益,更,更为,更其,更加,多,多么,比较,益,益发,相当,较,较为,较比,越,越加,越发,过,过于,过分,颇,颇为,蛮,老
较中量 何其,何等,够,尽,全然,满,还,真
中量 几,几乎,差不多,差点儿
较低量 不大,不太,不很,不甚,不胜
低量 多少,微微,丝毫,些微,有些,有点,有点儿,略,略为,略微,略略,毫,稍,稍为,稍微,稍稍,稍许

Table 2

Models ofemotional phrases"

类型 例子
n+q 凯歌阵阵
q+n 阵阵凯歌
a+(uj)+n 美丽(的)花,最好(的)恭维
b+n 中等身材
n+u 暴风雨般,绅士似的
ad+(uj)+vn 迅速发展,认真(的)咨询
eng+n AA制
d+n 真是棒极了,真是调皮
d+v 互相支援,没法接受,强烈反对
a+(ul)+n 活跃气氛,学好功课,热爱家乡,拉仇恨,支持魅族,抛弃(了)魅族,没牛人
v+a 洗刷干净,喜欢清净
a+(ud)+v 关心(得)不够,油价上升,经济复苏
v+(ud)+d 高兴(得)很
v+r 鼓励他
ad+v 正确领导
v+(ud)+l 吃(得)很饱
v+m 有一套
a+(ud)+d 痛快极了,漂亮(的)很,好(得)很,标致极了
d+a 非常漂亮,比较合适,非常谦虚,比较空,有点特殊,过分良好
a+v 富裕起来
n+(v)+a 态度和蔼,情节较重,外形好,胆子(可)真大
v+(ul)+b 看(了)高兴
m+a 十分壮丽
m+f 100以上
n+f 天亮以后,尖子生里面

Table 3

Precision of three feature systems onunbalanced datasets"

分类器 朴素贝叶斯 逻辑斯蒂回归 支持向量机 决策树 随机森林
一元文法特征 0.445 0.404 0.690 0.642 0.610
文献[10]的反语特征 0.494 0.257 0.728 0.687 0.798
9个反语特征 0.505 0.244 0.729 0.680 0.823

Table 4

Recall of three feature systems on unbalanced datasets"

分类器 朴素贝叶斯 逻辑斯蒂回归 支持向量机 决策树 随机森林
一元文法特征 0.509 0.467 0.393 0.233 0.300
文献[10]的反语特征 0.560 0.513 0.489 0.339 0.321
9个反语特征 0.586 0.521 0.502 0.339 0.311

Table 5

F-measure of three feature systems on unbalanced datasets"

分类器 朴素贝叶斯 逻辑斯蒂回归 支持向量机 决策树 随机森林
一元文法特征 0.472 0.434 0.513 0.348 0.431
文献[10]的反语特征 0.525 0.342 0.585 0.453 0.472
9个反语特征 0.543 0.332 0.595 0.452 0.464

Table 6

Precision of three feature systems on balanced datasets"

分类器 朴素贝叶斯 逻辑斯蒂回归 支持向量机 决策树 随机森林
一元文法特征 0.768 0.699 0.732 0.660 0.739
文献[10]的反语特征 0.803 0.710 0.758 0.691 0.785
9个反语特征 0.803 0.734 0.766 0.698 0.768

Table 7

Recall of three feature systems on balanced datasets"

分类器 朴素贝叶斯 逻辑斯蒂回归 支持向量机 决策树 随机森林
一元文法特征 0.780 0.775 0.873 0.785 0.802
文献[10]的反语特征 0.819 0.779 0.861 0.756 0.870
9个反语特征 0.827 0.808 0.881 0.811 0.885

Table 8

F-measure of three feature systems on balanced datasets"

分类器 朴素贝叶斯 逻辑斯蒂回归 支持向量机 决策树 随机森林
一元文法特征 0.773 0.734 0.795 0.714 0.768
文献[10]的反语特征 0.811 0.742 0.805 0.721 0.824
9个反语特征 0.815 0.768 0.819 0.750 0.821


Precision of three feature systems on differentsize of datasets"


Recall of Three Feature Systems on DifferentSize of Datasets"


F-measure of three feature systems on differentsize of datasets"

1 WANG M, CAO D, LI L, et al. Microblog sentiment analysis based on cross-media bag-of-words model[C]//International Conference on Internet Multimedia Computing and Service. Xiamen: ACM, 2014: 76.
2 JIANG F, LIU Y, LUAN H, et al. Microblog sentiment analysis with emoticon space model[C]//Chinese National Conference on Social Media Processing. Berlin, Germany: Springer, 2014: 76-87.
3 OUG, CHEN W, LI B, et al. Clusm: an unsupervised model for microblog sentiment analysis incorporating link information[C]// International Conference on Database Systems for Advanced Applications. Bali, Indonesia: Springer, 2014: 481-494.
4 KAROUI J, FARAH B, MORICEAU V, et al. Towards a contextual pragmatic model to detect irony in tweets[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Beijing: ACL, 2018: 644-650.
5 BRUNTSCH R , RUCH W . Studying irony detection beyond ironic criticism: let′s include ironic praise[J]. Frontiers in Psychology, 2017, 8, 606.
doi: 10.3389/fpsyg.2017.00606
6 TASLIOGLU H, KARAGOZ P. Irony detection on microposts with limited set of features[C]//Proceedings of the Symposium on Applied Computing. Marrakech, Morocco: ACM, 2017: 1076-1081.
7 SAVOV P, NIELEK R. Ridiculously expensive watches and surprisingly many reviewers: a study of irony[C]//International Conference on Web Intelligence. Omaha, USA: IEEE, 2017: 725-729.
8 刘正光. 反语理论综述[J]. 解放军外国语学院学报, 2002, 22 (4): 16- 18.
doi: 10.3969/j.issn.1002-722X.2002.04.004
LIU Zhengguang . A critique of irony theories[J]. Journal of PLA University of Foreign Language, 2002, 22 (4): 16- 18.
doi: 10.3969/j.issn.1002-722X.2002.04.004
9 TANG Y J, CHEN H. Chinese irony corpus construction and ironic structure analysis[C]//Proceedings of the 25th International Conference on Computational Lingustics. Dublin, Ireland: ACL, 2014: 1269-1278.
10 邓钊, 贾修一, 陈家骏. 面向微博的中文反语识别研究[J]. 计算机工程与科学, 2015, 37 (12): 2312- 2317.
doi: 10.3969/j.issn.1007-130X.2015.12.018
DENG Zhao , JIA Xiuyi , CHEN Jiajun . A survey on chinese ironic detection in microblog[J]. Computer Engineering and Science, 2015, 37 (12): 2312- 2317.
doi: 10.3969/j.issn.1007-130X.2015.12.018
11 WU C H, WU F Z, WU S X, et al. THU NGN at semeval-2018 task 3: tweet irony detection with densely connected lstm and multi-task learning[C]//Proceedings of the 12th International Workshop on Semantic Evaluation. New Orleans, USA: ACL, 2018: 51-56.
12 邢竹天, 徐扬. 面向网络文本的汉语反讽修辞识别方法研究[J]. 山西大学学报(自然科学版), 2015, 38 (3): 385- 391.
XIN Zhutian , XU Yang . A study of chinese sarcasm detection methods on internet texts[J]. Journal of Shanxi University(Natural Science Edition), 2015, 38 (3): 385- 391.
13 CHARALAMPAKIS B, SPATHIS D, KOUSLIS E, et al. Detecting irony ongreek political tweets: a text mining approach[C]// In Proceedings of the 16th International Conference on Engineering Applications of Neural Networks. New York, USA: ACM, 2015, 17: 1-5.
14 REYESA, ROSSO P. Mining subjective knowledge from customer reviews: a specific case of irony detection[C]// Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis. Portland, USA: ACL, 2011: 118-124.
15 BOUAZIZIM, OHTSUKI T. Sarcasm detection in twitter[C]// Global Communications Conference. San Diego, USA: IEEE, 2016: 1-6.
16 RAVIK , RAVI V . A novel automatic satire and irony detection using ensembled feature selection and data mining[J]. Knowledge-Based Systems, 2016, 120, 15- 33.
17 CARVALHOP. Clues for detecting irony in user-generated contents: oh…!! it′s "so easy"; -[C]// International CIKM Workshop on Topic-Sentiment Analysis for MASS Opinion. New York, USA: ACM, 2009: 53-56.
18 吕叔湘. 中国文法要略[M]. 北京: 商务印书馆, 1982.
[1] QIAN Chunlin, ZHANG Xingfang, SUN Lihua. Advanced collaborative filtering recommendation model based on sentiment analysis of online review [J]. Journal of Shandong University(Engineering Science), 2019, 49(1): 47-54.
[2] SHEN Ji, MA Zhiqiang, LI Tuya, ZHANG Li. A word extend LDA model for short text sentiment [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 120-126.
[3] ZHOU Zhe, SHANG Lin. A sentiment analysis method based on dynamic lexicon and three-way decision [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2015, 45(1): 19-23.
[4] ZHOU Yongmei1, YANG Aimin1, LIN Jianghao2. A method of building Chinese microblog sentiment lexicon [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2014, 44(3): 36-40.
Full text



[1] XIA Bin,ZHANG Lian-jun . Energy comparison-based TOA estimation algorithm for the DS-CDMA UWB system[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2007, 37(1): 70 -73 .
[2] BO De-Yun, ZHANG Dao-Jiang. Adaptive spectral clustering algorithm[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(5): 22 -26 .
[3] LI Shijin, WANG Shengte, HUANG Leping. Change detection with remote sensing images based on forward-backward heterogenicity[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 1 -9 .
[4] ZHAO Ke-Jun, WANG Xin-Jun, LIU Xiang, CHOU Yi-Hong. Algorithms of continuous top-k join query over structured overlay networks[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(5): 32 -37 .
[5] DING Wan-Tao, LI Shu-Cai, ZHANG Qing-Song. Discussion on interface error regularity of inclined  stratum predicted by TSP[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(4): 57 -60 .
[6] WANG Bai-wei,CAO Sheng-le . A mult-objective assessment method of the effects of industrial waste-water management[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2007, 37(3): 89 -92 .
[7] CHOU Wu-Sheng, WANG Shuo. Study on the adaptive algorithm of the force reflection robotic master under large stiffness of the environment[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(1): 1 -5 .
[8] ZHANG Hui,WANG Meng-xia, HAN Xue-shan. The advanced thermal rating of power system and its application[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(6): 25 -29 .
[9] LI Jie ,LIU Hong. A method of fractal artistic pattern generation based on a genetic algorithm[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(6): 33 -36 .
[10] YAN Chong-jing, LIAO Wen-he, GUO Yu, CHENG Xiao-sheng. The BOM modeling based on the polychromatic graph[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(6): 70 -75 .