Journal of Shandong University(Engineering Science) ›› 2018, Vol. 48 ›› Issue (5): 47-54.doi: 10.6040/j.issn.1672-3961.0.2018.207

• Machine Learning & Data Mining • Previous Articles     Next Articles

Suggestion sentence classification model based on feature fusion and ensemble learning

Pu ZHANG1(),Chang LIU1,Yong WANG2   

  1. 1. College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
    2. Key Laboratory of Electronic Commerce and Logistics, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
  • Received:2018-05-31 Online:2018-10-01 Published:2018-05-31
  • Supported by:
    教育部人文社会科学研究青年基金资助项目(17YJCZH247);重庆市教委人文社会科学研究资助项目(17SKG055);国家自然科学基金资助项目(61472464);重庆邮电大学博士启动基金资助项目(A2016-02)

Abstract:

As an emerging research task, suggestion mining has gradually attracted attention of researchers in recent years. Compared with English language suggestion expression forms, those of Chinese were more abundant, and many different characteristics were present. It was necessary to carry out the research on suggestion mining in the Chinese environment. As suggestion sentence detection was the core task of suggestion mining, this research proposed an ensemble learning model that integrated the Stacking and Bagging methods to classify the reviews for the detection of suggestion sentence. The model firstly used Stacking to combine classifiers and constructed probabilistic feature space. Then, the convolution neural network (CNN) and paragraph vector (PV) model were used to construct the CNN feature space and paragraph vector feature space of the reviews respectively. Finally, the above features were fused and the Bagging classifier was trained to classify suggestion sentences. Experimental results on Chinese dataset verified the effectiveness of the model.

Key words: suggestion mining, suggestion sentence classification, convolutional neural network, ensemble learning, feature fusion

CLC Number: 

  • TP391.1

Fig.1

Framework of the model"

Fig.2

Construction of probabilistic feature space"

Fig.3

Structure of Pos-TextCNN model"

Table 1

The experimental results%"

%
模型 精确率 召回率 F 准确率
NB 83.42 81.11 82.25 82.17
FM 87.03 81.17 84.00 84.23
LR 88.58 82.05 85.19 85.20
RF 87.36 82.55 84.89 85.01
ET 86.71 83.02 84.82 84.87
TextCNN 88.14 72.10 79.32 79.70
Pos-TextCNN 84.37 79.62 81.93 81.49
Stacking+Bagging 87.48 85.20 86.32 86.25
Stacking+Bagging+CNN 87.92 84.90 86.38 86.47
Stacking+Bagging+PV 88.27 85.14 86.68 86.66
CNN+PV+Bagging 86.86 82.97 84.87 84.87
Stacking+Bagging+CNN+PV 88.63 86.06 87.33 87.28

Table 2

Classification results of confusing reviews"

序号 评论文本 NB FM LR RF ET S-B P T
1 我的9350手机升级后2天,屏幕右有条红线怎么处理,希望大神回复。 0 0 0 1 1 0 0 0
2 小小国家sx公司太欺负中国人了,我以后永远都不买他们任何一件产品,我建议全中国人都别再买,爱我中华,支持国产 0 0 1 1 1 0 0 0
3 三星c7系统更新后变成砖头了,一个月前买的三星c7当时续航能力还不错,最近系统更新后让他名不副实,原来充电一小时能充满现在充两小时都充不满而且耗电非常快,问客服说建议恢复出厂设置或者关机充电。 0 0 1 1 1 0 0 0
4 三星都有那几款,准备花4 000~5 500之间买个三星,求建议! 1 1 1 0 0 1 0 0

Table 3

Classification results of models"

序号 评论文本 NB FM LR RF ET S-B P T
1 关于指纹解锁不灵敏问题,指纹解锁实在太不灵敏了,平常都不敢开启双击启动相机。和ip6s差几个档次,和n5机皇定位严重不符。 1 1 1 0 0 1 0 0
2 三星下一代手机应该具有的特性(一个都不要少), 1.手写笔, 2.红外线遥控, 3.防水(可以游泳级别), 4.高清屏(4K,不玩VR没感受,玩了就知道了) 1 1 0 0 0 0 1 1
3 给C7更新Grace UX系统吧,很喜欢这个新的定制系统,简洁易用, C7出的时间也不长,不能不管C7。 0 0 0 0 0 0 1 1
1 赵妍妍, 秦兵, 刘挺. 文本情感分析[J]. 软件学报, 2010, 21 (8): 1834- 1848.
ZHAO Yanyan , QIN Bing , LIU Ting . Sentiment analysis[J]. Journal of Software, 2010, 21 (8): 1834- 1848.
2 李然, 林政, 林海伦, 等. 文本情绪分析综述[J]. Journal of Computer Research and Development, 2018, (55): 30- 52.
LI Ran , LIN Zheng , LIN Hailun , et al. Text emotion analysis: a survey[J]. Journal of Computer Research and Development, 2018, (55): 30- 52.
3 刘兵.情感分析:挖掘观点、情感和情绪[M].北京:机械工业出版社, 2017.07.
4 NEGI S. Suggestion mining from opinionated text[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics—Student Research Workshop. Association for Computational Linguistics. Stroudsburg, USA: ACL, 2016: 7-12.
5 RAMANAND J, BHAVSAR K, PEDANEKAR N. Wishful thinking: finding suggestions and 'buy' wishes from product reviews[C]//Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text. Stroudsburg, USA: ACL, 2010: 54-61.
6 BRUN C , HAGEGE C . Suggestion mining: detecting suggestions for improvement in users' comments[J]. Research in Computing Science, 2013, (70): 199- 209.
7 NEGI S, BUITELAAR P. Curse or boon? presence of subjunctive mood in opinionated text[C]//Proceedings of the 11th International Conference on Computational Semantics. Stroudsburg, USA: ACL, 2015: 101-106.
8 WICAKSONO A F, MYAENG S H. Automatic extraction of advice-revealing sentences for advice mining from online forums[C]//International Conference on Knowledge Capture. New York, USA: ACM, 2013: 97-104.
9 DONG Li, WEI Furu, DUAN Yajuan, et al. The automated acquisition of suggestions from tweets[C]//Proceedings of the Twenty-Seventh American Association for Artificial Intelligence. Menlo Park, Canada: AAAI, 2013: 239-245.
10 LAI Siwei, XU Liheng, LIU Kang, et al. Recurrent convolutional neural networks for text classification[C]//Twenty-Ninth AAAI Conference on Artificial Intelligence. Texas Austin, USA: AAAI, 2015: 2267-2273.
11 YANG Zichao, YANG Diyi, DYER C, et al. Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. SAN Diego, USA: NAACL, 2016: 1480-1489.
12 NEGI S, ASOOJA K, MEHROTRA S, et al. A study of suggestions in opinionated texts and their automatic detection[C]//Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics. Stroudsburg, USA: ACL, 2016: 170-178.
13 孙松涛, 何炎祥. 基于CNN特征空间的微博多标签情感分类[J]. 工程科学与技术, 2017, 49 (3): 162- 169.
SUN Songtao , HE Yanxiang . Multi-label emotion classification for microblog based on CNN feature space[J]. Advanced Engineering Sciences, 2017, 49 (3): 162- 169.
14 LE Q, MKOLOV T. Distributed representations of sentences and documents[C]//Proceedings of the 31st International Conference on Machine Learning. Beijing, China: ICML, 2014: 1188-1196.
15 李寿山, 黄居仁. 基于Stacking组合分类方法的中文情感分类研究[J]. 中文信息学报, 2010, 24 (5): 56- 61.
doi: 10.3969/j.issn.1003-0077.2010.05.010
LI Shoushan , HUANG Juren . Chinese sentiment classification based on stacking combination method[J]. Journal of Chinese Information Processing, 2010, 24 (5): 56- 61.
doi: 10.3969/j.issn.1003-0077.2010.05.010
16 李恒超, 林鸿飞, 杨亮, 等. 一种用于构建用户画像的二级融合算法框架[J]. 计算机科学, 2018, 45 (1): 157- 161.
LI Hengchao , LIN Hongfei , YANG Liang , et al. Two-level stacking algorithm framework for building user portrait[J]. Computer Science, 2018, 45 (1): 157- 161.
17 KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2014: 1746-1751.
18 NGUYEN T H, GRISHMAN R. Relation extraction: perspective from convolutional neural networks[C]//Proceedings of the NAACL Workshop on Vector Space Modeling for NLP. Denver Colorado, Canada: NAACL, 2015: 39-48.
19 CHENG T, GUESTRIN C. Xgboost: a scalable tree boosting system[C]//Proceedings of the 22nd ACM SIGKDD Inernational Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2016: 785-794.
20 三星电子.三星盖乐世社区产品建议论坛[EB/OL]. [2018-07-23]. http://www.galaxyclub.cn/bbs/productadvicearea/productadvice.
[1] Dongdong SHEN,Fengyu ZHOU,Mengyuan LI,Shuqian WANG,Renhe GUO. Indoor wireless positioning based on ensemble deep neural network [J]. Journal of Shandong University(Engineering Science), 2018, 48(5): 95-102.
[2] Mengmeng LIANG,Tao ZHOU,Yong XIA,Feifei ZHANG,Jian YANG. Lung tumor images recognition based on PSO-ConvK convolutional neural network [J]. Journal of Shandong University(Engineering Science), 2018, 48(5): 77-84.
[3] XIE Zhifeng, WU Jiaping, MA Lizhuang. Chinese financial news classification method based on convolutional neural network [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 34-39.
[4] ZHAO Yanxia, WANG Xizhao. Multipurpose zero watermarking algorithm for color image based on SVD and DCNN [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 25-33.
[5] LI Yuxin, PU Yuanyuan, XU Dan, QIAN Wenhua, LIU Hejuan. Image aesthetic quality evaluation based on embedded fine-tune deep CNN [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 60-66.
[6] HE Zhengyi, ZENG Xianhua, GUO Jiang. An ensemble method with convolutional neural network and deep belief network for gait recognition and simulation [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 88-95.
[7] MOU Chunqian, TANG Yan. A novel 3D model retrieval method fusing global and local information [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2016, 46(6): 48-53.
[8] WANG Bin, CHANG Faliang, LIU Chunsheng. Traffic sign classification based on multi-feature fusion [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2016, 46(4): 34-40.
[9] WANG Lihong, LI Qiang. A selective ensemble method for traveling salesman problems [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2016, 46(1): 42-48.
[10] KONG Chao1,2, ZHANG Huaxiang1,2*, LIU Li1,2. A semi-supervised image retrieval algorithm based onfeature fusion of the region of interest [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2014, 44(3): 22-28.
[11] CHEN Dawei, YAN Zhao*, LIU Haoyan. Overfitting phenomenon  of  the series of single value decomposition algorithms in rating prediction [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2014, 44(3): 15-21.
[12] XU Shan-shan, LIU Ying-an*, XU Sheng. Wood defects recognition based on the convolutional neural network [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2013, 43(2): 23-28.
[13] FANG Xiao-nan1,2, ZHANG Hua-xiang1,2*, GAO Shuang1,2. Web spam detection based on SMOTE and random forests [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2013, 43(1): 22-27.
[14] ZHANG Ling-wei, WAN Wen-qiang. Study on the cost-sensitive ensemble learning algorithm based on the cloud computing platform [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2012, 42(4): 19-23.
[15] XIE Huo-sheng, LIU Min. An ensemble co-training algorithm based on active learning [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2012, 42(3): 1-5.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] MENG Jian, LI Yibin, LI Bin. Bound gait controlling method of quadruped robot[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2015, 45(3): 28 -34 .
[2] HE Dongzhi, ZHANG Jifeng, ZHAO Pengfei. Parallel implementing probabilistic spreading algorithm using MapReduce programming mode[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 0, (): 22 -28 .
[3] HUANG Jinchao. A new method for muti-objects image segmentation based on faster region proposal networks[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(4): 20 -26 .
[4] TANG Qingshun, JIN Lu, LI Guodong, WU Chunfu. Robotic manipulators tracking control based on adaptive terminal sliding mode controller[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2016, 46(5): 45 -53 .
[5] ZHANG Jianming, LIU Quansheng, TANG Zhicheng, ZHAN Ting, JIANG Yalong. New peak shear strength criterion with inclusion of shear action history[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 0, (): 77 -81 .
[6] WANG Huan, ZHOU Zhongmei. An over sampling algorithm based on clustering[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 134 -139 .
[7] XIAO Qiao, PEI Jihong, WANG Lixia, GONG Zhicheng. Ship detection in remote sensing image based on the fuzzy fusion of multi-channel Gabor filtering[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 0, (): 29 -35 .
[8] MA Xiangming, SUN Xia, ZHANG Qiang. Construction and analysis on typical working cycle of wheel loader[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 0, (): 82 -87 .
[9] WANG Lanzhong, MENG Wenjie. Video perceptual encryption algorithm in remote education receiver[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2015, 45(4): 40 -44 .
[10] LIANG Zehua, CUI Yaodong, ZHANG Yu. The one-dimensional cutting stock problem with sequence-dependent cut losses[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 75 -80 .