JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE) ›› 2011, Vol. 41 ›› Issue (3): 7-11.

• Articles • Previous Articles     Next Articles

Ensemble learning based feature selection for imbalanced problems

LI Xia1, WANG Lian-xi2, JIANG Sheng-yi1   

  1. 1. School of Informatics, Guangdong University of Foreign Studies, Guangzhou 510006, China;
    2. Department of Business and Trade, Guangdong Vocational College of Science and Trade, Guangzhou 510640, China
  • Received:2011-02-01 Online:2011-06-16 Published:2011-02-01

Abstract:

The traditional feature selection methods are basically aimed for getting the optimal accuracy without full consideration of the data distribution, which can not achieve promising results on imbalanced datasets. A new feature selection method was proposed based on the data distribution modification  for imbalanced data sets. This approach could modify data distribution  many times by sampling with replacement. The instances of large classes were equal to the minor class samples in each new dataset. Finally, the final selected features were generated by voting mechanism for ensemble learning, which could combine the selected features by receiving more votes   than half from all the new training datasets. Experimental results on several UCI datasets showed that the proposed method was an effective feature selection approach for imbalance problems.
 

Key words: imbalanced data, feature selection, ensemble learning, sampling

[1] TANG Jiefeng, ZHANG Jia, LONG Jinyi. Fast multi-label feature selection method based on global redundancy minimization [J]. Journal of Shandong University(Engineering Science), 2025, 55(6): 21-34.
[2] Caihui LIU,Qi ZHOU,Xiaowen YE. An intrusion detection model based on improved ReliefF algorithm [J]. Journal of Shandong University(Engineering Science), 2023, 53(2): 1-10.
[3] Yan PENG,Tingting FENG,Jie WANG. An integrated learning approach for O3 mass concentration prediction model [J]. Journal of Shandong University(Engineering Science), 2020, 50(4): 1-7.
[4] Dapeng ZHANG,Yajun LIU,Wei ZHANG,Fen SHEN,Jiansheng YANG. Fake comment detection based on heterogeneous ensemble learning [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 1-9.
[5] Xin MA,Xue WANG. Prediction of microRNA-binding residues based on Laplacian support vector machine and sequence information [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 76-82.
[6] Zongtang ZHANG,Sen WANG,Shilin SUN. An ensemble learning algorithm for unbalanced data classification [J]. Journal of Shandong University(Engineering Science), 2019, 49(4): 8-13.
[7] Jiachen WANG, Xianghong TANG, Jianguang LU. Research onfeature selection technology in bearing fault diagnosis [J]. Journal of Shandong University(Engineering Science), 2019, 49(2): 80-87.
[8] Rongxiang ZHOU,Xiuyi JIA. Features analysis for Chinese irony detection [J]. Journal of Shandong University(Engineering Science), 2019, 49(1): 41-46.
[9] Yingxue ZHU,Ruizhang HUANG,Can MA. A short text dynamic clustering approach bias on new topic [J]. Journal of Shandong University(Engineering Science), 2018, 48(6): 8-18.
[10] Hong CHEN,Xiaofei YANG,Qing WAN,Yingcang MA. Multi-label feature selection algorithm based on correntropy andmanifold learning [J]. Journal of Shandong University(Engineering Science), 2018, 48(6): 27-36.
[11] Lianming MOU. Weighted k sub-convex-hull classifier based on adaptive feature selection [J]. Journal of Shandong University(Engineering Science), 2018, 48(5): 32-37.
[12] Dongdong SHEN,Fengyu ZHOU,Mengyuan LI,Shuqian WANG,Renhe GUO. Indoor wireless positioning based on ensemble deep neural network [J]. Journal of Shandong University(Engineering Science), 2018, 48(5): 95-102.
[13] Pu ZHANG,Chang LIU,Yong WANG. Suggestion sentence classification model based on feature fusion and ensemble learning [J]. Journal of Shandong University(Engineering Science), 2018, 48(5): 47-54.
[14] ZHAO Yanan, WANG Xinfeng, LI Rui, CHEN Tianshu, XUE Likun, WANG Wenxing. Tests and comparison of the dehumidification effectiveness of drying techniques involving in atmospheric sampling [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(4): 128-136.
[15] WANG Huan, ZHOU Zhongmei. An over sampling algorithm based on clustering [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 134-139.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] WANG Su-yu,<\sup>,AI Xing<\sup>,ZHAO Jun<\sup>,LI Zuo-li<\sup>,LIU Zeng-wen<\sup> . Milling force prediction model for highspeed end milling 3Cr2Mo steel[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(1): 1 -5 .
[2] LI Kan . Empolder and implement of the embedded weld control system[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(4): 37 -41 .
[3] KONG Xiang-zhen,LIU Yan-jun,WANG Yong,ZHAO Xiu-hua . Compensation and simulation for the deadband of the pneumatic proportional valve[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(1): 99 -102 .
[4] CHEN Rui, LI Hongwei, TIAN Jing. The relationship between the number of magnetic poles and the bearing capacity of radial magnetic bearing[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(2): 81 -85 .
[5] LI Ke,LIU Chang-chun,LI Tong-lei . Medical registration approach using improved maximization of mutual information[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(2): 107 -110 .
[6] JI Tao,GAO Xu/sup>,SUN Tong-jing,XUE Yong-duan/sup>,XU Bing-yin/sup> . Characteristic analysis of fault generated traveling waves in 10 Kv automatic blocking and continuous power transmission lines[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(2): 111 -116 .
[7] . [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(1): 27 -32 .
[8] WANG Li-ju,HUANG Qi-cheng,WANG Zhao-xu . [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(6): 51 -56 .
[9] SUN Dianzhu, ZHU Changzhi, LI Yanrui. [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(1): 84 -86 .
[10] HAO Ranhang,CHEN Shouyu . The theory, model and method of water resources evaluationombining quantity with quality[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(3): 46 -50 .