JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE) ›› 2010, Vol. 40 ›› Issue (4): 8-11.

• Articles • Previous Articles     Next Articles

A new feature selection method for text categorization

WANG Fa-bo, XU Xin-shun   

  1. School of Computer Science and Technology, Shandong University, Jinan 250101, China
  • Received:2009-11-11 Online:2010-08-16 Published:2009-11-11

Abstract:

How to reduce feature dimension while maintaining categorization accuracy is a key issue of text categorization.  A new method based on information theory was proposed to solve this problem. This approach aims to eliminate sparsely distributed features and find features  useful for categorization. Working with these feature reduction methods, it could  further reduce the feature dimension. The performance of  this  proposed method was tested on benchmark text classification problems. The results showed that it could not only reduce the feature dimension to hundreds but also improve the performance.

Key words: text categorization, feature selection, entropy, mutual information, information gain, CHI square statistics

[1] TANG Jiefeng, ZHANG Jia, LONG Jinyi. Fast multi-label feature selection method based on global redundancy minimization [J]. Journal of Shandong University(Engineering Science), 2025, 55(6): 21-34.
[2] CAO Fubo, XIAO Shengxian, WANG Chenxia, GAO Delong, LI Dun, SU Tian, QIN Shijie, WANG Yufei. Comprehensive performance evaluation of recycled brick mixed water stabilized material with multiple indicators based on entropy weight TOPSIS [J]. Journal of Shandong University(Engineering Science), 2025, 55(6): 151-162.
[3] LI Changcheng, LUO Yanting, WANG Donghong, KANG Haipeng, PAN Song. A critical line identification method considering source fault state and secondary fault risk [J]. Journal of Shandong University(Engineering Science), 2025, 55(5): 40-50.
[4] Caihui LIU,Qi ZHOU,Xiaowen YE. An intrusion detection model based on improved ReliefF algorithm [J]. Journal of Shandong University(Engineering Science), 2023, 53(2): 1-10.
[5] LI Peng, LIU Zhaoqiang, YANG Fengling, LIU Xin. Operating characteristics of oil-free twin-screw air compressor [J]. Journal of Shandong University(Engineering Science), 2021, 51(4): 84-90.
[6] GE Weichun, LI Zhao, ZHAO Dong, LI Zhenyu, YE Qing, FU Yu, YU Na. Comprehensive benefits analysis of power supply side of regional power grid with electrode-type electric boiler [J]. Journal of Shandong University(Engineering Science), 2020, 50(5): 90-98.
[7] Yan PENG,Tingting FENG,Jie WANG. An integrated learning approach for O3 mass concentration prediction model [J]. Journal of Shandong University(Engineering Science), 2020, 50(4): 1-7.
[8] CHENG Sen. The KPI design method of performance assessment of hydraulic engineering construction enterprise based on entropy method [J]. Journal of Shandong University(Engineering Science), 2020, 50(4): 80-84.
[9] Xin MA,Xue WANG. Prediction of microRNA-binding residues based on Laplacian support vector machine and sequence information [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 76-82.
[10] Jinchao HUANG. Object tracking algorithm based on deep residual features and entropy energy optimization [J]. Journal of Shandong University(Engineering Science), 2019, 49(4): 14-23.
[11] Jiachen WANG, Xianghong TANG, Jianguang LU. Research onfeature selection technology in bearing fault diagnosis [J]. Journal of Shandong University(Engineering Science), 2019, 49(2): 80-87.
[12] Pengcheng ZHAO, Fuquan ZHANG, Xubing YANG, Yin WU. Optimal deployment strategy of forest fire monitoring nodes based on visualization [J]. Journal of Shandong University(Engineering Science), 2019, 49(1): 30-35.
[13] Hong CHEN,Xiaofei YANG,Qing WAN,Yingcang MA. Multi-label feature selection algorithm based on correntropy andmanifold learning [J]. Journal of Shandong University(Engineering Science), 2018, 48(6): 27-36.
[14] Lianming MOU. Weighted k sub-convex-hull classifier based on adaptive feature selection [J]. Journal of Shandong University(Engineering Science), 2018, 48(5): 32-37.
[15] QIU Lu, YE Yinzhong, JIANG Chundi. Fault diagnostic method for micro-grid based on wavelet singularity entropy and SOM neural network [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(5): 118-122.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] QU Yan-peng,CHEN Song-ying,LI Chun-feng,WANG Xiao-peng,TENG Shu-ge . [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(4): 16 -20 .
[2] ZOU Feifei,GUAN Xiaojun,HAN Zhenqiang,SHEN Xiaomin,MA Xiaofei ,LIU Yunteng . hermal simulating experiment and FEM simulation of dynamic recrystallization of 09CuPTiRE steel[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(5): 17 -20 .
[3] WANG Yong, XIE Yudong. Gas control technology of largeflow pipe[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(2): 70 -74 .
[4] LI Hui-ping, ZHAO Guo-qun, ZHANG Lei, HE Lian-fang. The development status of hot stamping and quenching of ultra high-strength steel[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(3): 69 -74 .
[5] XIA Bin,ZHANG Lian-jun . Energy comparison-based TOA estimation algorithm for the DS-CDMA UWB system[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2007, 37(1): 70 -73 .
[6] LIU Xin 1, SONG Sili 1, WANG Xinhong 2. [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(2): 98 -100 .
[7] . [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(2): 108 -112 .
[8] . [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(2): 104 -107 .
[9] CHEN Huaxin, CHEN Shuanfa, WANG Binggang. The aging behavior and mechanism of base asphalts[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(2): 125 -130 .
[10] XUE Qiang,AI Xing,ZHAO Jun,ZHOU Yong-hui,YUAN Xun-liang . Effects of TiC nano-sized particle on the microstructure and properties of Si3N4 composite ceramics[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(3): 69 -72 .