Journal of Shandong University(Engineering Science) ›› 2025, Vol. 55 ›› Issue (6): 21-34.doi: 10.6040/j.issn.1672-3961.0.2024.288

• Machine Learning & Data Mining • Previous Articles    

Fast multi-label feature selection method based on global redundancy minimization

TANG Jiefeng, ZHANG Jia*, LONG Jinyi   

  1. TANG Jiefeng, ZHANG Jia*, LONG Jinyi(College of Information Science and Technology, Jinan University, Guangzhou 510632, Guangdong, China
  • Published:2025-12-22

Abstract: To solve the problem of the curse of dimensionality in multi-label learning and the problem that filter feature selection methods were prone to fall into local optima, a fast multi-label feature selection method based on global redundancy minimization was proposed. Candidate labels and candidate feature subsets were selected from the original label space and feature space through K-means clustering and mutual information calculation; the local optima problem was solved by minimizing global redundancy, and the feature weight with minimum feature redundancy was obtained to ensure outputting the best feature subset; an ensemble learning strategy was used to enhance the stability of feature selection. The experimental results on 14 multi-label datasets showed that the proposed method had better performance than other methods in all classification indicators.

Key words: multi-label learning, feature selection, curse of dimensionality, fast global redundancy minimization

CLC Number: 

  • TP181
[1] ZHANG M L, ZHOU Z H. A review on multi-label learning algorithms[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8): 1819-1837.
[2] 马坤, 刘筱云, 李乐平, 等. 用于意图识别的自适应多标签信息学习模型[J]. 山东大学学报(工学版), 2024, 54(1): 45-51. MA Kun, LIU Xiaoyun, LI Leping, et al. Adaptive label information learning for intention detection [J]. Journal of Shandong University(Engineering Science), 2024, 54(1): 45-51.
[3] 李云, 卢志翔, 刘姝伊, 等. 基于深度多模态关联学习的短视频多标签分类研究[J]. 数据分析与知识发现, 2024, 8(7): 77-88. LI Yun, LU Zhixiang, LIU Shuyi, et al. Research on micro-video multi-label classification based on deep multimodal association learning[J]. Data Analysis and Knowledge Discovery, 2024, 8(7): 77-88.
[4] 张建贺, 姜晓燕. 结合双路网络和多标签分类的弱监督行人搜索[J]. 计算机工程与应用, 2023, 59(9): 159-166. ZHANG Jianhe, JIANG Xiaoyan. Weakly supervised person search combining dual-path network and multi-label classification[J]. Computer Engineering and Applications, 2023, 59(9): 159-166.
[5] 周慧颖,汪廷华, 张代俐. 多标签特征选择研究进展[J]. 计算机工程与应用, 2022, 58(15): 52-67. ZHOU Huiying, WANG Tinghua, ZHANG Daili. Research progress of multi-label feature selection[J]. Computer Engineering and Applications, 2022, 58(15): 52-67.
[6] 李永豪, 胡亮, 高万夫. 基于稀疏系数矩阵重构的多标记特征选择[J]. 计算机学报, 2022, 45(9): 1827-1841. LI Yonghao, HU Liang, GAO Wanfu. Multi-label feature selection based on sparse coefficient matrix reconstruction[J]. Chinese Journal of Computers,2022, 45(9):1827-1841.
[7] 胡军, 王海峰. 基于加权信息粒化的多标记数据特征选择算法[J]. 智能系统学报, 2023, 18(3): 619-628. HU Jun, WANG Haifeng. Feature selection algorithm of multi-labeled data based on weighted information granulation[J]. CAAI Transactions on Intelligent Systems, 2023, 18(3): 619-628.
[8] LEE J, KIM D W. Feature selection for multi-label classification using multivariate mutual information[J]. Pattern Recognition Letters, 2013, 34(3): 349-357.
[9] LIN Y J, HU Q H, LIU J H, et al. Multi-label feature selection based on max-dependency and min-redundancy[J]. Neurocomputing, 2015, 168: 92-103.
[10] LEE J, KIM D W. Fast multi-label feature selection based on information-theoretic feature ranking[J]. Pattern Recognition, 2015, 48(9): 2761-2771.
[11] SUN Z Q, ZHANG J, DAI L, et al. Mutual information based multi-label feature selection via constrained convex optimization[J]. Neurocomputing, 2019, 329: 447-456.
[12] ZHANG J, LIN Y D, JIANG M, et al. Multi-label feature selection via global relevance and redundancy optimization[C] //Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. Yokohama, Japan: IEEE, 2020: 2512-2518.
[13] DAI J H, HUANG W Y, ZHANG C C, et al. Multi-label feature selection by strongly relevant label gain and label mutual aid[J]. Pattern Recognition, 2024, 145: 109945.
[14] BUGATA P, DROTAR P. On some aspects of minimum redundancy maximum relevance feature selection[J]. Science China Information Sciences, 2020, 63(1): 112103.
[15] HASHEMI A, DOWLATSHAHI M B, NAZAMABADI-POUR H. Minimum redundancy maximum relevance ensemble feature selection: a bi-objective Pareto-based approach[J]. Journal of Soft Computing and Information Technology, 2023, 12(1): 20-28.
[16] ZHOU H F, WANG X Q, ZHU R R. Feature selection based on mutual information with correlation coefficient[J]. Applied Intelligence, 2022, 52(5): 5457-5474.
[17] WANG X J, ZHOU Y C. Multi-label feature selection with conditional mutual information[J]. Computational Intelligence and Neuroscience, 2022(1): 9243893.
[18] ZHANG P, LIU G X, SONG J Z. MFSJMI: multi-label feature selection considering join mutual information and interaction weight[J]. Pattern Recognition, 2023, 138: 109378.
[19] 张俐, 王枞. 基于最大相关最小冗余联合互信息的多标签特征选择算法[J]. 通信学报, 2018, 39(5): 111-122. ZHANG Li, WANG Cong. Multi-label feature selection algorithm based on joint mutual information of max-relevance and min-redundancy[J]. Journal on Communications, 2018, 39(5): 111-122.
[20] RAKESH D K, JANA P K. A general framework for class label specific mutual information feature selection method[J]. IEEE Transactions on Information Theory, 2022, 68(12): 7996-8014.
[21] JIAN L, LI J, SHU K, et al. Multi-label informed feature selection[C] //Proceedings of the Twenty-Fifth international Joint Conference on Artificial Intelligence. New York, USA: AAAI, 2016: 1627-1633.
[22] PENG H C, LONG F H, DING C. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(8): 1226-1238.
[23] SHIRZAD M B, KEYVANPOUR M R. A feature selection method based on minimum redundancy maximum relevance for learning to rank[C] //2015 AI & Robotics(IRANOPEN). Qazvin, Iran: IEEE, 2015: 1-5.
[24] AGHAEIPOOR F, JAVIDI M M. A hybrid fuzzy feature selection algorithm for high-dimensional regression problems: an mRMR-based framework[J]. Expert Systems with Applications, 2020, 162: 113859.
[25] 徐洪峰, 孙振强. 多标签学习中基于互信息的快速特征选择方法[J]. 计算机应用, 2019, 39(10): 2815-2821. XU Hongfeng, SUN Zhenqiang. Fast feature selection method based on mutual information in multi-label learning[J]. Journal of Computer Applications, 2019, 39(10): 2815-2821.
[26] BERTSEKAS D P. Constrained optimization and lagrange multiplier methods[M]. Amsterdam: Elsevier, 1982.
[27] WU X Z, ZHOU Z H. A unified view of multi-label performance measures[C] //Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia: ACM, 2017: 3780-3788.
[28] MA J H, CHIU B C Y, CHOW T W S. Multilabel classification with group-based mapping: a framework with local feature selection and local label correlation[J]. IEEE Transactions on Cybernetics, 2022, 52(6): 4596-4610.
[29] HUANG R, JIANG W D, SUN G L. Manifold-based constraint Laplacian score for multi-label feature selection[J]. Pattern Recognition Letters, 2018, 112: 346-352.
[30] ZHANG Y, HUO W, TANG J. Multi-label feature selection via latent representation learning and dynamic graph constraints[J]. Pattern Recognition, 2024, 151: 110411.
[31] SUN Z Z, XIE H, LIU J H, et al. Multi-label feature selection via adaptive dual-graph optimization[J]. Expert Systems with Applications, 2024, 243: 122884.
[32] ZHANG M L, ZHOU Z H. ML-KNN: a lazy learning approach to multi-label learning[J]. Pattern Recognition, 2007, 40(7): 2038-2048.
[33] FRIEDMAN M. A comparison of alternative tests of significance for the problem of m rankings[J]. The Annals of Mathematical Statistics, 1940, 11(1): 86-92.
[34] SHESKIN D J. Handbook of parametric and nonpara-metric statistical procedures, fifth edition[M]. New York: Chapman and Hall/CRC, 2020.
[35] DEMŠAR J. Statistical comparisons of classifiers over multiple data sets[J]. The Journal of Machine Learning Research, 2006, 7: 1-30.
[1] Caihui LIU,Qi ZHOU,Xiaowen YE. An intrusion detection model based on improved ReliefF algorithm [J]. Journal of Shandong University(Engineering Science), 2023, 53(2): 1-10.
[2] Yan PENG,Tingting FENG,Jie WANG. An integrated learning approach for O3 mass concentration prediction model [J]. Journal of Shandong University(Engineering Science), 2020, 50(4): 1-7.
[3] Xin MA,Xue WANG. Prediction of microRNA-binding residues based on Laplacian support vector machine and sequence information [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 76-82.
[4] Jiachen WANG, Xianghong TANG, Jianguang LU. Research onfeature selection technology in bearing fault diagnosis [J]. Journal of Shandong University(Engineering Science), 2019, 49(2): 80-87.
[5] Hong CHEN,Xiaofei YANG,Qing WAN,Yingcang MA. Multi-label feature selection algorithm based on correntropy andmanifold learning [J]. Journal of Shandong University(Engineering Science), 2018, 48(6): 27-36.
[6] Lianming MOU. Weighted k sub-convex-hull classifier based on adaptive feature selection [J]. Journal of Shandong University(Engineering Science), 2018, 48(5): 32-37.
[7] LI Sushu, WANG Shitong, LI Tao. A feature selection method based on LS-SVM and fuzzy supplementary criterion [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(3): 34-42.
[8] FANG Hao, LI Yun. Random undersampling and POSS method for software defect prediction [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(1): 15-21.
[9] MO Xiaoyong, PAN Zhisong, QIU Junyang, YU Yajun, JIANG Mingchu. Anomaly detection in network traffic based on online feature selection [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2016, 46(4): 21-27.
[10] WEI Xiaomin, XU Bin, GUAN Jihong. Prediction of protein energy hot spots based on recursion feature elimination [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2014, 44(2): 12-20.
[11] LI Ya-lin1,2, ZHANG Hua-xiang1,2*, FENG Xin-ying1,2. A new multi-label learning algorithm based on semi-supervised learning [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2013, 43(2): 18-22.
[12] PAN Dong-yin, ZHU Fa, XU Sheng, YE Ning*. Feature selection of gene expression profiles of colon cancer [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2012, 42(2): 23-29.
[13] LI Guo-he1,2, YUE Xiang1,2, LI Xue3, WU Wei-jiang1,2, LI Hong-qi1. A method of feature selection for continuous attributes [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2011, 41(6): 1-6.
[14] LI Xia1, WANG Lian-xi2, JIANG Sheng-yi1. Ensemble learning based feature selection for imbalanced problems [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2011, 41(3): 7-11.
[15] YOU Ming-yu, CHEN Yan, LI Guo-zheng. Im-IG: A novel feature selection method for imbalanced problems [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(5): 123-128.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!