您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2022, Vol. 52 ›› Issue (2): 80-88.doi: 10.6040/j.issn.1672-3961.0.2021.316

• • 上一篇    

一种面向多标签分类的在线主动学习算法

龚楷伦,翟婷婷*,唐鸿成   

  1. 扬州大学信息工程学院, 江苏 扬州 225127
  • 发布日期:2022-04-20
  • 作者简介:龚楷伦(1995— ),女,江苏靖江人,硕士研究生,主要研究方向为多标签在线主动学习. E-mail:1019467856@qq.com. *通信作者简介:翟婷婷(1988— ),女,河南济源人,讲师,博士,主要研究方向为在线机器学习、数据流挖掘和随机优化. E-mail:zhtt@yzu.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(61906165);江苏省高等学校自然科学研究资助项目(19KJB520064)

An online active learning algorithm for multi-label classification

GONG Kailun, ZHAI Tingting*, TANG Hongcheng   

  1. College of Information Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China
  • Published:2022-04-20

摘要: 针对现有算法多标签分类器收敛效率低和标签查询策略未考虑特征辨别能力的弊端,提出一种基于判别采样和镜像梯度下降规则的多标签在线主动学习算法(multi-label active mirror descent by discrimination sampling,MLAMD_D)。MLAMD_D算法采用二元关联策略将包含C个标签的多标签分类问题分解成C个相互独立的二分类问题,算法使用镜像梯度下降规则更新其二分类器,并采用基于判别的采样策略。将MLAMD_D算法与现有算法以及基于随机采样和镜像梯度下降规则的多标签在线主动学习算法(multi-label active mirror descent by random sampling,MLAMD_R)在6个多标签分类数据集上进行对比试验。试验结果表明,MLAMD_D算法的多标签分类性能优于其他多标签在线主动学习算法。因此,MLAMD_D算法在处理多标签在线主动学习的任务中具有可行性和有效性。

关键词: 在线主动学习, 多标签分类, 弱监督学习, 基于判别的采样策略, 二元关联策略

中图分类号: 

  • TP181
[1] ZHANG M L, ZHOU Z H. A review on multi-label learning algorithms[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8):1819-1837.
[2] AL-SALEMI B, AYOB M, NOAH S. Feature ranking for enhancing boosting-based multi-label text categorization[J]. Expert Systems with Applications, 2018, 113:531-543.
[3] LIU Y, WEN K, GAO Q, et al. SVM based multi-label learning with missing labels for image annotation[J]. Pattern Recognition, 2018, 78:307-317.
[4] XU X S, JIANG Y, PENG L, et al. Ensemble approach based onconditional random field for multi-label image and video annotation[C] //Proceedings of the 19th International Conference on Multimedea 2011. Scottsdale, AZ, USA: ACM, 2011: 1377-1380.
[5] SUN L, ZU C, SHAO W, et al. Reliability-based robust multi-atlas label fusion for brain MRI segmentation[J]. ArtificialIntelligence in Medicine, 2019, 96:12-24.
[6] TSOUMAKAS G, KATAKIS I, VLAHAVAS I. Mining multi-label data[M] //Data mining and knowledge discovery handbook. Boston, USA: Springer, 2009: 667-685.
[7] BOUTELL M R, LUO J, SHEN X, et al. Learning multi-label scene classification[J]. Pattern Recognition, 2004, 37(9): 1757-1771.
[8] ZHANG M L, ZHOU ZH. ML-KNN: a lazy learning approach to multi-label learning[J]. Pattern Recognition, 2007, 40(7): 2038-2048.
[9] AGGARWA L, CHARU C. A survey of stream classification algorithms[J]. Data Classification: Algorithms and Applications, 2014: 245-274.
[10] KREMPL G, SPILIOPOULOU M, STEFANOWSKI J, et al. Open challenges for data stream mining research[J]. Acm Sigkdd Explorations Newsletter, 2014, 16(1):1-10.
[11] 翟婷婷,高阳,朱俊武.面向流数据分类的在线学习综述[J].软件学报,2020,31(4):912-931. ZHAI Tingting, GAO Yang, ZHU Junwu. Survey of online learning algorithms for streaming data classification[J]. Journal of Software, 2020, 31(4):912-931.
[12] ZHANG X, GRAEPEL T, HERBRICH R. Bayesianonline learning for multi-label and multi-variate performance measures[C] //Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, JMLR Workshop and Conference Proceedings.Sardinia, Italy: JMLR Workshop and Conference Proceedings, 2010: 956-963.
[13] HIGUCHI D, OZAWA S. Aneural network model for online multi-task multi-label pattern recognition[C] //International Conference on Artificial Neural Networks and Machine Learning. Berlin, Heidelberg:Springer, 2013: 162-169.
[14] PARK S, CHOI S. Online multi-label learning with acceleratednonsmooth stochastic gradient descent[C] //IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver, Canada: IEEE International Conference on Acoustics, 2013: 3322-3326.
[15] LUGHOFER, EDWIN. On-line active learning: a new paradigm to improve practical useability of data stream modeling methods[J]. Information Sciences, 2017, 415: 356-376.
[16] HUANG S J, LI G X, HUANG W Y, et al. Incremental multi-label learning with active queries[J]. Journal of Computer Science and Technology, 2020, 35(2):234-246.
[17] HUA X S, QI G J. Online multi-label active annotation: towards large-scale content-based video search[C] //Proceedings of the 16th ACM International Conference on Multimedia. New York, USA: Association for Computing Machinery, 2008: 141-150.
[18] 徐美香,孙福明,李豪杰.主动学习的多标签图像在线分类[J].中国图象图形学报,2015,20(2):237-244. XU Meixiang, SUN Fuming, LI Haojie. Online multi-label image classification with active learning[J]. Journal of Image and Graphics, 2015, 20(2): 237-244.
[19] CRAMMER K, SINGER Y. A family of additive online algorithms for category ranking[J]. Journal of Machine Learning Research, 2003, 3: 1025-1058.
[20] GUO X, ZHANG Y, XU J. Online multi-label passive aggressive active learning algorithm based on binary relevance[C] // International Conference on Neural Information Processing. Cham, Germany: Springer, 2017: 256-266.
[21] GIBAJA E L, VENTURA S. Atutorial on multi-label learning[J]. ACM Computing Surveys, 2015, 47(3):1-38.
[22] CRAMMER K, DEKEL O, KESHET J, et al. Online passive-aggressive algorithms[J]. Journal of Machine Learning Research, 2006, 7:551-585.
[23] DUCHI J C, SHALEV-SHWARTZ S, SINGER Y, et al. Composite objective mirror descent[J]. Learning/statistics & Optimisation Theory & Algorithms, 2010: 14-26.
[24] CRAMMER K, DREDZE M, PEREIRA F. Confidence-weighted linear classification for text categorization[J]. Journal of Machine Learning Research, 2012, 13: 1891-1926.
[25] LU J, ZHAO P, HOI S C H. Onlinepassive-aggressive active learning[J]. Machine Learning, 2016, 103(2): 141-183.
[1] 彭岩,冯婷婷,王洁. 基于集成学习的O3的质量浓度预测模型[J]. 山东大学学报 (工学版), 2020, 50(4): 1-7.
[2] 王一宾,李田力,程玉胜,钱坤. 基于核极限学习机自编码器的标记分布学习[J]. 山东大学学报 (工学版), 2020, 50(3): 58-65.
[3] 李春阳,李楠,冯涛,王朱贺,马靖凯. 基于深度学习的洗衣机异常音检测[J]. 山东大学学报 (工学版), 2020, 50(2): 108-117.
[4] 李英达,谢宗霞. 基于核相似性删减策略的支持向量回归算法[J]. 山东大学学报 (工学版), 2019, 49(3): 8-14.
[5] 庞阔,陈思琪,宋笑迎,邹丽. 基于粒计算的语言概念决策形式背景分析[J]. 山东大学学报 (工学版), 2018, 48(6): 74-81.
[6] 王婷婷,翟俊海,张明阳,郝璞. 基于HBase和SimHash的大数据K-近邻算法[J]. 山东大学学报(工学版), 2018, 48(3): 54-59.
[7] 何正义,曾宪华,郭姜. 一种集成卷积神经网络和深信网的步态识别与模拟方法[J]. 山东大学学报(工学版), 2018, 48(3): 88-95.
[8] 崔晓松,王颖,孟佳, 邹丽. 基于语言值相似度推理的网络商家自评价方法[J]. 山东大学学报(工学版), 2018, 48(1): 1-7.
[9] 姚宇,冯健,张化光,韩克镇. 一种基于椭球体支持向量描述的异常检测方法[J]. 山东大学学报(工学版), 2017, 47(5): 195-202.
[10] 李素姝,王士同,李滔. 基于LS-SVM与模糊补准则的特征选择方法[J]. 山东大学学报(工学版), 2017, 47(3): 34-42.
[11] 刘英霞,王希常,唐晓丽,常发亮. 基于小波域特征和贝叶斯估计的目标检测算法[J]. 山东大学学报(工学版), 2017, 47(2): 63-70.
[12] 陈泽华,尚晓慧,柴晶. 基于混合Hausdorff距离的多示例学习近邻分类器[J]. 山东大学学报(工学版), 2016, 46(6): 15-22.
[13] 王志强,文益民,李芳. 基于多方面评分的景点协同推荐算法[J]. 山东大学学报(工学版), 2016, 46(6): 54-61.
[14] 何正义,曾宪华,曲省卫,吴治龙. 基于集成深度学习的时间序列预测模型[J]. 山东大学学报(工学版), 2016, 46(6): 40-47.
[15] 王梅,曾昭虎,孙莺萁,杨二龙,宋考平. 基于输入K-近邻的正则化路径上SVR贝叶斯组合[J]. 山东大学学报(工学版), 2016, 46(6): 8-14.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!