Journal of Shandong University(Engineering Science) ›› 2025, Vol. 55 ›› Issue (4): 56-71.doi: 10.6040/j.issn.1672-3961.0.2024.055
• Machine Learning & Data Mining • Previous Articles
YANG Jucheng, LU Kaikui, WANG Yuan
CLC Number:
| [1] HINTON G, VINYALS O, DEAN J. Distilling the knowledge in a neural network[EB/OL].(2015-05-09)[2024-04-20]. https://arxiv.org/abs/1503.02531v1 [2] SUCHOLUTSKY I, SCHONLAU M. Soft-label dataset distillation and text dataset distillation[C] //Proceedings of the 2021 International Joint Conference on Neural Networks(IJCNN). Shenzhen, China: IEEE, 2021: 1-8. [3] MALIK S M, HAIDER M U, THARANI M, et al. Teacher-class network: a neural network compression mechanism[EB/OL].(2021-10-29)[2024-04-30]. https://arxiv.org/abs/2004.03281v3 [4] PARK W, KIM D, LU Y, et al. Relational knowledge distillation[C] //Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Long Beach, USA: IEEE, 2019: 3962-3971. [5] ROMERO A, BALLAS N, KAHOU S E, et al. FitNets: hints for thin deep nets[C] //Proceedings of the 3rd International Conference on Learning Representations. Washington, D.C., USA: ICLR, 2015: 1-13. [6] YIM J, JOO D, BAE J, et al. A gift from knowledge distillation: fast optimization, network minimization and transfer learning[C] //Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Honolulu, USA: IEEE, 2017: 7130-7138. [7] BUDNIK M, AVRITHIS Y. Asymmetric metric learning for knowledge transfer[C] //Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Nashville, USA: IEEE, 2021: 8228-8238. [8] LI X J, WU J L, FANG H Y, et al. Local correlation consistency for knowledge distillation[C] //Proceedings of the European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 18-33. [9] TAO X Y, HONG X P, CHANG X Y, et al. Few-shot class-incremental learning[C] //Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Seattle, USA: IEEE, 2020: 12183-12192. [10] CHO J H, HARIHARAN B. On the efficacy of knowledge distillation[C] //Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision(ICCV). Seoul: IEEE, 2019: 4793-4801. [11] ZHOU Z D, ZHUGE C R, GUAN X W, et al. Channel distillation: channel-wise attention for knowledge distillation[EB/OL].(2020-06-02)[2024-04-25]. https://arxiv.org/abs/2006.01683v1 [12] YUE K Y, DENG J F, ZHOU F. Matching guided distillation[C] //Computer Vision-ECCV 2020. Glasgow, UK: Springer, 2020: 312-328. [13] CHEN S Y, WANG W Y, PAN S J, et al. Cooperative pruning in cross-domain deep neural network compression[C] //Proceedings of the 28th International Joint Conference on Artificial Intelligence. Macao, China: ACM, 2019: 2102-2108. [14] LE D H, VO T N, THOAI N. Paying more attention to snapshots of iterative pruning: improving model compre-ssion via ensemble distillation[EB/OL].(2020-08-14)[2024-04-25]. https://arxiv.org/abs/2006.11487v3 [15] XIE J, LIN S H, ZHANG Y C, et al. Compressing convolutional neural networks with cheap convolutions and online distillation[J]. Displays, 2023, 78: 102428. [16] XU K R, RUI L, LI Y S, et al. Feature normalized knowledge distillation for image classification[C] //Computer Vision-ECCV 2020. Glasgow, UK: Springer, 2020: 664-680. [17] CHEN W C, CHANG C C, LEE C R. Knowledge distillation with feature maps for image classification[C] //Computer Vision-ACCV 2018. Perth, Australia: Springer, 2019: 200-215. [18] HUANG Z Y, ZOU Y, BHAGAVATULA V, et al. Comprehensive attention self-distillation for weakly-supervised object detection[C] //Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver, Canada: ACM, 2020: 16797-16807. [19] LI M, HALSTEAD M, MCCOOL C. Knowledge distillation for efficient instance semantic segmentation with transformers[C] //Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Seattle, USA: IEEE, 2024: 5432-5439. [20] GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C] //Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada: ACM, 2014: 2672-2680. [21] 张俊三, 程俏俏, 万瑶, 等. MIRGAN: 一种基于GAN的医学影像报告生成模型[J]. 山东大学学报(工学版), 2021, 51(2): 9-18. ZHANG Junsan, CHENG Qiaoqiao, WAN Yao, et al. MIRGAN: a medical image report generation model based on GAN[J]. Journal of Shandong University(Engineering Science), 2021, 51(2): 9-18. [22] 张月芳, 邓红霞, 呼春香, 等. 融合残差块注意力机制和生成对抗网络的海马体分割[J]. 山东大学学报(工学版), 2020, 50(6): 76-81. ZHANG Yuefang, DENG Hongxia, HU Chunxiang, et al. Hippocampal segmentation combining residual attention mechanism and generative adversarial networks[J]. Journal of Shandong University(Engineering Science), 2020, 50(6): 76-81. [23] GOU J P, YU B S, MAYBANK S J, et al. Knowledge distillation: a survey[J]. International Journal of Computer Vision, 2021, 129: 1789-1819. [24] 黄震华, 杨顺志, 林威, 等. 知识蒸馏研究综述[J]. 计算机学报, 2022, 45(3): 624-653. HUANG Zhenhua, YANG Shunzhi, LIN Wei, et al. Knowledge distillation: a survey[J]. Chinese Journal of Computers, 2022, 45(3): 624-653. [25] BUCILUĂ C, CARUANA R, NICULESCU-MIZIL A. Model compression[C] //Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Philadelphia, USA: ACM, 2006: 535-541. [26] BA L J, CARUANA R. Do deep nets really need to be deep? [C] //Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada: ACM, 2014: 2654-2662. [27] LI J Y, ZHAO R, HUANG J T, et al. Learning small-size DNN with output-distribution-based criteria[C] //Proceedings of the 15th Annual Conference of the International Speech Communication Association. Singapore: ISCA, 2014: 1910-1914. [28] TANG Z Y, WANG D, ZHANG Z Y. Recurrent neural network training with dark knowledge transfer[C] //Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP). Shanghai, China: IEEE, 2016: 5900-5904. [29] YANG C L, XIE L X, QIAO S Y, et al. Training deep neural networks in generations: a more tolerant teacher educates better students[C] //Proceedings of the 2019 AAAI Conference on Artificial Intelligence. Honolulu, USA: AAAI, 2019: 5628-5635. [30] YUAN L, TAY F E, LI G L, et al. Revisiting knowledge distillation via label smoothing regularization[C] //Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Seattle, USA: IEEE, 2020: 3902-3910. [31] ZHAO B R, CUI Q, SONG R J, et al. Decoupled knowledge distillation[C] //Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). New Orleans, USA: IEEE, 2022: 11943-11952. [32] XIE Q Z, LUONG M T, HOVY E, et al. Self-training with noisy student improves ImageNet classification[C] //Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Seattle, USA: IEEE, 2020: 10684-10695. [33] GUPTA S, HOFFMAN J, MALIK J. Cross modal distillation for supervision transfer[C] //Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Las Vegas, USA: IEEE, 2016: 2827-2836. [34] KOOHPAYEGANI A S, TEJANKAR A, PIRSIAVASH H. CompRess: self-supervised learning by compressing representations[EB/OL].(2020-10-28)[2024-4-25]. https://arxiv.org/abs/2010.14713v1 [35] YUN S, PARK J, LEE K, et al. Regularizing class-wise predictions via self-knowledge distillation[C] //Procee-dings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Seattle, USA: IEEE, 2020: 13873-13882. [36] WU G L, GONG S G. Peer collaborative learning for online knowledge distillation[EB/OL].(2021-03-03)[2024-4-25]. https://arxiv.org/abs/2006.04147v2 [37] PENG B Y, JIN X, LI D S, et al. Correlation congruence for knowledge distillation[C] //Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision(ICCV). Seoul: IEEE, 2019: 5007-5016. [38] ZAGORUYKO S, KOMODAKIS N. Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer[EB/OL].(2017-02-12)[2024-04-25]. https://arxiv.org/abs/1612.03928v3 [39] TUNG F, MORI G. Similarity-preserving knowledge distillation[C] //Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision(ICCV). Seoul: IEEE, 2019: 1365-1374. [40] PASSALIS N, TEFAS A. Learning deep representations with probabilistic knowledge transfer[C] //Computer Vision-ECCV 2018. Munich, Germany: Springer, 2018: 283-299. [41] GUAN Y S, ZHAO P Y, WANG B X, et al. Differentiable feature aggregation search for knowledge distillation[C] //Computer Vision-ECCV 2020. Glasgow, UK: Springer, 2020: 469-484. [42] LEE S H, KIM D H, SONG B C. Self-supervised knowledge distillation using singular value decomposition[C] //Computer Vision-ECCV 2018. Munich, Germany: Springer, 2018: 339-354. [43] HEO B, LEE M, YUN S, et al. Knowledge transfer via distillation of activation boundaries formed by hidden neurons[C] //Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence. Palo Alto, USA: ACM, 2019: 3779-3787. [44] KIM J, PARK S, KWAK N, et al. Paraphrasing complex network: network compression via factor transfer[C] //Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montreal, Canada: ACM, 2018: 2765-2774. [45] AHN S, HU S X, DAMIANOU A, et al. Variational information distillation for knowledge transfer[C] //Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Long Beach, USA: IEEE, 2019: 9163-9171. [46] YIM J, JOO D, BAE J, et al. A gift from knowledge distillation: fast optimization, network minimization and transfer learning[C] //Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Honolulu, USA: IEEE, 2017:7130-7138. [47] SRINIVAS S, FLEURET F. Knowledge transfer with Jacobian matching[EB/OL].(2018-03-01)[2024-04-25]. https://arxiv.org/abs/1803.00443v1 [48] PARK W, KIM D, LU Y, et al. Relational knowledge distillation[C] //Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Long Beach, USA: IEEE, 2019: 3962-3971. [49] LASSANCE C, BONTONOU M, HACENE G B, et al. Deep geometric knowledge distillation with graphs[C] //ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP). Barcelona, Spain: IEEE, 2020: 8484-8488. [50] LIU Y F, CAO J J, LI B, et al. Knowledge distillation via instance relationship graph[C] //Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Long Beach, USA: IEEE, 2019: 7089-7097. [51] XU X X, ZOU Q, LIN X, et al. Integral knowledge distillation for multi-person pose estimation[J]. IEEE Signal Processing Letters, 2020, 27: 436-440. [52] HOU Y N, MA Z, LIU C X, et al. Inter-region affinity distillation for road marking segmentation[C] //Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Seattle, USA: IEEE, 2020: 12483-12492. [53] CHEN X, ZHANG Y F, XU H T, et al. Adversarial distillation for efficient recommendation with external knowledge[J]. ACM Transactions on Information Systems, 2018, 37(1): 1-28. [54] XU Z, HSU Y C, HUANG J. Training student networks for acceleration with conditional adversarial networks[C] //Proceedings of the 2018 British Machine Vision Conference(BMVC). Newcastle, UK: BMVA, 2018:61. [55] ZHANG T C, LIU Y X. MTUW-GAN: a multi-teacher knowledge distillation generative adversarial network for underwater image enhancement[J]. Applied Sciences, 2024, 14(2): 529. [56] WANG Y H, XU C, XU C, et al. Adversarial learning of portable student networks[C] //Proceedings of the Thirty-Second AAAI Conference on Artificial Intelli-gence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence. New Orleans, USA: ACM, 2018: 4260-4267. [57] BELAGIANNIS V, FARSHAD A, GALASSO F. Adversarial network compression[C] //Computer Vision-ECCV 2018. Munich, Germany: Springer, 2019: 431-449. [58] AGUINALDO A, CHIANG P Y, GAIN A, et al. Compressing GANs using knowledge distillation[EB/OL].(2019-02-01)[2024-4-30]. https://arxiv.org/abs/1902.00159v1 [59] WANG X, ZHANG R, SUN Y, et al. KDGAN: knowledge distillation with generative adversarial networks[C] //Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montreal, Canada: ACM, 2018: 783-794. [60] REN Y X, WU J, XIAO X F, et al. Online multi-granularity distillation for GAN compression[C] //Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision(ICCV). Montreal, Canada: IEEE, 2021: 6773-6783. [61] HU T, LIN M B, YOU L Z, et al. Discriminator-cooperated feature map distillation for GAN compression[C] //Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Vancouver, Canada: IEEE, 2023: 20351-20360. [62] VO D M, SUGIMOTO A, NAKAYAMA H. PPCD-GAN: progressive pruning and class-aware distillation for large-scale conditional GANs compression[C] //Proceedings of the 2022 IEEE/CVF Winter Conference on Applications of Computer Vision(WACV). Waikoloa, USA: IEEE, 2022: 1422-1430. [63] CHEN H T, WANG Y H, XU C, et al. Data-free learning of student networks[C] //Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision(ICCV). Seoul: IEEE, 2019: 3513-3521. [64] YE J W, JI Y X, WANG X C, et al. Data-free knowledge amalgamation via group-stack dual-GAN[C] //Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Seattle, USA: IEEE, 2020: 12513-12522. [65] 张晶, 鞠佳良, 任永功. 基于双生成器网络的Data-Free知识蒸馏[J]. 计算机研究与发展, 2023, 60(7): 1615-1627. ZHANG Jing, JU Jialiang, REN Yonggong. Double-generators network for Data-Free knowledge distillation[J]. Journal of Computer Research and Development, 2023, 60(7): 1615-1627. [66] MICAELLI P, STORKEY A. Zero-shot knowledge transfer via adversarial belief matching[EB/OL].(2019-11-25)[2024-04-30]. https://arxiv.org/abs/1905.09768v4 [67] FANG G F, SONG J, SHEN C C, et al. Data-free adversarial distillation[EB/OL].(2020-03-02)[2024-04-30]. https://arxiv.org/abs/1912.11006v3 [68] YU X Y, YAN L, YANG Y, et al. Conditional generative data-free knowledge distillation[J]. Image and Vision Computing, 2023, 131: 104627. [69] DO K, LE T H, NGUYEN D, et al. Momentum adversarial distillation: handling large distribution shifts in data-free knowledge distillation[EB/OL].(2022-09-21)[2024-04-30]. https://arxiv.org/abs/2209.10359v1 [70] CHOI Y, CHOI J, EL-KHAMY M, et al. Data-free network quantization with adversarial knowledge distillation[C] //Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops(CVPRW). Seattle, USA: IEEE, 2020: 710-711. [71] CUI K W, YU Y C, ZHAN F N, et al. KD-DLGAN: data limited image generation via knowledge distillation[C] //Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Vancouver, Canada: IEEE, 2023: 3872-3882. [72] CHEN H T, WANG Y H, SHU H, et al. Distilling portable generative adversarial networks for image translation[C] //Proceedings of the 34th AAAI Confe-rence on Artificial Intelligence. Palo Alto, USA: AAAI, 2020: 3585-3592. [73] GAO T W, LONG R J. Accumulation knowledge distillation for conditional GAN compression[C] //Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision Workshops(ICCVW). Paris, France: IEEE, 2023: 1294-1303. [74] CHUNG I, PARK S, KIM J, et al. Feature-map-level online adversarial knowledge distillation[EB/OL].(2020-06-05)[2024-04-30]. https://arxiv.org/abs/2002.01775v3 [75] WANG W W, HONG W, WANG F, et al. GAN-knowledge distillation for one-stage object detection[J]. IEEE Access, 2020, 8: 60719-60727. [76] ARJOVSKY M, CHINTALA S, BOTTOU L. Wasserstein generative adversarial networks[C] //Proceedings of the 2017 International Conference on Machine Learning(ICML). Sydney, Australia: JMLR, 2017: 214-223. [77] MIRZA M, OSINDERO S. Conditional generative adversarial nets[EB/OL].(2014-11-06)[2024-04-30]. https://arxiv.org/abs/1411.1784v1 [78] CHEN P G, LIU S, ZHAO H S, et al. Distilling knowledge via knowledge review[C] //Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Nashville, USA: IEEE, 2021: 5006-5015. [79] HUANG Z Z, LIANG M F, QIN J H, et al. Understanding self-attention mechanism via dynamical system perspective[C] //Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision(ICCV). Paris, France: IEEE, 2023: 1412-1422. [80] 兰治, 严彩萍, 李红, 等. 混合双注意力机制生成对抗网络的图像修复模型[J]. 中国图象图形学报, 2023, 28(11): 3440-3452. LAN Zhi, YAN Caiping, LI Hong, et al. HDA-GAN: hybrid dual attention generative adversarial network for image inpainting[J]. Journal of Image and Graphics, 2023, 28(11): 3440-3452. [81] 黄仲浩, 杨兴耀, 于炯, 等. 基于多阶段多生成对抗网络的互学习知识蒸馏方法[J]. 计算机科学, 2022, 49(10): 169-175. HUANG Zhonghao, YANG Xingyao, YU Jiong, et al. Mutual learning knowledge distillation based on multi-stage multi-generative adversarial network[J]. Computer Science, 2022, 49(10): 169-175. [82] 钱亚冠, 马骏, 何念念, 等. 面向边缘智能的两阶段对抗知识迁移方法[J]. 软件学报, 2022, 33(12): 4504-4516. QIAN Yaguan, MA Jun, HE Niannian, et al. Two-stage adversarial knowledge transfer for edge intelligence[J]. Journal of Software, 2022, 33(12): 4504-4516. [83] SHI Y, TANG A D, NIU L F, et al. Sparse optimization guided pruning for neural networks[J]. Neurocomputing, 2024, 574: 127280. |
| [1] | ZHOU Qunying, SUI Jiacheng, ZHANG Ji, WANG Hongyuan. Industrial product surface defect detection based on self supervised convolution and parameter free attention mechanism [J]. Journal of Shandong University(Engineering Science), 2025, 55(4): 40-47. |
| [2] | Tongyu JIANG, Fan CHEN, Hongjie HE. Lightweight face super-resolution network based on asymmetric U-pyramid reconstruction [J]. Journal of Shandong University(Engineering Science), 2022, 52(1): 1-8. |
| [3] | YANG Xiuyuan, PENG Tao, YANG Liang, LIN Hongfei. Adaptive multi-domain sentiment analysis based on knowledge distillation [J]. Journal of Shandong University(Engineering Science), 2021, 51(3): 15-21. |
| [4] | Chunyang LI,Nan LI,Tao FENG,Zhuhe WANG,Jingkai MA. Abnormal sound detection of washing machines based on deep learning [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 108-117. |
| [5] | Zhifu CHANG,Fengyu ZHOU,Yugang WANG,Dongdong SHEN,Yang ZHAO. A survey of image captioning methods based on deep learning [J]. Journal of Shandong University(Engineering Science), 2019, 49(6): 25-35. |
|
||