通过元学习增强泛化的一种少样本模仿学习方法

doi:10.6040/j.issn.1672-3961.0.2025.105

Abstract

Abstract: To address the issues of poor training performance and insufficient generalization capability of most classical imitation learning methods in few-shot scenarios due to data scarcity, a meta-learning based generative adversarial imitation learning(Meta-GAIL)method was proposed. Through the introduction of meta-learning mechanisms, the policy network pre-accumulated experiential knowledge from diverse tasks with similar characteristics to the target task. The generative adversarial imitation learning(GAIL)algorithm was utilized to fine-tune the network using the limited demonstration data provided by the target task, achieving rapid adaptive transfer to new tasks. To validate the effectiveness of the method, systematic experiments were conducted on the MuJoCo physics simulation platform, where Meta-GAIL method was compared and evaluated against baseline algorithms. Experimental results demonstrated that Meta-GAIL method exhibited stronger rapid adaptability in unseen similar task scenarios by effectively integrating cross-task knowledge representations acquired during the meta-learning phase, and its performance consistently outperformed baseline algorithms under few-shot settings.

Key words: few-shot learning, imitation learning, generative adversarial imitation learning, meta-learning, generalization

CLC Number:

TP18

WEI Long, FENG Xiang, YU Huiqun. A few-shot imitation learning method by improving generalization with meta-learning[J].Journal of Shandong University(Engineering Science), 2026, 56(3): 144-155.

References

[1] TORABI F, WARNELL G, STONE P. Behavioral cloning from observation[C] //Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm, Sweden: AAAI, 2018: 4950-4957.
[2] NG A Y, RUSSELL S. Algorithms for inverse reinforcement learning[C] //Proceedings of the Seven-teenth International Conference on Machine Learning. San Francisco, USA: Morgan Kaufmann, 2000: 663-670.
[3] ZARE M, KEBRIA P M, KHOSRAVI A, et al. A survey of imitation learning: algorithms, recent developments, and challenges[J]. IEEE Transactions on Cybernetics, 2024, 54(12): 7173-7186.
[4] HAKHAMANESHI K, ZHAO R H, ZHAN A, et al. Hierarchical few-shot imitation with skill transition models[EB/OL].(2022-03-10)[2025-04-21]. https://arxiv.org/abs/2107.08981
[5] CAO H Y, COHEN S N, SZPRUCH L. Identifiability in inverse reinforcement learning[EB/OL].(2021-11-08)[2025-04-21]. https://arxiv.org/abs/2106.03498
[6] ARORA S, DOSHI P. A survey of inverse reinforcement learning: challenges, methods and progress[J]. Artificial Intelligence, 2021, 297: 103500.
[7] HO J, ERMON S. Generative adversarial imitation learning[C] //Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: ACM, 2016: 4572-4580.
[8] GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C] //Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada: MIT, 2014: 2672-2680.
[9] RAVICHANDAR H, POLYDOROS A S, CHERNOVA S, et al. Recent advances in robot learning from demonstration[J]. Annual Review of Control, Robotics, and Autonomous Systems, 2020, 3: 297-330.
[10] JENA R, LIU C, SYCARA K. Augmenting GAIL with BC for sample efficient imitation learning[EB/OL].(2020-11-09)[2025-04-21]. https://arxiv.org/abs/2001.07798
[11] BARAM N, ANSCHEL O, CASPI I, et al. End-to-end differentiable adversarial imitation learning[C] //Pro-ceedings of the 34th International Conference on Machine Learning. Sydney, Australia: PMLR, 2017: 390-399.
[12] FINN C, YU T, ZHANG T, et al. One-shot visual imitation learning via meta-learning[EB/OL].(2017-09-14)[2025-04-21]. https://arxiv.org/abs/1709.04905
[13] HUISMAN M, VAN RIJN J N, PLAAT A. A survey of deep meta-learning[J]. Artificial Intelligence Review, 2021, 54(6): 4483-4541.
[14] REDDY S, DRAGAN A D, LEVINE S. SQIL: imitation learning via reinforcement learning with sparse rewards[EB/OL].(2019-09-25)[2025-04-21]. https://arxiv.org/abs/1905.11108
[15] TORABI F, WARNELL G, STONE P. Recent advances in imitation learning from observation[EB/OL].(2019-06-19)[2025-04-21]. https://arxiv.org/abs/1905.13566
[16] OSA T, PAJARINEN J, NEUMANN G, et al. An algorithmic perspective on imitation learning[J]. Foundations and Trends in Robotics, 2018, 7(1/2): 1-179.
[17] PATACCHIOLA M, SUN M F, HOFMANN K, et al. Comparing the efficacy of fine-tuning and meta-learning for few-shot policy imitation[EB/OL].(2023-06-23)[2025-04-21]. https://arxiv.org/abs/2306.13554
[18] DE HAAN P, JAYARAMAN D, LEVINE S. Causal confusion in imitation learning[C] //Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver, Canada: ACM, 2019: 11698-11709.

Related Articles 15

[1]	LIU Ziyi, CUI Chaoran, MENG Fan'an, LIN Peiguang. Multi-source-free domain adaptation with batch normalization statistics [J]. Journal of Shandong University(Engineering Science), 2023, 53(2): 102-108.
[2]	XU Qianqian, XU Qian, XU Huachang, ZHAO Yulin, XU Kai, ZHU Hong. Intelligent prediction method of IDH1 mutation status of glioma based on CnViT [J]. Journal of Shandong University(Engineering Science), 2023, 53(2): 127-134.
[3]	Chunhong CAO,Hongxuan DUAN,Ling CAO,Lele ZHANG,Kai HU,Fen XIAO. Real-time semantic segmentation of high-resolution remote sensing image based on multi-level feature cascade [J]. Journal of Shandong University(Engineering Science), 2021, 51(2): 19-25.
[4]	Yan PENG,Tingting FENG,Jie WANG. An integrated learning approach for O₃ mass concentration prediction model [J]. Journal of Shandong University(Engineering Science), 2020, 50(4): 1-7.
[5]	Yifei LI,Zunhua GUO. A Chirplet neural network for automatic target recognition [J]. Journal of Shandong University(Engineering Science), 2020, 50(3): 8-14.
[6]	Yibin WANG,Tianli LI,Yusheng CHENG,Kun QIAN. Label distribution learning based on kernel extreme learning machine auto-encoder [J]. Journal of Shandong University(Engineering Science), 2020, 50(3): 58-65.
[7]	Chunyang LI,Nan LI,Tao FENG,Zhuhe WANG,Jingkai MA. Abnormal sound detection of washing machines based on deep learning [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 108-117.
[8]	Minghe GAO,Ying ZHANG,Rongrong ZHANG,Zihao HUANG,Linyan HUANG,Fanyu LI,Xin ZHANG,Yanhao WANG. Air quality prediction approach based on integrating forecasting dataset [J]. Journal of Shandong University(Engineering Science), 2020, 50(2): 91-99.
[9]	Yingda LI,Zongxia XIE. Support vector regression algorithm based on kernel similarity reduced strategy [J]. Journal of Shandong University(Engineering Science), 2019, 49(3): 8-14.
[10]	Chengbin ZHANG,Hui ZHAO,Zongyu CAO. The vulnerability mining method for KWP2000 protocol based on deep learning and fuzzing [J]. Journal of Shandong University(Engineering Science), 2019, 49(2): 17-22.
[11]	Kuo PANG,Siqi CHEN,Xiaoying SONG,Li ZOU. Linguistic concept formal decision context analysis based on granular computing [J]. Journal of Shandong University(Engineering Science), 2018, 48(6): 74-81.
[12]	Hong CHEN,Xiaofei YANG,Qing WAN,Yingcang MA. Multi-label feature selection algorithm based on correntropy andmanifold learning [J]. Journal of Shandong University(Engineering Science), 2018, 48(6): 27-36.
[13]	Mengmeng LIANG,Tao ZHOU,Yong XIA,Feifei ZHANG,Jian YANG. Lung tumor images recognition based on PSO-ConvK convolutional neural network [J]. Journal of Shandong University(Engineering Science), 2018, 48(5): 77-84.
[14]	WANG Tingting, ZHAI Junhai, ZHANG Mingyang, HAO Pu. K-NN algorithm for big data based on HBase and SimHash [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 54-59.
[15]	HE Zhengyi, ZENG Xianhua, GUO Jiang. An ensemble method with convolutional neural network and deep belief network for gait recognition and simulation [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(3): 88-95.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

A few-shot imitation learning method by improving generalization with meta-learning

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0