Journal of Shandong University(Engineering Science) ›› 2025, Vol. 55 ›› Issue (3): 34-45.doi: 10.6040/j.issn.1672-3961.0.2024.065

• Transportation Engineering—Special Issue for Intelligent Transportation • Previous Articles    

Hierarchical multi-agent reinforcement learning based route guidance method combining personalization and signal control

GAO Junjian, LIAO Zhuhua*, LIU Yizhi, ZHAO Yijiang   

  1. GAO Junjian, LIAO Zhuhua*, LIU Yizhi, ZHAO Yijiang(School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan 411201, Hunan, China
  • Published:2025-06-05

Abstract: To further alleviate traffic congestion and improve road network efficiency, this study proposed an urban vehicle route guidance method integrating personalized routing strategies and traffic signal control based on hierarchical multi-agent reinforcement learning(MARL). Route guidance agents and traffic signal control agents were deployed at intersections to provide personalized routing policies and optimize traffic light control, thereby balancing urban traffic flow. To overcome the limitations of predefined graph structures in representing dynamic traffic state features, the traffic signal control agents employed an adaptive graph convolutional network to autonomously capture spatial correlations among peer agents. Concurrently, the route guidance agents integrated meanfield game to analyze aggregated vehicle actions, effectively capturing inter-vehicle interactions for coordinated decision-making while delivering destination-specific routing strategies. To prevent local congestion and severe traffic imbalance, a multi-agent proximal policy optimization(MAPPO)algorithm was adopted, enabling centralized training and decentralized execution for cooperative signal control agents to implement directional flow restriction. A hierarchical reinforcement learning framework facilitated information sharing and collaboration among heterogeneous agents. Extensive experiments were conducted on the SUMO simulation platform using multiple real-world open-source traffic datasets, with comparisons against baseline methods. Results demonstrated that the proposed method reduced average travel time by at least 11.05% and decreased average delay time by at least 19.90%, significantly enhancing urban traffic efficiency.

Key words: reinforcement learning, route guidance, signal control, mean field game, adaptive graph convolution

CLC Number: 

  • U121
[1] ZHOU B, SONG Q, ZHAO Z, et al. A reinforcement learning scheme for the equilibrium of the in-vehicle route choice problem based on congestion game[J]. Applied Mathematics and Computation, 2020, 371: 124895.
[2] 周晓昕, 廖祝华, 刘毅志, 等. 融合历史与当前交通流量的信号控制方法[J]. 山东大学学报(工学版), 2023, 53(4): 48-55. ZHOU Xiaoxin, LIAO Zhuhua, LIU Yizhi, et al. Signal control method integrating history and current traffic flow[J]. Journal of Shandong University(Engineering Science), 2023, 53(4): 48-55.
[3] TANG C, HU W, HU S, et al. Urban traffic route guidance method with high adaptive learning ability under diverse traffic scenarios[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 22(5): 2956-2968.
[4] WEI H, ZHENG G, YAO H, et al. IntelliLight: a reinforcement learning approach for intelligent traffic light control[C] //Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. London, UK: Association for Computing Mac-hinery, 2018: 2496-2505.
[5] VEZHNEVETS A S, OSINDERO S, SCHAUL T, et al. FeUdal networks for hierarchical reinforcement learning[C] //Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia: JMLR, 2017: 3540-3549.
[6] 吴黎兵, 范静, 聂雷, 等. 一种车联网环境下的城市车辆协同选路方法[J]. 计算机学报, 2017, 40(7): 1600-1613. WU Libing, FAN Jing, NIE Lei, et al. A collaborative routing method with internet of vehicles for city cars[J]. Chinese Journal of Computers, 2017, 40(7): 1600-1613.
[7] HALL R W. The fastest path through a network with random time-dependent travel times[J]. Transportation Science, 1986, 20(3): 182-188.
[8] MAO C, SHEN Z. A reinforcement learning framework for the adaptive routing problem in stochastic time-dependent network[J]. Transportation Research Part C: Emerging Technologies, 2018, 93: 179-197.
[9] KOH S, ZHOU B, FANG H, et al. Real-time deep reinforcement learning based vehicle navigation[J]. Applied Soft Computing, 2020, 96: 106694.
[10] ARASTEH F, SHEIKHGARGAR S, PAPAGELIS M. Network-aware multi-agent reinforcement learning for the vehicle navigation problem[C] //Proceedings of the 30th International Conference on Advances in Geographic Information Systems. Seattle, USA: Association for Computing Machinery, 2022: 1-4.
[11] SHOU Z, CHEN X, FU Y, et al. Multi-agent reinforcement learning for Markov routing games: a new modeling paradigm for dynamic traffic assignment[J]. Transportation Research Part C: Emerging Techn-ologies, 2022, 137: 103560.
[12] WANG Y, XU T, NIU X, et al. STMARL: a spatio-temporal multi-agent reinforcement learning approach for cooperative traffic light control[J]. IEEE Transactions on Mobile Computing, 2020, 21(6): 2228-2242.
[13] VARAIYA P. Max pressure control of a network of signalized intersections[J]. Transportation Research Part C: Emerging Technologies, 2013, 36: 177-195.
[14] WEI H, CHEN C, ZHENG G, et al. PressLight: learning max pressure control to coordinate traffic signals in arterial network[C] //Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Anchorage, USA: Association for Computing Machinery, 2019: 1290-1298.
[15] CHU T, WANG J, CODECÀ L, et al. Multi-agent deep reinforcement learning for large-scale traffic signal control[J]. IEEE Transactions on Intelligent Trans-portation Systems, 2019, 21(3): 1086-1095.
[16] WEI H, XU N, ZHANG H, et al. CoLight: learning network-level cooperation for traffic signal control[C] //Proceedings of the 28th ACM International Conference on Information and Knowledge Management. Beijing, China: Association for Computing Machinery, 2019: 1913-1922.
[17] ZHU H, WANG Z, YANG F, et al. Intelligent traffic network control in the era of internet of vehicles[J]. IEEE Transactions on Vehicular Technology, 2021, 70(10): 9787-9802.
[18] SUN Q, ZHANG L, YU H, et al. Hierarchical reinforcement learning for dynamic autonomous vehicle navigation at intelligent intersections[C] //Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Long Beach, USA: Association for Computing Machinery, 2023: 4852-4861.
[19] ZHANG L, WU Q, SHEN J, et al. Expression might be enough: representing pressure and demand for reinforcement learning based traffic signal control[C] //International Conference on Machine Learning. Baltimore, USA: PMLR, 2022: 26645-26654.
[20] HUANG H, HU Z, LU Z, et al. Network-scale traffic signal control via multiagent reinforcement learning with deep spatiotemporal attentive network[J]. IEEE Transactions on Cybernetics, 2021, 53(1): 262-274.
[21] BAI L, YAO L, LI C, et al. Adaptive graph convolutional recurrent network for traffic forecasting[J]. Advances in Neural Information Processing Systems, 2020, 33: 17804-17815.
[22] MUKHUTDINOV D, FILCHENKOV A, SHALYTO A, et al. Multi-agent deep learning for simultaneous optimization for time and energy in distributed routing system [J]. Future Generation Computer Systems, 2019, 94: 587-600.
[23] TANAKA T, NEKOUEI E, PEDRAM A R, et al. Linearly solvable mean-field traffic routing games[J]. IEEE Transactions on Automatic Control, 2020, 66(2): 880-887.
[24] YANG Y, LUO R, LI M, et al. Mean field multi-agent reinforcement learning[C] //Proceedings of the 35th International Conference on Machine Learning. Stockholm, Sweden: PMLR, 2018: 5567-5576.
[25] WANG Z, SCHAUL T, HESSEL M, et al. Dueling network architectures for deep reinforcement learning[C] //Proceedings of the 33rd International Conference on International Conference on Machine Learning. New York, USA: JMLR, 2016: 1995-2003.
[26] YU C, VELU A, VINITSKY E, et al. The surprising effectiveness of ppo in cooperative multi-agent games[J]. Advances in Neural Information Processing Systems, 2022, 35: 24611-24624.
[27] SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms[DB/OL].(2017-08-28)[2025-04-03]. https://doi.org/10.48550/arXiv.1707.06347
[28] LOPEZ P A, BEHRISCH M, BIEKER-WALZ L, et al. Microscopic traffic simulation using SUMO[C] //2018 21st International Conference on Intelligent Transportation Systems(ITSC). Maui, USA: IEEE, 2018: 2575-2582.
[29] SU H, ZHONG Y D, CHOW J Y J, et al. EMVLight: a multi-agent reinforcement learning framework for an emergency vehicle decentralized routing and traffic signal control system[J]. Transportation Research Part C: Emerging Technologies, 2023, 146: 103955.
[1] ZHENG Zijun, FENG Xiang, YU Huiqun, LI Xiuquan. Dynamic prediction of spatiotemporal big data based on relationship transfer and reinforcement learning [J]. Journal of Shandong University(Engineering Science), 2021, 51(2): 105-114.
[2] Junsan ZHANG,Qiaoqiao CHENG,Yao WAN,Jie ZHU,Shidong ZHANG. MIRGAN: a medical image report generation model based on GAN [J]. Journal of Shandong University(Engineering Science), 2021, 51(2): 9-18.
[3] Zhifu CHANG,Fengyu ZHOU,Yugang WANG,Dongdong SHEN,Yang ZHAO. A survey of image captioning methods based on deep learning [J]. Journal of Shandong University(Engineering Science), 2019, 49(6): 25-35.
[4] SHEN Jing, LIU Hai-bo, ZHANG Ru-bo, WU Yan-xia, CHENG Xiao-bei. Multi-robot hierarchical reinforcement learning based on semi-Markov games [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(4): 1-7.
[5] LIN Jie,YANG Li-cai,WU Xiao-qing,YE Yang . Artificial immune optimization method for solving the K shortest paths search in dynamic route guidance system [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2007, 37(2): 103-108 .
[6] ZANG Li-lin,JIA Lei,LIN Zhong-qin . Research of traffic signal optimized control algorithm based on fuzzy logic [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(1): 41-45 .
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!