Traffic Signal Control: a Double Q-learning Approach

Anton Agafonov; Vladislav Myasnikov

Traffic Signal Control: a Double Q-learning Approach

Anton Agafonov, Vladislav Myasnikov

DOI: http://dx.doi.org/10.15439/2021F109

Citation: Proceedings of the 16th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 25, pages 365–369 (2021)

Full text

Abstract. The use of information and communication technologies for solving economic, social, transportation, and other problems in the urban environment is usually considered within the``smart city'' concept. Optimal traffic management is one of the key components of smart cities. In this paper, we investigate the reinforcement learning approach to solve the traffic signal control problem. Both the initial data on the connected vehicles distribution and the aggregated characteristics of traffic flows are used to describe the state of the reinforcement learning agent. Experimental studies of the proposed model were carried out on synthetic and real data using the CityFlow simulator.

References

C. Lim, K.-J. Kim, and P. P. Maglio, “Smart cities with big data: Reference models, challenges, and considerations,” Cities, vol. 82, pp. 86–99, Dec. 2018, http://dx.doi.org/10.1016/j.cities.2018.04.011.
E. Ismagilova, L. Hughes, Y. K. Dwivedi, and K. R. Raman, “Smart cities: Advances in research—An information systems perspective,” International Journal of Information Management, vol. 47, pp. 88–100, Aug. 2019, http://dx.doi.org/10.1016/j.ijinfomgt.2019.01.004.
A. Yumaganov, A. Agafonov, and V. Myasnikov, “Map Matching Algorithm Based on Dynamic Programming Approach,” in 2020 15th Conference on Computer Science and Information Systems (FedCSIS), Sep. 2020, pp. 563–566. http://dx.doi.org/10.15439/2020F139.
A. A. Agafonov, “Short-Term Traffic Data Forecasting: A Deep Learning Approach,” Optical Memory and Neural Networks, vol. 30, no. 1, pp. 1–10, Jan. 2021, http://dx.doi.org/10.3103/S1060992X21010021.
A. Adart, H. Mouncif, and M. Naïmi, “Vehicular ad-hoc network application for urban traffic management based on markov chains,” International Arab Journal of Information Technology, vol. 14, no. 4A Special Issue, pp. 624–631, 2017.
Y. Li, E. Fadda, D. Manerba, R. Tadei, and O. Terzo, “Reinforcement Learning Algorithms for Online Single-Machine Scheduling,” in 2020 15th Conference on Computer Science and Information Systems (FedCSIS), Sep. 2020, pp. 277–283. http://dx.doi.org/10.15439/2020F100.
B. N. Silva, M. Khan, and K. Han, “Towards sustainable smart cities: A review of trends, architectures, components, and open challenges in smart cities,” Sustainable Cities and Society, vol. 38, pp. 697–713, Apr. 2018, http://dx.doi.org/10.1016/j.scs.2018.01.053.
B. Xu, X. J. Ban, Y. Bian, J. Wang, and K. Li, “V2I based cooperation between traffic signal and approaching automated vehicles,” in 2017 IEEE Intelligent Vehicles Symposium (IV). Los Angeles, CA, USA: IEEE, Jun. 2017, pp. 1658–1664. http://dx.doi.org/10.1109/IVS.2017.7995947.
C. Yu, Y. Feng, H. Liu, W. Ma, and X. Yang, “Integrated optimization of traffic signals and vehicle trajectories at isolated urban intersections,” Transportation Research Part B: Methodological, vol. 112, pp. 89–112, 2018, http://dx.doi.org/10.1016/j.trb.2018.04.007.
H. Wei, G. Zheng, V. Gayah, and Z. Li, “A Survey on Traffic Signal Control Methods,” https://arxiv.org/abs/1904.08117 [cs, stat], Jan. 2020, arXiv:1904.08117. [Online]. Available: http://arxiv.org/abs/1904.08117
S. S. S. M. Qadri, M. A. Gökçe, and E. Öner, “State-of-art review of traffic signal control methods: challenges and opportunities,” European Transport Research Review, vol. 12, no. 1, p. 55, Dec. 2020, http://dx.doi.org/10.1186/s12544-020-00439-1.
Q. Guo, L. Li, and X. (Jeff) Ban, “Urban traffic signal control with connected and automated vehicles: A survey,” Transportation Research Part C: Emerging Technologies, vol. 101, pp. 313–334, Apr. 2019, http://dx.doi.org/10.1016/j.trc.2019.01.026.
M. Papageorgiou, C. Kiakaki, V. Dinopoulou, A. Kotsialos, and Yibing Wang, “Review of road traffic control strategies,” Proceedings of the IEEE, vol. 91, no. 12, pp. 2043–2067, Dec. 2003, http://dx.doi.org/10.1109/JPROC.2003.819610.
R. Allsop, “Estimating the traffic capacity of a signalized road junction,” Transportation Research, vol. 6, no. 3, pp. 245–255, 1972, doi: 10.1016/0041-1647(72)90017-2.
F. V. Webster, Traffic Signal Settings. H.M. Stationery Office, 1958.
J. Little, M. Kelson, and N. Gartner, “MAXBAND: A Program for Setting Signals on Arteries and Triangular Networks,” Transportation Research Record Journal of the Transportation Research Board, vol. 795, pp. 40–46, Dec. 1981.
M.-T. Li and A. Gan, “Signal timing optimization for oversaturated networks using TRANSYT-7F,” Transportation Research Record, no. 1683, pp. 118–126, 1999, http://dx.doi.org/10.3141/1683-15.
P. Varaiya, “The Max-Pressure Controller for Arbitrary Networks of Signalized Intersections,” in Advances in Dynamic Network Modeling in Complex Transportation Systems, ser. Complex Networks and Dynamic Systems, S. V. Ukkusuri and K. Ozbay, Eds. New York, NY: Springer, 2013, pp. 27–66. http://dx.doi.org/10.1007/978-1-4614-6243-9_2.
K. Stoilova and T. Stoilov, “Bi-level Optimization Application for Urban Traffic Management,” in 2020 15th Conference on Computer Science and Information Systems (FedCSIS), Sep. 2020, pp. 327–336. http://dx.doi.org/10.15439/2020F18.
K.-L. Yau, J. Qadir, H. Khoo, M. Ling, and P. Komisarczuk, “A survey on Reinforcement learning models and algorithms for traffic signal control,” ACM Computing Surveys, vol. 50, no. 3, 2017, http://dx.doi.org/10.1145/3068287.
M. Gregurić, M. Vujić, C. Alexopoulos, and M. Miletić, “Application of Deep Reinforcement Learning in Traffic Signal Control: An Overview and Impact of Open Traffic Data,” Applied Sciences, vol. 10, no. 11, p. 4011, Jun. 2020, http://dx.doi.org/10.3390/app10114011.
P. Palos and A. Huszak, “Comparison of Q-Learning based Traffic Light Control Methods and Objective Functions,” in 2020 International Conference on Software, Telecommunications and Computer Networks (SoftCOM). Split, Hvar, Croatia: IEEE, Sep. 2020, pp. 1–6. http://dx.doi.org/10.23919/SoftCOM50211.2020.9238290.
H. Wei, G. Zheng, H. Yao, and Z. Li, “IntelliLight: A Reinforcement Learning Approach for Intelligent Traffic Light Control,” in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. London United Kingdom: ACM, Jul. 2018, pp. 2496–2505. http://dx.doi.org/10.1145/3219819.3220096.
H. Wei, N. Xu, H. Zhang, G. Zheng, X. Zang, C. Chen, W. Zhang, Y. Zhu, K. Xu, and Z. Li, “CoLight: Learning Network-level Cooperation for Traffic Signal Control,” Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 1913–1922, Nov. 2019, https://arxiv.org/abs/ 1905.05717, http://dx.doi.org/10.1145/3357384.3357902.
C. Chen, H. Wei, N. Xu, G. Zheng, M. Yang, Y. Xiong, K. Xu, and Z. Li, “Toward A Thousand Lights: Decentralized Deep Reinforcement Learning for Large-Scale Traffic Signal Control,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 04, pp. 3414–3421, Apr. 2020, http://dx.doi.org/10.1609/aaai.v34i04.5744.
Y. Liu, L. Liu, and W.-P. Chen, “Intelligent Traffic Light Control Using Distributed Multi-agent Q Learning,” https://arxiv.org/abs/1711.10941 [cs], Nov. 2017, https://arxiv.org/abs/ 1711.10941. [Online]. Available: http://arxiv.org/abs/1711.10941
Z. Li, H. Yu, G. Zhang, S. Dong, and C.-Z. Xu, “Network-wide traffic signal control optimization using a multi-agent deep reinforcement learning,” Transportation Research Part C: Emerging Technologies, vol. 125, p. 103059, Apr. 2021, http://dx.doi.org/10.1016/j.trc.2021.103059.
J. Gu, Y. Fang, Z. Sheng, and P. Wen, “Double Deep Q-Network with a Dual-Agent for Traffic Signal Control,” Applied Sciences, vol. 10, no. 5, p. 1622, Feb. 2020, http://dx.doi.org/10.3390/app10051622.
J. Zeng, J. Hu, and Y. Zhang, “Adaptive Traffic Signal Control with Deep Recurrent Q-learning,” in 2018 IEEE Intelligent Vehicles Symposium (IV). Changshu: IEEE, Jun. 2018, pp. 1215–1220. http://dx.doi.org/10.1109/IVS.2018.8500414.
H. Hasselt, “Double Q-learning,” Advances in Neural Information Processing Systems, vol. 23, 2010.
H. Zhang, S. Feng, C. Liu, Y. Ding, Y. Zhu, Z. Zhou, W. Zhang, Y. Yu, H. Jin, and Z. Li, “CityFlow: A Multi-Agent Reinforcement Learning Environment for Large Scale City Traffic Scenario,” https://arxiv.org/abs/1905.05217 [cs], May 2019, https://arxiv.org/abs/ 1905.05217. [Online]. Available: http://arxiv.org/abs/1905.05217, http://dx.doi.org/10.1145/3308558.3314139.