Abstract
We consider a multi-agent network where agents interact with others in a dynamic environment, making decisions in real-time, based on coupled state dynamics and action policies. In general, multi-agent reinforcement learning faces formidable technical challenges, due to 1) the curse of dimensionality, 2) the curse of partial observability and multi-agency, and 3) the curse of non-stationarity. To tackle these challenges, we propose a world model based distributed reinforcement learning (WM-DRL) framework. In particular, by leveraging world models’ low-dimensional latent representation, the proposed WM-DRL framework makes it feasible for agents to share their latent information with other agents of interest via lightweight communications and acquire essential information for decision making, thereby alleviating the challenges of high-dimensionality and partial observability. Furthermore, the proposed WM-DRL framework exploits the generalization capability of world models to equip the agents with the power of foresight, enabling them to predict the environment, thus mitigating the challenge of non-stationarity. Aiming to build a theoretic foundation for WM-DRL, we characterize the latent representation error and quantify its impact on the prediction capability, and analyze the impact of information sharing on the performance of WM-DRL. We also develop an open-source RL platform that integrates world models with CARLA for autonomous driving. Extensive experiments are carried out on the challenging local trajectory planning tasks in complicated vehicle networks to highlight the benefits of the WM-DRL framework.
Bio
Junshan Zhang is a professor in the ECE Department and CS graduate faculty at University of California Davis. He received his Ph.D. degree from the School of ECE at Purdue University in Aug. 2000, and was on the faculty of the School of ECEE at Arizona State University from 2000 to 2021. His research interests fall in the general field of information networks and data science, including edge AI, reinforcement learning, continual learning, network optimization and control, game theory. He is a Fellow of the IEEE, and a recipient of the ONR Young Investigator Award in 2005 and the NSF CAREER award in 2003. His papers have won a few awards, including the Best Student paper at WiOPT 2018, the Kenneth C. Sevcik Outstanding Student Paper Award of ACM SIGMETRICS/IFIP Performance 2016, the Best Paper Runner-up Award of IEEE INFOCOM 2009 and IEEE INFOCOM 2014, and the Best Paper Award at IEEE ICC 2008 and ICC 2017. He is currently serving as Editor-in-Chief of IEEE/ACM Transactions on Networking.