Independent Learning in Stochastic Games - Where Strategic Decision-Making Meets Reinforcement Learning

Kaiqing Zhang (UMD)

Reinforcement learning (RL) has recently achieved great successes in many sequential decision-making applications. Many of the forefront applications of RL involve the decision-making of multiple strategic agents, e.g., playing chess and Go games, autonomous driving, and robotics. Unfortunately, classical RL framework is inappropriate for multi-agent learning as it assumes an agent’s environment is stationary and does not take into account the adaptive nature of behavior. In this talk, I focus on stochastic games for multi-agent reinforcement learning in dynamic environments, and develop independent learning dynamics for stochastic games: each agent is myopic and chooses best-response type actions to other agents’ strategies independently, meaning without any coordination with her opponents. I will present our independent learning dynamics that guarantee convergence in stochastic games, including for both zero-sum and single-controller identical-interest settings. Time-permitting, I will also discuss our other results along the line of learning in stochastic games, including both the positive ones on the sample and iteration complexity of certain multi-agent RL algorithms, and negative ones on the computation complexity of general-sum stochastic games.

Bio: Kaiqing Zhang is currently an Assistant Professor at the Department of Electrical and Computer Engineering (ECE) and the Institute for System Research (ISR), at the University of Maryland, College Park. He is also affiliated with the Maryland Robotics Center (MRC). During the deferral time before joining Maryland, he was a postdoctoral scholar affiliated with LIDS and CSAIL at MIT, and a Research Fellow at Simons Institute for the Theory of Computing at Berkeley. He finished his Ph.D. from the Department of ECE and CSL at the University of Illinois at Urbana-Champaign (UIUC). He also received M.S. in both ECE and Applied Math from UIUC, and B.E. from Tsinghua University. His research interests lie broadly in Control and Decision Theory, Game Theory, Robotics, Reinforcement/Machine Learning, Computation, and their intersections. He is the recipient of several awards and fellowships, including Hong, McCully, and Allen Fellowship, Simons-Berkeley Research Fellowship, CSL Thesis Award, and ICML Outstanding Paper. See more details at

Recorded Talk