Special Theory Talk

— 1:00pm

Location:
In Person and Virtual - ET - Reddy Conference Room, Gates Hillman 4405 and Zoom

Speaker:
ZAIWEI CHEN , Postdoctoral Fellow, The Computing + Mathematical Sciences Department, California Institute of Technology
https://sites.google.com/view/zaiweichen/home

A Finite-Sample Analysis of Payoff-Based Independent Learning in Zero-Sum Stochastic Games

In this work, we study two-player zero-sum stochastic games and develop a natural variant of best-response learning dynamics called doubly smoothed best-response dynamics, which combines a discrete and smoothed variant of the best-response dynamics with temporal-difference (TD)-learning and minimax value iteration. The resulting learning dynamics is payoff-based, convergent, rational, and symmetric between players. Our theoretical results present the first last-iterate finite-sample analysis of such learning dynamics. Specifically, in the stateless setting (which corresponds to zero-sum matrix games), we establish a sample complexity of $O(\epsilon^{-1})$ to find the Nash distribution and a sample complexity of $O(\epsilon^{-8})$ to find a Nash equilibrium. For general stochastic games, our learning dynamics also enjoys a sample complexity of $O(\epsilon^{-8})$ to find a Nash equilibrium. To establish the results, we develop a coupled Lyapunov drift approach to capture the evolution of multiple sets of coupled and stochastic iterates, which might be of independent interest.

Dr. Zaiwei Chen is currently a CMI postdoctoral fellow in The Computing + Mathematical Sciences (CMS) Department at the California Institute of Technology, hosted by Dr. Adam Wierman and Dr. Eric Mazumdar. Zaiwei obtained a Ph.D. degree in Machine Learning, an M.S. degree in Mathematics, and an M.S. degree in Operations Research from Georgia Institute of Technology, where he was advised by Dr. Siva Theja Maguluri and Dr. John-Paul Clarke. Before that, Zaiwei obtained his B.S. degree in Electrical Engineering at Chu Kochen Honors College, Zhejiang University. Zaiwei was a recipient of the Simoudis Discovery Prize and was named a PIMCO Postdoctoral Fellow in Data Science in 2022. His Ph.D. thesis won the Sigma Xi Best Ph.D. Thesis Award, and was selected as a runner-up for the 2022 SIGMETRICS Doctoral Dissertation Award. Before that, Zaiwei received the ARC-TRIAD Student Fellowship in 2021. A proposal based on his research received The IDEaS-TRIAD Research Scholarship in 2020. In Person and Zoom Participation. See announcement.


Add event to Google
Add event to iCal