Warning
Under development!
This repository aims to provide tutorials on multi-agent reinforcement learning (MARL). Before diving into the modeling of multiple agents interacting through the Markov decision process (MDP) framework (introduced by Shapley in 1953), we first cover the basics of game theory and other fundamental concepts.
Note
Currently, the repository has a strong focus on game theory, which may eventually be moved to a separate GitHub repository.
-
How many players are involved?
The number of players defines the game’s structure (e.g., two-player vs. multi-player) and influences the complexity of the analysis.
-
Are the actions available finite or infinite?
Determining whether the action space is finite or infinite affects the strategies players can adopt and the mathematical tools used for analysis.
-
Does the game evolve with simultaneous moves or sequential moves?
Identifying whether players act simultaneously or in sequence helps classify the game (e.g., normal-form vs. extensive-form) and influences strategy formulation.
-
Is the game static or dynamic?
A static game is played once, while a dynamic game unfolds over multiple periods. Dynamic games often require more complex strategies, like backward induction.
-
Is there complete or incomplete information?
Assessing whether players have full knowledge of the game's structure and payoffs is vital. Incomplete information often leads to Bayesian or signaling games.
-
Do players have private information (asymmetric information)?
If some players have private information, the game involves asymmetric information, leading to scenarios like adverse selection or moral hazard.
-
Are players cooperating or competing?
Understanding whether players work together (cooperative games) or compete (non-cooperative games) influences the modeling approach and solution concepts.
-
What is the payoff structure?
Analyzing how payoffs are determined—whether zero-sum, positive-sum, or involving externalities—is crucial for understanding incentives and outcomes.
-
Are players rational, and what level of rationality is assumed?
Assuming players act in their best interest, the level of rationality (e.g., bounded or perfect) assumed in the model can significantly affect predictions.
-
Is the game repeated, and how does repetition affect strategies?
In repeated games, strategies may evolve based on past interactions, potentially leading to cooperation in repeated settings (e.g., folk theorem).
-
How are equilibria defined, and which equilibrium concepts are applicable?
Identifying the appropriate equilibrium concept (e.g., Nash, subgame perfect, Bayesian) is essential for predicting outcomes.
Normal-form Games (Strategic-form Games)
- Von Neumann, J., & Morgenstern, O. (1944). Theory of games and economic behavior. Princeton University Press.
This book established the foundations of normal-form games, providing a structure for analyzing strategic interactions where players make decisions simultaneously
Bayesian Games
- Harsanyi, J. C. (1967). Games with incomplete information played by “Bayesian” players, I–III Part I. The basic model. Management Science, 14(3), 159-182.
Harsanyi’s work introduced Bayesian games, which extend normal-form games to situations of incomplete information, allowing players to form beliefs about unknown factors.
Potential Games
- Monderer, D., & Shapley, L. S. (1996). Potential games. Games and Economic Behavior, 14(1), 124-143.
This paper defines potential games, a class of games where the incentive structure can be captured by a potential function, simplifying the analysis of equilibria.
Robust Games
- Aghassi, M., & Bertsimas, D. (2006). Robust game theory. Mathematical Programming, 107(1), 231-273.
This work extends game theory to account for uncertainty and robustness, providing tools for decision-making in uncertain environments.
Stackelberg Games
- Stackelberg, H. V. (1934). Marktform und Gleichgewicht.
Stackelberg introduced a model where leaders move first and followers react, which is fundamental in analyzing competitive markets and leadership dynamics in sequential games.
Extensive-form Games
- Von Neumann, J., & Morgenstern, O. (1944). Theory of games and economic behavior. Princeton University Press.
This book also introduced extensive-form games, providing a framework for analyzing games where the order of moves and the information available at each decision point are crucial.
Stochastic Games (Markov Games)
- Shapley, L. S. (1953). Stochastic games. Proceedings of the National Academy of Sciences, 39(10), 1095-1100.
Shapley’s work introduced stochastic games, which generalize Markov decision processes to multi-player settings with both simultaneous and sequential moves.
- M. L. Littman. (1994). Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the 11th International Conference on Machine Learning (ICML) (pp. 157-163).
Littman introduced Minimax-Q learning and extended the concept of stochastic games into the realm of multi-agent reinforcement learning, emphasizing their application in AI and dynamic environments.
Evolutionary Game Theory
- Smith, J. M., & Price, G. R. (1973). The logic of animal conflict. Nature, 246(5427), 15-18.
This paper introduced the concept of evolutionary stable strategies, a cornerstone of evolutionary game theory.
Repeated Games
- Fudenberg, D., & Maskin, E. (1986). The folk theorem in repeated games with discounting or with incomplete information. Econometrica, 54(3), 533–554.
This work is key in the theory of repeated games, explaining how cooperation can be sustained over time in strategic settings.
Cooperative Games
- Shapley, L. S. (1953). A value for n-person games. In Contributions to the Theory of Games (pp. 307-317).
Shapley introduced the Shapley value, a solution concept in cooperative game theory that is widely used in economics and political science.
Nash Equilibrium
- Nash Jr, J. F. (1950). Equilibrium points in n-person games. Proceedings of the National Academy of Sciences, 36(1), 48-49.
This foundational paper introduced the concept of Nash equilibrium and proved that in every finite game with a finite number of players, there exists at least one Nash equilibrium in mixed strategies.
Correlated Equilibrium
- Aumann, R. J. (1974). Subjectivity and correlation in randomized strategies. Journal of Mathematical Economics, 1(1), 67-96.
This paper introduced the concept of correlated equilibrium, where players can coordinate their strategies based on shared random signals, generalizing Nash equilibrium.
- Aumann, R. J. (1987). Correlated equilibrium as an expression of Bayesian rationality. Econometrica, 55(1), 1-18.
This paper formalizes correlated equilibrium in the context of Bayesian rationality, showing how players' beliefs and strategies can be aligned through common signals.
Coarse Correlated Equilibrium
- Moulin, H., & Vial, J. P. (1978). Strategically zero-sum games: The class of games whose completely mixed equilibria cannot be improved upon. International Journal of Game Theory, 7, 201-221.
This paper explores coarse correlated equilibrium, a generalization of correlated equilibrium where players evaluate their expected payoffs before seeing the signal, highlighting strategic considerations in games.
- Brouwer, L. E. J. (1911). Über Abbildung von Mannigfaltigkeiten. Mathematische Annalen, 71(1), 97-115.
This classic result in topology proves that any continuous function from a compact convex set to itself has at least one fixed point, forming the mathematical foundation for proving the existence of Nash equilibria.
- Kakutani, S. (1941). A generalization of Brouwer’s fixed point theorem. Duke Mathematical Journal, 8(3), 457-459.
Kakutani's theorem extends Brouwer’s theorem to set-valued (multi-valued) functions and is a crucial tool for proving the existence of Nash equilibria in games with mixed strategies.
- Busoniu, L., Babuska, R., & De Schutter, B. (2008). A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 38(2), 156-172.
- Zhang, K., Yang, Z., & Başar, T. (2021). Multi-agent reinforcement learning: A selective overview of theories and algorithms. In Handbook of Reinforcement Learning and Control (pp. 321-384). Springer, Cham.
- Lowe, R., Wu, Y. I., Tamar, A., Harb, J., Pieter Abbeel, O., & Mordatch, I. (2017). Multi-agent actor-critic for mixed cooperative-competitive environments. Advances in Neural Information Processing Systems, 30.
MADDPG
- Iqbal, S. & Sha, F.. (2019). Actor-attention-critic for multi-agent reinforcement learning. In Proceedings of the 36th International Conference on Machine Learning (ICML) (pp. 2961-2970).
- Yu, C., Velu, A., Vinitsky, E., Gao, J., Wang, Y., Bayen, A., & Wu, Y. (2022). The surprising effectiveness of ppo in cooperative multi-agent games. Advances in Neural Information Processing Systems, 35, 24611-24624.
- Bansal, T., Pachocki, J., Sidor, S., Sutskever, I., & Mordatch, I. (2018). Emergent complexity via multi-agent competition. In Proceedings of the 6th International Conference on Learning Representations (ICLR). [Code]
- Al-Shedivat, M., Bansal, T., Burda, Y., Sutskever, I., Mordatch, I., & Abbeel, P. (2018). Continuous adaptation via meta-learning in nonstationary and competitive environments. In Proceedings of the 6th International Conference on Learning Representations (ICLR). [Code]
- Bengio, Y., Louradour, J., Collobert, R., & Weston, J. (2009). Curriculum learning. In Proceedings of the 26th International Conference on Machine Learning (ICML) (pp. 41-48).
- Loftin, R., Saha, A., Devlin, S., & Hofmann, K. (2021). Strategically efficient exploration in competitive multi-agent reinforcement learning. In Uncertainty in Artificial Intelligence (pp. 1587-1596).
- Chen, J., Xu, Z., Li, Y., Yu, C., Song, J., Yang, H., Fang, F., Wang, Y., & Wu, Y. (2024). Accelerate multi-agent reinforcement learning in zero-sum games with subgame curriculum learning. In Proceedings of the AAAI Conference on Artificial Intelligence, 38(10), 11320-11328.
- Fudenberg, D., & Levine, D. K. (1998). The theory of learning in games (Vol. 2). MIT Press, Cambridge, MA.
- Hofbauer, J., & Sandholm, W. H. (2002). On the global convergence of stochastic fictitious play. Econometrica, 70(6), 2265-2294.
- Hu, J., & Wellman, M. P. (2003). Nash Q-learning for general-sum stochastic games. Journal of Machine Learning Research, 4, 1039-1069.
- Daskalakis, C., Foster, D. J., & Golowich, N. (2020). Independent policy gradient methods for competitive reinforcement learning. Advances in Neural Information Processing Systems, 33, 5527-5540.
- Leslie, D. S., Perkins, S., & Xu, Z. (2020). Best-response dynamics in zero-sum stochastic games. Journal of Economic Theory, 189, 105095.
- Sayin, M. O., Parise, F., & Ozdaglar, A. (2022). Fictitious play in zero-sum stochastic games. SIAM Journal on Control and Optimization, 60(4), 2095-2114.
There are multiple GitHub repositories providing literature resources purely focused on recent advancements in MARL research.