Research

Proximal Policy Optimization (PPO) https://arxiv.org/abs/1707.06347

Multi-Agent DDPG https://github.com/openai/maddpg

Monte Carlo Tree Search https://gnunet.org/sites/default/files/Browne%20et%20al%20-%20A%20survey%20of%20MCTS%20methods.pdf

Monte Carlo Tree Search and Reinforcement Learning https://www.jair.org/media/5507/live-5507-10333-jair.pdf

Cooperative Multi-Agent Learning https://link.springer.com/article/10.1007/s10458-005-2631-2

Opponent Modeling in Deep Reinforcement Learning http://www.umiacs.umd.edu/~hal/docs/daume16opponent.pdf

Machine Theory of Mind https://arxiv.org/pdf/1802.07740.pdf

Coordinated Multi-Agent Imitation Learning https://arxiv.org/pdf/1703.03121.pdf

Deep Reinforcement Learning from Self-Play in Imperfect-Information Games https://arxiv.org/pdf/1603.01121.pdf%20and%20http://proceedings.mlr.press/v37/heinrich15.pdf

Autonomous Agents Modelling Other Agents http://www.cs.utexas.edu/~pstone/Papers/bib2html-links/AIJ18-Albrecht.pdf