Landing : Athabascau University

Open AI Hide and Seek

  • Public

Open AI Hide and Seek

Started by Gaganpreet Jhajj December 7, 2023 - 9:23pm Replies (1)


I wanted to share a paper published by Open AI a few years ago that is highly relevant to MAS. 


Researchers at OpenAI developed multiple agents to play the game of hide and seek in a controlled environment [1]. Through both teams of hiders and seekers, the researchers observed that reinforcement learning led to new emergent behaviours [2]. They observed how the agents would change their strategy when new tools were added, such as boxes, ramps and other entities that would allow for a new potential strategy for either the hiders or seekers. Overall, the team observed six distinct behaviours or emergent behaviours, which were: 1) Running and chasing, 2) fort building, 3) ramp use, 4) ramp defence, 5) box surfing, 6) surf defence [3]. A great video on this from two-minute papers is worth watching to get a better visual of emergent behaviours [4].


The study also suggests that the agents develop more complex, almost human-like strategies through the self-supervised auto curriculum rather than just through intrinsic motivation. While intrinsic motivation will lead to exploration based on new or unfamiliar states, overall, it results in less directed and, thus, meaningful behaviour when we scale the complexity of our environment. The authors argue that multiagent competition could be more effective in creating advanced behaviours in increasingly complex settings.



[1] Multi-Agent Hide and Seek. Accessed: Dec. 07, 2023. [Online Video]. Available:

[2] “Emergent tool use from multi-agent interaction.” Accessed: Dec. 07, 2023. [Online]. Available:

[3] B. Baker et al., “Emergent Tool Use From Multi-Agent Autocurricula,” 2019, doi: 10.48550/ARXIV.1909.07528.

[4] OpenAI Plays Hide and Seek…and Breaks The Game! ????. Accessed: Dec. 07, 2023. [Online Video]. Available:



  • Oscar Lin December 10, 2023 - 10:23am

    Thanks Gagan.

    These are really interesting and the future direction of research and development !

    The best way to learn is learning from the best. 

    Dr Lin

COMP667  Multiagent Systems

COMP667 Multiagent Systems

This group is created for complementing COMP667 Course in Moodle, providing additional course material, and facilitating discussions.