
[TMP-055] Gesture-based Interactive Grounding for Mixed-Reality Human-AI Collaboration
A gesture-based Augmented-reality game for studying human-AI collaboration.

The project addresses research on interactive grounding. It consists of the development of an Augmented Reality (AR) game, using HoloLens, that supports the interaction of a human player with an AI virtual agent in a mixed-reality setting using gestures as the main communicative act. The game provides collaborative tasks that need coordination at the level of mutual understanding of the several elements of the required task. Players (human and AI) have different information about the tasks to advance in the game and need to communicate such information to their partners through gestures. The game was developed as a tool that engages humans in the interaction with AI agents to support research on human-centred collaborative AI.
In the game, players face a sequence of codebreaking challenges that require them to press some buttons in a specific sequence, however, only one of the partners has access to the buttons while the other has access to the solution code. The core gameplay is centred on the communication between the two partners (AI and virtual agent), which must be performed only by using gestures.
In addition, to the development of the AR game, we developed some sample AI agents that are able to play with a human player.
Players face a sequence of codebreaking challenges that require them to press some buttons in a specific sequence, however, only one of the partners has access to the buttons while the other has access to the solution code. Furthermore, only gesture communication if possible. Therefore, the core gameplay is centred on the communication between the two partners (AI and virtual agent).
Gestures supported in the game are split into two distinct subtypes:
- Taskwork gestures: Used for conveying information about the game’s tasks and environment (e.g., an object’s colour).
- Teamwork gestures: Used for giving feedback regarding communication (e.g., affirming that a gesture was understood).
The gameplay loop implies shared performance coordination and communication.
In the current version, the virtual agent is able to play reactively in response to the player's gestures based on a gesture knowledge base that assigns meaning and action to each gesture. A version using an LLM was also developed to provide some reasoning for gesture recognition and performance by the AI virtual agent.
Partners
Department of Computer Science, Instituto Superior Técnico
Department of Artificial Intelligence, Eötvös Loránd University
DFKI Lower Saxony, Interactive Machine Learning Lab (external)
Robotics Institute, Carnegie Mellon University (external)
Project code
The source code is available at: https://github.com/badomate/EscapeHololens and https://github.com/badomate/EscapeMain