This project entails robot online behavioral adaptation during interactive learning with humans.
During human-robot interaction in a cooperative context, both human and robot may display behavioral adaptation and learning in response to their partner's behavior. The goal is to enable robots to adapt online to such dynamic and interactive situation. Robots shall in particular adapt to each human subject’s specific way of giving feedback during the interaction. Feedback here includes reward, instruction and demonstration, and can be regrouped under the term “teaching signals”.
As an example, when exchanging objects reachable by only one partner (either the human or the robot) in order to put them in various boxes, some human subjects prefer a proactive robot while others prefer the robot to wait for their instructions; some humans only tell the robot when it performs a wrong action, remaining silent when it performs correct actions, while other humans tend to reward each correct action, etc.
The proposed research strategy consists in endowing robots with a cognitive architecture constituted of an ensemble of machine learning methods, as potential tools that the robot can autonomously choose when it considers them as appropriate during different phases of the human-robot interaction. In particular, we propose to combine model-based and model-free reinforcement, coordinated by a meta-controller which monitors their respective performance and arbitrates. Additionally, the architecture shall enable the robot to learn models of various human feedback strategies and use them for online tuning of reinforcement learning so that the robot can quickly adapt its behavioral policy.