[TMP-069] Learning from Richer Feedback Through the Integration of Prior Beliefs
An enhancement of interactive machine learning by incorporating richer feedback, improving learning, and reducing expert burden through informed queries
Interactive Machine Learning (IML) has gained attention for enabling agents to learn from human feedback, but many solutions rely on sparse feedback, burdening the expert. This project aims to address this by allowing the learner to use richer feedback, accelerating learning, and incorporating a model of the expert to select more informative queries, reducing the expert's burden.
We have three objectives:
- Develop methods to incorporate causal and contrastive feedback, based on psychology research, into IML.
- Design a belief-based system for the learner to maintain beliefs about expert objectives, guiding query selection.
- Use feedback to generate a posterior that informs future queries, improving learning in the context of Inverse Reinforcement Learning (IRL).
The project focuses on key aspects from the Collaboration with AI Systems work package (W1-2), particularly AI systems that can communicate and understand descriptions of situations, goals, intentions, or operational plans to establish shared understanding for collaboration. By explicitly maintaining beliefs about the expert’s objectives and integrating causal and contrastive feedback, the system aims to establish a common ground and improve collaboration. Furthermore, the project aligns with the goal of creating AI systems that can explain their internal models by justifying statements and answering questions. By using feedback to enhance learning, the system aims to provide explanations, verify facts, and improve shared understanding between the AI and human expert. It also promotes two-way interaction, enabling the construction and adaptation of shared representations. Through user-study evaluations and methods that leverage prior knowledge of the expert, the project seeks to make measurable progress toward collaborative AI.
This project resulted in an exchange period during which our collaborator came to our lab and spent a month with us. This opportunity allowed us to conceptualize and write a paper that we plan to submit to the IJCAI conference in December 2024. The paper addresses the challenge of learning from individuals who have a different model of the task. Specifically, we focused on identifying human bottleneck states, determining the maximal achievable set of these states given the robot’s model of the task, and querying for the bottlenecks when they cannot be achieved due to the constraints of the robot model.
In addition, we have also begun working on a survey paper regarding human modeling in sequential decision-making, which has led to a workshop paper that we are currently extending for journal publication.
Tangible Outcomes
- [arxiv] Human-Modeling in Sequential Decision-Making: An Analysis through the Lens of Human-Aware AI by Silvia Tulli, Stylianos Loukas Vasileiou, Sarath Sreedharan https://arxiv.org/abs/2405.07773
Partner
- ISIR, Sorbonne University, Silvia Tulli