[TMP-009] A tool for mitigating algorithmic biases through explanations
Our project integrates fair and explainable AI, developing a tool to reduce algorithmic bias through user feedback and iterative refinement.
Our project addresses fair Artificial Intelligence (AI), focusing on how decision-making algorithms in high-stakes domains like hiring or loans can reinforce discriminatory patterns, disproportionately impacting specific demographics. Initial efforts to tackle bias in AI relied on mathematical fairness definitions and algorithmic optimization but were criticized for ignoring context and excluding domain experts who could address such biases effectively.
Recognizing these limitations, policymakers now require human oversight in high-risk AI systems. Our project explores this human control by integrating fair AI with explainable AI (xAI), which seeks to clarify the decision-making of opaque algorithms. We developed a tool that explains algorithmic decisions, enabling users to provide fairness-related feedback and choose strategies to reduce bias. Users receive immediate feedback on their chosen strategies, fostering an iterative refinement process.
Since Human-AI collaboration in bias mitigation is underexplored, we adopted an exploratory approach, conducting a think-aloud study where potential users tested mitigation strategies. We analyzed their feedback to identify the tool’s strengths, weaknesses, and users’ mental models. Additionally, we compared algorithmic biases before and after intervention to evaluate the success of bias mitigation. This study aims to advance effective Human-AI collaboration in addressing algorithmic fairness.
We developed an algorithm that can reject predictions both based on their uncertainty and their unfairness. By rejecting possibly unfair predictions, our method reduces error and positive decision rate differences across demographic groups of the non-rejected data. Since the unfairness-based rejections are based on an interpretable-by-design method, i.e., rule-based fairness checks and situation testing, we create a transparent process that can empower human decision-makers to review the unfair predictions and make more just decisions for them. This explainable aspect is especially important in light of recent AI regulations, mandating that any high-risk decision task should be overseen by human experts to reduce discrimination risks. This methodology allows us to essentially bridge the gap between classifiers with a reject option and interpretable by design methods, encouraging human intervention and comprehension. We produced a functioning software, which is available, and are working on a full publication with experiments on multiple datasets and multiple rejection strategies. A publication is planned out of the outcome.
Tangible Outcomes
- The full software:
https://github.com/calathea21/IFAC
Partners:
- University of Pisa – Department of CS, Dino Pedreschi (dino.pedrschi@unipi.it)
- University of Antwerp – Department of CS, Daphne Lenders (daphne.lenders@uantwerpen.be)
- Scuola Normale Superiore, Roberto Pellungrini (roberto.pellungrini@sns.it)