[TMP-009] A tool for mitigating algorithmic biases through explanations

Our project integrates fair and explainable AI, developing a tool to reduce algorithmic bias through user feedback and iterative refinement.

Micro-project in Humane-AI-Net

www.humane-ai.eu

Short description

Our project addresses fair Artificial Intelligence (AI), focusing on how decision-making algorithms in high-stakes domains like hiring or loans can reinforce discriminatory patterns, disproportionately impacting specific demographics. Initial efforts to tackle bias in AI relied on mathematical fairness definitions and algorithmic optimization but were criticized for ignoring context and excluding domain experts who could address such biases effectively.

Recognizing these limitations, policymakers now require human oversight in high-risk AI systems. Our project explores this human control by integrating fair AI with explainable AI (xAI), which seeks to clarify the decision-making of opaque algorithms. We developed a tool that explains algorithmic decisions, enabling users to provide fairness-related feedback and choose strategies to reduce bias. Users receive immediate feedback on their chosen strategies, fostering an iterative refinement process.

Since Human-AI collaboration in bias mitigation is underexplored, we adopted an exploratory approach, conducting a think-aloud study where potential users tested mitigation strategies. We analyzed their feedback to identify the tool’s strengths, weaknesses, and users’ mental models. Additionally, we compared algorithmic biases before and after intervention to evaluate the success of bias mitigation. This study aims to advance effective Human-AI collaboration in addressing algorithmic fairness.

Additional information

We developed an algorithm that can reject predictions both based on their uncertainty and their unfairness. By rejecting possibly unfair predictions, our method reduces error and positive decision rate differences across demographic groups of the non-rejected data. Since the unfairness-based rejections are based on an interpretable-by-design method, i.e., rule-based fairness checks and situation testing, we create a transparent process that can empower human decision-makers to review the unfair predictions and make more just decisions for them. This explainable aspect is especially important in light of recent AI regulations, mandating that any high-risk decision task should be overseen by human experts to reduce discrimination risks. This methodology allows us to essentially bridge the gap between classifiers with a reject option and interpretable by design methods, encouraging human intervention and comprehension. We produced a functioning software, which is available, and are working on a full publication with experiments on multiple datasets and multiple rejection strategies. A publication is planned out of the outcome.

Tangible Outcomes

The full software:
https://github.com/calathea21/IFAC

Partners:

University of Pisa – Department of CS, Dino Pedreschi (dino.pedrschi@unipi.it)
University of Antwerp – Department of CS, Daphne Lenders (daphne.lenders@uantwerpen.be)
Scuola Normale Superiore, Roberto Pellungrini (roberto.pellungrini@sns.it)

Technical categories

AI services

Technology Readiness Level

TRL 3-5 (Technology development)

Lead institution

University of Pisa

Last updated

Wed, 11/20/2024 - 16:50

Contact details

Dino Pedreschi (dino.pedreschi@unipi.it)

[TMP-009] A tool for mitigating algorithmic biases through explanations

Tangible Outcomes

Assets related to [TMP-009] A tool for mitigating algorithmic biases through explanations