FLAC
A method for fairness-aware representation learning by suppressing attribute-class associations.

FLAC is a methodology that minimizes mutual information between the features extracted by the model and a protected attribute, without the use of attribute labels. To do that, FLAC proposes a sampling strategy that highlights underrepresented samples in the dataset, and casts the problem of learning fair representations as a probability matching problem that leverages representations extracted by a bias-capturing classifier.
FLAC is a bias-label unaware approach that leverages the representations of a bias-capturing classifier to force an initial potentially biased model to learn fairer representations. In particular, we cast the reduction of the mutual information minimization problem into a simpler probability matching problem between the similarity distributions of the main model and the bias-capturing classifier. This turns out to be an effective means of disassociating the target representation from the bias-capturing model and, as a result, from the protected attributes. To this end, FLAC leverages the pairs of samples for which a typical task-specific loss is prone to bias, namely samples sharing either only targets or only protected. This is not the case for previous bias-label unaware methods that ignore the importance of a proper selection process, thereby impeding the bias mitigation effectiveness. Furthermore, it is theoretically justified that FLAC can minimize the mutual information between the main model representation and protected attribute.