Pluralistic Recommendation in News

Perform unbiased recommendation for news in the European domain. To achieve it, build a political leaning classifier on a EU-News Dataset

Micro-project in Humane-AI-Net

Network of different leanings citations in EU news

Short description

The tangible objective of this micro-project was to develop two datasets for European News with a political leaning labelling. This was needed to tackle the next step of the project, which was the one of building a bias-minimizing recommender system for European news.

The first dataset comprehends millions of European news, and it has been enriched with metadata coming from Eurotopics.net. Each entry in the dataset contains the maintext, title, publishment date, language, news source together with news source metadata. This metadata comprehends political leaning of the news source and its country.

We then built an article bias classifier, in the attempt of predicting the political label of single articles using the labels obtained through distant supervision. We then applied explainableAI to our classifier, and concluded that the classifier is effectively predicting the news source, rather than the political leaning.
In order to try and overcome this issue, we built a second dataset, which has the same features of the first one described above, but with the addition of topics, chosen between 7 macro-topics.
The immediate plan is to perform political-bias classification exploiting the new dataset by filtering out all the articles which do not carry political bias, such as those dealing with sports or gossip.

Additional information

This micro-project was joinly performed by Lorenzo Bellomo, Virginia Morini, Paolo Ferragina, Giulio Rossetti, Dino Pedreschi. The involved institutions are University of Pisa (PF, DP, VM), ScuolaNormale Superiore (LB), and ISTI-CNR (GR).

The project is part of the Humane-AI-Net network of excellent research centers in AI. It contributes to this network in n the following aspects:

Task 4.3. "Societal impact of AI-STS"

Tangible outcomes:

A dataset that can be useful for analyzing biased behaviour in the European media field and we provide the first stepping stones for an unbiased recommender system.
Report of the project. In this report you can also find the links to the two datasets. Before accessing them, you have to fill the request field, as we still have not obtained clearence to publish some metadata.
Code: https://github.com/LorenzoBellomo/EU-NewsDataset, https://github.com/LorenzoBellomo/BiasClassification