Learning to quantify: LeQua 2024 dataset

Datasets of the LeQua 2024 Learning to Quantify Data Challenge

Dataset

Developed by

License

Creative Commons Attribution 4.0 International License

Main Characteristic

The aim of LeQua 2024 (the 2nd data challenge on Learning to Quantify) is to allow the comparative evaluation of methods for “learning to quantify” in textual datasets, i.e., methods for training predictors of the relative frequencies of the classes of interest in sets of unlabelled textual documents. These predictors (called “quantifiers”) will be required to issue predictions for several such sets, some of them characterized by class frequencies radically different from the ones of the training set.

Tasks

Task T1: This task is concerned with evaluating binary quantifiers, i.e., quantifiers that must only predict the relative frequencies of a class and its complement; the data used are affected by prior probability shift (a.k.a. “label shift”). This task is akin to Task T1A of LeQua 2022.
Task T2: This task is concerned with evaluating single-label multi-class quantifiers, i.e., quantifiers that operate on datapoints each belonging to exactly one among a set of n>2 classes; here too, the data used are affected by prior probability shift. This task is akin to Task T1B of LeQua 2022.
Task T3: This task is concerned with evaluating ordinal quantifiers, i.e., quantifiers that operate on a set of n>2 totally ordered classes; here too, the data used are affected by prior probability shift. This task is new to LeQua 2024.
Task T4: Like Task T1, this task is concerned with evaluating binary quantifiers; unlike in Task T1, the data used are affected by covariate shift. This task is new to LeQua 2024.

Technical Categories

Machine learning

Keywords

Last updated

26.02.2024 - 10:31

Related Projects

AI4Media