Learning to quantify: LeQua 2024 dataset
Datasets of the LeQua 2024 Learning to Quantify Data Challenge
Developed by
License
Creative Commons Attribution 4.0 International License
Main Characteristic
The aim of LeQua 2024 (the 2nd data challenge on Learning to Quantify) is to allow the comparative evaluation of methods for “learning to quantify” in textual datasets, i.e., methods for training predictors of the relative frequencies of the classes of interest in sets of unlabelled textual documents. These predictors (called “quantifiers”) will be required to issue predictions for several such sets, some of them characterized by class frequencies radically different from the ones of the training set.
Tasks
- Task T1: This task is concerned with evaluating binary quantifiers, i.e., quantifiers that must only predict the relative frequencies of a class and its complement; the data used are affected by prior probability shift (a.k.a. “label shift”). This task is akin to Task T1A of LeQua 2022.
- Task T2: This task is concerned with evaluating single-label multi-class quantifiers, i.e., quantifiers that operate on datapoints each belonging to exactly one among a set of n>2 classes; here too, the data used are affected by prior probability shift. This task is akin to Task T1B of LeQua 2022.
- Task T3: This task is concerned with evaluating ordinal quantifiers, i.e., quantifiers that operate on a set of n>2 totally ordered classes; here too, the data used are affected by prior probability shift. This task is new to LeQua 2024.
- Task T4: Like Task T1, this task is concerned with evaluating binary quantifiers; unlike in Task T1, the data used are affected by covariate shift. This task is new to LeQua 2024.
Technical Categories
Machine learning
Last updated
26.02.2024 - 10:31