Skip to main content

General Claim dataset

The General Claim dataset is a diverse, harmonized dataset created for the task of check-worthy claim detection, addressing the limitations of narrow, specialized datasets currently used in the field. Constructed from five pre-existing datasets, it emphasizes variability across topics, languages, and writing styles, covering content in 10 languages. This dataset enables broader applicability and consistency in training and evaluating models for identifying claims that merit fact-checking.