Evaluating segmentation in automatic captioning systems
We focused on evaluating the quality of the segmentation produced by the system, where decisions regarding the length, disposition, and display duration of the caption need to be taken so as to make the text easily readable.
Owing to the progress of underlying NLP technologies (speech to text, text normalization and compression, machine translation) automatic captioning technologies (ATC) both intra- and inter-lingual, are rapidly improving. ACTs are useful for many contents and contexts: from talks and lectures to news, fiction, and other entertaining content.
While historical systems are based on complex NLP pipelines, recent proposals are based on integrated (end-to-end) systems, which question standard evaluation schemes, where each module can be assessed independently from the others.
We focused on evaluating the quality of the output segmentation, where decisions regarding the length, disposition, and display duration of the caption need to be taken, all having a direct impact on the acceptability and readability. We notably studied ways to perform reference-free evaluations of automatic caption segmentation. We also correlated these « technology-oriented » metrics with user-oriented evaluations in typical use cases: post-editing and direct broadcasting.
Output
- Survey of existing segmentation metrics
- Design of a contrastive evaluation set
- Implementantion and comparison of metrics on multiple languages / tasks
- Publication: Alina Karakanta, François Buet, Mauro Cettolo, François Yvon (2022). "Evaluating Subtitle Segmentation for End-to-end Generation Systems", to appear in the Proceedings of the International Language Resource and Evaluation Conference (LREC), Marseille, France (June, 2022).
- https://cirrus.universite-paris-saclay.fr/s/22D9e8RLQ3crXYo
This Humane-AI-Net micro-project was carried out by Centre national de la recherche scientifique (CNRS, Francois Yvon) and Fondazione Bruno Kessler (FBK, Marco Turchi).