Skip to main content


10.02.2021 | 15:00 - 16:00 (CET)

AI-Cafe presents: Employing AI for the semantic analysis of conventional and immersive video

AI-based methods are nowadays the best choice for the automatic extraction of semantic metadata from archive content. This session will describe our experience employing deep learning algorithms for face detection and recognition as well as general object detection and tracking.

We will discuss the current state of these methods and highlight issues that are still remaining, like the ethnic bias occurring in all face recognition methods trained on public datasets. We will then present use cases on how these methods can help archives annotate and exploit their content in a more convenient way. We address here not only conventional video content, but also emerging content types like immersive video, which pose new challenges for archives. We will show how face recognition and scene object extraction can be used for the semi-automatic annotation of video content and for automatic cinematography / editing of a 360° video.


Hannes Fassold

Senior researcher at the Machine Vision Applications Group of the DIGITAL institute at JOANNEUM RESEARCH.

Hannes Fassold received a MSc degree in Applied Mathematics from Graz University of Technology in 2004. Since then he works at JOANNEUM RESEARCH, where he is currently a senior researcher at the Machine Vision Applications Group of the DIGITAL institute.

His main research interests are the automatic analysis and enhancement of video (e.g. object detection & tracking, optical flow, speaker recognition, defect detection, superresolution, denoising) with deep learning methods. He has published several publications in these fields and coordinates the machine learning workflow / infrastructure at DIGITAL.