4k Video Super-Resolution Detection
This model detects upscaled 4k videos and identifies the upscaling model used.
This model accurately detects upscaled 4k videos and can predict the upscaling model utilized. Leveraging a comprehensive training dataset that includes both traditional interpolation methods and modern deep learning-based super-resolution techniques, the model effectively discerns the nuances in video quality enhancements. It has been trained for 4K videos, which are processed as sequences of frames. There are two checkpoints available: a lightweight version based on ResNet-18, and a more robust version utilizing ConvNext.
The upscaling detection model utilizes a convolutional neural network (CNN) designed to process frame patches individually. The model has been trained on a dataset of around 1000 videos, generated by upscaling the original set of 200 BVI-DVC videos using various super-resolution techniques. These methods range from traditional approaches such as bicubic interpolation to state-of-the-art deep learning (DL) methods. The DL-based techniques include both "classical" and "real" variants, distinguished by their respective output patterns as documented in the literature.
The model has been trained with a diverse array of labels, enabling it to identify the specific enhancement method applied to a video. During development, various model architectures were considered, including Generative Adversarial Networks (GANs), Transformers, and recurrent networks, each offering unique advantages for the task.
A primary consideration is the model's sensitivity to input variations. While effective on systematically upscaled videos, its performance may be impacted by artefacts introduced in some original videos, particularly those that are compressed or slightly altered.