Playable Video Generation

Novel framework for Playable Video Generation that is trained in a self-supervised manner on a large dataset of unlabelled videos. We employ an encoder-decoder architecture where the predicted action labels act as bottleneck. The network is constrained to learn a rich action space using, as main driving loss, a reconstruction loss on the generated video.

ML Model

Link to the asset

Developed by

University of Trento

License

MIT license (MIT)

Main Characteristic

- Playable Video Generation that is trained in a self-supervised manner on a large dataset of unlabelled videos.

Technical Categories

Computer vision Machine learning

Keywords