Explainable Video Summarization
This library lists the outcomes of our research on video summarization and explainable AI-based summarization.
This library lists the outcomes of our research on video summarization and explainable AI-based summarization. Through this library, the interested readers can get access to: a) a literature survey on video summarization using deep neural networks, b) an unsupervised method for video summarization (called CA-SUM), c) an unsupervised method for video thumbnail selection (called RL-DiVTS), d) a method (called XAI-SUM) and a study on the use of attention for explaining video summarization methods, e) a book chapter on the use of explainable video summarization for advancing media content production, and f) a web application that facilitates the production of well-tailored video summaries for sharing on social media.
This library lists the outcomes of our research on video summarization and explainable AI, that include:
- A literature survey on video summarization using deep neural networks, that has been published in the Proceedings of the IEEE, vol. 109, no. 11, pp. 1838-1863, Nov. 2021. DOI:10.1109/JPROC.2021.3117472.
- An attention-based unsupervised method for video summarization (called CA-SUM), which estimates the frames' importance by integrating a concentrated attention mechanism and utilizing information about the frames' uniqueness and diversity. This method has been published in the Proc. of the ACM Int. Conf. on Multimedia Retrieval (ICMR’22), Newark, NJ, USA, pp. 407-415, Jun. 2022. DOI:10.1145/3512527.3531404. The code for re-producing our experiments is publicly-available on the CA-SUM GitHub page. Pretrained models of this method are publicly-available on Zenodo.
- An unsupervised method for video thumbnail selection (called RL-DiVTS), which quantifies the representativeness and the aesthetic quality of the selected thumbnails using deterministic reward functions, and integrates a frame picking mechanism that takes frames' diversity into account. This method has been published in the Proc. of the IEEE Int. Conf. on Image Processing (ICIP 2023), Kuala Lumpur, Malaysia, Oct. 2023. DOI:10.1109/ICIP49359.2023.10222743. The code for re-producing our experiments is publicly-available on the RL-DiVTS GitHub page.
- An attention-based method for explaining video summarization (called XAI-SUM), and an extended study that builds on this method and takes into account various explanation signals and network architectures for video summarization. The original method has been published in the Proc. IEEE Int. Symposium on Multimedia (ISM), Naples, Italy, pp. 146-150, Dec. 2022. DOI:10.1109/ISM55400.2022.00029. The detailed study has been published in the Proc. of the NarSUM workshop at ACM Multimedia 2023 (ACM MM), Ottawa, Canada, Oct.-Nov. 2023. DOI:10.1145/3607540.3617138. The code for re-producing our experiments in both of the aforementioned works is publicly-available on the XAI-SUM GitHub page.
- A book chapter discussing the use of explainable video summarization for advancing media content production, that is based on our methods for attention-based video summarization (CA-SUM) and explanation (XAI-SUM). This chapter has been published in the Encyclopedia of Information Science and Technology, Sixth Edition, IGI Global, 2023. DOI:10.4018/978-1-6684-7366-5.ch065.
- A web-based tool that facilitates the production of well-tailored video summaries for sharing on social media. This technology is presented in a paper that has been accepted for publication in the 30th Int. Conf. on Multimedia Modeling (MMM), Amsterdam, The Netherlands, Jan.-Feb. 2024. Access to this web-based tool is allowed via: https://idt.iti.gr/summarizer