Video Captioning Lecture
Nowadays, digital videos are everywhere and revolutionize very many domains, notably:
-Digital Media (video/movie) Content Production and Broadcasting,
-Social Media Streaming and Analytics (g., YouTube),
-Mobile computing and streaming
-Videoconferencing
-Medical/Biological/Dental Imaging and Diagnosis,
-Big Visual Data Analytics,
-Internet and Communications (media broadcasting, streaming).
-Scientific Imaging of any sort, e.g., Physics.
Furthermore, Video Processing and Analysis enables diverse applications, in unison with Computer Vision and Machine Learning:
-Autonomous Systems (cars, drones, vessels) Perception,
-Robotics Perception and Control,
-Intelligent Human-Machine Interaction,
-Anthropocentric (human-centered) Computing,
-Smart Cities/Buildings and Assisted living.
Visual Computing, encompassing Computer Vision and Video Processing and Analysis, coupled with AI (notably Machine Learning and Deep Neural Network) advances hit the news almost every day.
This lecture overviews Video Captioning that has many applications in video description, search and retrieval. It covers the following topics in detail: Video captioning definitions and datasets. Evaluation Metrics (SVO, BLEU, ROUGE, CIDER, METEOR, F-SCORE). Video Captioning Methods: Template-based Captioning, Joint Embedding, Encoder-Decoder Mechanisms (Attention Mechanism, Hierarchical Neural Encoder, Paragraph Description, Dense Captioning, Video Captioning by Adversarial LSTMs).