Using Classification for Analysis of Multi-modal Video Summarization
Author | : Brendan Wells |
Publisher | : |
Total Pages | : 63 |
Release | : 2020 |
ISBN-10 | : OCLC:1184039042 |
ISBN-13 | : |
Rating | : 4/5 ( Downloads) |
Download or read book Using Classification for Analysis of Multi-modal Video Summarization written by Brendan Wells and published by . This book was released on 2020 with total page 63 pages. Available in PDF, EPUB and Kindle. Book excerpt: "Video Summarization refers to taking the important contents of a video and condensing it down to an easily consumable piece of data without having to watch the entire video. Currently, Millions of Videos are being recorded and shared every day. These videos range from the consumer level, such as a birthday party or wedding video, all the way up to industry such as film and television. We have constructed a model that seeks to address the problem of not being able to consume all the media that is being presented to you because of time constraints. To do this, we conduct two separate experiments. The first experiment examines the role of different parts of the summarization model, namely modality, sampling rate, and data scaling so that we better understand how summaries are generated. The second experiment utilizes these findings to create a model based in classification. We use classification as a means of interpreting a wide variety of types of video for summarization. By using classification to generate the video and audio features used by the summarizer, the classifier granularity is leveraged, and the maturity of classification problems is leveraged to accomplish a summarization task. We found that while scaling and sampling of the data have little effect on the overall summary, in each experiment the modality played a large role in the results. While many models exclude audio, we found that there are benefits to including this data when generating a video summary. We also found that the use of classification resulted in a separation of impacts for each modality, with video serving to construct the shape of the summary and audio determining importance score."--Abstract.