Lebron Casas, Luis ORCID: 0000-0002-3230-3589, Graham, Yvette, O'Connor, Noel E. ORCID: 0000-0002-4033-9135 and McGuinness, Kevin ORCID: 0000-0003-1336-6477 (2022) Evaluation of automatically generated video captions using vision and language models. In: IEEE International Conference on Image Processing (ICIP), 16-19 Oct 2022, Bordeaux, France. ISBN 978-1-6654-9620-9
Abstract
Vision and language models are easily transferred to other tasks. In particular, they have been demonstrated to work well in the evaluation of automatic image captioning. This has made it possible to evaluate systems without the need for references or additional information apart from the image and the caption. However, these models do not provide a straightforward way of evaluating videos. In this paper, we propose using these models for video captioning evaluation. We explore the use of both single image-based evaluation and different methods to include data from multiple frames. Experiments demonstrate that using clustering methods to select a few frames to compute the final score gives an excellent correlation with human judgment. The bias in the human annotations can also influence the metric, so we propose filtering the human assessments to discard outliers and improve the evaluation process.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Conference |
Refereed: | Yes |
Uncontrolled Keywords: | Video Captioning Evaluation; Vision and Language Models; Clustering; Transformers; Video Captioning |
Subjects: | Computer Science > Algorithms Computer Science > Artificial intelligence Computer Science > Image processing Computer Science > Machine learning Computer Science > Multimedia systems |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Electronic Engineering Research Initiatives and Centres > INSIGHT Centre for Data Analytics |
Published in: | 2022 IEEE International Conference on Image Processing (ICIP). . IEEE. ISBN 978-1-6654-9620-9 |
Publisher: | IEEE |
Official URL: | https://doi.org/10.1109/ICIP46576.2022.9897559 |
Funders: | Insight SFI Centre for Data Analytics, Irish Research Council Partnership Scheme, United Technologies Research Center Ireland, Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289_P2, co-funded by the European Regional Development Fund |
ID Code: | 27890 |
Deposited On: | 26 Oct 2022 12:23 by Kevin Mcguinness . Last Modified 26 Oct 2022 14:26 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
162kB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record