Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Evaluation of automatically generated video captions using vision and language models

Lebron Casas, Luis orcid logoORCID: 0000-0002-3230-3589, Graham, Yvette, O'Connor, Noel E. orcid logoORCID: 0000-0002-4033-9135 and McGuinness, Kevin orcid logoORCID: 0000-0003-1336-6477 (2022) Evaluation of automatically generated video captions using vision and language models. In: IEEE International Conference on Image Processing (ICIP), 16-19 Oct 2022, Bordeaux, France. ISBN 978-1-6654-9620-9

Abstract
Vision and language models are easily transferred to other tasks. In particular, they have been demonstrated to work well in the evaluation of automatic image captioning. This has made it possible to evaluate systems without the need for references or additional information apart from the image and the caption. However, these models do not provide a straightforward way of evaluating videos. In this paper, we propose using these models for video captioning evaluation. We explore the use of both single image-based evaluation and different methods to include data from multiple frames. Experiments demonstrate that using clustering methods to select a few frames to compute the final score gives an excellent correlation with human judgment. The bias in the human annotations can also influence the metric, so we propose filtering the human assessments to discard outliers and improve the evaluation process.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Uncontrolled Keywords:Video Captioning Evaluation; Vision and Language Models; Clustering; Transformers; Video Captioning
Subjects:Computer Science > Algorithms
Computer Science > Artificial intelligence
Computer Science > Image processing
Computer Science > Machine learning
Computer Science > Multimedia systems
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Electronic Engineering
Research Initiatives and Centres > INSIGHT Centre for Data Analytics
Published in: 2022 IEEE International Conference on Image Processing (ICIP). . IEEE. ISBN 978-1-6654-9620-9
Publisher:IEEE
Official URL:https://doi.org/10.1109/ICIP46576.2022.9897559
Funders:Insight SFI Centre for Data Analytics, Irish Research Council Partnership Scheme, United Technologies Research Center Ireland, Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289_P2, co-funded by the European Regional Development Fund
ID Code:27890
Deposited On:26 Oct 2022 12:23 by Kevin Mcguinness . Last Modified 26 Oct 2022 14:26
Documents

Full text available as:

[thumbnail of ICIP.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
162kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record