Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Semantic indexing of wearable camera images: kids’cam concepts

Smeaton, Alan F. orcid logoORCID: 0000-0003-1028-8389, McGuinness, Kevin orcid logoORCID: 0000-0003-1336-6477, Gurrin, Cathal orcid logoORCID: 0000-0003-2903-3968, Zhou, Jiang orcid logoORCID: 0000-0002-3067-8512, O'Connor, Noel E. orcid logoORCID: 0000-0002-4033-9135, Wang, Peng, Davis, Brian orcid logoORCID: 0000-0003-1336-6477, Azevedo, Lucas, Freitas, Andre, Signal, Louise N. orcid logoORCID: 0000-0003-4715-6718, Smith, Moira orcid logoORCID: 0000-0003-2085-7522, Stanley, James, Barr, Michelle, Chambers, Tim and Ní Mhurchú, Cliona orcid logoORCID: 0000-0002-1144-9167 (2016) Semantic indexing of wearable camera images: kids’cam concepts. In: Proceedings of the 2016 ACM workshop on Vision and Language Integration Meets Multimedia Fusion, 16 Oct 2016, Amsterdam, the Netherlands. ISBN 978-1-4503-4519-4/16/10

Abstract
In order to provide content-based search on image media, including images and video, they are typically accessed based on manual or automatically assigned concepts or tags, or sometimes based on image-image similarity depending on the use case. While great progress has been made in very recent years in automatic concept detection using machine learning, we are still left with a mis-match between the semantics of the concepts we can automatically detect, and the semantics of the words used in a user’s query, for example. In this paper we report on a large collection of images from wearable cameras gathered as part of the Kids’Cam project, which have been both manually annotated from a vocabulary of 83 concepts, and automatically annotated from a vocabulary of 1,000 concepts. This collection allows us to explore issues around how language, in the form of two distinct concept vocabularies or spaces, one manually assigned and thus forming a ground-truth, is used to represent images, in our case taken using wearable cameras. It also allows us to discuss, in general terms, issues around mis-match of concepts in visual media, which derive from language mis-matches. We report the data processing we have completed on this collection and some of our initial experimentation in mapping across the two language vocabularies.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Workshop
Refereed:Yes
Subjects:Computer Science > Lifelog
Computer Science > Machine learning
Business > Marketing
Computer Science > Image processing
Computer Science > Information retrieval
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Electronic Engineering
Research Initiatives and Centres > INSIGHT Centre for Data Analytics
DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Published in: iV&L-MM '16 Proceedings of the 2016 ACM workshop on Vision and Language Integration Meets Multimedia Fusion. . Association for Computing Machinery. ISBN 978-1-4503-4519-4/16/10
Publisher:Association for Computing Machinery
Official URL:http://dx.doi.org/10.1145/2983563.2983566
Copyright Information:© ACM, 2016. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in http://dx.doi.org/10.1145/2983563.2983566
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Science Foundation Ireland, National Natural Science Foundation of China, Beijing Key Laboratory of Networked Multimedia, Health Research Council of New Zealand Programme Grant
ID Code:21434
Deposited On:18 Oct 2016 09:23 by Alan Smeaton . Last Modified 07 Apr 2021 13:02
Documents

Full text available as:

[thumbnail of p27-smeaton.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
1MB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record