Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

A text recognition and retrieval system for e-business image management

Zhou, Jiang orcid logoORCID: 0000-0002-3067-8512, O'Connor, Noel E. orcid logoORCID: 0000-0002-4033-9135 and McGuinness, Kevin orcid logoORCID: 0000-0003-1336-6477 (2018) A text recognition and retrieval system for e-business image management. In: The 24th International Conference on Multimedia Modeling (MMM2018), 5-7 Feb 2018, Bangkok, Thailand.

Abstract
The on-going growth of e-business has resulted in companies having to manage an ever increasing number of product, packaging and promotional images. Systems for indexing and retrieving such images are required in order to ensure image libraries can be managed and fully exploited as valuable business resources. In this paper, we explore the power of text recognition for e-business image management and propose an innovative system based on photo OCR. Photo OCR has been actively studied for scene text recognition but has not been exploited for e-business digital image management. Besides the well known difficulties in scene text recognition such as various size, location, orientation in text and cluttered background, e-business images typically feature text with extremely diverse fonts, and the characters are often artistically modified in shape, colour and arrangement. To address these challenges, our system takes advantage of the combinatorial power of deep neural networks and MSER processing. The cosine distance and n-gram vectors are used during retrieval for matching detected text to queries to provide tolerance to the inevitable transcription errors in text recognition. To evaluate our proposed system, we prepared a novel dataset designed specifically to reflect the challenges associated with text in e-business images. We compared our system with two other approaches for scene text recognition, and the results show our system outperforms other state-ofthe-art on the new challenging dataset. Our system demonstrates that recognizing text embedded in images can be hugely beneficial for digital asset management.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Uncontrolled Keywords:image management; image retrieval; OCR
Subjects:UNSPECIFIED
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Electronic Engineering
Research Initiatives and Centres > INSIGHT Centre for Data Analytics
Copyright Information:© 2018 The Authors
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Science Foundation Ireland under grant number SFI/12/RC/2289, European Regional Development Fund
ID Code:22133
Deposited On:08 Feb 2018 09:57 by Jiang Zhou . Last Modified 25 Jan 2019 10:00
Documents

Full text available as:

[thumbnail of a_text_recognition_and_retrieval_system_for_e-business_image_management.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
3MB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record