Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

No padding please: efficient neural handwriting recognition

Maillette de Buy Wenniger, Gideon, Schomaker, Lambert orcid logoORCID: 0000-0003-2351-930X and Way, Andy orcid logoORCID: 0000-0001-5736-5930 (2020) No padding please: efficient neural handwriting recognition. In: International Conference on Document Analysis and Recognition (ICDAR2019), 20-25 Sept 2019, Sydney, Australia.

Abstract
Neural handwriting recognition (NHR) is the recognition of handwritten text with deep learning models, such as multi-dimensional long short-term memory (MDLSTM) re-current neural networks. Models with MDLSTM layers have achieved state-of-the art results on handwritten text recognition tasks. While multi-directional MDLSTM-layers have an unbeaten ability to capture the complete context in all directions, this strength limits the possibilities for parallelization, and therefore comes at a high computational cost.In this work we develop methods to create efficient MDLSTM-based models for NHR, particularly a method aimed at eliminating computation waste that results from padding. This proposed method, called example-packing, replaces wasteful stacking of padded examples with efficient tiling in a 2-dimensional grid.For word-based NHR this yields a speed improvement of factor6.6 over an already efficient baseline of minimal padding foreach batch separately. For line-based NHR the savings are more modest, but still significant.In addition to example-packing, we propose: 1) a technique to optimize parallelization for dynamic graph definition frameworks including PyTorch, using convolutions with grouping, 2) a method for parallelization across GPUs for variable-length example batches. All our techniques are thoroughly tested on our own PyTorch re-implementation of MDLSTM-based NHR models. A thorough evaluation on the IAM dataset shows that our models are performing similar to earlier implementations of state-of-theart models. Our efficient NHR model and some of the reusable techniques discussed with it offer ways to realize relatively efficient models for the omnipresent scenario of variable-length inputs in deep learning.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Additional Information:This is a pre-publication of a paper which has been accepted at the International Conference on Document Analysis and Recognition 2019 (ICDAR 2019, https://icdar2019.org/).
Uncontrolled Keywords:variable length input; example-packing; multi-dimensional long short-term memory; handwriting recognition;deep learning; fast deep learning
Subjects:Computer Science > Artificial intelligence
Computer Science > Machine learning
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Initiatives and Centres > ADAPT
Published in: 2019 International Conference on Document Analysis and Recognition (ICDAR). . IEEE.
Publisher:IEEE
Official URL:http://dx.doi.org/10.1109/ICDAR.2019.00064
Copyright Information:©2019 The Authors
Funders:European Union’s Horizon 2020 under the European Union’s Horizon 2020 research and innovthe Marie Skłodowska-Curie grant agreement No 713567., ADAPT Centre under the SFI Research Centres Programme (Grant 13/RC/2106).
ID Code:23382
Deposited On:03 Jul 2019 12:06 by Gideon Maillette De buy . Last Modified 17 Feb 2020 15:50
Documents

Full text available as:

[thumbnail of 1902.11208.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
796kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record