Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Human evaluation of English–Irish transformer-based NMT

Lankford, Séamus, Afli, Haithem orcid logoORCID: 0000-0002-7449-4707 and Way, Andy orcid logoORCID: 0000-0001-5736-5930 (2022) Human evaluation of English–Irish transformer-based NMT. Information, 13 (7). ISSN 2078-2489

Abstract
In this study, a human evaluation is carried out on how hyperparameter settings impact the quality of Transformer-based Neural Machine Translation (NMT) for the low-resourced English–Irish pair. Sentence Piece models using both Byte Pair Encoding (BPE) and unigram approaches were appraised. Variations in model architectures included modifying the number of layers, evaluating the optimal number of heads for attention and testing various regularisation techniques. The greatest performance improvement was recorded for a Transformer-optimized model with a 16k BPE subword model. Compared with a baseline Recurrent Neural Network (RNN)model, a Transformer-optimized model demonstrated a BLEU score improvement of 7.8 points. When benchmarked against Google Translate, our translation engines demonstrated significant improvements. Furthermore, a quantitative fine-grained manual evaluation was conducted which compared the performance of machine translation systems. Using the Multidimensional Quality Metrics (MQM) error taxonomy, a human evaluation of the error types generated by an RNN-based system and a Transformer-based system was explored. Our findings show the best-performing Transformer system significantly reduces both accuracy and fluency errors when compared with an RNN-based model.
Metadata
Item Type:Article (Published)
Refereed:Yes
Uncontrolled Keywords:human evaluation; MQM; neural machine translation; Irish; low-resource languages
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Initiatives and Centres > ADAPT
Publisher:MDPI
Official URL:https://dx.doi.org/10.3390/info13070309
Copyright Information:© 2022 The Authors.
ID Code:27455
Deposited On:29 Jul 2022 15:01 by Thomas Murtagh . Last Modified 14 Mar 2023 15:54
Documents

Full text available as:

[thumbnail of information-13-00309-v2.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution 4.0
640kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record