A roadmap to neural automatic post-editing: an empirical approach

Shterionov, Dimitar ORCID: 0000-0001-6300-797X, do Carmo, Félix, Moorkens, Joss ORCID: 0000-0003-0766-0071, Hossari, Murhaf, Wagner, Joachim ORCID: 0000-0002-8290-3849, Paquin, Eric, Schmidtke, Dag, Groves, Declan and Way, Andy ORCID: 0000-0001-5736-5930 (2020) A roadmap to neural automatic post-editing: an empirical approach. Machine Translation (34). pp. 67-96. ISSN 0922-6567

Abstract
Metadata
Downloads
Documents

[+][-]

Abstract

In a translation workflow, machine translation (MT) is almost always followed by a human post-editing step, where the raw MT output is corrected to meet required quality standards. To reduce the number of errors human translators need to correct, automatic post-editing (APE) methods have been developed and deployed in such workflows. With the advances in deep learning, neural APE (NPE) systems have outranked more traditional, statistical, ones. However, the plethora of options, variables and settings, as well as the relation between NPE performance and train/test data makes it difficult to select the most suitable approach for a given use case. In this article, we systematically analyse these different parameters with respect to NPE performance. We build an NPE “roadmap” to trace the different decision points and train a set of systems selecting different options through the roadmap. We also propose a novel approach for APE with data augmentation. We then analyse the performance of 15 of these systems and identify the best ones. In fact, the best systems are the ones that follow the newly-proposed method. The work presented in this article follows from a collaborative project between Microsoft and the ADAPT centre. The data provided by Microsoft originates from phrase-based statistical MT (PBSMT) systems employed in production. All tested NPE systems significantly increase the translation quality, proving the effectiveness of neural post-editing in the context of a commercial translation workflow that leverages PBSMT.

Metadata

Item Type:	Article (Published)
Refereed:	Yes
Uncontrolled Keywords:	Automatic post-editing; Neural post-editing; Multi-source; Deep learning; Empirical evaluation; Machine Translation
Subjects:	Computer Science > Machine translating
DCU Faculties and Centres:	DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing DCU Faculties and Schools > Faculty of Humanities and Social Science > School of Applied Language and Intercultural Studies Research Initiatives and Centres > ADAPT
Publisher:	Springer
Official URL:	http://dx.doi.org/10.1007%2Fs10590-020-09249-7
Copyright Information:	© 2020 The Authors CC-BY-4.0 (Open Access)
Funders:	SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development, European Union’s Horizon 2020 research and innovation programme, under the EDGE COFUND Marie Skłodowska-Curie Grant Agreement No. 713567, Science Foundation Ireland (SFI) under Grant Number 13/RC/2077
ID Code:	25315
Deposited On:	06 Jan 2021 14:08 by Joss Moorkens . Last Modified 01 Mar 2023 13:33

Documents

Full text available as:

Preview

PDF (Shterionov et al 2020) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution 3.0
943kB

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

A roadmap to neural automatic post-editing: an empirical approach

Downloads