Poncelas, Alberto ORCID: 0000-0002-5089-1687, Maillette de Buy Wenniger, Gideon and Way, Andy ORCID: 0000-0001-5736-5930 (2018) Feature decay algorithms for neural machine translation. In: 21st Annual Conference of The European Association for Machine Translation, 28-30 May 2018, Alicante, Spain.
Abstract
Neural Machine Translation (NMT) systems require a lot of data to be competitive. For this reason, data selection techniques are used only for finetuning systems that have been trained with larger amounts of data. In this work we aim to use Feature Decay Algorithms (FDA) data selection techniques not only to fine-tune a system but also to build a complete system with less data. Our findings reveal that it is possible to find a subset of sentence pairs, that outperforms by 1.11 BLEU points the full training corpus, when used for training a German-English NMT system .
Metadata
Item Type: | Conference or Workshop Item (Lecture) |
---|---|
Event Type: | Conference |
Refereed: | Yes |
Uncontrolled Keywords: | Machine Translation; Statistical Machine Translation; Neural Machine Translation |
Subjects: | UNSPECIFIED |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Initiatives and Centres > ADAPT |
Published in: | Proceedings of the 21st Annual Conference of the European Association for Machine Translation. . |
Copyright Information: | ©2018 The Authors |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License |
ID Code: | 22882 |
Deposited On: | 19 Dec 2018 12:45 by Gideon Maillette De buy . Last Modified 22 Jan 2021 14:25 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
610kB |
Other (Plain Text Bibliography)
8kB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record