Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Lost in translation: loss and decay of linguistic richness in machine translation

Way, Andy orcid logoORCID: 0000-0001-5736-5930, Shterionov, Dimitar orcid logoORCID: 0000-0001-6300-797X and Vanmassenhove, Eva orcid logoORCID: 0000-0003-1162-820X (2019) Lost in translation: loss and decay of linguistic richness in machine translation. In: MT Summit XVII, 19 - 23 Aug 2019, Dublin, Ireland.

Abstract
This work presents an empirical approach to quantifying the loss of lexical richness in Machine Translation (MT) systems compared to Human Translation (HT).Our experiments show how current MT systems indeed fail to render the lexical diversity of human generated or translated text. The inability of MT systems to generate diverse outputs and its tendency to exacerbate already frequent patterns while ignoring less frequent ones, might be the underlying cause for, among others, the currently heavily debated issues related to gender biased output. Can we indeed, aside from biased data, talk about an algorithm that exacerbates seen biases?
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Initiatives and Centres > ADAPT
Published in: Forcada, Mikel, Way, Barry, Haddow, Barry and Sennrich, Rico, (eds.) Proceedings of MT Summit XVII. 1. European Association for Machine Translation.
Publisher:European Association for Machine Translation
Official URL:https://www.aclweb.org/anthology/W19-6622.pdf
Copyright Information:© 2019 The Authors.
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Dublin City University Faculty of Engineering & Computing under the Daniel O’Hare Research Scholarship, ADAPT Centre for Digital Content Technology, which is funded under the SFI Research Centres Programme (Grant 13/RC/2106).
ID Code:23865
Deposited On:21 Oct 2019 13:08 by Andrew Way . Last Modified 24 May 2023 10:05
Documents

Full text available as:

[thumbnail of W19-6622.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution 4.0
329kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record