Dowling, Meghan ORCID: 0000-0003-1637-4923, Lynn, Teresa and Way, Andy ORCID: 0000-0001-5736-5930 (2017) A crowd-sourcing approach for translations of minority language user-generated content (UGC). In: First workshop on Social Media and User Generated Content Machine Translation, 31 May 2017, Prague, Czech Republic.
Abstract
Data sparsity is a common problem for machine translation of minority and less-resourced
languages. While data collection for standard, grammatical text can be challenging enough,
efforts for collection of parallel user-generated content can be even more challenging. In this
paper we describe an approach to collecting English↔Irish translations of user-generated content (tweets) that overcomes some of these hurdles. We show how a crowd-sourced data collection campaign, which was tailored to our target audience (the Irish language community),
proved successful in gathering data for a niche domain. We also discuss the reliability of crowd-sourcing English↔Irish tweet translations in terms of quality by reporting on a self-rating approach along with qualified reviewer ratings.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Conference |
Refereed: | Yes |
Uncontrolled Keywords: | Minority Languages; |
Subjects: | Computer Science > Machine translating Humanities > Irish language |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Initiatives and Centres > ADAPT |
Copyright Information: | © 2017 PBML. Distributed under CC BY-NC-ND |
Funders: | ADAPT Centre for Digital Content Technology, which is funded under the SFI Research Centres Programme (Grant 13/RC/2016) and is co-funded by the European Regional Development Fund |
ID Code: | 23304 |
Deposited On: | 20 May 2019 15:50 by Thomas Murtagh . Last Modified 20 May 2019 15:50 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
221kB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record