Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

The ADAPT bilingual document alignment system at WMT16

Lohar, Pintu orcid logoORCID: 0000-0002-5328-1585, Afli, Haithem orcid logoORCID: 0000-0002-7449-4707, Liu, Chao-Hong orcid logoORCID: 0000-0002-1235-6026 and Way, Andy orcid logoORCID: 0000-0001-5736-5930 (2016) The ADAPT bilingual document alignment system at WMT16. In: First Conference on Machine Translation (WMT16), 11-12 Aug 2016, Berlin, Germany.

Abstract
Comparable corpora have been shown to be useful in several multilingual natural language processing (NLP) tasks. Many previous papers have focused on how to improve the extraction of parallel data from this kind of corpus on different levels. In this paper, we are interested in improving the quality of bilingual comparable corpora according to increased document alignment score. We describe our participation in the bilingual document alignment shared task of the First Conference on Machine Translation (WMT16). We propose a technique based on sourceto-target sentence- and word-based scores and the fraction of matched source named entities. We performed our experiments on English-to-French document alignments for this bilingual task.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Initiatives and Centres > ADAPT
Published in: Proceedings of the First Conference on Machine Translation: Shared Task Papers. 2. Association for Computational Linguistics (ACL).
Publisher:Association for Computational Linguistics (ACL)
Official URL:http://dx.doi.org/10.18653/v1/W16-2372
Copyright Information:© 2016 Association for Computational Linguistics (ACL)
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Science Foundation Ireland in the ADAPT Centre (Grant 13/RC/2106) (www.adaptcentre.ie) at Dublin City University
ID Code:23374
Deposited On:29 May 2019 09:24 by Thomas Murtagh . Last Modified 05 May 2023 16:27
Documents

Full text available as:

[thumbnail of The_ADAPT_bilingual_document_alignment_system_at_wmt16[1].pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
172kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record