Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Robust sub-sentential alignment of phrase-structure trees

Groves, Declan, Hearne, Mary and Way, Andy orcid logoORCID: 0000-0001-5736-5930 (2004) Robust sub-sentential alignment of phrase-structure trees. In: COLING 2004 - 20th International Conference on Computational Linguistics, 23-27 August 2004, Geneva, Switzerland.

Abstract
Data-Oriented Translation (DOT), based on Data-Oriented Parsing (DOP), is a language-independent MT engine which exploits parsed, aligned bitexts to produce very high quality translations. However, data acquisition constitutes a serious bottleneck as DOT requires parsed sentences aligned at both sentential and sub-structural levels. Manual substructural alignment is time-consuming, error-prone and requires considerable knowledge of both source and target languages and how they are related. Automating this process is essential in order to carry out the large-scale translation experiments necessary to assess the full potential of DOT. We present a novel algorithm which automatically induces sub-structural alignments between context-free phrase structure trees in a fast and consistent fashion requiring little or no knowledge of the language pair. We present results from a number of experiments which indicate that our method provides a serious alternative to manual alignment.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Uncontrolled Keywords:data-oriented translation (DOT);
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:Research Initiatives and Centres > National Centre for Language Technology (NCLT)
DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Publisher:Association for Computational Linguistics
Official URL:http://aclweb.org/anthology-new/C/C04/
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Irish Research Council for Science Engineering and Technology
ID Code:15307
Deposited On:15 Mar 2010 14:24 by DORAS Administrator . Last Modified 16 Nov 2018 11:55
Documents

Full text available as:

[thumbnail of GrovesEtAl_coling_04.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
89kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record