Bryl, Anton and van Genabith, Josef ORCID: 0000-0003-1322-7944 (2010) Two approaches to automatic matching of atomic grammatical features in LFG. In: LFG10 Conference, 18-20 July 2010, Ottowa, Canada.
Abstract
The alignment of a bilingual corpus is an important step in data preparation for data-driven machine translation. LFG f-structures provide bilexical labelled dependencies in the form of lemmas and core grammatical functions linking those lemmas, but also important grammatical features (TENSE,
NUMBER, CASE, etc.) representing morphological and semantic information. These grammatical features can often be translated independently from the lemmas or words. It is therefore of practical interest to develop methods that align grammatical features which can be considered translations of each other (e.g. the number features of the corresponding words in the source and target parts of the corpus) in data-driven LFG-based MT. In a parallel grammar
development scenario, such as ParGram, this is to a large extent captured through manually hardcoding the correspondences in the hand-crafted grammars, using similar or identical feature names for similar phenomena across languages. However, for a completely automatic learning method it is desirable to establish these correspondences without human assistance. In this paper we present and evaluate two approaches to the automatic identification of correspondences between atomic features of LFG (and similar)
grammars for different languages. The methods can be used to evaluate the correspondence between feature names in hand-crafted parallel grammars or find correspondences between features in grammars for different languages where feature alignments are not known.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Conference |
Refereed: | Yes |
Uncontrolled Keywords: | LFG Grammars |
Subjects: | Computer Science > Machine translating |
DCU Faculties and Centres: | Research Initiatives and Centres > Centre for Next Generation Localisation (CNGL) Research Initiatives and Centres > National Centre for Language Technology (NCLT) DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing |
Published in: | Proceedings of the LFG10 Conference. . CSLI Publications. |
Publisher: | CSLI Publications |
Official URL: | http://cslipublications.stanford.edu/LFG/15/papers... |
Copyright Information: | © 2010 CSLI Publications. |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License |
ID Code: | 16016 |
Deposited On: | 19 May 2011 09:36 by Shane Harper . Last Modified 21 Jan 2022 16:28 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
174kB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record