Li, Liangyou ORCID: 0000-0002-0279-003X (2016) Dependency graph-based statistical machine translation. PhD thesis, Dublin City University.
Abstract
Statistical Machine Translation has been shown to benefit from complex linguistic structures. However, previous work mainly focuses on sequences and trees. In this thesis, we build dependency graphs which are constructed from dependency trees and uniformly represent both dependency relations and sequential relations, including bigram relations and sibling relations. We propose translation models to translate these graphs into target strings and conduct experiments on Chinese--English and German--English translation tasks.
As a motivation, we firstly present a pseudo forest-to-string model which improves a dependency tree-to-string model by dependency decomposition. The decomposition takes sibling relations into consideration which results in more rules being used and thus a higher phrase coverage. Experiments show that such decomposition is beneficial to translation performance. Integrating phrasal rules further improves our model.
Then, we propose a segmentational graph-based translation model. It segments graphs into subgraphs and generates translations from left to right by combining translations of these subgraphs. The graphs explicitly combine dependency relations and bigram relations. In experiments, the graph-based model outperforms both the phrase-based model and treelet-based model. In addition, we improve this model by using a graph segmentation model to take source context into consideration.
Furthermore, inspired by using tree grammars to translate trees, we propose recursive graph-based translation models by using graph grammars. An edge replacement grammar is used to translate dependency-edge graphs which are converted from dependency trees by labeling edges to naturally take sibling relations into consideration. A node replacement grammar is used to translate dependency-sibling graphs which explicitly add sibling links to dependency trees. Experiments show that our models are significantly better than the hierarchical phrase-based model.
Metadata
Item Type: | Thesis (PhD) |
---|---|
Date of Award: | November 2016 |
Refereed: | No |
Supervisor(s): | Liu, Qun and Way, Andy |
Uncontrolled Keywords: | Graph-based Translation |
Subjects: | Computer Science > Computational linguistics Computer Science > Machine translating |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 3.0 License. View License |
ID Code: | 21751 |
Deposited On: | 10 Nov 2017 13:12 by Qun Liu . Last Modified 04 Dec 2019 13:40 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution-Noncommercial-No Derivative Works 3.0 750kB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record