Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Treebank embedding vectors for out-of-domain dependency parsing

Wagner, Joachim orcid logoORCID: 0000-0002-8290-3849, Barry, James orcid logoORCID: 0000-0003-3051-585X and Foster, Jennifer orcid logoORCID: 0000-0002-7789-4853 (2020) Treebank embedding vectors for out-of-domain dependency parsing. In: 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020), 05-10 Jul 2020, Online (virtual conference).

Abstract
A recent advance in monolingual dependency parsing is the idea of a treebank embedding vector, which allows all treebanks for a particular language to be used as training data while at the same time allowing the model to prefer training data from one treebank over others and to select the preferred treebank at test time. We build on this idea by 1) introducing a method to predict a treebank vector for sentences that do not come from a treebank used in training, and 2) exploring what happens when we move away from predefined treebank embedding vectors during test time and instead devise tailored interpolations. We show that 1) there are interpolated vectors that are superior to the predefined ones, and 2) treebank vectors can be predicted with sufficient accuracy, for nine out of ten test languages, to match the performance of an oracle approach that knows the most suitable predefined treebank embedding for the test set.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Uncontrolled Keywords:dependency parsing
Subjects:Computer Science > Computational linguistics
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Initiatives and Centres > ADAPT
Published in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. . Association for Computational Linguistics (ACL).
Publisher:Association for Computational Linguistics (ACL)
Official URL:http://dx.doi.org/10.18653/v1/2020.acl-main.778
Copyright Information:© 2020 The Authors CC-BY-4.0
Funders:Science Foundation Ireland (SFI) Research Centres Programme (Grant 13/RC/2106), European Regional Development Fund
ID Code:24861
Deposited On:27 Aug 2020 14:05 by Joachim Wagner . Last Modified 27 Aug 2020 14:05
Documents
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record