Evaluating evaluation measures

Rehbein, Ines and van Genabith, Josef (2007) Evaluating evaluation measures. In: NODALIDA 2007 - 16th Nordic Conference on Computational Linguistic, 25-26 May 2007, Tartu, Estonia.

Abstract
Metadata
Downloads
Documents

[+][-]

Abstract

This paper presents a thorough examination of the validity of three evaluation measures on parser output. We assess parser performance of an unlexicalised probabilistic parser trained on two German treebanks with different annotation schemes and evaluate parsing results using the PARSEVAL metric, the Leaf-Ancestor metric and a dependency-based evaluation. We reject the claim that the T¨uBa-D/Z annotation scheme is more adequate then the TIGER scheme for PCFG parsing and show that PARSEVAL should not be used to compare parser performance for parsers trained on treebanks with different annotation schemes. An analysis of specific error types indicates that the dependency-based evaluation is most appropriate to reflect parse quality.

Metadata

Item Type:	Conference or Workshop Item (Paper)
Event Type:	Conference
Refereed:	Yes
Subjects:	Computer Science > Machine translating
DCU Faculties and Centres:	Research Initiatives and Centres > National Centre for Language Technology (NCLT) DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Official URL:	http://dspace.utlib.ee/dspace/handle/10062/2606
Use License:	This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
ID Code:	15237
Deposited On:	19 Feb 2010 15:47 by DORAS Administrator . Last Modified 19 Jul 2018 14:50

Documents

Full text available as:

[thumbnail of rehbein_van_genabith_07c.pdf]

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
633kB

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

Evaluating evaluation measures

Downloads