Reinforced NMT for sentiment and content preservation in low-resource scenario

Kumari, Divya, Ekbal, Asif ORCID: 0000-0003-3612-8834, Haque, Rejwanul ORCID: 0000-0003-1680-0099, Bhattacharyya, Pushpak ORCID: 0000-0001-5319-5508 and Way, Andy ORCID: 0000-0001-5736-5930 (2021) Reinforced NMT for sentiment and content preservation in low-resource scenario. ACM Transactions on Asian and Low-Resource Language Information Processing, 20 (4). ISSN 2375-4699

Abstract
Metadata
Downloads
Documents

[+][-]

Abstract

The preservation of domain knowledge from source to the target is crucial in any translation workflows. Hence, translation service providers that use machine translation (MT) in production could reasonably expect that the translation process should transfer both the underlying pragmatics and the semantics of the sourceside sentences into the target language. However, recent studies suggest that the MT systems often fail to preserve such crucial information (e.g., sentiment, emotion, gender traits) embedded in the source text in the target. In this context, the raw automatic translations are often directly fed to other natural language processing (NLP) applications (e.g., sentiment classifier) in a cross-lingual platform. Hence, the loss of such crucial information during the translation could negatively affect the performance of such downstream NLP tasks that heavily rely on the output of the MT systems. In our current research, we carefully balance both the sides (i.e., sentiment and semantics) during translation, by controlling a global-attention-based neural MT (NMT), to generate translations that encode the underlying sentiment of a source sentence while preserving its non-opinionated semantic content. Toward this, we use a state-of-the-art reinforcement learning method, namely, actor-critic, that includes a novel reward combination module, to fine-tune the NMT system so that it learns to generate translations that are best suited for a downstream task, viz. sentiment classification while ensuring the source-side semantics is intact in the process. Experimental results for Hindi–English language pair show that our proposed method significantly improves the performance of the sentiment classifier and alongside results in an improved NMT system.

Metadata

Item Type:	Article (Published)
Refereed:	Yes
Additional Information:	Article Number: 70
Uncontrolled Keywords:	neural machine translation; sentiment preservation; actor-critic; reinforcement learning; BERT
Subjects:	Computer Science > Machine translating
DCU Faculties and Centres:	DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Initiatives and Centres > ADAPT
Publisher:	Association for Computing Machinery (ACM)
Official URL:	https://dx.doi.org/10.1145/3450970
Copyright Information:	© 2021 The Authors.
ID Code:	27449
Deposited On:	28 Jul 2022 16:15 by Thomas Murtagh . Last Modified 11 May 2023 13:43

Documents

Full text available as:

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution 4.0
695kB

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

Reinforced NMT for sentiment and content preservation in low-resource scenario

Downloads