Courtney, Michael, Breen, Michael ORCID: 0000-0002-5857-9938, McMenamin, Iain ORCID: 0000-0002-1704-390X and McNulty, Gemma ORCID: 0000-0002-6909-6958 (2020) Automatic Translation, Context, and Supervised Learning in Comparative Politics. Journal of Information Technology and Politics . ISSN 1933-1681
Abstract
This paper proves that automatic translation of multilingual newspaper documents deters neither human nor computer classification of political concepts. We show how theory-driven coding of newspaper text can be automated in several languages by monolingual researchers. Supervised machine learning is successfully applied to text in English from British, Spanish and German sources. The paper has three main findings. First, results from human coding directly in a foreign language do not differ from coding computer-translated text. Second, humans can code translated text as well as they can code untranslated prose in their mother tongue. Third, machine learning based on translated Spanish and German training sets can reproduce human coding as accurately as a system learning from English training sets.
Metadata
Item Type: | Article (Published) |
---|---|
Refereed: | Yes |
Uncontrolled Keywords: | automatic translation; supervised learning; google translate; media; newspapers; comparative politics; text analysis; political text |
Subjects: | Computer Science > Machine learning Computer Science > Machine translating Humanities > Translating and interpreting Social Sciences > International relations Social Sciences > Mass media Social Sciences > Political science |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Humanities and Social Science > School of Law and Government |
Publisher: | Taylor & Francis |
Official URL: | http://dx.doi.org/10.1080/19331681.2020.1731245 |
Copyright Information: | © 2020 Taylor & Francis |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License |
Funders: | Irish Research Council Grant Number GOIPD/2016/253. |
ID Code: | 24233 |
Deposited On: | 21 Feb 2020 10:20 by Michael Breen . Last Modified 28 Feb 2022 13:47 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
859kB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record