Mahajani, Siddhant Jaydeep, Srivastava, Shashank and Smeaton, Alan F. ORCID: 0000-0003-1028-8389 (2023) A comparison of lexicon-based and ML-based sentiment analysis: are there outlier words? In: 31st Irish Conference on Artificial Intelligence and Cognitive Science, 7-8 Dec 2023, Letterkenny, Ireland. (In Press)
Abstract
Lexicon-based approaches to sentiment analysis of text are based on each word or lexical entry having a pre-defined weight indicating its sentiment polarity. These are usually manually assigned but the accuracy of these when compared against machine leaning based approaches to computing sentiment, are not known. It may be that there are lexical entries whose sentiment values cause a lexicon-based approach to give results which are very different to a machine learning approach. In this paper we compute sentiment for more than 150,000 English language texts drawn from 4 domains using the Hedonometer, a lexicon-based technique and Azure, a contemporary machine-learning based approach which is part of the Azure Cognitive Services family of APIs which is easy to use. We model differences in sentiment scores between approaches for documents in each domain using a regression and analyse the independent variables (Hedonometer lexical entries) as indicators of each word's importance and contribution to the score differences. Our findings are that the importance of a word depends on the domain and there are no standout lexical entries which systematically cause differences in sentiment scores.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Conference |
Refereed: | Yes |
Uncontrolled Keywords: | Sentiment Analysis |
Subjects: | Computer Science > Artificial intelligence Computer Science > Computational linguistics Computer Science > Machine learning |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Initiatives and Centres > INSIGHT Centre for Data Analytics |
Published in: | Proceedings of the 31st Irish Conference on Artificial Intelligence and Cognitive Science, AICS. . IEEE. |
Publisher: | IEEE |
Copyright Information: | © 2023 The Authors. |
Funders: | Science Foundation Ireland Grant Number SFI/12/RC/2289 P2, co-funded by the European Regional Development Fund |
ID Code: | 29204 |
Deposited On: | 07 Dec 2023 12:43 by Alan Smeaton . Last Modified 07 Dec 2023 12:43 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution-Noncommercial 4.0 1MB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record