Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

A comparison of lexicon-based and ML-based sentiment analysis: are there outlier words?

Mahajani, Siddhant Jaydeep, Srivastava, Shashank and Smeaton, Alan F. orcid logoORCID: 0000-0003-1028-8389 (2023) A comparison of lexicon-based and ML-based sentiment analysis: are there outlier words? In: 31st Irish Conference on Artificial Intelligence and Cognitive Science, 7-8 Dec 2023, Letterkenny, Ireland. (In Press)

Abstract
Lexicon-based approaches to sentiment analysis of text are based on each word or lexical entry having a pre-defined weight indicating its sentiment polarity. These are usually manually assigned but the accuracy of these when compared against machine leaning based approaches to computing sentiment, are not known. It may be that there are lexical entries whose sentiment values cause a lexicon-based approach to give results which are very different to a machine learning approach. In this paper we compute sentiment for more than 150,000 English language texts drawn from 4 domains using the Hedonometer, a lexicon-based technique and Azure, a contemporary machine-learning based approach which is part of the Azure Cognitive Services family of APIs which is easy to use. We model differences in sentiment scores between approaches for documents in each domain using a regression and analyse the independent variables (Hedonometer lexical entries) as indicators of each word's importance and contribution to the score differences. Our findings are that the importance of a word depends on the domain and there are no standout lexical entries which systematically cause differences in sentiment scores.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Uncontrolled Keywords:Sentiment Analysis
Subjects:Computer Science > Artificial intelligence
Computer Science > Computational linguistics
Computer Science > Machine learning
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Initiatives and Centres > INSIGHT Centre for Data Analytics
Published in: Proceedings of the 31st Irish Conference on Artificial Intelligence and Cognitive Science, AICS. . IEEE.
Publisher:IEEE
Copyright Information:© 2023 The Authors.
Funders:Science Foundation Ireland Grant Number SFI/12/RC/2289 P2, co-funded by the European Regional Development Fund
ID Code:29204
Deposited On:07 Dec 2023 12:43 by Alan Smeaton . Last Modified 07 Dec 2023 12:43
Documents

Full text available as:

[thumbnail of 2023309500.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution-Noncommercial 4.0
1MB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record