Greevy, Edel and Smeaton, Alan F. ORCID: 0000-0003-1028-8389 (2004) Classifying racist texts using a support vector machine. In: SIGIR 2004 - the 27th Annual International ACM SIGIR Conference, 25-29 July 2004, Sheffield, UK.
Abstract
In this poster we present an overview of the techniques we used to develop and evaluate a text categorisation system to automatically classify racist texts. Detecting racism is difficult because the presence of indicator words is insufficient to indicate racist texts, unlike some other text classification tasks. Support Vector Machines (SVM) are used to automatically categorise web pages based on whether or not they are racist. Different interpretations of what constitutes a term are taken, and in this poster we look at three representations of a web page within an SVM -- bag-of-words, bigrams and part-of-speech tags.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Conference |
Refereed: | Yes |
Uncontrolled Keywords: | Text Categorisation/Classification; Machine Learning; Support Vector Machines; |
Subjects: | Computer Science > Information retrieval |
DCU Faculties and Centres: | Research Initiatives and Centres > Centre for Digital Video Processing (CDVP) |
Publisher: | Association for Computing Machinery |
Official URL: | http://dx.doi.org/10.1145/1008992.1009074 |
Copyright Information: | © ACM, 2004. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution |
ID Code: | 368 |
Deposited On: | 28 Mar 2008 by DORAS Administrator . Last Modified 08 Nov 2018 11:11 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
56kB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record