Ganguly, Debasis ORCID: 0000-0003-0050-7138 and Jones, Gareth J.F. ORCID: 0000-0002-4033-9135 (2014) DCU@FIRE-2014: an information retrieval approach for source code plagiarism detection. In: Forum for Information Retrieval Evaluation (FIRE 2014) workshop, 5-7 Dec 2014, Bangalore, India.
Abstract
This paper investigates an information retrieval (IR) based approach for source code plagiarism detection. The method of extensively checking pairwise similarities between documents is not scalable for large collections of source code documents. To make the task of source code plagiarism detection fast and scalable in practice, we propose an IR based approach in which each document is treated as a pseudo-query in order to retrieve a list of potential candidate documents in a decreasing order of their similarity values. A threshold is then applied on the relative similarity decrement ratios to report a set of documents as potential cases of source-code reuse. Instead of treating a source code as an unstructured text document, we explore term extraction from the annotated parse tree of a source code and also make use of field based language model for indexing and retrieval of source code documents. Results conrm that source code parsing plays a vital role in improving the plagiarism prediction accuracy.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Workshop |
Refereed: | Yes |
Uncontrolled Keywords: | Source Code Plagiarism Detection; Field Search |
Subjects: | Computer Science > Information retrieval |
DCU Faculties and Centres: | Research Initiatives and Centres > Centre for Next Generation Localisation (CNGL) DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing |
Published in: | Proceedings of FIRE 2014. . |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License |
Funders: | Science Foundation Ireland |
ID Code: | 20382 |
Deposited On: | 15 Jan 2015 14:59 by Gareth Jones . Last Modified 25 Oct 2018 08:54 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
219kB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record