Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

user2code2vec: embeddings for profiling students based on distributional representations of source code

Azcona, David orcid logoORCID: 0000-0003-3693-7906, Arora, Piyush orcid logoORCID: 0000-0002-4261-2860, Hsiao, I-Han orcid logoORCID: 0000-0002-1888-3951 and Smeaton, Alan F. orcid logoORCID: 0000-0003-1028-8389 (2019) user2code2vec: embeddings for profiling students based on distributional representations of source code. In: The 9th International Learning Analytics & Knowledge Conference, LAK 2019, 4-8 Mar, 2019, Tempe, AZ, USA. ISBN 978-1-4503-6256-6/19/03

Abstract
In this work, we propose a new methodology to profile individual students of computer science based on their programming design using a technique called embeddings. We investigate different approaches to analyze user source code submissions in the Python language. We compare the performances of different source code vectorization techniques to predict the correctness of a code submission. In addition, we propose a new mechanism to represent students based on their code submissions for a given set of laboratory tasks on a particular course. This way, we can make deeper recommendations for programming solutions and pathways to support student learning and progression in computer programming modules effectively at a Higher Education Institution. Recent work using Deep Learning tends to work better when more and more data is provided. However, in Learning Analytics, the number of students in a course is an unavoidable limit. Thus we cannot simply generate more data as is done in other domains such as FinTech or Social Network Analysis. Our findings indicate there is a need to learn and develop better mechanisms to extract and learn effective data features from students so as to analyze the students' progression and performance effectively.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Uncontrolled Keywords:user2code2vec; code2vec; Code Embeddings; Distributed Representations; Representation Learning for Source Code; Machine Learning; Computer Science Education
Subjects:Computer Science > Artificial intelligence
Computer Science > Machine learning
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Initiatives and Centres > INSIGHT Centre for Data Analytics
Research Initiatives and Centres > ADAPT
Published in: The 9th International Learning Analytics & Knowledge Conference (LAK19),. . ACM. ISBN 978-1-4503-6256-6/19/03
Publisher:ACM
Official URL:https://doi.org/10.1145/3303772.3303813
Copyright Information:© 2019 ACM
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Irish Research Council under project number GOIPG/2015/3497, Science Foundation Ireland 12/RC/2289, Science Foundation Ireland 13/RC/2106, Fulbright Ireland
ID Code:22895
Deposited On:09 Jan 2019 12:25 by David Azcona . Last Modified 04 Feb 2020 14:12
Documents

Full text available as:

[thumbnail of Camera_ready_copyright_David_DCU_LAK_2019.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
779kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record