Scriney, Michael ORCID: 0000-0001-6813-2630 (2018) Constructing data marts from web sources using a graph common model. PhD thesis, Dublin City University.
Abstract
At a time when humans and devices are generating more information than ever, activities such as data mining and machine learning become crucial. These activities enable us to understand and interpret the information we have and predict, or better prepare ourselves for, future events. However, activities such as data mining cannot be performed without a layer of data management to clean, integrate, process and make available the necessary datasets. To that extent, large and costly data flow processes such as Extract-Transform-Load are necessary to extract from disparate information sources to generate ready-for-analyses datasets. These datasets are generally in the form of multi-dimensional cubes from which different data views can be extracted for the purpose of different analyses. The process of creating a multi-dimensional cube from integrated data sources is significant. In this research, we present a methodology to generate these cubes automatically or in some cases, close to automatic, requiring very little user interaction. A construct called a StarGraph acts as a canonical model for our system, to which imported data sources are transformed. An ontology-driven process controls the integration of StarGraph schemas and simple OLAP style functions generate the cubes or datasets. An extensive evaluation is carried out using a large number of agri data sources with user-defined case studies to identify sources for integration and the types of analyses required for the final data cubes.
Metadata
Item Type: | Thesis (PhD) |
---|---|
Date of Award: | November 2018 |
Refereed: | No |
Supervisor(s): | Roantree, Mark |
Uncontrolled Keywords: | Data Analytics; Data Warehousing; Data Integration; ETL |
Subjects: | Computer Science > Software engineering |
DCU Faculties and Centres: | Research Initiatives and Centres > INSIGHT Centre for Data Analytics DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 3.0 License. View License |
Funders: | This work is supported by Science Foundation Ireland under grant number [SFI/12/RC/2289] |
ID Code: | 22387 |
Deposited On: | 21 Nov 2018 10:08 by Michael John Scriney . Last Modified 23 Aug 2019 08:57 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
2MB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record