Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Imbal-OL: Online Machine Learning from Imbalanced Data Streams in Real-world IoT

Sudharsan, Bharath orcid logoORCID: 0000-0001-5906-113X, Breslin, John G. orcid logoORCID: 0000-0001-5790-050X and Ali, Muhammad Intizar orcid logoORCID: 0000-0002-0674-2131 (2021) Imbal-OL: Online Machine Learning from Imbalanced Data Streams in Real-world IoT. In: 2021 IEEE International Conference on Big Data (Big Data), 15-18 December 2021, Orlando, FL, USA.

Abstract
Typically a Neural Networks (NN) is trained on data centers using historic datasets, then a C source file (model as a char array) of the trained model is generated and flashed on IoT devices. This standard process impedes the flexibility of billions of deployed ML-powered devices as they cannot learn unseen/fresh data patterns (static intelligence) and are impossible to adapt to dynamic scenarios. Currently, to address this issue, Online Machine Learning (OL) algorithms are deployed on IoT devices that provide devices the ability to locally re-train themselves - continuously updating the last few NN layers using unseen data patterns encountered after deployment. In OL, catastrophic forgetting is common when NNs are trained using non-stationary data distribution. The majority of recent work in the OL domain embraces the implicit assumption that the distribution of local training data is balanced. But the fact is, the sensor data streams in real-world IoT are severely imbalanced and temporally correlated. This paper introduces Imbal-OL, a resource-friendly technique that can be used as an OL plugin to balance the size of classes in a range of data streams. When Imbal-OL processed stream is used for OL, the models can adapt faster to changes in the stream while parallelly preventing catastrophic forgetting. Experimental evaluation of Imbal-OL using CIFAR datasets over ResNet-18 demonstrates its ability to deal with imperfect data streams, as it manages to produce high-quality models even under challenging learning settings
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Uncontrolled Keywords:IoT Devices; TinyML; Online Learning; Imbalanced Data; Class Balancing
Subjects:Engineering > Electronic engineering
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Electronic Engineering
Published in: 2021 IEEE International Conference on Big Data (Big Data). . IEEE.
Publisher:IEEE
Official URL:https://doi.org/10.1109/BigData52589.2021.9671765
Copyright Information:© 2021 IEEE
ID Code:27222
Deposited On:18 Apr 2023 15:32 by Muhammad Intizar Ali . Last Modified 18 Apr 2023 15:32
Documents

Full text available as:

[thumbnail of Imbal-ol.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution-Noncommercial-Share Alike 4.0
478kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record