In the era of data-intensive edge computing, the orchestration of Data Distributed Inferencing (DDI) tasks poses a formidable challenge, demanding real-time adaptability to varying network conditions and compute resources. This study introduces an innovative approach to address this challenge, leveraging Gradient Boosting Regression (GBR) as the core predictive modeling technique. The primary objective is to estimate inferencing time based on crucial factors, including bandwidth, compute device type, and the number of compute nodes, allowing for dynamic task placement and optimization in a DDI environment. Our model employs an online learning framework, continuously updating itself as new data streams in, enabling it to swiftly adapt to changing conditions and consistently deliver accurate inferencing time predictions. This research marks a significant step forward in enhancing the efficiency and performance of DDI systems, with implications for real-world applications across various domains, including IoT, edge computing, and distributed machine learning.
KEYWORDS: Data modeling, Data compression, Internet of things, Machine learning, Performance modeling, Random forests, Instrument modeling, Cross validation, Image compression, Mathematical optimization
Internet of Things (IoT) devices have communication, computation, size, weight, battery life, and power consumption limitations. Machine learning (ML) algorithms on IoT devices suffer from limited computational resources, resulting in poor performance. Techniques such as model pruning, offloading, and data compression can improve computational, network, and storage costs as well as performance at the expense of inference quality. This study examines the performance of ML inferencing tasks and adaptations on IoT devices using statistical learning methods such as ridge regression and random forest. We aim to understand the trade-offs of inference adaptations in a range of operational regimes including constraints on available bandwidth, distance from the data, and constraints on compute. Our results indicate that the task configuration, offload or on-device, and the bandwidth available are the most critical factors in determining inference performance, while the percentage of model pruned is the least important. These findings demonstrate how statistical learning can be used to better understand the effects of task performance on IoT device and offers insights into which inference adaptations provide the largest improvements in ML inference tasks to support real-time requirements.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.