Abstract

DNN inference is crucially employed in applications ranging from real-time speech recog- nition and language translation to autonomous vehicle navigation and personalized content recommendation, enabling applications to make quick, data-driven decisions. Consequently, these models are increasingly being deployed on edge devices, facilitating ubiquitous com- puting and enabling closer interaction with end users. More often than not, edge devices are constrained in computation resources and often run in battery-operated environments, which makes sustainable deployment of deep learning models essential. This work presents an evaluation of the energy impact of device and workload parameters for DNN inference on edge devices. Our work strives to study the impact of energy and power for inference on 6 different deep-learning workloads for two different Jetson Class edge devices. We identified various device and workload parameters that affect the energy consumption of DNN workload execution during inference and showed how tuning these parameters can lead to energy savings without compromising accuracy. This work provides valuable insights for researchers and practitioners working on sustainable edge deployments of DNN workloads. The experiments were conducted on Jetson Nanos and Orins which are commonly used System-on-Chip edge devices with onboard GPUs for DNN inference. Both the devices offer a large number of operating configurations which complicates the problem of finding the optimal operating configuration. We ran our tests on device parameters such as CPU and GPU frequency and workload parameters such as batch size, number of layers, model initialization, and quantization. We also compared against Dynamic Voltage and Frequency Scaling (DVFS) governors, the default power management technique used in computing that adjusts the voltage and frequency of a processor’s operations dynamically based on workload demand to conserve energy. Our results indicate that tuning the CPU and GPU frequency can lead to significant energy savings to the tune of 19% over DVFS. We also found that workload parameters such as batch size and model settings such as graph initialization can impact inference energy.

Year

12-4-2023

Document Type

Dissertation

Keywords

Energy efficiency, edge computing, DNN inference, measurements

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

Advisor

Anshul Gandhi

Share

COinS