Abstract
The deep neural network (DNN) has been widely applied in multiple application domains. However, conducting the model inference on large DNN models in the data center servers consumes substantial energy. The resource-constrained edge device often offloads model inference to remote data center servers to gain the acceleration of the model inference execution. Such a data offloading method also increases security threats because of untrustworthy network connections. To mitigate these problems, Edge AI aims to squeeze the DNN model on resource-constrained edge devices and achieves Green AI computing by lowering the energy consumption of model inference through small DNN models. Unlike desktop computers and servers, edge devices often limit their memory usage to reduce the hardware price and energy consumption. This limitation raises significant challenges when deploying DNN models on edge devices. Consequently, this talk will introduce our recent research on the memory-efficient model compilation for edge AI inference that reduces memory usage and improves the data reuse rate of the DNN model to accelerate model inference while lowering the energy consumption on edge devices. Finally, this talk will discuss future work and open challenges for edge AI systems and hardware.
Bio
Tsung Tai Yeh is an associate professor of computer science at National Yang Ming Chiao Tung University, Taiwan. He obtained a Ph.D. from the electrical computer engineering school at Purdue University, USA. His research work spans computer architecture, computer systems, and programming languages. He received the Lynn Fellowship at Purdue University and worked at AMD research. His compiler research work was also nominated for the Best Paper Award at the PPoPP conference and was published in multiple top-ranking conference proceedings (ISCA, ASPLOS, HPCA, PPoPP, NeuraIPS).