Towards Deployment of Deep Neural Networks on Resource-constrained Embedded Systems

Author	: Boyu Zhang
Publisher	:
Total Pages	: 0
Release	: 2019
ISBN-10	: OCLC:1126812113
ISBN-13	:
Rating	: 4/5 ( Downloads)

DOWNLOAD EBOOK

Book Synopsis Towards Deployment of Deep Neural Networks on Resource-constrained Embedded Systems by : Boyu Zhang

Download or read book Towards Deployment of Deep Neural Networks on Resource-constrained Embedded Systems written by Boyu Zhang and published by . This book was released on 2019 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Deep Neural Network (DNNs) have emerged as an important computational structure that facilitate important tasks such as speech and image recognition, autonomous vehicles, etc. In order to achieve better performance, such as higher classification accuracy, modern DNN models are designed to be more complex in terms of network structure and larger in terms of number of weights in the model. This imposes a great challenge for realizing DNN models on computation devices, especially those resource-constrained devices such as embedded and mobile systems. The challenge arises from three aspects: computation, memory, and energy consumption. First, the number of computations per inference required by modern large and complex DNN models is huge, whereas the computation capability available in the given systems may not be as powerful as a modern GPU or a dedicated processing unit. So, accomplishing the required computation within certain latency is an open challenge. Second, the conflict between the limited on-board memory resource and the static/run-time memory requirement of large DNN models also need to be resolved. Third, the very energy-consuming inference process places a heavy burden on edge devices' battery life. Since the majority of the total energy is consumed by data movement, the goal is not only to fit the DNN model into the system but also to optimize off-chip memory access in order to minimize energy consumption during inference. This dissertation aims to make contributions towards efficient realizations of DNN models on resource-constrained systems. Our contributions can be categorized into three aspects. First, we propose a structure simplification procedure that can identify and eliminate redundant neurons in any layer of a trained DNN model. Once the redundant neurons are identified and removed, the corresponding edges connected to those neurons will be eliminated as well. Then the new weight matrix is calculated directly by our procedure, while retraining may be applied to further recover the lost accuracy if necessary. We also propose a high-level energy model to better explore the tradeoffs in the design space during neuron elimination. Since both the neurons and their edges are eliminated, the memory and energy requirements are also get alleviated. Furthermore, the procedure also allows exploring the tradeoff between model performance and implementation cost. Second, since the convolutional layer is the most energy-consuming and computation heavy layer in Convolutional Neural Networks (CNNs), we propose a structural pruning technique to prune the input channels in convolutional layers. Once the redundant channels are identified and removed, the corresponding convolutional filters will be pruned as well. There significant reduction in static/run-time memory, computation, and energy consumption can be achieved. Moreover, the resulting pruned model is more efficient in terms of network architecture rather than specific weight values, which makes the theoretical reductions of implementation cost much easier to be harvested by existing hardware and software. Third, instead of blindly sending data to cloud and relying on cloud to perform inference, we propose to utilize the computation power of IoT devices to accomplish deep learning tasks while achieving higher degree of customization and privacy level. Specifically, we propose to incorporate a small-sized local customized DNN model to work with a large-sized general DNN model by using a "Mixture of Experts" architecture. Therefore, with minimal implementation overhead, the customized data can be handled by the small-sized DNN to achieve better performance without compromising the performance on general data. Our experiments show that the MoE architecture outperforms popular alternatives such as fine-tuning, bagging, independent ensemble, and multiple choice learning

Towards Deployment Of Deep Neural Networks On Resource Constrained Embedded Systems

Towards Deployment of Deep Neural Networks on Resource-constrained Embedded Systems

Towards Deployment of Deep Neural Networks on Resource-constrained Embedded Systems Related Books

Towards Deployment of Deep Neural Networks on Resource-constrained Embedded Systems

Embedded Deep Learning

Embedded Artificial Intelligence

Artificial Neural Networks and Machine Learning – ICANN 2019: Workshop and Special Sessions

Advances in Signal Processing and Intelligent Recognition Systems