Efficient Layer Optimizations for Deep Neural Networks
Shafayat Mowla Anik1, Kevyn Kelso2, Byeong Kil Lee3

1Shafayat Mowla Anik, Department of Electrical and Computer Engineering, University of Colorado Colorado Springs, 1420 Austin Bluffs Parkway, Colorado Springs, USA.

2Kevyn Kelso, Department of Electrical and Computer Engineering, University of Colorado Colorado Springs, 1420 Austin Bluffs Parkway, Colorado Springs, USA.

3Byeong Kil Lee, Department of Electrical and Computer Engineering, University of Colorado Colorado Springs, 1420 Austin Bluffs Parkway, Colorado Springs, USA.

Manuscript received on 25 September 2024 | Revised Manuscript received on 14 October 2024 | Manuscript Accepted on 15 November 2024 | Manuscript published on 30 November 2024 | PP: 20-29 | Retrieval Number: 100.1/ijsce.E365014051124 | DOI: 10.35940/ijsce.E3650.14051124

Open Access | Editorial and Publishing Policies | Cite | Zenodo | OJS | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Deep neural networks (DNNs) have technical issues such as long training time as the network size increases. Parameters require significant memory, which may cause migration issues for embedded devices. DNNs applied various pruning techniques to reduce the network size in deep neural networks, but many problems still exist when applying the pruning techniques. Among neural networks, several applications applied autoencoders for reconstruction and dimension reduction. However, network size is a disadvantage of autoencoders since the architecture of the autoencoders has a double workload due to the encoding and decoding processes. In this research, we chose autoencoders and two deep neural networks – AlexNet and VGG16 to apply out-of-order layer pruning. We perform the sensitivity analysis to explore the performance variations for the network architecture and network complexity through an out-of-order layer pruning mechanism. As a result of applying the proposed layer pruning scheme to the autoencoder, we developed the accordion autoencoder (A2E) and applied credit card fraud detection and MNIST classification. Our results show 4.9% and 13.6% performance drops, respectively, but we observe a significant reduction in network complexity, 85.1% and 94.5% for each application. We extend the out-of-order layer pruning to deeper learning networks. In our approach, we propose a simple yet efficient scheme, accuracy-aware structured filter pruning based on the characterization of each convolutional layer combined with the quantization of fully connected layers. We investigate the accuracy and compression rate of each layer using a fixed pruning ratio, and then the pruning priority is rearranged depending on the accuracy of each layer. Our analysis of layer characterization shows that the pruning order of the layers does affect the final accuracy of the deep neural network. Based on our experiments using the proposed pruning scheme, the parameter size in the AlexNet can be up to 47.28x smaller than the original model. We also obtained comparable results for VGG16, achieving a maximum compression rate of 35.21x.

Keywords: Deep Neural Network; Machine Learning; Filter Pruning; Network Compression; Layer Pruning.
Scope of the Article: Machine and Knowledge Learning