Energy decay and conservation in deep convolutional neural networks
AuthorsPhilipp Grohs, Thomas Wiatowski, and Helmut Bölcskei
ReferenceProc. of IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, pp. 1356-1360, June 2017.
AbstractMany practical machine learning tasks employ very deep convolutional neural networks. Such large depths pose formidable computational challenges in training and operating the network. It is therefore important to understand how many layers are actually needed to have most of the input signal's features be contained in the feature vector generated by the network. This question can be formalized by asking how quickly the energy contained in the feature maps decays across layers. In addition, it is desirable that none of the input signal's features be "lost'' in the feature extraction network or, more formally, we want energy conservation in the sense of the energy contained in the feature vector being proportional to that of the corresponding input signal. This paper establishes conditions for energy conservation for a wide class of deep convolutional neural networks and characterizes corresponding feature map energy decay rates. Specifically, we consider general scattering networks, and find that under mild analyticity and high-pass conditions on the filters (which encompass, inter alia, various constructions of Weyl-Heisenberg filters, wavelets, ridgelets, alpha-curvelets, and shearlets) the feature map energy decays at least polynomially. For broad families of wavelets and Weyl-Heisenberg filters, the guaranteed decay rate is shown to be exponential. Our results yield handy estimates of the number of layers needed to have at least ((1-epsilon) x 100)% of the input signal energy be contained in the feature vector.
KeywordsMachine learning, energy decay and conservation, deep convolutional neural networks, scattering networks, frame theory
Download this document:
Copyright Notice: © 2017 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.