Structured Convolutions for Efficient Neural Network Design

Yash Bhalgat, Yizhe Zhang, Jamie Lin, Fatih Porikli (or view on SciRate)

Main idea

We introduce a neat trick to enable the execution of convolution operations in the form of efficient, scaled, sum-pooling components. We present a Structural Regularization loss that enables this decomposition with negligible performance loss. Our method is competitive with other tensor decomposition and structured pruning methods.


Deep neural networks deliver outstanding performance across a variety of use-cases but quite often fail to meet the computational budget requirements of mainstream devices. Hence, model efficiency plays a key role in bridging deep learning research into practice. Various model compression techniques rely on a key assumption that the deep networks are over-parameterized, meaning that a significant proportion of the parameters are redundant. This redundancy can appear either explicitly or implicitly. In the former case, several structured, as well as unstructured, pruning methods have been proposed to systematically remove redundant components in the network and improve run-time efficiency. On the other hand, tensor-decomposition methods based on singular values of the weight tensors, such as spatial SVD or weight SVD, remove somewhat implicit elements of the weight tensor to construct low-rank decompositions for efficient inference.

In this work, we explore restricting the degrees of freedom of convolutional kernels by imposing a structure on them. This structure can be thought of as constructing the convolutional kernel by super-imposing several constant-height kernels. A few examples are shown in Fig. 1 of the paper, where a kernel is constructed via superimposition of M linearly independent masks with associated constant scalars αm, hence leading to M degrees of freedom for the kernel. The very nature of the basis elements as binary masks enables efficient execution of the convolution operation as explained in Sec. 3.1 of the paper.


By applying our method to a wide range of CNN architectures, we demonstrate ‘structured’ versions of the ResNets that are up to 2× smaller and a new Structured-MobileNetV2 that is more efficient while staying within an accuracy loss of 1% on ImageNet and CIFAR-10 datasets. We also show similar structured versions of EfficientNet on ImageNet and HRNet architecture for semantic segmentation on the Cityscapes dataset. Our method performs equally well or superior in terms of the complexity reduction in comparison to the existing tensor decomposition and channel pruning methods.

tags: Efficient Deep Learning  Model Compression