Convolutional Squeeze and Excitation Networks

Squeeze-and-Excitation (SE) networks improve CNN feature learning by channel-wise attention, but their global pooling strategy discards spatial context. In this work, we reinterpret the SE block’s excitation mechanism as a convolution operation, which leads to a novel patched pooling design. Instead of global average pooling, we propose to divide feature maps into patches and pool within each patch, preserving local spatial information for attention. The excitation step is implemented with 1x1 convolutions (replacing the original SE fully-connected layers), enabling the model to learn adaptive channel reweighting efficiently across those patches. This Convolutional Squeeze-and-Excitation (CSE) approach yields spatially aware feature recalibration with minimal overhead. We evaluate CSE across multiple CNN architectures (including a custom ConvNet and ResNet) on image classification tasks (Fashion-MNIST, CIFAR-10). The results show consistent accuracy improvements over standard SE blocks. Moreover, we demonstrate the generality of patched pooling by integrating it with other attention modules like Efficient Channel Attention (ECA) and Global Context (GC), achieving further gains. Our findings highlight that incorporating localized pooling in SE-style attention significantly enhances representation learning across diverse scenarios.

About this item

Supplemental Files

Loading...
Current image, full-size
Current image, reduced-size
Other download options: