How do CNNs achieve translational equivariance and why is it not full invariance?
Updated May 15, 2026
Short answer
CNNs are translation-equivariant because shifting input shifts feature maps, but pooling and architecture choices only approximate full invariance.
Deep explanation
Convolution operations ensure that if an object moves in the input image, its feature map representation shifts accordingly—this is translational equivariance. However, full invariance (same output regardless of position) is not guaranteed because CNNs still preserve spatial structure. Pooling layers and global average pooling introduce partial invariance by reducing spatial sensitivity, but complete invariance would discard useful spatial information.
Unlock with a Pro subscription to view this section.
View pricingReal-world example
No real-world example available yet.
Unlock with a Pro subscription to view this section.
Upgrade to ProCommon mistakes
No common mistakes listed yet.
Unlock with a Pro subscription to view this section.
Upgrade to ProFollow-up questions
No follow-up questions available yet.
Unlock with a Pro subscription to view this section.
Upgrade to Pro