For example, in face recognition, CNN does not take care of placements of eyes, nose, mouth, lips etc. Even if lips are near to eyes or eyes are below the mouth, it will still consider it a face. If all the features or components of face are available, it will consider it as a face without taking care of the orientation and placement of those components. Capsule networks take care of this.
I have written a separate post on CNN. Please go through it for detailed information on CNN.
Pooling layer problem in CNN: Pooling layer is used to perform down-sampling the data due to which a lot of information is lost. These layers reduce the spatial resolution, so their outputs are invariant to small changes in the inputs. This is a problem when detailed information must be preserved throughout the network. With CapsNets, detailed pose information (such as precise object position, rotation, thickness, skew, size, and so on) is preserved throughout the network. Small changes to the inputs result in small changes to the outputs—information is preserved. This is called "equivariance."
Capsule: Human brain is organized into modules called capsules. Considering this fact, concept of capsule was put forward by Hilton. A capsule can be considered as a group of neurons. We can add as many neurons to a capsule to capture different dimensions of an image like scale thickness, stroke thickness, width, skew, translation etc. It can maintain information such as equivariance, hue, pose, albedo, texture, deformation, speed, and location of the object.
Dynamic Routing Algorithm: Human brain has a mechanism to route information among capsules. On similar mechanism, dynamic routing algorithm was suggested by Hilton. This algorithm allows capsules to communicate with each other. For more details, please visit this article:
Dynamic Routing Between Capsules
Squashing Function: Instead of ReLU, a new squashing function was suggested by Hilton known as novel squashing function. It is used to normalize the magnitude of vectors so that it falls between 0 and 1. The outputs from these squash functions tell us how to route data through various capsules that are trained to learn different concepts.
Limitations of Capsule Neural Networks
1. As compared to the CNN, the training time for the capsule network is slower because of its computational complexity.
2. It has been tested over MNIST dataset, but how will it behave on complex dataset, is still unknown.
3. This concept is still under research. So, it has a lot of scope for improvement.
I would suggest to go through this PDF for more details on Capsule Neural Networks.