Structure of Convolutional Neural Networks
The basic structure of a convolutional neural network consists of the following parts: an input layer, a convolutional layer, a pooling layer, an activation function layer and a fully connected layer.
Convolutional Neural Networks (CNN) are a class of feedforward neural networks that contain convolutional computation and have a deep structure, and are one of the representative algorithms of deeplearning.
Convolutional neural networks have the ability of representation learning and can perform shift-invariant classification of input information according to their hierarchical structure, so they are also known as “Shift-InvariantArtificial Neural Networks” (Shift-InvariantArtificial Neural Networks). Shift-Invariant Artificial Neural Networks (SIANN)”.
Research on convolutional neural networks began in the 1980s and 1990s, and time-delay networks and LeNet-5 were the first convolutional neural networks to appear; in the twenty-first century, with the introduction of the theory of deep learning and the improvement of numerical computation equipment, convolutional neural networks have been developed rapidly, and have been applied to computer vision, natural language processing and other fields.
Convolutional neural networks are modeled after the visualperception mechanism of living creatures, and can be used for both supervised and unsupervised learning. The sharing of convolutional kernel parameters within the implicit layers and the sparsity of inter-layer connections enable convolutional neural networks to be able to use a small amount of computation on grid-liketopology features.
Connections between convolutional layers in a convolutional neural network are known as sparseconnection, i.e., neurons in a convolutional layer are connected to only some, but not all, of their neighboring layers, as opposed to full connectivity in a feedforward neural network.
Specifically, any pixel in the feature map of layer l of a convolutional neural network is only a linear combination of pixels within the receptive field defined by the convolutional kernel in layer l-1. The sparse connections of the convolutional neural network have a regularization effect, which improves the stability and generalization of the network structure and avoids overfitting.
Convolutional Neural Networks Commonly Understood
Convolutional Neural Networks are commonly understood as follows:
Convolutional Neural Networks (CNNs)-Structure
①CNNN structure generally contains these layers:
Input Layer: used for the input of the data
Convolutional Layer: uses convolution kernel for feature extraction and feature mapping
Excitation Layer: Since convolution is also a linear operation, nonlinear mapping needs to be added
Pooling layer: downsampling is performed to sparse the feature map and reduce the amount of data operations.
Fully Connected Layer: Usually re-fitting in the tail of CNN to reduce the loss of feature information
Output Layer: Used for outputting the result
②There are also some other functional layers in the middle that can be used:
Normalization Layer (BatchNormalization): Normalization of the features in the CNN
Slice Split Layer: separate learning of certain (image) data in separate regions
Fusion Layer: fusion of branches that learn features independently
Please click to enter a description of the image
Convolutional Neural Networks (CNNs) – Input Layer
1) The input format of the input layer of a CNN preserves the structure of the image itself.
②For a black-and-white 28×28 picture, the input to the CNN is a 28×28 two-dimensional neuron.
3) For a 28×28 picture in RGB format, the input to the CNN is a 3×28×28 three-dimensional neuron (each color channel in RGB has a 28×28 matrix)
2) Convolutional Neural Networks (CNNs)-Convolutional Layer
1) In the Convolutional Layer there are several important concepts:
2 Assuming that the input is a 28×28 two-dimensional neuron, we define 5×5 localreceptivefields, i.e., the neurons in the hidden layer are the same as the 5×5 neurons in the input layer. neurons are connected to the 5×5 neurons in the input layer, and this 5×5 region is called LocalReceptiveFields,