What network components are included in a convolutional neural network

Structure of Convolutional Neural Networks

The basic structure of a convolutional neural network consists of the following parts: input layer, convolutional layer, pooling layer, activation function layer and fully connected layer.

Convolutional Neural Networks (CNN) are a class of feedforward neural networks that contain convolutional computation and have a deep structure, and are one of the representative algorithms of deeplearning.

Convolutional neural networks have the ability of representation learning and can perform shift-invariant classification of input information according to their hierarchical structure, so they are also known as “Shift-InvariantArtificial Neural Networks” (Shift-InvariantArtificial Neural Networks). Shift-Invariant Artificial Neural Networks (SIANN)”.

Research on convolutional neural networks began in the 1980s and 1990s, and time-delay networks and LeNet-5 were the first convolutional neural networks to appear; in the twenty-first century, with the introduction of the theory of deep learning and the improvement of numerical computation equipment, convolutional neural networks have been developed rapidly, and have been used in computer vision, natural language processing and other fields.

Convolutional neural networks are modeled after the visualperception mechanism of living creatures, and can be used for both supervised and unsupervised learning. The sharing of convolutional kernel parameters within the implicit layers and the sparsity of inter-layer connections enable convolutional neural networks to be able to use a small amount of computation on grid-liketopology features.


Connections between convolutional layers in a convolutional neural network are known as sparseconnection, i.e., neurons in a convolutional layer are connected to only some, but not all, of their neighboring layers, as opposed to full connectivity in a feedforward neural network.

Specifically, any pixel in the feature map of layer l of a convolutional neural network is only a linear combination of pixels within the receptive field defined by the convolutional kernel in layer l-1. The sparse connections of the convolutional neural network have a regularization effect, which improves the stability and generalization of the network structure and avoids overfitting.

(7) Basic Structure of Convolutional Neural Network

The main structures of a convolutional neural network are: convolutional layer, pooling layer, and fully connected layer. A convolutional neural network is formed by stacking these layer structures. The original image is transformed into a category score, where the convolutional and fully connected layers have parameters, and the activation and pooling layers have no parameters. Parameter updates are realized by back propagation.

(1) Convolutional Layer

A convolutional kernel is a series of filters that are used to extract a certain kind of feature

We use it to process an image, and the convolutional operation yields a comparatively larger value when the image features are similar to those represented by the filter.

When the image features are not similar to the filter, the convolution operation can get a smaller value, and in fact, the resulting feature mapping map of the convolution shows the distribution of the features represented by the corresponding convolution kernel on the original feature map.

Each filter is spatially (width and height) smaller, but the depth is kept the same as the input data (the number of channels in the feature map), and as the convolution kernel slides over the original image, a two-dimensional activation map is generated, with each spatial location on the activation map representing the original image’s response to that convolution kernel. For each convolutional layer, there will be an entire collection of convolutional kernels, and the output will have as many channels as there are convolutional kernels. Each convolutional kernel generates a feature map, which are stacked to make up the entire output.

Convolutional kernels embody a pattern of parameter sharing and local connectivity. The size of each convolutional kernel represents the size of a receptive field.

The size of the feature map after convolution is (W-F+2*P)/s+1; P is the padding s is the step size

(2) Pooling Layer

The pooling layer is essentially downsampling, which utilizes the principle of local correlation in an image (the maxima or the means are thought to represent the features of this localization) to subsampling, which can reduce the amount of data processing while retaining useful information. Here pooling has average pooling, L2 paradigm pooling, maximum pooling, after practice, the effect of maximum pooling is better than the average pooling (average pooling is generally placed in the last layer of the convolutional neural network), maximum pooling is conducive to the preservation of texture information, average pooling is conducive to the preservation of background information. In fact (because of the loss of information) we can see that the size of the feature mapping can also be reduced by using a larger step size in the convolution, not necessarily by pooling, and there are many people who do not recommend the use of pooling layers. 32*32 in a 5*5 convolution kernel with a step size of 1 yields 28*28.

Pooling gradually reduces the spatial size of the dataset, which reduces the number of parameters in the network. The number of parameters in the network can be reduced, which makes the computational resources consumed less, and can also effectively control the overfitting.

(3) Fully Connected Layer

The feature map is converted into category output through the fully connected layer. There is more than one fully connected layer, and DropOut is introduced in the process to prevent overfitting. recent research has shown that using global average pooling before entering the fully connected layer can effectively reduce overfitting.

(4) Batch Normalization BN – BatchNormal

As the neural network training proceeds, the parameter change of each hidden layer makes the input of the latter layer change, and thus the distribution of the training data in each batch changes, resulting in the network changing in each iteration. The network needs to fit a different data distribution in each iteration, increasing the training complexity and risk of overfitting, which can only be solved by using a smaller learning rate.

Usually the convolutional layer is followed by a BN layer plus Relu. BN is already a standard technique in convolutional neural networks. The standardization is microscopic, so BN can be applied to each layer to do forward and back propagation, with the same being connected after the convolutional or fully connected layer and before the nonlinear layer. It is robust to bad initialization and at the same time can speed up network convergence.

(5) DropOut

Dropout for a certain layer of neurons, through the definition of the probability to randomly delete some neurons, while keeping the number of neurons in the input layer and the output layer unchanged, and then according to the learning method of the neural network for parameter updating, and then in the next iteration, to randomly delete some neurons again until the end of training. .

(6) softmax layer

Softmax layer also does not belong to a separate layer in the CNN, generally to use the CNN to do the classification, we are accustomed to the way is to turn the output of the neuron into a probabilistic form, Softmax is to do this: . Obviously all the outputs of the Softmax layer add up to 1, and according to the size of this probability to determine exactly which category it belongs to.

What are the types of cnn

What are the types of CNN

CNN refers to ConvolutionalNeuralNetwork, which is an important algorithm in the field of artificial intelligence. It has been used in various fields, such as computer vision, speech recognition and natural language processing. So, what kinds of CNN are there? This article will provide you with a detailed introduction.

1. Conventional Convolutional Neural Network

Conventional Convolutional Neural Network is a network consisting of several convolutional layers, pooling layers and fully connected layers. The convolutional layers are mainly used to extract features from the image, the pooling layer is used to reduce the size of the feature map, and the fully connected layer is used to classify the features. Conventional convolutional neural networks can be used in various applications such as image classification, target detection and image segmentation.

2. Residual Network

ResidualNeuralNetwork was proposed by KaimingHe et al. from Microsoft Research. Its main idea is to introduce the “residual block”, by letting the output of the network and the input to establish a direct mapping relationship, to solve the problem of gradient disappearance in some deep network. Residual networks can greatly improve the accuracy of deep neural networks, and have been widely used in various applications.

3. Interpretability Methods for Convolutional Neural Networks

Interpretability of convolutional neural networks has been one of the hotspots of research. In many practical applications, people need to know how the network makes decisions in order to better understand and interpret the results. Currently there are mainly two kinds of interpretability methods: one is based on gradient, such as Grad-CAM; the other is based on the internal features of the network, such as ActivationAtlas.These methods have been widely used in computer vision, medical image processing and other fields.

4. Convolutional Neural Networks in Target Detection

The application of convolutional neural networks in target detection is one of its important research areas. Target detection is an important problem in computer vision, and its main task is to locate and recognize objects in images. Currently, there are two main types of commonly used target detection methods: region-based methods and frame-based methods. In recent years, the development of deep learning technology has led to the widespread application of frame-based methods in target detection, such as YOLO and FasterR-CNN, which are convolutional neural network-based target detection methods.

5. Convolutional Neural Networks in Natural Language Processing

Besides in the field of computer vision, convolutional neural networks are also widely used in natural language processing. Convolutional neural networks can extract local features of text by performing convolutional operations on the text, and can be reduced in dimension by pooling layers. Convolutional neural networks have been widely used in tasks such as sentiment classification, text categorization and machine translation.

6. Summary

In summary, convolutional neural network is an important algorithm in the field of artificial intelligence, which has been widely used in various fields. In addition to conventional convolutional neural networks, there are residual networks, interpretable methods for convolutional neural networks, etc. In the field of computer vision, convolutional neural networks are widely used in target detection; and in natural language processing, the application of convolutional neural networks is also getting more and more attention.