What is the convolutional neural network architecture and characteristics

Convolutional neural networks

Generally made up of a cross-stack of convolutional, convergent, and fully connected layers, trained using a backpropagation algorithm (backpropagation, revisited)

Convolutional neural networks have three structural properties: local connectivity, weight-sharing, and sub-sampling

filterfilter convolution kernel convolutionkernel

Local connectivity, which is actually a function that measures the correlation of two sequences based on time, with decreasing weights and finally 0 parameters that don’t propagate far

Local connectivity multiplied by a filter yields a feature mapping

Correlation, which is a function that measures the correlation of two sequences

The difference between correlation and convolution is that convolution kernel is the same as convolution kernel, but it’s not the same as convolution kernel. The difference between a convolution and a convolution is that the convolution kernel is simply flipped or not, so a mutual correlation can also be called a non-flip convolution

The use of a convolution is for the purpose of feature extraction, and whether or not the convolution kernel is flipped is irrelevant to its ability to extract features.

When the convolution kernel is a parameter that can be learned, convolution and mutual correlation are equivalent, so they are actually pretty much the same.

Tips: P is for feature mapping

34 – Convolutional Neural Networks (Conv)

Difference between Deep Learning Networks and Ordinary Neural Networks

Disadvantages of Fully Connected Neural Networks

Error Rates of Convolutional Neural Networks

The Evolution of Convolutional Neural Networks

Structure of Convolutional Neural Networks

Structural Characteristics:

The basic composition of neuralnetworks (neural networks) includes an input layer, a hidden layer, and an output layer. And the convolutional neural network is characterized by the hidden layer is divided into convolutional layer and poolinglayer (poolinglayer, also called downsampling layer).

Convolution process

Correction: the filter of the convolutional layer, is a matrix, the elements of which are the weights of each pixel point in the corresponding scan

i.e.: each filter produces a featuremap

The two ways to 0-padding

The convolution kernel is used in the extraction of the featuremap. The action when extracting the feature map is called padding (zero padding), due to the fact that the move step does not necessarily give the entire pixel width of the map. There are two ways to do this, SAME and VALID

Convolution process for color images

Since color images have 3 channels, i.e., 3 tables, the filters need to be observed separately in 3 passes, and the results of each pass are directly added up as the final result

Number of filters

There are a couple of filters that generate a couple of tables. eg:

For a picture of [28,28,1], if there are 32 filters, the result of the convolution will be [28,28,32], which is equivalent to the picture being “stretched”

Calculation of the size of the observation

Interviews may be examined

Note: If the result of the calculation appears to be a small number, you need to combine it with the result of the observation. the result of the calculation appears to be a decimal, you need to take into account the specific circumstances, rather than that directly rounded

Convolutionary api

In the convolutional neural network, the main use of the Relu function as an activation function

that is, in this case, using the relu function to remove the pixel in the value less than 0

Why do we need to use the activation function in the neural network

Why use relu and not sigmoid function anymore?


Convolution is to carry out feature extraction and observe more carefully, however, observing carefully means more data and more computation, which requires the use of a Pooling layer to reduce the amount of computation

The Pooling layer’s main role is feature extraction, which further reduces the number of parameters by removing unimportant samples from the FeatureMap. There are many Pooling methods, the most commonly used is MaxPooling.

The Pooling layer also has a window size (filter)

i.e. the pooling process makes the image more “narrow”

i.e. the convolution layer makes the image longer, the pooling layer makes the image narrower, so the image gets more “narrow” after the convolution, so the image gets more “narrow” after the convolution. After convolution, the picture becomes more and more “slender”


The calculation of SAME in pooling is the same as the calculation of SAME in convolution. eg:

The data of [None,28,28,32], after a 2×2 with a step size of 2, the padding is the SAME of the data. padding is SAME pooling, becomes [None,14,14,32]

Analysis: the previous convolution and pooling is equivalent to doing feature engineering, and the later fully connected is equivalent to doing feature weighting. The last fully connected layer plays the role of “classifier” in the whole convolutional neural network.

So the neural network is also equivalent to a feature selection method


Characteristics of Convolutional Neural Networks

Convolutional layers are characterized by sparse interactions, parameter sharing, and isotropic representation.

2. This structure allows convolutional neural networks to utilize the two-dimensional structure of the input data. Convolutional neural networks are able to give better results in image and speech recognition compared to other deep learning structures. This model can also be trained using back propagation algorithm. Convolutional neural networks require fewer parameters to be considered than other deep, feed-forward neural networks, making them an attractive deep learning structure.

Application areas:

1. Image recognition: convolutional neural networks are commonly used in the fields of image analysis and image processing. Closely related, the two have a certain degree of crossover, but different. Image processing focuses on signal processing aspects of research, such as image contrast adjustment, image coding, denoising, and a variety of filtering research.

2, image analysis is more focused on the study of the content of the image, including but not limited to the use of image processing techniques, it is more inclined to analyze the content of the image, interpretation, and identification. Thus, image analysis and computer science in the field of pattern recognition, computer vision, closer relationship.

Structure of Convolutional Neural Network

1, in other words, the most common convolutional neural network structure is as follows: INPUT-[[CONV-RELU]*N-POOL?]*M-[FC-RELU]*K-FC where * refers to the number of repetitions, and POOL? refers to an optional convergence layer.

2. Current convolutional neural networks are generally feed-forward neural networks consisting of a cross-stack of convolutional, convergence and fully connected layers, which are trained using a back-propagation algorithm. Convolutional neural networks have three structural properties: local connectivity, weight sharing, and convergence. These properties give the convolutional neural network some degree of translation, scaling, and rotation invariance.

3. ConvolutionalNeuralNetworks (CNNs) are feed-forward neural networks. Convolutional neural network is subject to the mechanism of biological ReceptiveField (ReceptiveField). Receptive field mainly refers to some properties of neurons in the auditory system, proprioceptive system and visual system.

Convolutional Neural Networks Commonly Understood

Convolutional Neural Networks are commonly understood as follows:

Convolutional Neural Networks (CNNs)-Structure

①CNNN structure generally contains these layers:

Input Layer: used for the input of the data

Convolutional Layer: uses convolution kernel for feature extraction and feature mapping

Excitation Layer: Since convolution is also a linear operation, nonlinear mapping needs to be added

Pooling layer: downsampling is performed to sparse the feature map and reduce the amount of data operations.

Fully Connected Layer: Usually re-fitting in the tail of CNN to reduce the loss of feature information

Output Layer: Used for outputting the result

②There are also some other functional layers in the middle that can be used:

Normalization Layer (BatchNormalization): Normalization of the features in the CNN

Slice Split Layer: separate learning of certain (image) data in separate regions

Fusion Layer: fusion of branches that learn features independently

Please click to enter a description of the image

Convolutional Neural Networks (CNNs) – Input Layer

1) The input format of the input layer of a CNN preserves the structure of the image itself.

②For a black-and-white 28×28 picture, the input to the CNN is a 28×28 two-dimensional neuron.

3) For a 28×28 picture in RGB format, the input to the CNN is a 3×28×28 three-dimensional neuron (each color channel in RGB has a 28×28 matrix)

2) Convolutional Neural Networks (CNNs)-Convolutional Layer

Feeling Horizons

1) In the Convolutional Layer there are several important concepts:



2 Assuming that the input is a 28×28 two-dimensional neuron, we define 5×5 localreceptivefields, i.e., the neurons in the hidden layer are the same as the 5×5 neurons in the input layer. neurons are connected to the 5×5 neurons in the input layer, and this 5×5 region is called LocalReceptiveFields,