### What is the normal distribution?

Standard normal distribution density function: f(x)=(1/√2π)exp(-x^2/2). And where exp(-x^2/2) is the -x^2/2 power of e. Its domain of definition is (-∞, +∞), and it can be seen from the probability density expression that f(x) is an even function, i.e. the image of f(x) is symmetric about the y-axis.

Φ(x) is defined as the distribution function of a random variable X obeying a standard normal distribution, and its value is the integral of f(x) with respect to x, which accumulates from -∞ to x. From the image of f(x), the value of Φ(x) corresponds to the area of the section (-∞, x) under the curve of f(x) and above the curve of the x-axis, and the area is (-∞, x). Since f(x) is an even function and has the distribution function property Φ(+∞)=1, you can find Φ(0)=0.5.

Properties of Probability Density Function of Normal Distribution

Concentration: the peak of the normal curve is located in the center, where the mean is located.

Symmetry: the normal curve is symmetrical from side to side with the mean at the center, and the ends of the curve never intersect the horizontal axis.

Uniform variability: the normal curve starts from the place where the mean is located, and decreases gradually and uniformly to the left and right sides respectively. The area between the curve and the horizontal axis is always equal to 1, which corresponds to the probability that the function of the probability density function integrates from positive infinity to negative infinity with a probability of 1. That is, the frequencies sum to 100%.

### What is the nature of the probability density function?

EX=4/3, DX=2/9, P{|X-EX|<DX}=8/27.

Calculation:

EX=∫(0,2)x*(x/2)dx

=∫(0,2)x^2/2dx

=x^3/6|(0,2)

=4/ 3

DX=EX^2-EXEX-(4/3)*(4/3)

=∫(0,2)x^3/2dx-16/9

=x^4/8|(0,2)-16/9=2/9

P{|X-4/3|<2/9}=∫(10/9,14/9)x/2dx=8 /27

Extended information:

Properties of probability densities:

Nonnegativity:

Normality:

These two basic properties can be used to determine whether or not a function is the probability density function of a particular continuous-type random variable.

Properties of Expectation:

### Properties of Probability Density Functions

Nonnegativity: f(x) ≥ 0, x ∈ (-∞, +∞). Normality: ∫f(x)dx=1. These two basic properties can be used to determine whether a function is a probability density function of some continuous random variable.

In mathematics, the probability density function of a continuous random variable (which can be abbreviated to density function when not confusing) is a function that describes the likelihood that the output value of this random variable will be in the vicinity of some definite point of value.

And the probability that the value of the random variable falls within a certain region is the integral of the probability density function over that region. When a probability density function exists, the cumulative distribution function is the integral of the probability density function. The probability density function is usually labeled in lower case.

There is no practical significance in speaking of the probability density alone; it must be predicated on a definite bounded interval. You can think of the probability density as a vertical coordinate, the interval as a horizontal coordinate, the integral of the probability density over the interval is the area, and this area is the probability that an event will occur in this interval, and the sum of all the areas is 1. So analyzing the probability density of a point in isolation doesn’t make any sense, and it has to have the interval as a reference and comparison.

### Normal Distributions in High School, So Important

We have been learning about normal distributions since high school, and now we can’t do data analytics and machine learning without it, so have you ever wondered what’s so special about normal distributions? Why so many articles about data science and machine learning are centered around the normal distribution? The author of this article has dedicated an article to try to clarify the concept of normal distribution in an easy-to-understand way.

The world of machine learning is centered around probability distributions, and at the heart of probability distributions is the normal distribution. This article explains what a normal distribution is and why it is so widely used, especially for data scientists and machine learning experts.

We’ll start by explaining the basics so that readers can understand why the normal distribution is so important.

Catalog:

Unsplash, published by timJ.

Let’s start with a little background:

1. First, the most important thing to note is that the normal distribution is also known as the Gaussian distribution.

2. It is named after the genius CarlFriedrich Gauss.

3. Finally, it is important to note that simple predictive models are generally the most commonly used models because they are easy to explain and easy to understand. Now add this: the normal distribution is popular because of its simplicity.

Thus, the normal probability distribution is well worth the time it takes to understand.

What is a probability distribution?

Imagine we’re building predictive models of interest in our own data science projects:

The higher the probability, the more likely it is that the event will occur.

Unsplash, posted by BrettJordan

As an example, we could repeat an experiment a large number of times and record the values of the variables we retrieve so that the probability distributions slowly unfold in front of us.

Each experiment produces a value, and these values can now be assigned to categories/buckets. For each bucket, we can record the number of times the variable value appears in the bucket. For example, we could throw the dice 10,000 times, and each dice throw would produce 6 possible values, we could create 6 buckets. And record the number of times each value appears.

We can make a graph based on these values. The curve made is a probability distribution curve; the probability of getting a value for the target variable is the probability distribution for that variable.

After understanding how the values are distributed, we can start estimating the probability of an event, even using the formula (probability distribution function). As a result, we can better understand its behavior. The probability distribution depends on the moments of the sample, such as the mean, standard deviation, skewness and kurtosis. If all probabilities are summed, the total sum is 100%.

There are many probability distributions in the real world, the most commonly used being the ‘normal distribution’.

What is a Normal Probability Distribution

If you graph a probability distribution and get an inverted bell curve where the mean, the plurality, and the median of the samples are equal, then the variable is normally distributed.

This is an example of a normally distributed bell curve:

The above is a graph of a Gaussian distribution of a variable, with millions of parameters like a neural network, each with its own independent distribution shape, and an extremely scary joint distribution shape. This high-dimensional joint distribution then dominates the performance of different tasks, so it’s important to understand and estimate the probability distribution of the target variable.

The following variables are very close to a normal distribution:

1. the height of the population

2. the blood pressure of the adults

3. the position of the diffused particles

4. the measurement error

5. the shoe size of the population

6. the time it takes for the employees to get home

Additionally, most of the surrounding us variables are normally distributed with a confidence level of x% (x<100). So, pretty much all of the various variables that occur frequently in life can be described by a Gaussian distribution.

Good understanding of the normal distribution

The normal distribution is a distribution that relies on only two parameters in the data set: the mean and the standard deviation of the sample.

This property of the distribution saves statisticians a lot of work, so the accuracy of predicting any variable that is normally distributed is usually very high. It’s worth noting that once you’ve studied the probability distributions of most variables in nature, you’ll find that they all roughly follow a normal distribution.

The normal distribution is well explained. Because:

1. the mean, the plurality, and the median of the distribution are equal;

2. we can explain the entire distribution simply by using the mean and the standard deviation.

Why do so many variables approximate a normal distribution?

Why is it that when there are a lot of samples, then there will always be a bunch of samples that are all very common? There is a theorem behind this idea: when you repeat an experiment many times on a large number of random variables, the sum of their distributions will be very close to normal (normality).

Human height is a random variable based on other random variables (such as the amount of nutrients a person consumes, the environment they live in, and their genes, etc.), and the sum of the distributions of these random variables ends up being very close to normality. This is the central limit theorem.

We learned from the previous section that the normal distribution is the sum of many random distributions. If we graph the density function of a normal distribution, the resulting curve has the following properties:

This bell curve has a mean of 100 and a standard deviation of 1.

The graph above introduces the very well known 3-σ principle, which is:

This way we can easily estimate the volatility of a variable, and also give a level of confidence in what value it is likely to take. For example, in the gray bell curve above, the probability that the value of the variable will occur between 101 and 99 is about 68.2%. Imagine how much confidence you have when you make a decision based on information like this.

Probability Distribution Function

The probability density function of a normal distribution is:

The probability density function is essentially the probability that a continuous random variable will take on certain values. For example, if you want to know that the variable occurs between 0 and 1, its probability can be found by using the probability density function.

How can I find out the characteristic distribution in Python?

The easiest way I’ve used is to load all the features in Pandas’s DataFrame, and then call its methods directly to find out the probability distribution of the features:

The bins here represent the number of bars in the distribution. Of course the above is not a normal distribution, so what does it mean when a variable satisfies a normal distribution?

It means that if you add together a large number of random variables with different distributions, your new variable ends up obeying a normal distribution, which is the beauty of the central limit theorem. Furthermore, variables that obey a normal distribution will always obey a normal distribution. For example, if A and B are two variables obeying a normal distribution, then:

Variables might as well be good enough to become normally distributed

If the sample satisfies some unknown distribution, it will always become normally distributed through a series of operations. Conversely, a superposition and transformation of the standard normal distribution must also be able to change to any unknown distribution. Transforming from the standard normal to an unknown distribution is what many machine learning models hope to do, whether it’s VAE or GAN in vision, or in other domains.

But for traditional statistics, we prefer to convert the distribution of features to a normal distribution, because normal distributions are simple and easy to compute. Below shows a few ways to convert to standard normal, like believing in transformations and whatnot, which you’ve learned in high school.

1. Linear transformation

After we collect the samples as variables, we can use the following formula to do a linear transformation on the samples to calculate

Use the following formula to calculate Z based on each value of x

Previously, x might have obeyed some unknown distribution, but the normalized Z obeys a normal distribution. Well, that’s the benefit of doing batch normalization or other normalization, I guess.

2. Box-cox transform

You can use Python’s SciPy package to transform the data into a normal distribution:

3. YEO-JOHBSON transform

Additionally, you can also use the powerful YEO-JOHNSON transform.Python’s sci-kitlearn provides suitable function:

Finally, it is very important to note that it is unwise to assume that variables follow a normal distribution without doing any analysis.

Taking samples that follow a Poissondistribution, a t-distribution (student-t distribution), or a binomial distribution, for example, incorrectly assuming that the variables follow a normal distribution may give incorrect results.

These are some of the discussions of the normal distribution.

If the article is helpful to you, please feel free to retweet/like/favorite it~

### I want to know the meaning of the probability density

I. Meaning

The probability density must have a definite bounded interval as a prerequisite. You can think of the probability density as a vertical coordinate, the interval as a horizontal coordinate, the integral of the probability density to the interval is the area, and this area is the probability that the event occurs in this interval, the sum of all the areas is 1. So analyzing the probability density of a point alone does not have any significance, it must have the interval as a reference and comparison.

Two, the definition

For the distribution function F (x) of the random variable X, if there is a non-negative productable function f (x), so that for any real number x,there

Then X is a continuous type of random variable, said f (x) for the probability density function of X, referred to as the probability density.

Expansion

Because of the density function of non-zero random variables X, Y range of values may not always be all real numbers, the above formula, the integration limit of the change will be more complex, usually the integration limit are z function. As an example, the formula for integrating over x, the specific approach to determining the integration limit of the variable x is as follows.