### What are the problems addressed by statistical regression analysis

The main problems studied in regression analysis are:

(1) determining the expression of the quantitative relationship between Y and X. This expression is called the regression equation;

(2) testing the credibility of the regression equation that is derived;

(3) determining whether or not the independent variable, X, has an effect on the dependent variable, Y;

(4) Use the obtained regression equation for prediction and control.

The main elements of regression analysis are:

(1) Starting from a set of data, determine the quantitative equation of the relationship between certain variables, that is, to establish a mathematical model and estimate the unknown parameters in it. The common method of estimating parameters is the least squares method.

②The degree of confidence in these relational equations is tested.

③In a relationship in which many independent variables jointly influence a dependent variable, to determine which (or which) independent variables have a significant effect and which have an insignificant effect, and to include in the model the independent variables that have a significant effect and eliminate those that do not have a significant effect, is usually done by using stepwise, forward, and backward regression, among other methods.

④Predicting or controlling a production process by utilizing the required relational equation. The application of regression analysis is very extensive, and statistical software packages make it very easy to calculate various regression methods.

In regression analysis, variables are divided into two categories. One category is the dependent variable, they are usually a class of indicators of interest in the actual problem, usually denoted by Y; and the other category of variables that affect the value of the dependent variable is called the independent variable, denoted by X.

### A question about stepwise regression independent variable selection and elimination in regression analysis, please help.

### How to interpret the results of stepwise regression analysis done by SPSS?

1, using the standardized B of each independent variable / the sum of the standardized B of all independent variables, the resulting percentage can be expressed as a percentage of the contribution of the independent variable to the dependent variable,

2, step-by-step regression is the basic idea of the variables are introduced into the model one by one, after the introduction of each explanatory variable to carry out the F-test, and the explanatory variables that have been selected for t-tests are carried out one by one, when the original introduction of explanatory variables due to the introduction of later explanatory variables to carry out t-tests, when the original introduction of explanatory variables due to the introduction of later explanatory variables. explanatory variables become no longer significant due to the introduction of later explanatory variables, they are deleted.

To ensure that only the prior explanatory variables are included in the regression equation before each new variable is introduced. This is an iterative process until there are neither significant explanatory variables selected into the regression equation nor insignificant explanatory variables removed from the regression equation. To ensure that the final set of explanatory variables obtained is optimal.

Expanded information:

SPSS stepwise regression analysis:

When there are many independent variables, some of them may not have a great influence on the corresponding variables, and x may not be completely independent of each other, there may be a variety of interactions. In this case, stepwise regression analysis can be used to screen the x-factor, so that the establishment of multiple regression model prediction effect will be better.

Step-by-step regression analysis, first of all, to establish the total regression equation between the dependent variable y and the independent variable x, and then the total equation and each – an independent variable for hypothesis testing. When the total equation is not significant, it indicates that the linear relationship of the multiple regression equation does not hold; and when the effect of a – independent variable on y is not significant, it should be eliminated and the multiple regression equation without the factor should be re-established. Screening out the significant effect of the factors as independent variables, and establish the “optimal” regression equation.

The more independent variables are included in the regression equation, the larger the regression sum of squares is, the smaller the remaining sum of squares is, the smaller the remaining mean square is, the smaller the error in the predicted values

and the better the simulation. However, the more variables in the equation, the greater the forecasting workload will be, and some of the forecasting factors with insignificant correlation will affect the forecasting effect. Therefore, it is especially important to choose the appropriate number of variables in the multiple regression model.

### How to analyze the results of exiews stepwise regression

Eviews is supported to perform stepwise regression automatically.

Suppose the dependent variable is Y, constant C, and explanatory variables X1, X2, X3, X4

Detailed operation is:

Quick-EstimateEquation first select Method: STEPLS;

2. In DependentVariable enter Y, in Listofsearchregressors, enter CX1X2X3X4

3. Pay special attention to set the iterative abort condition StoppingCriteria in Options, choose the significance level of the p-value as the basis for discrimination, assuming that the level of the test is 5%, and set the two values of 0.05 and 0.051.

4. p>4. Stepwise in the choice of forward or backward according to your own needs.

OK! Execution

I compared the automatic execution of the stepwise regression with the manual execution of a one-way regression for each explanatory variable and the addition of explanatory variables in order of goodness-of-fit. The resulting regression equations are slightly different, but still effectively avoid the problem of multicollinearity.