Fundamentals of Statistics 15-Time-Series Analysis
Time-SeriesAnalysis is the process of breaking down the original sales into four parts to look at-trend, cycle, period, and instability-and then synthesizing these factors to come up with a sales forecast. The emphasis is on extracting image-related features and analyzing the process of change and scale of development through continuous remote sensing observation of an area over a certain period of time. Of course, it is first necessary to determine the period of remote sensing monitoring according to the time-phase change characteristics of the detection object, so as to select the appropriate remote sensing data.
For example, rainfall:
If a set of time series meets the following conditions:
then it is called white noise.
In particular, it becomes Gaussian white noise if it obeys a normal distribution with zero mean and finite variance.
In time series models, we generally decompose the data into parts that can be fitted (or predicted), plus there is randomness caused by white noise.
How much of the object under study:
Statistical properties of the series:
Influences on the time series:
Superposition of influences:
What are the elements of time series analysis?
Time series are suitable for graphical representation: numerical axis, time axis.
The forecasting object, the forecasting target and the influencing factors on the forecasting are regarded as having a time sequence, as a function of time, and the time series method is to study the forecasting object’s own change process and development trend. According to the causal relationship between the forecast object and the influencing factors and their degree of influence to project the future. There are many factors related to the target, only those with a strong causal relationship can be selected for the prediction of the impact of the factors.
Long-term trends, seasonal variations, cyclical variations, irregular variations.
1, long-term trend (T) phenomenon in a longer period of time by some fundamental factors and the formation of the general trend of change.
2, seasonal variation (S) phenomenon in a year with the change of seasons and the regular cyclical changes.
3, cyclic changes (C) phenomenon to a number of years for the cycle of the wave pattern of regular changes.
1, Time-SeriesAnalysis (Time-SeriesAnalysis) refers to the original sales decomposition into four parts to see – trend, cycle, period and instability, and then synthesize these factors, put forward sales forecasts. The emphasis is on extracting image-related features and analyzing the process of change and the scale of development through continuous remote sensing observations of an area over a certain period of time. Of course, it is first necessary to determine the period of remote sensing monitoring according to the time-phase change characteristics of the detection object, so as to select the appropriate remote sensing data.
2. Characteristics: simple and easy to use, easy to grasp, but poor accuracy, generally only applicable to short-term prediction.
3, the basic principle: First, recognize the continuity of the development of things. The application of past data, you can speculate on the development trend of things. The second is to take into account the randomness of the development of things. The development of any thing may be affected by chance, for this reason to use statistical analysis of the weighted average method of processing historical data.
4, the basic idea: according to the limited length of the system’s operating records (observational data), the establishment of a mathematical model that can more accurately reflect the dynamic dependencies contained in the sequence, and by means of forecasting the future of the system.
When to use regression analysis and when to use time series
The core difference between the two lies in the assumptions made about the dataRegression analysis assumes that each data point is independent, while time series utilizes correlation between the data to make predictions.
This article will first explain the specific differences in assumptions about the data between the two, then explain why AR models are different even though they look like regression analysis, and finally also mention a common problem that can arise in the financial direction after confusing the two.
Assumptions of regression analysis on the data: independence In regression analysis, we assume that the data are independent of each other. This independence is reflected in two aspects: on the one hand, the independent variable (X) is fixed and has been observed values, and on the other hand, the error term for each dependent variable (y) is independently and identically distributed, and for the linear regression model, the error term is independently and identically distributed with a normal distribution and satisfies a mean of 0 and a constant variance.
This independence of the data is demonstrated by the fact that the order of the data can be exchanged arbitrarily in a regression analysis. When modeling, you can randomly select data sequentially for model training, or you can randomly select a portion of the data for splitting the training set and validation set. Because of this, the error in each prediction is relatively constant in the validation set: there is no accumulation of errors that leads to increasingly less accurate predictions.
Time series assumptions about the data: correlationBut for time series analysis, we must assume and utilize correlation in the data. The core reason for this is that we don’t have any other external data and can only use the existing data going forward to predict the future. Therefore, we need to assume that there is a correlation between each data point, find the corresponding correlation through modeling, and use it to predict the future direction of the data. This is why classical time series analysis (ARIMA) uses ACF (autocorrelation coefficient) and PACF (partial autocorrelation coefficient) to look at the correlation between data.
ACF and PACF measure correlation from data point to data point in two ways, respectivelyThe time series assumption of correlation directly contradicts the independence assumption of regression analysis. In multiple time series forecasting, on the one hand, the independent variable may not be realistically observable for future forecasts, and on the other hand, the error accumulates as the forecasts get further and further out: your forecasts for the long term future should be more uncertain than the near term forecasts. Therefore, time series analysis requires a completely different perspective and a different model to conduct analytical studies.
The “Similarities” and Differences Between AR and Linear Regression ModelsOne of the fundamental models in time series analysis is the AR (Auto-Regressive) model. It uses past data points to predict the future. For example, the AR(1) model uses current data points to predict future values, and their mathematical relationship can be expressed as follows:
It is indeed very similar to the linear regression model, and even the general AR(n) model has a high degree of similarity to linear regression. The only difference is that the independent variable (X) on the right side of the equation becomes what used to be the dependent variable (y)
And it is this small difference that leads to a completely different solution for the two. In an AR model, the fact that the model independent variable becomes the past dependent variable makes there a correlation between the independent variable and the past error. And this correlation makes
The solution of the AR model obtained using the linear model will be biased. The practical proof of the above conclusion requires the introduction of too many concepts. Here we only analyze the AR(1) model as a special case. Without loss of generality, we can express the AR(1) model by panning the data as follows:
For this type of model, linear regression will give the following estimates: For a general linear regression model, since all the independent variables are treated as if they were already observed as true values. So when we take the mean, we can treat the denominator as if it were known and get unbiased conclusions through the property that past observations and future errors are not related.
Using a regression model to predict the results of data simulations of an AR model: parameter estimates will be biased estimates In fact, we will use a linear regression model to approximate the solution of an AR model. This is because although the results will be biased, they are consistent estimates. That is, when the amount of data is large enough, the solved values will converge to the true values. No further expansion will be done here.
Consequences of Ignoring Independence: a Common Mistake in the Direction of FinanceHopefully, by this point you’ve figured out why it’s important not to confuse the assumptions of a model: especially the assumptions of independence or correlation. Next I’ll talk about one I’ve seen
Mistakes in financial direction due to confusing assumptions With the growth of machine learning, many people want to be able to combine machine learning and financial markets. Using data modeling to make predictions about stock prices. They will use traditional machine learning methods to randomly assign the data they get into a training set and a test set. The training set is used to train a model to predict the probability of a stock going up or down (a two-dimensional classification problem of going up or down). Then when they went to apply the model to the test set, they found that the model performed very well – able to achieve 80 to 90% accuracy. But it didn’t perform so well in real-world applications.
The reason for this mistake was that they didn’t recognize that the data was highly correlated. For time series, we can’t randomly assign training and test sets, or we’ll have the problem of “using future data” to predict “past direction”. At this point, even if your model performs well in your test set, it doesn’t mean that it can really predict the future direction of the stock price.
SummaryThe main difference between time series and regression analysis is the assumptions made about the data: regression analysis assumes that each data point is independent, whereas time series uses the correlation between the data to make predictions. While linear regression and AR models look very similar. However, due to the missing independence, the parameters of the AR model solved using linear regression will be biased. But again, since this solution is consistent, linear regression is still utilized in practice to approximate the AR model. Neglecting or assuming data independence is likely to result in model failure. Modeling of forecasts for financial markets in particular requires attention to this point.
What do I need to know about time series analysis? How can it be applied in public safety?
Quantitative. According to the query related public information shows that time series analysis is the use of this group of series, the application of mathematical statistical methods to deal with, in order to predict the development of future things. Application of prediction and warning in public security work. Crime analysis is the combination of society’s demographics, spatial factors on the qualitative and quantitative study of crime and law enforcement information, so as to understand the criminals, to stop the crime, to reduce the state of chaos in the community and to assess the organizational process.