# Time Series Analysis

## Summary

Time series data is an ordered sequence of observations of well-defined data items at regular time intervals. Examples include daily exchange rates, bank interest rates, monthly sales, heights of ocean tides, or humidity.^{} Time Series Analysis (TSA) finds hidden patterns and obtains useful insights from time series data.^{} TSA is useful in predicting future values or detecting anomalies across a variety of application areas.^{}

Historically, TSA was divided into time domain versus frequency domain approaches. The time domain approach used autocorrelation function whereas the frequency domain approach used Fourier transform of the autocorrelation function. Likewise, there are also Bayesian and non-Bayesian approaches. Today these differences are of less importance. Analysts use whatever suits the problem.^{}

While most methods of TSA are from classical statistics, since the 1990s artificial neural networks have been used.^{} However, these can excel only when sufficient data is available.^{}

## Milestones

## Discussion

What are the main objectives of time series analysis? TSA has the following objectives:

^{}**Describe**: Describe the important features of the time series data. The first step is to plot the data to look for the possible presence of trends, seasonal variations, outliers and turning points.**Model**: Investigate and find out the generating process of the time series.**Predict**: Forecast future values of an observed time series. Applications are in predicting stock prices or product sales.

What are some applications of time series analysis? TSA used in numerous practical fields such as business, economics, finance, science, or engineering. Some typical use cases are Economic Forecasting, Sales Forecasting, Budgetary Analysis, Stock Market Analysis, Yield Projections, Process and Quality Control, Inventory Studies, Workload Projections, Utility Studies, and Census Analysis.

^{}In TSA, we collect and study past observations of a time series data. We then develop an appropriate model that describes the inherent structure of the series. This model is then used to generate future values for the series, that is, to make forecasts. Time series analysis can be termed as the act of predicting the future by understanding the past.

^{}Forecasting is a common need in business and economics. Besides forecasting, TSA is also useful to see how a single event affects the time series. TSA can also help towards quality control by pointing out data points that are deviating too much from the norm. Control and monitoring applications of TSA are more common in science and industry.

^{}What are the main components of time series data? There are many factors that result in variations in time series data. The effects of these factors are studied by following four major components:

^{}**Trends**: A trend exists when there is a long-term increase or decrease in the data. It doesn't have to be linear. Sometimes we will refer to a trend as "changing direction" when it goes from an increasing trend to a decreasing trend.**Seasonal**: A seasonal pattern exists when a series is influenced by seasonal factors (quarterly, monthly, half-yearly). Seasonality is always of a fixed and known period.**Cyclic Variation**: A cyclic pattern exists when data exhibits rises and falls that are not of fixed period. The duration of these cycles is more than a year. For example, stock prices cycle between periods of high and low values but there's no set amount of time between those fluctuations.**Irregular**: The variation of observations in a time series which is unusual or unexpected. It's also termed as a*Random Variation*and is usually unpredictable. Floods, fires, revolutions, epidemics, and strikes are some examples.

What is a stationary series and how important is it? Given a series of data points, if the mean and variance of all the data points remain constant with time, then we call it a stationary series. If these vary with time, we call it a non-stationary series.

^{}Most prices (such as stock prices or price of Bitcoins) are not stationary. They are either drifting upward or downward. Non-stationary data are unpredictable and cannot be modeled or forecasted. The results obtained by using non-stationary time series may be spurious in that they may indicate a relationship between two variables where one doesn't exist. In order to receive consistent, reliable results, non-stationary data needs to be transformed into stationary data.

Given a non-stationary series, how can I make it stationary? The two most common ways to make a non-stationary time series curve stationary are:

^{}**Differencing**: In order to make a series stationary, we take a difference between the data points. Suppose the original time series is \(X_1, X_2, X_3, \ldots X_n\). Series with difference of degree 1 becomes \(X_2-X_1, X_3-X_2, X_4-X_3, \ldots, X_n-X_{n-1}\). If this transformation is done only once to a series, we say that the data has been**first differenced**. This process essentially eliminates the trend if the series is growing at a fairly constant rate. If it's growing at an increasing/decreasing rate, we can apply the same procedure and difference the data again. The data would then be**second differenced**.**Transformation**: If the series can't be made stationary, we can try transforming the variables. Log transform is probably the most commonly used transformation for a diverging time series. However, it's normally suggested to use transformation only when differencing is not working.

What are the different models used in Time Series Analysis? Some commonly used models for TSA are:

**Auto-Regressive (AR)**: A regression model, such as linear regression, models an output value based on a linear combination of input values. \(y = \beta_0 + \beta_1x + \epsilon\). In TSA, input variables are observations from previous time steps, called*lag variables*. For p=2, where p is the order of the AR model,*AR(p)*is \( x_t = \beta_0 + \beta_1 x_{t-1} + \beta_2 x_{t-2}\)**Moving Average (MA)**: This uses past forecast errors in a regression-like model. For q=2,*MA(q)*is \(x_t = \theta_0 + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2}\)**Auto-Regressive Moving Average (ARMA)**: This combines both AR and MA models.*ARMA(p,q)*is \(\begin{align}x_t = &\beta_0 + \beta_1 x_{t-1} + \beta_2 x_{t-2} + \ldots + \beta_p x_{t-p} + \\ &\theta_0 + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + \ldots + \theta_q \epsilon_{t-q} \end{align}\)**Auto-Regressive Integrated Moving Average (ARIMA)**: The above models can't handle non-stationary data.*ARIMA(p,d,q)*handles the conversion of non-stationary data to stationary:*I*refers to the use of differencing,*p*is lag order,*d*is degree of differencing,*q*is averaging window size.^{}

What are autocorrelations in the context of time series analysis? Autocorrelations are numerical values that indicate how a data series is related to itself over time. It measures how strongly data values separated by a specified number of periods (called the

**lag**) are correlated to each other.**Auto-Correlation Function (ACF)**defines autocorrelation for a specific lag.^{}Autocorrelations may range from +1 to -1. A value close to +1 indicates a high positive correlation while a value close to -1 implies a high negative correlation. These measures are most often evaluated through graphical plots called

**correlogram**. A correlogram plots the auto-correlation values against lag.^{}Such a plot helps us choose the order parameters for ARIMA model.In addition to suggesting the order of differencing, ACF plots can help in determining the order of MA(q) models.

**Partial Auto-Correlation Function (PACF)**correlates a variable with its lags, conditioned on the values in between. PACF plots are useful when determining the order of AR(p) models.^{}How do I build a time series model? ARMA or ARIMA are standard statistical models for time series forecast and analysis. Along with its development, the authors Box and Jenkins also suggested a process for identifying, estimating, and checking models. This process is now referred to as the

**Box-Jenkins (BJ) Method**. It's an iterative approach that consists of the following three steps:^{}**Identification**: Involves determining the order (p, d, q) of the model in order to capture the salient dynamic features of the data. This mainly leads to use graphical procedures such as time series plot, ACF, PACF, etc.**Estimation**: The estimation procedure involves using the model with p, d and q orders to fit the actual time series and minimize the loss or error term.**Diagnostic checking**: Evaluate the fitted model in the context of the available data and check for areas where the model may be improved.

How do we handle random variations in data? Whenever we collect data over some period of time there's some form of random variations. Smoothing is the technique to reduce the effect of such variations and thereby bring out trends and cyclic components.

^{}There are two distinct groups of smoothing methods:**Averaging Methods**:

(a) Moving Average: we forecast the next value by averaging 'p' previous values.

(b) Weighted Average: we assign weights to each of the previous observations and then take the average. The sum of all the weights should be equal to 1.**Exponential Smoothing Methods**: It assigns exponentially decreasing weights as the observation get older. In other words, recent observations are given relatively more weight in forecasting than the older observations.^{}There are several varieties of this method:^{}

(a) Simple exponential smoothing for series with no trend and seasonality: the basic formula for simple exponential smoothing is \(S_{t+1} = \alpha y_t + (1-\alpha)S_t, \qquad 0 < \alpha <=1, t > 0\)

(b) Double exponential smoothing for series with a trend and no seasonality.

(c) Triple exponential smoothing for series with both trend and seasonality.

## References

- Bates, J. M. and C. W. J. Granger. 1969. "The Combination of Forecasts." Operational Research Quarterly, Operational Research Society, vol. 20, no. 4, pp. 451-468, December. Accessed 2020-08-19.
- Brownlee, Jason. 2017. "A Gentle Introduction to the Box-Jenkins Method for Time Series Forecasting." Machine Learning Mastery, January 13. Accessed 2018-07-28.
- Dahodwala, Murtuza. 2018. "Beginners Guide To Time Series Analysis with Implementation in R." Blog, Digital Vidya, February 20. Accessed 2018-07-28.
- Emmanuel, Joshua. 2015. "Forecasting: Exponential Smoothing, MSE." YouTube, July 9. Accessed 2020-08-19.
- Gavrilov, Viktor. 2015. "Time series analysis: smoothing." Accessed 2018-07-28.
- Gooijer, Jan G. De and Rob J. Hyndman. 2006. "25 Years of Time Series Forecasting." International Journal of Forecasting, vol. 22, no. 3, pp. 443-473. doi: 10.1016/j.ijforecast.2006.01.001. Accessed 2020-08-19.
- Holmes, E. E., M. D. Scheuerell, and E. J. Ward. 2020. "Correlation within and among time series." Section 4.4 in: Applied time series analysis for fisheries and environmental data, NOAA Fisheries, Northwest Fisheries Science Center, February 3. Accessed 2020-08-19.
- Hyndman, Rob J. 2018. "A brief history of time series forecasting competitions." Blog, Hyndsight, April 11. Accessed 2020-08-19.
- Hyndman, Rob J. and George Athanasopoulos. 2018. "Stationarity and differencing." Section 8.1 in: Forecasting: principles and practice, 2nd edition, OTexts: Melbourne, Australia. Accessed 2020-08-19.
- Krishnan, Adithya. 2019. "Anomaly Detection with Time Series Forecasting." Towards Data Science, on Medium, March 3. Accessed 2020-08-19.
- MOFC. 2020. "The M5 Competition." MOFC. Accessed 2020-08-19.
- Mitrani, Alex. 2020. "Achieving Stationarity With Time Series Data." Towards Data Science, on Medium, January 10. Accessed 2020-08-19.
- Moore, Peter. 2015. "The science of weather forecasting: The pioneer who founded the Met Office." Independent, April 27. Accessed 2020-08-19.
- Morris, C., W. Petty, J. Graunt, and T. Birch. 1759. "A Collection of the yearly bills of mortality, from 1657 to 1758 inclusive: Together with several other bills of an earlier date." London: A. Millar. Accessed 2020-08-19.
- Morrison, Jeff. 2018. "Autoregressive Integrated Moving Average Models (ARIMA)." Accessed 2018-07-28.
- NIST. 2003a. "Definitions, Applications and Techniques." Section 6.4.1 in: Engineering Statistics Handbook, NIST/SEMATECH, June 1. Accessed 2018-07-28.
- NIST. 2003b. "What are Moving Average or Smoothing Techniques?" Section 6.4.2 in: Engineering Statistics Handbook, NIST/SEMATECH, June 1.Accessed 2018-07-28.
- NIST. 2003c. "What is Exponential Smoothing?" Section 6.4.3 in: Engineering Statistics Handbook, NIST/SEMATECH, June 1.Accessed 2018-07-28.
- Nielsen, Aileen. 2019. "Time Series: An Overview and a Quick History." Chapter 1 in: Practical Time Series Analysis, O'Reilly Media, Inc. Accessed 2020-08-19.
- PennState. 2020. "Partial Autocorrelation Function (PACF)." Section 2.2 in: Applied Time Series Analysis, STAT 510, Penn State Univ. Accessed 2020-08-19.
- San-Juan, Juan Félix, Montserrat San-Martín, and Ivan Perez. 2012. "An Economic Hybrid J2 Analytical Orbit Propagator Program Based on SARIMA Models." Mathematical Problems in Engineering, Hindawi Publishing Corporation, vol. 2012, article ID 207381, August. doi: 10.1155/2012/207381. Accessed 2020-08-19.
- Senter, Anne. 2008. "Time Series Analysis." BIOL 710, San Francisco State University, June 3. Accessed 2020-08-19.
- Shmueli, Galit. 2016. "Smoothing 3: Differencing." National Tsing Hua Univ, on YouTube, November 30. Accessed 2020-08-19.
- Sánchez-Sánchez, Paola Andrea, José Rafael García-González, and Leidy Haidy Perez Coronell. 2019. "Encountered Problems of Time Series with Neural Networks: Models and Architectures." IntechOpen Limited, November 27. Accessed 2020-08-19.
- Toloi, Clélia M.C., and Sergio R. Martins. 2006. "How to teach some basic concepts in time series analysis." Accessed 2018-07-28.
- Tsay, Ruey S. 2000. "Time Series and Forecasting: Brief History and Future Research." Journal of the American Statistical Association, vol. 95, no. 450, pp. 638-643, June. Accessed 2020-08-19.
- Waller, Augustus D. 1887. "A Demonstration on Man of Electromotive Changes accompanying the Heart's Beat." J Physiol., vol. 8, no. 5, pp. 229–234, October. Accessed 2020-08-19.
- Wikipedia. 2020. "Time series." Wikipedia, August 16. Accessed 2020-08-19.
- Wikipedia. 2020b. "Makridakis Competitions." Wikipedia, April 14. Accessed 2020-08-19.
- Yule, G. Udny. 1927. "On a Method of Investigating Periodicities in Disturbed Series, with Special Reference toWolfer's Sunspot Numbers." Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, vol. 226, pp. 267-298. Accessed 2020-08-19.
- Zhang, Guoqiang, B. Eddy Patuwo, and Michael Y. Hu. 1998. "Forecasting with artificial neural networks: The state of the art." International Journal of Forecasting, Elsevier, vol. 14, pp. 35–62. Accessed 2020-08-19.
- Zhao, Yanchang. 2011. "Time Series Analysis and Mining with R." Blog, R Data Mining, August 23. Accessed 2020-08-19.
- Zhu, Wei. 2019. "ARCH/GARCH Models." AMS 586, Time Series Analysis, State University of New York at Stony Brook, November. Accessed 2020-08-19.
- Zoubir, Leila. 2017. "A brief history of time series analysis." Department of Statistics, Stockholms universitet, October 31. Accessed 2020-08-19.

## Milestones

## Tags

## See Also

- Predictive Analytics
- ARIMA Model
- Regression Modelling
- Exploratory Data Analysis
- Time Series Smoothing
- Time Series Database

## Further Reading

- ARIMA model
- Box-Jenkins methodology
- Hyndman, Rob J. and George Athanasopoulos. 2018. "Forecasting: principles and practice." 2nd edition, OTexts: Melbourne, Australia. Accessed 2020-08-19.
- Chatfield, Chris, Anne B. Koehler, J. Keith Ord and Ralph D. Snyder. 2001. "A New Look at Models for Exponential Smoothing." Journal of the Royal Statistical Society. Series D (The Statistician), vol. 50, no. 2, pp. 147-159. Accessed 2020-08-19.
- Time Series Forecast: A basic introduction using Python
- Using python to work with time series data