• Births and deaths for the year 1605-1606. Source: Morris et al. 1759.
• Possibly the first ECG of the human heart. Source: Waller 1887, fig. 1.
• Time series analysis of Wolfe's sunspot numbers. Source: Yule 1927, fig. 8.
• Combined forecast improves on individual forecasts. Source: Bates and Granger 1969, table 1.
• Time series analysis for anomaly detection. Source: Krishnan 2019.
• Components of time series data. Source: Zhao 2011.
• Stationary vs non-stationary series. Source: Mitrani 2020.
• Time series plot of a sine wave and its correlogram. Source: Holmes et al. 2020, fig. 4.13.
• Three-stage Box-Jenkins methodology. Source: San-Juan et al. 2012, fig. 4.

# Time Series Analysis

nandangk
1637 DevCoins

arvindpdmn
1192 DevCoins

hemanthSK
12 DevCoins
Last updated by arvindpdmn
on 2020-08-20 11:14:03
Created by nandangk
on 2018-07-27 19:27:11

## Summary

Time series data is an ordered sequence of observations of well-defined data items at regular time intervals. Examples include daily exchange rates, bank interest rates, monthly sales, heights of ocean tides, or humidity. Time Series Analysis (TSA) finds hidden patterns and obtains useful insights from time series data. TSA is useful in predicting future values or detecting anomalies across a variety of application areas.

Historically, TSA was divided into time domain versus frequency domain approaches. The time domain approach used autocorrelation function whereas the frequency domain approach used Fourier transform of the autocorrelation function. Likewise, there are also Bayesian and non-Bayesian approaches. Today these differences are of less importance. Analysts use whatever suits the problem.

While most methods of TSA are from classical statistics, since the 1990s artificial neural networks have been used. However, these can excel only when sufficient data is available.

## Milestones

1662

John Graunt publishes a book titled Natural and Political Observations … Made upon the Bills of Mortality. The book contains the number of births and deaths recorded weekly for many years starting from early 17th century. It also includes the probability that a person dies by a certain age. Such tables of life expectancy later become known as actuarial tables. This is one of the earliest examples of time series style of thinking applied to medicine.

1861

Robert FitzRoy coins the term "weather forecast". Such forecasts start appearing in The Times from August 1861. Atmospheric data collected from many parts of England are relayed by telegraph to London, where FitzRoy analyzes the data (along with past data) to make forecasts. His forecasts forewarn sailors of impending storms and directly contribute to reducing shipwrecks.

1887

Augustus D. Waller, a doctor by profession, records what is possibly the first electrocardiogram (ECG). As practical ECG machines arrive in the early 20th century, TSA is applied to estimate the risk of cardiac arrests. In the 1920s, electroencephalogram (EEG) is introduced to measure brain activity. This gives doctors more opportunities to apply TSA.

1927

Yule applies harmonic analysis and regression to determine the periodicity of sunspots. He separates periodicity from superposed fluctuations and disturbances. Yule's work starts the use of statistics in TSA. In general, application of autoregressive models is due to Yule and Walker in the 1920s and 1930s.

1960

Muth establishes a statistical foundation for Simple Exponential Smoothing (SES) by showing that it's optimal for a random walk plus noise. Further advances to exponential smoothing happen in 1985: Gardner gives a comprehensive review of the topic; Snyder links SES to innovation state space model, where innovation refers to the forecast error.

1969

Bates and Granger show that by combining forecasts from two independent models, we can achieve a lower mean squared error. They also propose how to derive the weights in which the two original forecasts are to be combined. The same year, David Reid publishes his PhD thesis that's probably the first non-trivial study of time series forecast accuracy.

1970

Box and Jenkins publish a book titled Time Series Analysis: Forecasting and Control. This work popularizes the ARIMA model with an iterative modelling procedure. Once a suitable model is built, forecasts are conditional expectations of the model using mean squared error (MSE) criterion. In time, this model is called the Box-Jenkins Model.

1978

Through the 1970s, many statisticians continue to believe that there's a single model waiting to be discovered that can best fit any given time series data. However, empirical evidence show that an ensemble of models give better results. These debates cause George Box to famously remark,

All models are wrong but some are useful
1979

Makridakis and Hibon use 111 time series data and compare the performance of many forecasting methods. Their results claim that a combination of simpler methods can outperform a sophisticated method. This causes a stir within the research community. To prove the point, Makridakis and Hibon organize a competition, called M-Competition starting from 1982: 1001 series (1982), 29 series (1993), 3003 series (2000), 100,000 series (2018), and 42,840 series (2020).

1980

Although Kalman filtering was invented in the 1960, it's only in the 1980s that statisticians use state-space parameterization and Kalman filtering for TSA. The recursive form of the filter enables efficient forecasting. An ARIMA model can be put into a state-space model. Similarly, a state-space model suggests an ARIMA model.

1982

Robert Engle develops the Autoregressive Conditional Heteroskedasticity (ARCH) model to account for time-varying volatility observed in economics time series data. In 1986, his student Time Bollerslev develops the Generalized ARCH (GARCH) model. In general, variance of the error term depends on past error terms and their variance. ARCH and GARCH are non-linear generalizations of the Box-Jenkins model.

1987

Engle and Grange propose cointegration as a technique for multivariate TSA. Cointegration is a linear combination of marginally unit-root nonstationary series to yield a stationary series. This becomes a popular method in econometrics due to long-term relationship between variables. An earlier method of multivariable TSA is Vector Autoregressive (VAR) model.

1998

Zhang et al. publish a survey of neural networks applied to forecasting. They note an early work by Lapedes and Farber (1987) who proposed multi-layer feedforward networks. However, the use of ANNs for forecasting happens mostly in the 1990s. In general, feedforward or recurrent networks are preferred. At most two hidden layers are used. Number of input nodes correspond to the number of lagged observations needed to discover patterns in data. Number of output nodes correspond to the forecasting horizon.

2019

Sánchez-Sánchez et al. highlight many issues in using neural networks for TSA. There's no clarity on how to select the number of input or hidden neurons. There's no guidance on how best to partition the data into training and validation sets. It's not clear if data needs to be preprocessed or if seasonal/trend components have to be removed before data goes into the model. In 2018, Hyndman commented that neural networks perform poorly due to insufficient data. This is likely to change as data becomes more easily available.

## Discussion

• What are the main objectives of time series analysis?

TSA has the following objectives:

• Describe: Describe the important features of the time series data. The first step is to plot the data to look for the possible presence of trends, seasonal variations, outliers and turning points.
• Model: Investigate and find out the generating process of the time series.
• Predict: Forecast future values of an observed time series. Applications are in predicting stock prices or product sales.
• What are some applications of time series analysis?

TSA used in numerous practical fields such as business, economics, finance, science, or engineering. Some typical use cases are Economic Forecasting, Sales Forecasting, Budgetary Analysis, Stock Market Analysis, Yield Projections, Process and Quality Control, Inventory Studies, Workload Projections, Utility Studies, and Census Analysis.

In TSA, we collect and study past observations of a time series data. We then develop an appropriate model that describes the inherent structure of the series. This model is then used to generate future values for the series, that is, to make forecasts. Time series analysis can be termed as the act of predicting the future by understanding the past.

Forecasting is a common need in business and economics. Besides forecasting, TSA is also useful to see how a single event affects the time series. TSA can also help towards quality control by pointing out data points that are deviating too much from the norm. Control and monitoring applications of TSA are more common in science and industry.

• What are the main components of time series data?

There are many factors that result in variations in time series data. The effects of these factors are studied by following four major components:

• Trends: A trend exists when there is a long-term increase or decrease in the data. It doesn't have to be linear. Sometimes we will refer to a trend as "changing direction" when it goes from an increasing trend to a decreasing trend.
• Seasonal: A seasonal pattern exists when a series is influenced by seasonal factors (quarterly, monthly, half-yearly). Seasonality is always of a fixed and known period.
• Cyclic Variation: A cyclic pattern exists when data exhibits rises and falls that are not of fixed period. The duration of these cycles is more than a year. For example, stock prices cycle between periods of high and low values but there's no set amount of time between those fluctuations.
• Irregular: The variation of observations in a time series which is unusual or unexpected. It's also termed as a Random Variation and is usually unpredictable. Floods, fires, revolutions, epidemics, and strikes are some examples.
• What is a stationary series and how important is it?

Given a series of data points, if the mean and variance of all the data points remain constant with time, then we call it a stationary series. If these vary with time, we call it a non-stationary series.

Most prices (such as stock prices or price of Bitcoins) are not stationary. They are either drifting upward or downward. Non-stationary data are unpredictable and cannot be modeled or forecasted. The results obtained by using non-stationary time series may be spurious in that they may indicate a relationship between two variables where one doesn't exist. In order to receive consistent, reliable results, non-stationary data needs to be transformed into stationary data.

• Given a non-stationary series, how can I make it stationary?

The two most common ways to make a non-stationary time series curve stationary are:

• Differencing: In order to make a series stationary, we take a difference between the data points. Suppose the original time series is $$X_1, X_2, X_3, \ldots X_n$$. Series with difference of degree 1 becomes $$X_2-X_1, X_3-X_2, X_4-X_3, \ldots, X_n-X_{n-1}$$. If this transformation is done only once to a series, we say that the data has been first differenced. This process essentially eliminates the trend if the series is growing at a fairly constant rate. If it's growing at an increasing/decreasing rate, we can apply the same procedure and difference the data again. The data would then be second differenced.
• Transformation: If the series can't be made stationary, we can try transforming the variables. Log transform is probably the most commonly used transformation for a diverging time series. However, it's normally suggested to use transformation only when differencing is not working.
• What are the different models used in Time Series Analysis?

Some commonly used models for TSA are:

• Auto-Regressive (AR): A regression model, such as linear regression, models an output value based on a linear combination of input values. $$y = \beta_0 + \beta_1x + \epsilon$$. In TSA, input variables are observations from previous time steps, called lag variables. For p=2, where p is the order of the AR model, AR(p) is $$x_t = \beta_0 + \beta_1 x_{t-1} + \beta_2 x_{t-2}$$
• Moving Average (MA): This uses past forecast errors in a regression-like model. For q=2, MA(q) is $$x_t = \theta_0 + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2}$$
• Auto-Regressive Moving Average (ARMA): This combines both AR and MA models. ARMA(p,q) is \begin{align}x_t = &\beta_0 + \beta_1 x_{t-1} + \beta_2 x_{t-2} + \ldots + \beta_p x_{t-p} + \\ &\theta_0 + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + \ldots + \theta_q \epsilon_{t-q} \end{align}
• Auto-Regressive Integrated Moving Average (ARIMA): The above models can't handle non-stationary data. ARIMA(p,d,q) handles the conversion of non-stationary data to stationary: I refers to the use of differencing, p is lag order, d is degree of differencing, q is averaging window size.
• What are autocorrelations in the context of time series analysis?

Autocorrelations are numerical values that indicate how a data series is related to itself over time. It measures how strongly data values separated by a specified number of periods (called the lag) are correlated to each other. Auto-Correlation Function (ACF) defines autocorrelation for a specific lag.

Autocorrelations may range from +1 to -1. A value close to +1 indicates a high positive correlation while a value close to -1 implies a high negative correlation. These measures are most often evaluated through graphical plots called correlogram. A correlogram plots the auto-correlation values against lag. Such a plot helps us choose the order parameters for ARIMA model.

In addition to suggesting the order of differencing, ACF plots can help in determining the order of MA(q) models. Partial Auto-Correlation Function (PACF) correlates a variable with its lags, conditioned on the values in between. PACF plots are useful when determining the order of AR(p) models.

• How do I build a time series model?

ARMA or ARIMA are standard statistical models for time series forecast and analysis. Along with its development, the authors Box and Jenkins also suggested a process for identifying, estimating, and checking models. This process is now referred to as the Box-Jenkins (BJ) Method. It's an iterative approach that consists of the following three steps:

• Identification: Involves determining the order (p, d, q) of the model in order to capture the salient dynamic features of the data. This mainly leads to use graphical procedures such as time series plot, ACF, PACF, etc.
• Estimation: The estimation procedure involves using the model with p, d and q orders to fit the actual time series and minimize the loss or error term.
• Diagnostic checking: Evaluate the fitted model in the context of the available data and check for areas where the model may be improved.
• How do we handle random variations in data?

Whenever we collect data over some period of time there's some form of random variations. Smoothing is the technique to reduce the effect of such variations and thereby bring out trends and cyclic components. There are two distinct groups of smoothing methods:

• Averaging Methods:
(a) Moving Average: we forecast the next value by averaging 'p' previous values.
(b) Weighted Average: we assign weights to each of the previous observations and then take the average. The sum of all the weights should be equal to 1.
• Exponential Smoothing Methods: It assigns exponentially decreasing weights as the observation get older. In other words, recent observations are given relatively more weight in forecasting than the older observations. There are several varieties of this method:
(a) Simple exponential smoothing for series with no trend and seasonality: the basic formula for simple exponential smoothing is $$S_{t+1} = \alpha y_t + (1-\alpha)S_t, \qquad 0 < \alpha <=1, t > 0$$
(b) Double exponential smoothing for series with a trend and no seasonality.
(c) Triple exponential smoothing for series with both trend and seasonality.

## Milestones

1662

John Graunt publishes a book titled Natural and Political Observations … Made upon the Bills of Mortality. The book contains the number of births and deaths recorded weekly for many years starting from early 17th century. It also includes the probability that a person dies by a certain age. Such tables of life expectancy later become known as actuarial tables. This is one of the earliest examples of time series style of thinking applied to medicine.

1861

Robert FitzRoy coins the term "weather forecast". Such forecasts start appearing in The Times from August 1861. Atmospheric data collected from many parts of England are relayed by telegraph to London, where FitzRoy analyzes the data (along with past data) to make forecasts. His forecasts forewarn sailors of impending storms and directly contribute to reducing shipwrecks.

1887

Augustus D. Waller, a doctor by profession, records what is possibly the first electrocardiogram (ECG). As practical ECG machines arrive in the early 20th century, TSA is applied to estimate the risk of cardiac arrests. In the 1920s, electroencephalogram (EEG) is introduced to measure brain activity. This gives doctors more opportunities to apply TSA.

1927

Yule applies harmonic analysis and regression to determine the periodicity of sunspots. He separates periodicity from superposed fluctuations and disturbances. Yule's work starts the use of statistics in TSA. In general, application of autoregressive models is due to Yule and Walker in the 1920s and 1930s.

1960

Muth establishes a statistical foundation for Simple Exponential Smoothing (SES) by showing that it's optimal for a random walk plus noise. Further advances to exponential smoothing happen in 1985: Gardner gives a comprehensive review of the topic; Snyder links SES to innovation state space model, where innovation refers to the forecast error.

1969

Bates and Granger show that by combining forecasts from two independent models, we can achieve a lower mean squared error. They also propose how to derive the weights in which the two original forecasts are to be combined. The same year, David Reid publishes his PhD thesis that's probably the first non-trivial study of time series forecast accuracy.

1970

Box and Jenkins publish a book titled Time Series Analysis: Forecasting and Control. This work popularizes the ARIMA model with an iterative modelling procedure. Once a suitable model is built, forecasts are conditional expectations of the model using mean squared error (MSE) criterion. In time, this model is called the Box-Jenkins Model.

1978

Through the 1970s, many statisticians continue to believe that there's a single model waiting to be discovered that can best fit any given time series data. However, empirical evidence show that an ensemble of models give better results. These debates cause George Box to famously remark,

All models are wrong but some are useful
1979

Makridakis and Hibon use 111 time series data and compare the performance of many forecasting methods. Their results claim that a combination of simpler methods can outperform a sophisticated method. This causes a stir within the research community. To prove the point, Makridakis and Hibon organize a competition, called M-Competition starting from 1982: 1001 series (1982), 29 series (1993), 3003 series (2000), 100,000 series (2018), and 42,840 series (2020).

1980

Although Kalman filtering was invented in the 1960, it's only in the 1980s that statisticians use state-space parameterization and Kalman filtering for TSA. The recursive form of the filter enables efficient forecasting. An ARIMA model can be put into a state-space model. Similarly, a state-space model suggests an ARIMA model.

1982

Robert Engle develops the Autoregressive Conditional Heteroskedasticity (ARCH) model to account for time-varying volatility observed in economics time series data. In 1986, his student Time Bollerslev develops the Generalized ARCH (GARCH) model. In general, variance of the error term depends on past error terms and their variance. ARCH and GARCH are non-linear generalizations of the Box-Jenkins model.

1987

Engle and Grange propose cointegration as a technique for multivariate TSA. Cointegration is a linear combination of marginally unit-root nonstationary series to yield a stationary series. This becomes a popular method in econometrics due to long-term relationship between variables. An earlier method of multivariable TSA is Vector Autoregressive (VAR) model.

1998

Zhang et al. publish a survey of neural networks applied to forecasting. They note an early work by Lapedes and Farber (1987) who proposed multi-layer feedforward networks. However, the use of ANNs for forecasting happens mostly in the 1990s. In general, feedforward or recurrent networks are preferred. At most two hidden layers are used. Number of input nodes correspond to the number of lagged observations needed to discover patterns in data. Number of output nodes correspond to the forecasting horizon.

2019

Sánchez-Sánchez et al. highlight many issues in using neural networks for TSA. There's no clarity on how to select the number of input or hidden neurons. There's no guidance on how best to partition the data into training and validation sets. It's not clear if data needs to be preprocessed or if seasonal/trend components have to be removed before data goes into the model. In 2018, Hyndman commented that neural networks perform poorly due to insufficient data. This is likely to change as data becomes more easily available.

Author
No. of Edits
No. of Chats
DevCoins
26
0
1637
7
1
1192
1
0
12
2513
Words
3
Chats
34
Edits
7
Likes
3652
Hits

## Cite As

Devopedia. 2020. "Time Series Analysis." Version 34, August 20. Accessed 2020-09-22. https://devopedia.org/time-series-analysis
• Site Map