# Time Series Analysis

## Summary

Time series data is an ordered sequence of observations of well-defined data items at regular time intervals. Ex: Exchange rates each day,Reserve banks record interest rates,Monthly sales.

Time Series Analysis is the technique used in order to analyze time series and get insights about meaningful information and hidden patterns from the time series data.^{}

Objective of time series analysis:^{}

**Description**: to describe the important features of the time series pattern. The first step is to plot the data to look for the possible presence of trends,seasonal variation, outliers and turning points.**Modeling**: to investigate and find out the generating process of the time series. E.g. analyzing a series of monthly values of sales of automobiles , we can want to know as these values of sales had been generated.**Prediction**: it may be useful to forecast future values of an observed time series. E.g. predicting stock prices,Sales Forecasting

## Milestones

## Discussion

What are some applications of time series analysis? The main aim of time series analysis is to collect and study the past observations of a time series to develop an appropriate model which describes the inherent structure of the series. This model is then used to generate future values for the series, i.e. to make forecasts. Time series analysis can be termed as the act of predicting the future by understanding the past. Time series analysis is used in numerous practical fields such as business,economics, finance, science, engineering etc. Some typical use cases are:

^{}- Economic Forecasting
- Sales Forecasting
- Budgetary Analysis
- Stock Market Analysis
- Yield Projections
- Process and Quality Control
- Inventory Studies
- Workload Projections
- Utility Studies
- Census Analysis

What are the main components of time series data? As we know that a time series data varies with time, there are many factors which results in this variation. The effects of these factors are studied by following four major components

^{}**Trends**: A trend exists when there is a long-term increase or decrease in the data. It does not have to be linear. Sometimes we will refer to a trend “changing direction” when it might go from an increasing trend to a decreasing trend.**Seasonal**: A seasonal pattern exists when a series is influenced by seasonal factors (ex: quarterly,monthly,half-yearly). Seasonality is always of a fixed and known period**Cyclic Variation**: A cyclic pattern exists when data exhibit rises and falls that are not of fixed period. The duration of these cycles is more than a year. Ex: the stock market tends to cycle between periods of high and low values, but there is no set amount of time between those fluctuations**Irregular**: The variation of observations in a time series which is unusual or unexpected. It is also termed as a Random Variation and is usually unpredictable. Ex: Floods, fires, earthquakes, revolutions, epidemics, strikes

What is a stationary series and how important is it? Given a series of data points, if the mean and variance of all the data points remains constant with time then we will call that series as Stationary Series.

^{}Unfortunately, most price series are not stationary. They are either drifting upward or downward. Ex:bitcoin pricing,stock prices

Non-stationary data are unpredictable and cannot be modeled or forecasted. The results obtained by using non-stationary time series may be spurious in that they may indicate a relationship between two variables where one does not exist. In order to receive consistent, reliable results, the non-stationary data needs to be transformed into stationary data

Given a non-stationary series, how can I make it stationary? The two most common ways to make a non-stationary time series curve stationary are:

**Differencing**: In order to make your series stationary, you take a difference between the data points. So let us say, your original time series was \(X1, X2, X3,...Xn\). Series with difference of degree 1 becomes \(X2 - X1, X3 - X2, X4 - X3, ... , Xn - X(n-1)\). If this transformation is done only once to a series, you say that the data has been**first differenced**. This process essentially eliminates the trend if your series is growing at a fairly constant rate. If it is growing at an increasing rate, you can apply the same procedure and difference the data again. The data would then be**second differenced**.**Transformation**: If you can not make a time series stationary, you can try out transforming the variables. Log transform is probably the most commonly used transformation, if you are seeing a diverging time series. However, it is normally suggested that you use transformation only in case differencing is not working.

What are the different models used in Time Series Analysis? Some commonly used models are:

**Auto-Regressive(AR) model**: A regression model, such as linear regression, models an output value based on a linear combination of input values. \( y = \beta_0 + \beta_1x + \epsilon \). This technique can be used on time series where input variables are taken as observations at previous time steps, called lag variables. AR(p) => \( x_t = \beta_0 + \beta_1 x_{t-1} + \beta_2 x_{t-2} . \;\;\;p=2\). p is called order of the AR model**Moving Average(MA) model**: Rather than use past values of the forecast variable in a regression, a moving average model uses past forecast errors in a regression-like model. MA(q) => \( x_t = \theta_0 + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2}. \;\;\;q=2 \) . q is called order of MA model**ARMA model**: ARMA is combination of both AR and MA. ARMA(p,q) => \( x_t = \beta_0 + \beta_1 x_{t-1} + \beta_2 x_{t-2} +....+ \beta_p x_{t-p} + \theta_0 + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + .... + \theta_q \epsilon_{t-q} \)**ARIMA model**: The above models can't handle non-stationary data. This model handles the conversion of non-stationary data to stationary. Generally represented as ARIMA(p,d,q)^{}

What are autocorrelations in the context of time series analysis? Autocorrelations are numerical values that indicate how a data series is related to itself over time. It measures how strongly data values at a specified number of periods apart are correlated to each other over time. The number of periods apart is usually called the

**lag**.For example, an autocorrelation at lag 1 measures how values 1 period apart are correlated to one another throughout the series. An autocorrelation at lag 2 measures how the data two periods apart are correlated throughout the series.

Autocorrelations may range from +1 to -1. A value close to +1 indicates a high positive correlation while a value close to -1 implies a high negative correlation. These measures are most often evaluated through graphical plots called

**correlogram**. A correlogram plots the auto-correlation values for a given series at different lags. This is referred to as the**autocorrelation function**and is very important in the ARIMA method.^{}How to choose model order in time series analysis? Autocorrelation plots (also known as ACF or the auto correlation function) help to choose the order parameters for ARIMA model.

ACF plots display correlation between a series and its lags. In addition to suggesting the order of differencing, ACF plots can help in determining the order of the MA(q) model. Partial autocorrelation plots (PACF), as the name suggests, display correlation between a variable and its lags that is not explained by previous lags. PACF plots are useful when determining the order of the AR(p) model.

How do I build a time series model? The ARMA or ARIMA model is a standard statistical model for time series forecast and analysis. Along with its development, the authors Box and Jenkins also suggest a process for identifying, estimating, and checking models for a specific time series dataset. This process is now referred to as the Box-Jenkins(BJ) Method. It is an iterative approach that consists of the following 3 steps:

^{}**Identification**: involves determining the order of the model required (p, d, and q) in order to capture the salient dynamic features of the data. This mainly leads to use graphical procedures (plotting the series, the ACF and PACF, etc)**Estimation**: The estimation procedure involves using the model with p, d and q orders to fit the actual time series and minimize the loss or error term.**Diagnostic checking**: Evaluate the fitted model in the context of the available data and check for areas where the model may be improved.

How do we handle random variations in data? Whenever we collect data over some period of time there is some form of random variation. Smoothing is the technique used for reducing of canceling the effect due to random variation.

^{}There are two distinct groups of smoothing methods:

**Averaging Methods**:

(a) Moving Average: we forecast the next value in a time series based on the average of a fixed finite number 'p' of the previous values.

(b) Weighted Average: We assign weights to each of the previous observations and then take the average.The sum of all the weights should be equal to 1.**Exponential Smoothing Methods**: It assigns exponentially decreasing weights as the observation get older. In other words, recent observations are given relatively more weight in forecasting than the older observations.^{}There are several varieties of this method:^{}

(a) Simple exponential smoothing for series with no trend and seasonality: The basic formula for simple exponential smoothing is \( S_{t+1} = \alpha y_t + (1-\alpha)S_t.\;\;\;\;\;0 < \alpha <=1, t > 0\).

(b) Double exponential smoothing for series with a trend and no seasonality.

(c) Triple exponential smoothing for series with both trend and seasonality.

## References

- Brownlee,Jason. 2017. "A Gentle Introduction to the Box-Jenkins Method for Time Series Forecasting" Accessed 2018-07-28.
- Dahodwala, Murtuza. 2018. "Beginners Guide To Time Series Analysis with Implementation in R." Digital Vidya, February 20. Accessed 2018-07-28.
- Gavrilov,Viktor. 2015. "Time series analysis: smoothing." Accessed 2018-07-28.
- Martins, Sergio R. 2006. "HOW TO TEACH SOME BASIC CONCEPTS IN TIME SERIES ANALYSIS." Accessed 2018-07-28.
- Morrison,Jeff. 2018. "Autoregressive Integrated Moving Average Models (ARIMA)." Accessed 2018-07-28.
- NIST. 2013. "Engineering Statistics Handbook." Accessed 2018-07-28.
- NIST/SEMATECH. 2013. "Engineering Statistics Handbook." Accessed 2018-07-28.
- SEMATECH. 2013. "Engineering Statistics Handbook." Accessed 2018-07-28.

## Milestones

## Tags

## See Also

- Regression Modelling
- Exploratory Data Analysis
- Data Science
- Time Series Smoothing
- Time Series Database

## Further Reading

- ARMA model
- ARIMA model
- Box-Jenkins methodology
- Time Series Forecast : A basic introduction using Python
- Using python to work with time series data