Hi there, let’s discuss about time series analysis using ARIMA models in this blog.

First of all what is time series? A time series is simply a** sequence of numbers collected at regular intervals** over a period of time. Some examples of time series are prices of quantities over a period of time, GDP of a region etc. So typically when we use these models we try to decompose into trend of seasonal or cyclical components.

Basically time series data are used to create forecasting models; Time series forecasting is the use of a model to predict future values based on previously observed values. Models for time series data can have many forms and represent different processes.

When modeling variations in the level of a process, three broad classes of practical importance are

- autoregressive(AR) models
- Integrated(I) models
- Moving average model(MA)

These three classes depend linearly on previous data points

Combinations of these ideas produce Auto regressive Moving average (**ARMA**) and auto regressive integrated moving average (**ARIMA**) models. They are applied in some cases where data shows evidence of non-stationarity.

**STATIONARITY**: An assumption in time series techniques that the data is stationary i.e. stationary process has the property that the mean, variance and auto-correlation structure do not change over time.

**MOVING AVERAGE:** It is a calculation to analyze data points by creating series of averages of different subsets of the full data set.

How to implement an ARIMA model using R is what I am going to tell you.

**STEP1**: Load the data.

mydata<- read.csv(“timeseries_ppi.csv”)

You can get the dataset from this link.

The data is about “producer pricing index”.

The head of the data set is as shown

defining variables

Y <- ppi –> dependent variable

d.Y <- diff(Y) –> function diff () returns suitably lagged and iterated differences.

t <- yearqrt –> time

**STEP2:** plotting the variables.

plot(t,Y)

From the plot it is clear that the data is not stationary so let’s conduct a test for testing the stationarity of the data.

**STEP3: **Conducting various statistical tests

Test1: Dickey fuller test

This test is used to determine the stationarity.

adf.test(Y, alternative=”stationary”, k=0)

The test statistic is -0.79 and P-value is considerably high and hence null hypothesis cannot be rejected which means that the data is not stationary

Conclusion of this test is that the data is stationary.

Running this test on the differenced variable we get

adf.test(d.Y, k=0)

Test statistic is -6.8398 which suggests that the differenced variable is having stationarity. So the **conclusion** of this test is that we need to use difference variable in ARIMA model

Test2: correlation test

acf(Y) –> correlation function

The graph suggests that the data is not stationary.

pacf(Y) –> partial correlation function

**STEP 4:** Implementing ARIMA model

Have a look at the basic syntax by following the link

Estimate different ARIMA models.

arima(Y, order = c(1,0,0))

arima(Y, order = c(2,0,0))

arima(Y, order = c(0,0,1))

arima(Y, order = c(1,0,1))

# ARIMA on differenced variable

arima(d.Y, order = c(1,0,0))

arima(d.Y, order = c(0,0,1))

arima(d.Y, order = c(1,0,1))

arima(d.Y, order = c(1,0,3))

arima(d.Y, order = c(2,0,3))

Best ARIMA model is selected based on the value of AIC, lower the value of the AIC better is the model

**STEP** 5**: **finally, predicting using the ARIMA model of (1, 0, 1) and then plotting the output.

mydata.arima101 <- arima(Y, order = c(1,0,1))

mydata.pred1 <- predict(mydata.arima101, n.ahead=100)

plot (Y)

lines(mydata.pred1$pred, col=”blue”)

lines(mydata.pred1$pred+2*mydata.pred1$se, col=”red”)

lines(mydata.pred1$pred-2*mydata.pred1$se, col=”red”)

Now, predicting using the differenced variable we get

mydata.arima111 <- arima(d.Y, order = c(1,0,1))

mydata.pred1 <- predict(mydata.arima111, n.ahead=100)

plot (d.Y)

lines(mydata.pred1$pred, col=”blue”)

lines(mydata.pred1$pred+2*mydata.pred1$se, col=”red”)

lines(mydata.pred1$pred-2*mydata.pred1$se, col=”red”)

References: