Hi all, this blog is about regression. I will be writing a series of blogs to explain different regression methods. Starting with **what is regression**; it is a statistical method to find **relationship** between variables.

One of the methods of regression is **Simple linear regression**; it is followed when there is **only one** explanatory variable, wondering what “explanatory” variable is? It is nothing but an independent variable.

Basically Simple linear regression is used to **predict the value of dependent variable** based on independent variable. There are some statistical terms which are to be understood before going to regression so here we go.

- SUM OF ERRORS (SSE): It is a measure of deviation between data and an estimated model. Purpose of Simple linear regression is to make a model which minimizes SSE.
- In Simple linear regression we follow the equation, Y=α + (β*x) + ε where,

X –> independent variable, Y–>dependent variable,

α, β –> parameters i.e. coefficient of the variable, constant and

ε is the error term.

- Correlation: correlation is a measure of the extent to which two or more variables vary together.

For Applying simple linear regression using** R** the following **steps** are to be followed.

**STEP 1: **Load the data into R

You can get the data from the link provided *data set*

Data <- read.csv(“Unemployment_rate.csv”)

The first few records of data are as shown.

head (Data)

For every 2.9 unemployed male the rate of unemployment in female is 4.0.

Unemployment_rate_for_male –> **Independent** variable.

Unemployment_rate_for_female –> **Dependent** variable.

I am plotting the data for better understanding.

plot(Data)

**STEP 2:** Creating the **linear model**.

Now that we know what are independent and dependent variable lets create the linear model and see the relationship that exists between them.

Linearmodel <- lm (data$unemployment_male ~ data$unemployment_female, data = data)

The summary of the linear model can be viewed using the command

Summary ( linearmodel )

For a clear understanding of summary(linearmodel), go to the following *link.*

Now, abline(linearmodel) draws the linear model line on the plot, which visualizes the model that we just created.

**STEP 3: **Finally using the linear model, **predict** the dependent variable.

The value of dependent variable can be predicted using predict () function.

Hence, it can be inferred that, for every 2 unemployed males there are 2.823 unemployed females as per the data given.

References:

Introduction to linear regression