Normal distribution functions in R

Hi everyone,

Here, i have explained about the possibilities of Normal distribution in R.

A normal distribution is a very important statistical data distribution pattern occurring in many natural phenomena, such as height, blood pressure, lengths of objects produced by machines, etc. Certain data, when graphed as a histogram (data on the horizontal axis, amount of data on the vertical axis), creates a bell-shaped curve known as a normal curve or normal distribution.

Normal distributions are symmetrical with a single central peak at the mean (average) of the data.  The shape of the curve is described as bell-shaped with the graph falling off evenly on either side of the mean.  Fifty percent of the distribution lies to the left of the mean and fifty percent lies to the right of the mean.

 

NormalDistributionWithPercentages1

Graph 1: A sample normal distribution

 

Importance of Normal distribution.

  1.  Many things actually are normally distributed or very close to it. For Example:
  • Heights of people
  • Size of the things produced by machines
  • Errors in measurement
  • Blood pressure
  • Marks on a test
  1. The normal distribution is easy to work with mathematically. In many practical cases, the methods developed using normal theory work quite well even when the distribution is not normal.
  2. There is a very strong connection between the size of a sample N and the extent to which a sampling distribution approaches the normal form. Many sampling distributions based on large N can be approximated by the normal distribution even though the population distribution itself is definitely not normal.
  3. Normal distribution is important because of Central Limit TheoremTells us that sampling distribution of other non-normal distributions approaches a normal distribution as the sample size increases. It allows us to perform hypothesis testing on all sorts of data.

Normal distribution with R.

The following functions support Normal distribution in R.

Functions Purpose Syntax Examples
rnorm Generates random number from normal distribution rnorm(n,mean,sd) rnorm(500, 3, .25)
Generates 500 numbers
from a normal with mean 3
and sd=.25
dnorm Probability density function (PDF) dnorm(x,mean,sd) dnorm(0, 0, .5)
Gives the density (height of the
PDF) of the normal
with mean=0 and sd=0.5
pnorm Cumulative distribution function (CDF) pnorm(q,mean,sd) pnorm(1.96, 0, 1)
Gives the area under the
standard normal curve to
the left of 1.96,
i.e. ~0.975
qnorm Quantile function  inverse of pnorm qnorm(p,mean,sd) qnorm(0.975, 0, 1)
Gives the value at which the
CDF of the standard normal
is .975, i.e. ~1.96

Table 1: Representation of functions of Normal distribution

Real time examples involving the above functions.

1. Cumulative distribution function(CDF)

Ex: 1  Suppose the mean and standard deviation for heights of class 10 students are as follows.
Mean = 172 cm
Standard deviation = 10 cm
A. Compute the probability of a student being no taller than 180 cm. (i.e. less than or equal to 180 cm)
P(X<=180)

Solution:
R code: pnorm(180,mean=172,sd=10)
(Or) pnorm(180,mean=172,sd=10,lower.tail=TRUE)
[1] 0.7881446

Therefore, the percentage of students being no taller than 180 cm is 78.81%

B. Compute the probability of a student being taller than 185 cm.(i.e. greater than or equal to 185 cm)

P(X>=185)

Solution:

R code:       1-pnorm(185,mean=172,sd=10)

(or)              pnorm(185,mean=172,sd=10,lower .tail= FALSE)

[1] 0.09680048 Therefore, we can conclude that percentage of students being taller than 185 cm is 9.7%

2. Quantile function

The function qnorm(), which comes standard with R, aims to do the opposite of pnorm

Ex 1: suppose you want to find that 85th percentile of a normal distribution whose mean is 70 and whose standard deviation is 3. Then you ask for:

qnorm(0.85,mean=70,sd=3)

[1] 73.1093

The value 73.1093 is indeed the 85th percentile, in the sense that 85% of the values in a population that is normally distributed with mean 70 and standard deviation 3 will lie below 73.1093. In other words, if you were to pick a random member XX from such a population, then

P(X<73.1093) =0.85

Ex 2: Let’s find 25th percentile of first quartile. Q1: qnorm(p=0.25,mean=75,sd=5,lower.tail=TRUE)

[1] 71.62755  71.62% is the value for the first quartile.

P(X<73.1093) =0.85

3. Density function 

Let’s create a sequence to find density function

x   <-   seq(from 55,to 95,by 0.25)

density<- dnorm(x,mean=75,sd=5)

plot(x,density,type=”l”)

plot(x,density,type=’l’,main=”X Normal: Mean=75,SD=5″,xlab=”X”,ylab =”Probability Density”,las=1)

abline(v=75)

normal curve

Graph 2: Plot of x and density function

 

4. A random sample from a normally distributed population

rnorm is used to generate n normal random numbers with arguments mean and sd.random<-rnorm(n=40,mean=75,sd=5)

hist(random)

Below plot shows the normal distribution of random numbers.

hist

Graph 3:  Histogram representing Normal distribution

 

The following code demonstrates the above functions.

set.seed(3000)      # To specify seeds

xseq<-seq(-4,4,.01)    # creating sequence of number from -4 to 4 by .01 difference

densities<-dnorm(xseq, 0,1)     # Generates probability density function.

cumulative<-pnorm(xseq, 0, 1)  # Generates cumulative distribution function

randomdeviates<-rnorm(1000,0,1) #Generates 1000 random numbers from normal distribution

par(mfrow=c(1,3), mar=c(3,4,4,2))

plot(xseq, densities, col=”darkgreen”,xlab=””, ylab=”Density”, type=”l”,lwd=2, cex=2, main=”PDF of Standard Normal”, cex.axis=.8)

# A plot between series of numbers and corresponding densities

plot(xseq, cumulative, col=”darkorange”, xlab=””, ylab=”Cumulative Probability”,type=”l”,lwd=2, cex=2, main=”CDF of Standard Normal”, cex.axis=.8)

#A plot between series and Cumulative distribution function

hist(randomdeviates, main=”Random draws from Std Normal”, cex.axis=.8, xlim=c(-4,4))

# Histogram to represent random deviates

three graphs

Graph 4: Plots showing normal distribution

References:

  1. http://statistation.blogspot.in/2012/05/importance-of-normal-distribution.html
  2. http://www.r-bloggers.com/normal-distribution-functions/

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s