Here, i have explained about the possibilities of Normal distribution in R.
A normal distribution is a very important statistical data distribution pattern occurring in many natural phenomena, such as height, blood pressure, lengths of objects produced by machines, etc. Certain data, when graphed as a histogram (data on the horizontal axis, amount of data on the vertical axis), creates a bell-shaped curve known as a normal curve or normal distribution.
Normal distributions are symmetrical with a single central peak at the mean (average) of the data. The shape of the curve is described as bell-shaped with the graph falling off evenly on either side of the mean. Fifty percent of the distribution lies to the left of the mean and fifty percent lies to the right of the mean.
Graph 1: A sample normal distribution
Importance of Normal distribution.
- Many things actually are normally distributed or very close to it. For Example:
- Heights of people
- Size of the things produced by machines
- Errors in measurement
- Blood pressure
- Marks on a test
- The normal distribution is easy to work with mathematically. In many practical cases, the methods developed using normal theory work quite well even when the distribution is not normal.
- There is a very strong connection between the size of a sample N and the extent to which a sampling distribution approaches the normal form. Many sampling distributions based on large N can be approximated by the normal distribution even though the population distribution itself is definitely not normal.
- Normal distribution is important because of Central Limit TheoremTells us that sampling distribution of other non-normal distributions approaches a normal distribution as the sample size increases. It allows us to perform hypothesis testing on all sorts of data.
Normal distribution with R.
The following functions support Normal distribution in R.
|rnorm||Generates random number from normal distribution||rnorm(n,mean,sd)||rnorm(500, 3, .25)
Generates 500 numbers
from a normal with mean 3
|dnorm||Probability density function (PDF)||dnorm(x,mean,sd)||dnorm(0, 0, .5)
Gives the density (height of the
PDF) of the normal
with mean=0 and sd=0.5
|pnorm||Cumulative distribution function (CDF)||pnorm(q,mean,sd)||pnorm(1.96, 0, 1)
Gives the area under the
standard normal curve to
the left of 1.96,
|qnorm||Quantile function inverse of pnorm||qnorm(p,mean,sd)||qnorm(0.975, 0, 1)
Gives the value at which the
CDF of the standard normal
is .975, i.e. ~1.96
Table 1: Representation of functions of Normal distribution
Real time examples involving the above functions.
1. Cumulative distribution function(CDF)
Ex: 1 Suppose the mean and standard deviation for heights of class 10 students are as follows.
Mean = 172 cm
Standard deviation = 10 cm
A. Compute the probability of a student being no taller than 180 cm. (i.e. less than or equal to 180 cm)
R code: pnorm(180,mean=172,sd=10)
Therefore, the percentage of students being no taller than 180 cm is 78.81%
B. Compute the probability of a student being taller than 185 cm.(i.e. greater than or equal to 185 cm)
R code: 1-pnorm(185,mean=172,sd=10)
(or) pnorm(185,mean=172,sd=10,lower .tail= FALSE)
 0.09680048 Therefore, we can conclude that percentage of students being taller than 185 cm is 9.7%
2. Quantile function
The function qnorm(), which comes standard with R, aims to do the opposite of pnorm
Ex 1: suppose you want to find that 85th percentile of a normal distribution whose mean is 70 and whose standard deviation is 3. Then you ask for:
The value 73.1093 is indeed the 85th percentile, in the sense that 85% of the values in a population that is normally distributed with mean 70 and standard deviation 3 will lie below 73.1093. In other words, if you were to pick a random member XX from such a population, then
Ex 2: Let’s find 25th percentile of first quartile. Q1: qnorm(p=0.25,mean=75,sd=5,lower.tail=TRUE)
 71.62755 71.62% is the value for the first quartile.
3. Density function
Let’s create a sequence to find density function
x <- seq(from 55,to 95,by 0.25)
plot(x,density,type=’l’,main=”X Normal: Mean=75,SD=5″,xlab=”X”,ylab =”Probability Density”,las=1)
Graph 2: Plot of x and density function
4. A random sample from a normally distributed population
rnorm is used to generate n normal random numbers with arguments mean and sd.random<-rnorm(n=40,mean=75,sd=5)
Below plot shows the normal distribution of random numbers.
Graph 3: Histogram representing Normal distribution
The following code demonstrates the above functions.
set.seed(3000) # To specify seeds
xseq<-seq(-4,4,.01) # creating sequence of number from -4 to 4 by .01 difference
densities<-dnorm(xseq, 0,1) # Generates probability density function.
cumulative<-pnorm(xseq, 0, 1) # Generates cumulative distribution function
randomdeviates<-rnorm(1000,0,1) #Generates 1000 random numbers from normal distribution
plot(xseq, densities, col=”darkgreen”,xlab=””, ylab=”Density”, type=”l”,lwd=2, cex=2, main=”PDF of Standard Normal”, cex.axis=.8)
# A plot between series of numbers and corresponding densities
plot(xseq, cumulative, col=”darkorange”, xlab=””, ylab=”Cumulative Probability”,type=”l”,lwd=2, cex=2, main=”CDF of Standard Normal”, cex.axis=.8)
#A plot between series and Cumulative distribution function
hist(randomdeviates, main=”Random draws from Std Normal”, cex.axis=.8, xlim=c(-4,4))
# Histogram to represent random deviates
Graph 4: Plots showing normal distribution