Close

Education

Education

Education

Education

Education

Education

## Description

Correlation and
Types of Correlation

Introduction
• In statistics, correlation analysis quantifies the strength

of the association between two numerical variables.

• In other words Correlation is a statistical measure that

indicates the extent to which two or

more variables fluctuate together.

• A correlation coefficient is a statistical measure of the

degree to which changes to the value of one variable

predict change to the value of another.
www.DuloMix.com 2

Definitions

1) If two variables are so inter-related in such
manner that change in one variable brings
about in the other variable, then this type
of relation of variable known as
correlation.

2) If we change the value of one variable that
will make corresponding change in the
value of other variable on an average then
we can say two variables are correlation.
The value of correlation coefficient will
very from -1 to +1.

www.DuloMix.com 3

 A positive correlation indicates the extent to

which variables increase or decrease in parallel.

 A negative correlation indicates the extent to

which one variable increases as the other

decreases.

 A zero correlation indicates no relation between

variables.

www.DuloMix.com 4

Importance of Correlation:

• Correlation is very important in the field of Psychology and Education as

a measure of relationship between test scores and other measures of

performance.

• With the help of correlation, it is possible to have a correct idea of the

working capacity of a person.

• With the help of it, it is also possible to have a knowledge of the various

qualities of an individual.

•In order to provide educational guidance to a student in selection of his

subjects of study, correlation is also helpful and necessary.

•Also useful to understanding unknown variables and economic behavior

www.DuloMix.com 5

Types of correlation
It is classified as follows on the basis of

Degree of Number of
Linearity

correlation variables

• Positive • Simple • Linear
• Negative • Partial • Non Linear
• No • Multiple
• Perfect

www.DuloMix.com 6

Positive correlation Negative correlation

❑ If one variable increases ❑ If one variable

with its impact other increases under its

variable also increases, impact the other variable

this is called Positive decreases , this is called

correlation. Negative correlation.

EXAMPLE: EXAMPLE:

The more time you spend If a train increases speed,
running on a treadmill, the the length of time to get to
more calories you will burn. the final point decreases.

www.DuloMix.com 7

Perfect correlation Simple correlation

 A positive
 A simple correlation is

one correlation indicates
one which involves

a perfect correlation that is

only 2 Variables.
positive, which means that

together, both variables

EXAMPLE:
move in the same direction.

Correlation between
demand and supply.

www.DuloMix.com 8

Partial Correlation Multiple Correlation

 When 3 or more variables
 When 3 or more variables

are part of analysis but only
are studied simultaneously

2 are studied and rest are

kept constant, it is partial it is multiple correlation.

correlation.

EXAMPLE: Correlation EXAMPLE: Rainfall ,

with demand supply and production of rice and

income where income is cost of rice studied

kept constant simultaneously

www.DuloMix.com 9

Linear Correlation Non Linear Correlation

 If the changes in amount of one  If the changes in amount of one

variable tends to make changes variable tends to make changes

in amount of other variable in amount of other variable but

bearing constant changing ratio not bearing constant changing

it is linear correlation. ratio it is non linear correlation.

EXAMPLE EXAMPLE

INCOME : 350 360 370 380 INCOME : 350 360 370 380

WEIGHT: 30 40 50 60 WEIGHT: 30 46 59 72

www.DuloMix.com 10

Coefficient of Correlation
 When two variables are correlated with each other, it is

important to know the amount or extent of correlation between
them.

 The numerical measure of correlation or degree of
relationship existing between two variables is called the
coefficient of correlation and

 It is denoted by r and it is always lies between 1 and -1.

1. When r = 1, it represents perfect direct or positive correlation

2. When r = -1 it represents perfect inverse or negative
correlation.

3. When r = 0, there is no linear correlation or it shows absence of
correlation.

4. When the value of r is 0.9 or 0.8 etc, it shows high degree of
relationship between the vawrwiwa.DbulloMeixs.coamnd when r is small say 0.2 11

or 0.1 etc, it shows low degree of correlation.

www.DuloMix.com 12

Method of Measuring A Correlation

www.DuloMix.com 13

Scattered Diagram

 Scatter diagram method is the simplest
method to ascertain whether the two variables
correlated and if they are correlated what is
the direction of correlation i.e. positive or
negative.

 In Scatter diagram one variable is taken along
the axis of x and the other along the axis of y.

 We then, plot these points on the graph paper
(on xy plane) and thus get the scattered
points. It is called scatter diagram.

 The way in which the points are scattered on
xy plane show the degree and direction of the
correlation between the two variables.

www.DuloMix.com 14

1) Perfect Positive Correlation 2) Highly Positive Correlation

• All the points are in correlation. • If All the points are very near to
• The straight line in upward straight line in upward

direction , the correlation direction, then we say it as a
scatter diagram showing highly positive correlation.
positive correlation is perfect
positive.

www.DuloMix.com 15

3) Positive Correlation 4) Perfect Negative Correlation

• If all the points are near to the • If all the points in a scattered
straight line (but not very near) diagram lies in a straight line in
the correlation is positive. downward direction, then we

say it as a perfect negative
correlation.

www.DuloMix.com 16

5) High Negative : 6) Negative :

• f the points are very close to • If the points are close to
straight line in downward straight line (not very close) in
direction,the correlation is high downward direction, then we
negative say it is negative correlation.

www.DuloMix.com 17

7) Zero correlation :

• If the points are widely
scattered in a graph, the
correlation is said to be zero

www.DuloMix.com 18

Correlation Graph

 In this method, we use the individual
values of two variables, which are potted
on the graph sheet and we obtain two
different curves on a graph sheet.

 By the examination of properties of
plotted point, we conclude that they will
be correlated or not.

www.DuloMix.com 19

Example: Draw the diagram and examine the correlation between
variables X and Y. Data are given in the following table :

Year 1990 1995 2000 2005 2010

X 5 7 6 6 8

Y 1 4 5 4 7

Solution : first we draw the graph between variables.

www.DuloMix.com 20

Merits and Demerits of Graphical Method:

Merits :
a) It is popular method of measuring the relationship

between two variables.
b) It is very easiest method, without involving any

mathematical calculation.
c) Every one can easily understood and examine it.
Demerits :
a) We can not obtain the degree of correlation.
b) Graphical method is suitable only for small

number of data.

www.DuloMix.com 21

Karl Pearson’s Coefficient of Correlation:

 Karl Pearson’s Coefficient of Correlation is used to
measure the degree of linear relationship between
two variables.

 It is also called moment correlation coefficient.

 This is a most widely used mathematical method of
finding the magnitude of linear correlation.

 It gives not only the magnitude of the correlation but
also its direction.

www.DuloMix.com 22

 Let (x1, y1), (x2,y2)…. (xn,yn,) be n pairs of observations of
two variables X and y.

 The coefficient of correlation (r) between X and Y is
defined by

Cov (X, Y)
r =

𝜎 ∙ 𝜎
𝑥 𝑦

where
1

Cov (X. Y) = Covariance between X and Y = Ʃ ( x – ̅x) (Y – ̅Y)
𝑛

𝜎𝑥 = Standard deviation of X
𝜎𝑦= Standard deviation of Y
n = Number of pairs of observations (xi, yi )
The above formula can be written as

Ʃ 𝑋 −𝑋ത (𝑌−𝑌ത)
r =

𝑛𝜎 ∙ 𝜎
𝑥 𝑦

www.DuloMix.com 23

Ʃ 𝑋 −𝑋ത (𝑌−𝑌ത)
r =

𝑛𝜎 ∙ 𝜎
𝑥 𝑦

If we write 𝑋 − 𝑋ത = 𝑥 and 𝑌 − 𝑌ത = 𝑦 , then formula becomes

Ʃ 𝑥𝑦
r =

𝑛𝜎 ∙ 𝜎
𝑥 𝑦

This formula is known as the product moment formula of coefficient of
correlation.

Where dx = x-A & dy = y-B
A & B = assumed means
n= number of pairs of observawtwiow.nDusloMix.com 24

Example : Compute the correlation coefficient between X
and Y using following data:

X 2 4 5 6 8 11

Y 18 12 10 8 7 5

Solution :

X Y XY X2 Y2

2 18 36 4 324

4 12 48 16 144
6 293 − 36 (60)

5 10 50 25 100 =
6 266 − 36 2 6 706 − 60 2

6 8 48 36 64

8 7 56 64 49
r = – 0.9203

11 5 55 121 25

36 60 293 266 706

www.DuloMix.com 25

Example :The following data show the temperature (X) and the pulse rate (Y) of
the 8 patients. Compute the coefficient of correlation between X and Y.

Patient No. 1 2 3 4 5 6 7 8

X 98 97 102 100 99 101 99 101

Y 100 91 63 80 92 70 90 72

Solution : we construct the following table taking A = 95 and B= 80 As assumed
mean

X Y dx = X – 95 dy = Y-80 dxdy dx2 dy2

98 100 98-95 = 3 100-80=20 60 9 400

97 91 97-95= 2 91-80=11 22 4 121

102 63 102-95= 7 63-80=-17 -119 49 289

100 80 100-95= 5 80-80= 0 0 25 0

99 92 99-95= 4 92-80=12 48 16 144

101 70 101-95= 6 70-80=-10 -60 36 100

99 90 99-95= 4 90-80=10 40 16 100

101 72 101-95= 6 72-80 =-8 -48 36 64

total – 18 -57 191 1218
www.DuloMix.com 26

8 −57 −(37)(18)
=

8 191 − 37 2 8(1218− 18 2

−1122
=

12.6095 ×97.0657

= – 0.9168

www.DuloMix.com 27

Example : The following table shows data of diastolic blood pressure and cholesterol
levels od 10 randomly selected man. Find the coefficient of correlation between
diastolic blood pressure and cholesterol level.

Person 1 2 3 4 5 6 7 8 9 10

Diastolic B.P. 80 75 90 74 75 110 70 85 88 78

Cholesterol 307 259 341 317 274 416 267 320 274 336

Solution : Let X denote the diastolic blood pressure and Y denote the cholesterol
level of a man. We take dx= X-85 and dy = y-300 and construct the table.

X Y dx = X -85 dy = Y-300 dxdy dx2 dy2

Ans : 0.8088

www.DuloMix.com 28

Example: During a laboratory experiments muscular contractions of frog muscle
were measured against different doses of a given drug. The height of the curve
was considered as the response to the drug. Calculate the correlation coefficient
for the following data.

Serial No. Dose of the drug Response to the drug

1 0.3 54

2 0.4 59

3 0.6 60

4 0.8 65

5 0.9 70

Solution: Let X= Dose of the frug
Y = Response to the drug

www.DuloMix.com 29

We construct the following table taking dx= 10 (x – 0.6)
dy = Y – 60

X Y dx dy dxdy dx2 dy2

0.3 54 -3 -6 18 9 36

0.4 59 -2 -1 2 4 1

0.6 60 0 0 0 0 0

0.8 65 2 5 10 4 25

0.9 70 3 10 30 9 100

Total 0 8 60 26 162

300
r =

130 746

r = 0.9633

www.DuloMix.com 30

Example: Tablets were weighed and assayed for the drug content. Results are
given below. Find the correlation coefficient between the weight of tablet and
assay.

Weight 200 205 203 201 195 203 198 200 190 205 207 210

Assay 10.0 10.1 10.0 10.1 9.9 10.1 9.9 10.0 9.6 10.2 10.2 10.
3

solution : Let X= weight
Y = assay

The following table is constructed taking dx= X-200 and dy= 10 (Y-10)

X Y dx dy dxdy dx2 dy2

r = 0.9588

www.DuloMix.com 31

Example :Diclofenac sodium sustained release tablets were
analyzed in-vitro and in-vivo. The results are summerised in the
following table: find out both the methods of evaluation are
correlated or not..

Amount of Drug released (%)
Time in minutes In -vitro In-vivo

0 0 0

30 35.45 20.33

60 36.47 33.65

90 44.91 41.82

120 55.20 50.01

150 62.46 59.78

Let X= Amount of drug released in-vitro
Y = Amount of drug released in-vivo
N= 6

www.DuloMix.com 32

Amount of Drug released (%)
Time in In –vitro In-vivo XY
minutes (X) (Y)

0 0 0 0

30 35.45 20.33

60 36.47 33.65

90 44.91 41.82

120 55.20 50.01

150 62.46 59.78

Total 224.89 205.59 10130.62

r = 0.9973

www.DuloMix.com 33

Spearman’s rank correlation coefficient

 Product moment correlation coefficient can be evaluated when both
the variables X and Y are quantitative.

 But if one variable or both the variables are qualitative , we can not
use the formula of product moment correlation coefficient.

 In such a situation, we can assign ranks according to the particular
characteristics under consideration and use Spearman’s rank
correlation coefficient. Spearman’s formula of rank correlations given
by

6 Ʃ 𝑑2

r = 1 –
𝑛 (𝑛2−1)

Where d= difference of ranks R1 and R2 given by two judges

n = number of pairs

The vaule of r lies between -1 and 1.

If r=positive = two judges have same line of thinking.

r = negative = two judges have opposite line of thinking.
www.DuloMix.com 34

Merits

1. It is simpler to understand and easy to
calculate as compared to Karl
Pearson’s method.

2. It is a useful method when the actual
data is not given but only ranks are
given.

3. It is useful for qualitative data such as
beauty, honesty. efficiency etc.

www.DuloMix.com 35

Demerits

1. It cannot be used for grouped
frequency distribution.

2. It is not as accurate as Karl Pearson’s
Coefficient of Correlation.

3. It cannot be used when continuous
series is given.

4. When the no. of items is more than 30
and if the ranks are not known, this
method consumes more time and
therefore can’t conveniently be used.

www.DuloMix.com 36

Example : A leading company engaged in the production of an antibiotic
drug has called 15 persons for interview, to fill up 10 vacancies of salesman.
The interview board consists of the sales manager and a psychologist. The
ranks given by the two to all the 15 candidates who attended the interview,
according to their serial number in the interview list, are given below. Find
the rank correlation coefficient

www.DuloMix.com 37

Sr no Rank given Rank given d 2
i di

by by (X-Y)
Sales Psychologis
manager(X) t (Y)

1 1 2 -1 1

2 3 3 0 0

3 2 1 1 1

4 4 5 -1 1

5 6 4 2 4

6 5 6 -1 1

7 7 8 -1 1

8 9 7 2 4

9 8 9 -1 1

10 11 10 1 1

11 10 12 -2 4

12 12 11 1 1

13 14 13 1 1

14 13 14 -1 1

15 15 15 0 0
www.DuloMix.com 38

6 Ʃ 𝑑2

r = 1 –
𝑛 (𝑛2−1)

6 (22)
= 1 –

15 (225−1)

132
= 1 –

3360

= 1- 0.0393

= 0.961

• The positive value of r indicates that the Sales Manger and the
Psychologist have same line of thinking.

• Also the value of r is very near to 1 which indicates that the
judgements given by both are almost same.

www.DuloMix.com 39

Example: Sixteen Pharmacy-industries of the Gujarat have been
ranked according to the profit in 2007-2008 and the working capital
for the year. Calculate the rank correlation coefficient.

Pharma A B C D E F G H I J K L M N O P
industry

Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
(profit)

Rank 13 16 14 15 10 12 4 11 5 9 8 3 1 6 7 2
(Workin
g
capital)

www.DuloMix.com 40

R1 R2 d d2

1 13 6 Ʃ 𝑑2

r = 1 –
2 16 𝑛 (𝑛2−1)
3 14

r = -0.8176
4 15

5 10

6 12

7 4

8 11

9 5

10 9

11 8

12 3

13 1

14 6

15 7

16 2

Total Ʃd2
www.DuloMix.com 41

Example: The competitors in a beauty contest are ranked by three judges in the
following order:

1st judge 1 5 4 8 9 6 10 7 3 2

2nd judge 4 8 7 6 5 9 10 3 2 1

3rd judge 6 7 8 1 5 10 9 2 3 4

Use rank correlation coefficient to discuss which pair of judges has the nearest
approach to beauty.

www.DuloMix.com 42

Solution : Let R1, R2, and R3 indicate the ranks given by three judges.

R1 R2 R3 d1= d2= d3= d 2
1 d 2

2 d 2
3

R1 – R2 R1 –R3 R2 –R3

1 4 6 -3 -5 -2 9 25 4

5 8 7 -3 -2 1 9 4 1

4 7 8 -3 -4 -1 9 16 1

8 6 1 2 7 5 4 49 25

9 5 5 4 4 0 19 16 0

6 9 10 -3 -4 -1 9 16 1

10 10 9 0 1 1 0 1 1

7 3 2 4 5 1 16 25 1

3 2 3 1 0 -1 1 0 1

2 1 4 1 -2 -3 1 4 9

Total 74 156 44

www.DuloMix.com 43

(1) Rank correlation coefficient between first and second judge is

6 Ʃ 𝑑 2

r12 = 1 – 1
= 0 5

𝑛 (𝑛2 . 5
−1)

(2) Rank correlation coefficient between first and third judge is

6 Ʃ 𝑑 2

r13 = 1 – 2
= 0 0

𝑛 (𝑛2 . 5
−1)

(3) Rank correlation coefficient between second and third judge

6 Ʃ 𝑑 2

r23 = 1 – 3
= 0 7

𝑛 (𝑛2 . 3
−1)

Since r23 has maximum positive value, we conclude that the second and third
judges have the nearest approach in judging beauty.

www.DuloMix.com 44

Multiple Correlation
 The coefficient of multiple correlations (R) is a measure of how well a

particular variable can be predicted using a linear function of a set of
other variables.

 It is correlation between the variable’s values and the best predictions
that can be determined linearly from the predictive variables.

 The coefficient of multiple correlations ranges between 0.00 and 1.00.
 A higher value indicates a high predictability of the dependent variable

from the independent variables.
 A value 1 indicates that the predictions are exactly correct and a value 0

indicates that no linear combination of the independent variables is a
better predictor than the fixed mean of the dependent variables.

 The coefficient of multiple correlations is also known as the square root
of the coefficient of determination under the particular assumptions that
an intercept is included and the best possible linear predictors are used.

 The coefficient of determination is defined for more general cases, such
as for non-linear prediction and those in which the predicted values have
not been derived from a model-fitting procedure.

www.DuloMix.com 45

 R is a scalar value that is defined as “the PCC between the
predicted and the actual values of the dependent variable in
a linear regression model that includes an intercept” Since
these regressions require two or more predictor variables, it
is called multiple regressions.

 The multiple regression equation is presented as:
y = bxs + bx2 + … + x + C..

 Where, b’s (i = 1, 2 n) are the regression coefficients, which
represent the value at which the criterion variable changes
when the predictor variable changes.

 For example, the hardness of a tablet will be dependent on
various factors like amount of binder, properties of drug and
excipient and the amount of force applied during
compression. Using hardness test one can estimate the
appropriate relationship among these factors.

www.DuloMix.com 46

Properties of Multiple
Correlations
 When more than two variables are related to each other, the

value of the coefficient of multiple correlation depends on
the choice of dependent variable as a regression of y on x
and z will have a different R than a regression of z on x and
y.

 For example, suppose that in a particular sample the
variable is uncorrelated with both x and y, while x and y are
linearly related to each other. Then in such cases a
regression of z on y and x yields an R = 0, while a
regression of y on x and z will yield a strictly +R. This
follows since the correlation of y with its best predictor
based on and z is in all cases at least as large as the
correlation of y with best predictor based on alone and in
this case with z providing no explanatory power.