Statistics is the technique to analysis the numerical data. These data represented the
observations obtained either through statistical or non-statistical techniques. It may define a
universe or an entire population, based on various sampling procedure. It also include the
various techniques for the collecting as central tendency, tabulation, average, disperion etc.,
which help in describing and summering the characteristic or feature of sampling of data in
medical research/ field.
The word ‘statistics’ has been derived from the latin word ‘status’. In plurar sense it
means a set of a numerical figures called ‘data’ obtained by counting. Also, in the singular
sense it means collection, classification, analysis, comparision and meaningful interpretation
of ‘raw data’. According to Croxton and Cowdon, raw data as “It is the science which deals
with the collection, analysis and interpretation of numerical data”.
Or Data collection in its original form is known as a raw data.
Array : An array can be considered as a multiply subscripted collection of data entries.
1.1.1 DEFINITION OF STATISTICS
Statistics means a measured or counted fact or piece of information stated as figure such
as height of a person, birth of a body. Statistics through apparently plural when used in
singular sense, is a “Science of figure”. It is a field of study concerned with various tech-
niques or methods of collections, classification, summarising, inter-pretation of data and
drawing inferences, testing hypothesis and making recommendations.
1.1.2 MEANING OF STATISTICS
The word ‘statistics’ define as-
i) A plural noun, to mean numerical data, such as annual outcome of a machine, yearly
changes in incomes etc.
li) A singular noun, to refer to a techniques and procedures for collecting, describing,
analysing and interpreting numerical data.
iii) Statistics may be defined as the aggregate of facts affected to a marked extent by a
multiplicity of causes, numerically expressed or estimated according to a reasonable
standard of accuracy, collected in a systematic manner for a predetermined purpose
and placed in relation to each other. (Horace Secrist)
Figure (1)- Components of Statistics
categories, as descriptive statistics and inferentiay
eases, Further more it also classified and express as below :
“Making Predictions 7
Figure (2) ; Classification of Statistics
THE STUDY OF STATISTICs
Bio-Statistics is a term used when the tools of statistics are applied to the data that is
derived from biological sciences such as medicine. The tools and theories of statistics are
very important in the field and medical sciences.
“Mathematical Biology” is a fast growing, well designed and recognized subject and
the ee exciting modern application of applied mathematics. The biostatistics is a science of
Application and Uses of Bio-Statistics :
® To define, what is normal or healthy in a population.
w To find relative potency of a new drug with respect to a standard drug.
iii) To compare the efficiency of a particular drug.
iv) To find an association between two attributes such as disease and smoking, filariasis
and social class.
v) To identify sign and symptoms of a disease or syndrome.
1.1.6 FREQUENCY DISTRIBUTION
Suppose the value of a variable occurs twice or more in a given series of observations,
then the number of occurance of the value is known as the frequency of that value. The way
of tabulating a pool of data of a variable and their respective frequency side by side is called
a frequency distribution of these data.
Frequency distribution is usually the first method used to well organize the data for
investigation. A sytematic presentation the data for investigation : A systematic presenta-
tion of different values taken by the variables together with corresponding frequencies is
called a frequency distribution, which is presented in tabular form called
as frequency table.
If class interval are not given, then it is called as a discrete frequency distribution. For
S.No. Number of items Number of packing
If the class intervals are given, then frequency distribution is called as a continuous
fregency distribution. For example :
Further, the class interval are divided into two categories : exclusive aid inclusiy
i) The class interval that does not include upper class limit is called an exclusive .
interval e.g. ass
S.No. Marks Number of students
| ii) The class interval that includes the upper class limit is called an inclusive class inte.
val e.g. T
Note : In above table, the class ‘0-20’ does not include the value 20 (i.e. upper limit) and
the class “1-20” include the values 20 (i.e. upper limit).
1.2 TYPES OF FREQUENCY DISTRIBUTION
Frequency distribution are two types :
i) Grouped frequency distribution
ii) Ungrouped frequency distribution
i) Grouped frequency distribution :
Grouped frequency distribution are used, if variables will be continuous, such a ee |
salary etc. are examined. Many measures
‘ taken during data collection, including body ‘ “1
perature, weight, score, and time are measured using a continuous scale. In this way at
data may be divided into number of groups or classes or what usually called class inte”)
For example : Grouped frequency distribution : ay)
STATISTICS AND BIOSTATISTICS 7 a
Family income (variable) Number of persons (N) Percentage
Note : The above example is a best example of open end class interval.
ii) Ungrouped frequency distribution :
Mostly, we have some categorical data that are presented in the form of an “ungrouped
frequency distribution”, in which a table is generally developed to display all numerical
values obtained for a particular variable. This approach used on a discrete data rather than
continuous data. Example of data commonly organized in this manner include gender,
ethnicity, marital status, diagnostic category of study subject and value obtained from the
measurement of variables. Following table is the example of ungrouped frequency of sub-
1.3 PERCENTAGE DISTRIBUTION (OR COMMULATIVE FREQUENCY)
Percentage distribution indicate the percentage of the sample whose scores fall into a
specific group and the number of scores in that group. Percentage distributions are particu-
larly useful for comparing the present data with findings from other studies that have dif-
ferent sample sizes. A commulative frequency distribution is a type of percentage distribu-
tion in which the percentages and frequencies of score are summed, as one moves from the
top of the table to the bottom. Thus the bottom category would have a commulative fre-
quency equivalent to the sample size and a commulative percentage of 100.
BIOSTATISTICS AND RESEARG METH Example of a commulative frequency table :
Score | Frequency | Percentage | Cummulative Cummulative)
1.4 BIVARIATE AND MULTIVARIATE FREQUENCY DISTRIBUTION
Bivariate frequency distribution : }, frequency distribution if the number of Variable js
only two, then it is called a bivariate frequency distribution.
Multivariate frequency distribution : Frequency distribution of more than two iol
able is known as multivariate frequency distribution.
Note : 4 bivariate frequency distribution have two marginal distribution ice. the last
row and the last column in the frequency table. Also, it has ‘m+n’ conditional distribution
Le. all row and column except marginal distribution in the frequency distribution table
STATISTICS AND BIOSTATISTICGS 9 eS
Table 3 : Conditional distribution of salary for particular age
Salary Age (30-40)
Note : In above example, there are number of rows are 4 and number of columns are 3.
Then we have m+n = 4+3 = 7 conditional distributions.
1.5 GRAPHICAL PRESENTATION OF FREQUENCY DISTRIBUTION
The important forms of frequency distribution graphs are as follows :
i) Line frequency graph
iti) Frequency polygon
iv) Frequency curve
VY) Commulative frequency curve or ogive
i) Line frequency graph :
Line frequency graph is used to depict a discrete data. In this graph the size is depicted
on the x-axis and the frequencies on the y-axis. Lines are drawn according to the given
frequencies of different sizes.
Example 1. The following table shows the distribution of marks of students. Depict the
data_on_a line frequency gr ph
Figure : Line frequency graph
Fig. Line graph
ii) Histogram :
a) Histogram for equal class interval ;
A histogram is a representation of frequency distribution by a set of rectangular bars
with area proportional to the class frequency. In graph, the classes are shown on the x-axis
and the frequencies on the y-axis.
Example 3. Prepare a histogram from the following table :
| Marks 0-10 10-20 20-30 30—40 40-50 50-60
Number of students 4 7 10 15 8 6
STATISTICS AND BIOSTATISTICS
b) Histogram for unequal class interval :
In case of unequal class interval, to prepare a histogram use the following rules :
) Divide the class interval into equal class interval
ii) Calculate the adjusted frequency by dividing the frequency © f that class interval by 2.
Similarly, Follow the procedure for other unequal class intervals. The class
shown on x-axis and frequencies on y-axis.
Example 4. Prepare a histogram form the following data :
Class interval 0-10 10-20 20-40 40-70
Frequency 5 8 20 24
Solution. Here the smallest class interval is 10. In above example we have two unequal
class interval; one class is 20 (20-40) and another class is 30 (40-70).
Since the first class interval 20-40 is twice the size of the smallest class interval, so divide
it into two equal class interval. Also, the frequency of that class is 20; also divided by 2 ie.
SImilarly, the width of the class (40-70) is thrice of the size of the smallest class interval.
So, the height of the rectangle or frequency divided by 3. Finally, the adjusted frequencies
and class intervals are given below :
Class intervals | 0-10 | 10-20} 20-30 | 30-40 | 40-50 | 50-60_| 60-70
Frequencies 5 8 10 10 8 8 8
Now draw the graph of above adjusted table.
10 20 =30 40 50 60° 70 80
Figure : Unequal Histogram (Class interval)
Scanned by CamScanner
c) Histogram for inclusive awn
first it has to be-co
To preparing histogram for an inclusive series, nverteg
sive series. ®xely
Example 5. Draw a histogram for the following data :
Marks 10-19 20-29 30-39 40-49 50-59 60-69 70-79
Students 1 5 7 10 12 6 :
Solution. First convert given inclusive data into exclusive data,
Marks 9.5-19.5 19.5-29.5 | 29.5-39.5 | 39.5-49.5 |49.5-59.5 59.5-69.5 69.5.79.
d) Histogram for mid-value series
To preparing histogram for mid value series, first it has to be converted into continuous
Example 6. Construct a histogram with the help of the following data :
Mid values (salary) | 50 150_}_ 250 _ | 350 | 450
No. of workers 8 Z 10 14 12
Solution. Here, we convert given mid values series into a continuous series
100 200 300 400 #4500 Class interval
(iii) Frequency Polygon:
Any graph, which have more than four sides is called a polygon. RATER polygon is
a graph in which the values of variable are taken on x-axis and the frequencies on the y-axis.
It is the curve obtained by joining the mid-points of the tops of the rectangles in a histograph
by the straight line.
Example 7. Draw a frequency polygon from the following data :
Age in years | 10-20 20-30 30-40 40-50 50-60
No. of patients 5 20 25 35 10
(iv) Frequency curve :
Draw a histogram for the given distribution. A frequency curve is a smooth curve. It is
obtained by joining the point of frequency polygon by a free hand smoothed curve.BIOSTATISTICS AND RESEARCH ae , if
Example 8, Draw a frequency curve from the following table :
Interval 10-20 20-30 30-40 40-50 50-60
(v) Commulative Frequency Curve or Ogive :
Graph can be used to depict a commulative frequency distribution. For drawing on |
ogive, an ordinary frequency distribution table is converted into commulative frequency |
table. The commulative frequencies are then ploted corresponding to the upper limits of the
classes. The points, corresponding to commulative frequency at each upper limit of the classes
are joined by a free hand curve. The obtained graph is called an ogive.
The ogive further classified as “less than ogive” and “more than ogive”.
a) Less than Ogive : For drawing the less than ogive, the frequencies are added
cummulatively in an increasing order.
b) More than Ogive : For drawing more than ogive, the commulative frequencies &
different classses are estimated in a diminishing order. The upper limit of a class interv
STATISTICS AND BIOSTATISTICS ue
Example 9. Construct the less than ogive and more than ogive from the following table :
Solution. To draw ogive, firstly we construct commulative frequency table :
Less than Ogive More than Ogive
Marks c.f, Marks Gufs
Less than 10 4 More than 0 6/
Less than 20 4+7=11 More than 10 87-4=83
Less than 30 11+9=20 More than 20 83-7=76
Less than 40 20+10=30 . More than 30 76-9=67
Less than 50 30+15=45 More than 40 67-10=57
Less than 60 45+22=67 More than 50 57-15=42
Less than 70 67+14=81 More than 60 42-22=20
Less than 80 81+6=87 More than 70 20-14=6
Measure of central tendency or an average refers to the value, which is used to represnt
an entire series. This property of concentration of the value around a central value is known
as central tendency. The central value around which there is a concentration is called the
measure of central tendency.
By calculating the measure of central tendency, we can find a single value to represent
the whole data. It also help us to compare the value of two or more groups.
There are some important definitions of ‘average’ as follows :
j) Clark define ‘average’ as “Average is an attempt to find one single figure to describe
whole of figure”.
ii) A.E. Waugh observed that “an average is a single value selected from a group of
values to represent them in some way- a value which is supposed to stand for whole
group, of which it is a part, as typical of all the values in the group”.
iii) Leabo defines ‘average’ as the average is sometimes described as a number which is
typical of the whole group.
iv) According to Ya-Lun-Chou “An average is a typical value which is employed to
represent all the individual values in a series or of a variable”.
v) Lawrence J. Kaplan has defined average as one of the most widely used set of sum-
vi) Bowley shows average as statistical constant which helps to understand the signifi-
cance of the whole in a single effort.
2.1 OBJECTIVE OF MEASURE OF CENTRAL TENDENCY
i) To obtain a single value that describe the characteristics of the whole group of data.
ii) To help for comparision. .
iii) To help to make quantiative relationship between different group average.
iv) To help in decision making.
2.2 TYPES OF MEASURE OF CENTRAL TENDENCY
Important types of measure of central tendency are given below :
Measures of Central Tendency
(i) Arithmetic (ii) Geometric (iii) Harmonic Median Mode
oa mean mean
: (A) Sim. ple (B) Wei ghted
arithmetic mean arithmetic mean
Figure : Measure of Central Tendency
2.2.1 MATHEMATICAL AVERAGE
(i) Arithmetic Mean
Arithmetic mean is the most commonly used measure of the central tendency. Its value
is obtained by dividing the sum of the values of various items in a series by the number of
total items. Arithemtic mean further divided into two types :
A) Simple arithmetic mean
B) Weighted arithmetic mean
A) Simple arithmetic mean : The simple arithmetic mean of a set of values is obtained
by dividing the sum of the values the number of values in the set. It is denoted by x or AM.
Calculation of Mean is an individual series :
There are two methods : Direct and Short cut method
a) Direct Method :
EAE Ki Ke ssceussaess ,x, be the ‘n’ values of the variable x. Then the arithmetic mean,
Example 2. The height of 10 students are in cm : 160, 162, 175, 158, 156, 169, 173, 192
165, 167cm. Find the mean height of students.
ans n 10
Here n =10, then
_ 160 + 162 +175 +158 +156 +169 +173 +192 +165 + 167
Example 3. Find the mean x : 10, 8, 15, 12,2,9
Solution. Here n=6 and Sx = 10+8+15+12+2+9=56
MEASURE OF CENTRAL TENDENCY 19
x =Arithmetic mean
d = Deviation = x-A = dx
A = Assumed mean
n = Number of data
Xd = Sum of deviations
Example 4. Find the arithmetic mean, if
x:101, 106, 125, 150, 110
Solution. Given n=5, A=110 (assumed mean)
101 101-110 = -9
106 106-110= +4
125 125-110 = 15
150 150 – 110 = 40
syete2 cuene d
(ii) Step Deviation Method :
x =Arithmetic mean
A = Assumed mean
n = Number of data
Sd = Sum of deviations
20 (dV | ei Rae
Example 5 : Find mean using step deviation method, from the following data :
Marks (x) : 20 40 60 80 100
x Deviation (dx) |Step deviation (dx/i)=d”x
Calculation of Mean in a Discrete Series/Continuous Series :
There are three methods to calculate mean :
(i) Direct method (ii) Short-cut method and (iii) Step deviation method
(i) Direct method : To calculate mean follows the procedure :
Step 1: The value of each item (x) is multiplied by its frequency (f) and take it’s total say
Step 2 : Make sum of all frequencies ie. Sf
Step 3 : Using formula to find mean x= oii
Example 6. From the following data; calculate the mean by direct method :
Class (x) : 20 30 40 50 60 70
Frequency (f):} 5 7 8 10 11 8
x f fx
20 5 100
30 7 210
40 8 320
50 10 500
60 11 660
70 8 560
LF =49 > fx = 2350
Scanned by CamScanner
MEASURE OF CENTRAL TENDENCY 21 _
_ dfx _ 2350
yi 49 = 47.96
Example 7. Calculate mean, from the followqing data
Age 18-20 | 21-23 | 24-26 | 27-29 | 30-32 |33-35
(ii) Short Cut Method :
Follow the steps of calculation of mean
Step 1: Choose assumed mean (A)
Step 2 : Calculate devaition, dx = x-A
Step 3 : Multiply deviation and it’s frequency then obtain the sum total ie. > fdx
Step 4 : Using formula to calculate mean
Example 8. From the following distribution of data, calculate the mean by using short-
Class (x) 10 20 30 40 50 60
Frequencies (f) | 3 2 5 10 11 8
Solution. Let Assumed mean = 40 Up, |
Example 9. From the following distirubtion of marks obtained by 50 studenii
quantitative methods. Calculate arithmetic mean.
Marks More than 10 | 20 30 40 50 | a :
No’s of students 50 46 40 20 10 B)
Solution. Here the given data is in commulative form. First, we convert it into as
We have i = 10, n = 50 %
Marks Students Mid value | Deviation | Step deviation | f.dx ree
MEASURE OF CENTRAL TENDENCY 23
(iii) Step Deviation Method :
This is improved method of short cut method :
Step 1: Select an assumed mean
Step 2 : Calculate deviation dx=x-A
Step 3 : Calculate step deviation dx’= dx ‘/i
Step 4: Multiply step deviation with its frequencies i.e. fdx’
Step 5: Take the sum of fdx’ as we get > fdx’
Step 6 : Finally, use the following formula
Where i is a class interval
(b) Weighted Arithmetic Mean
Weighted arithmetic mean is defined as the calculation of arithmetic mean by putting
the weights to different items in a series differently according to their relative importance.
The formula for the weighted arithemtic mean is given below :
Where X_ = Weighted arithmetic mean
W = Weighted assigned to different items
In case of frequency distribution, If f,, f,,….., f, are the frequencies of the variable values
Xp Xayerscrrene X, respectively, then the weighted arithmatic mean is given by
Example 10. Calculate weighted arithmetic mean from the following distribution :
(ii) Geometric Mean
Geometric mean is defined as the n’ root of the product of n values. It is denote by
GM and defined as
WHORE 5 Ky sensessvi -X, are the various values of the series and n = number of items,
In case of large amount of items,
MEASURE OF CENTRAL TENDENCY ap 4
(iii) Harmonic Mean :
The harmonic mean of n values is the reciprocal of the mean of the repro of. the |
values. It is denoted by H.M. and defined as |
2.2.2 POSITIONAL AVERAGE
(i) Median and (ii) Mode
Median of a set of values is the middle most value when the data is arranged in ascend-
ing or descending order of magnitude. The middle value will divide the whole data into
two equal parts. The median is denoted by M. It is also called a positional average.
Method to compute of median :
(a) Individual observations
Step 1 : Allot a serial number to each item.
Step 2 : Arrange data in ascending or descending order.
Step 3 : Using the following formula to calculate median.
M= sizeof ( 2 item
Where M = Median and N = Number of items
Example 13. Calculate median of the following data gives the height of five student
MEASURE OF CENTRAL TENDENCY
Example 18. Given median = 50.4, N=60. Find the missing term.
Marks 40-44 | 44-48 | 48-52 | 52-56 | 56-60
(ii) Mode :
Mode is the value which has the height frequency that means the item which occurs
largest number of time in a frequencies distribution.
“The value of variable which occurs most frequently in a distribution is called mode”.
-Kenny and Keeping
Mode is denoted by M, and mathematically define as
Mod=e M , =L+ a xi
° d, +d,
Scanned by CamScanner
MEASURE OF CENTRAL TENDENCY 33 3
L = Lower limit of mode class
i = Class interval of modal class
d, = Difference between the frequency of the modal class and premodal class
d, = Difference between the frequency of the modal class and post modal class
Or Mode=M, =L+
f = Frequency of modal class
f, = Frequency of pre modal class
f, = Frequency of post modal class
L = Lower limit of modal class
i = Class interval of modal
Example 19, Find the mode of given a set of data :
46, 47, 48, 47, 40, 50, 97, 52
Solution. Since the value 47 is occuring heighest number of time.
Hence mode = 47
Example 20. Calculate mode (for discrete data)
Wages 145 | 170 | 180 | 190 | 200 | 210
Employees 3 16 § 20 6 2
Solution. Since the value 190 is occuring heighest number of time. So, the value of mode
Example 21. Calculate mode for following data :
Data 8-9 9-10 10-11 11-12 12-13 13-14 | 14-15
Frequency 8 14 21 25 15 ‘ 10 7
Solution. In above table, we have modal class (11-12) with heighest frequency 25.
STW exe asst)
fy, 3.0 MEASURES OF DISPERSION (OR VARIATION)
Dispersion, of the-data is.the degree to which the numerical data approached to spread
about an average value: Thejvariability in the data can be analysed with the help of measure
of dispersion.” “le pid}
th sciences, like physics and chemistry there is not so much variability as is found in
medicine and biology. We-can say occurence of variability is a biological phenomenon.
3.1 TYPES OF VARIABILITY
There are three main types of variability :
(i) Biological Variability : Individuals in similar environments differ when compared
as regards, sex class and other properties but the difference noted may be small and is said
to occur by chance. Such type of difference or variability is called biological variability.
(ii) Real variability : When the difference between two readings. Observations, or
values of classes or samples is more than the defined limits in universe, it is known as real.
(iii) Experiment variability : Error or difference or variation may be due to materials
methods, procedures employed in the study or defects in the techniques involved in the
They are further three types :
1) Observer error
2) Instrumental error
3) Sampling error
1) Observer error :
Observer error may be subjective or objective.
a) Subjective observer error : An interviewer may change some information there by
adding a number of errors while noting human behaviour unless trained properly. He may
such ask an embarrasing questions which the person may not like to answer such as men-
strual history, pregnancy, use of family planning method etc. Some subjects are very keen
while other do not wish to give any information.
b) Objective observation error : It may be added by an untrained observer while re-
Scanned by CamScanner
MEASURE OF DISPERSION 35 ”
cording the measurements such as blood pressure, pulse rate etc.
2) Instrumental error : This is negligible or gross. Defect in weighing machines, height
measures, sphygomanometer, and other tool may cause undesirable variability or error in
observation leading to wrong conclusions.
Note : Observer and instrumental errors are sometimes called as non sampling errors.
_ 3) Sampling Error: A sample drawn should not be biased or too small to draw conclu-
sions. It should be representative and sufficiently large size to start statistical tests. Hospital
based studies are mostly biased because the sample of patients under study are drawn from
poor, influential or nearby suction of society.
Error occur when the samples of the study is not’a true representative of the population
é ‘ pl at
and it may lead to wrong conclusion.
Experimental variability due to observer instrument and sampling defects is not un-
usual but comon occurence about which are must be careful in any scientific study so that the
bias may be minimised.
The main objectives of measuring of dispersion are given below :
i) To obtain the liability of an average.
ii) To provide as a basis for the control of variability.
iii) To compare two or more samples with regard to their variability.
iv) To facilitate the use of other statistical measures.
3.2 TYPES OF MEASURES OF DISPERSION
3.2.2 Interquartile Range (or Quartile deviation)
3.2.3 Mean Deviation
3.2.4 Variance and Coefficient of variance
3.2.5 Standard Deviation
Range is defined as the difference between the heighest and lowest value in the sample.
Also the relative measure of range is known as the “coefficient of range”.
Mathematically, if H is the highest value and L is the lowest value then
Range (R) = H-L
and coefficient of range =
220.127.116.11 Advantages of Range
i) The range is very simple to understand.
ii) It is easy to calculate. hart fin
iii) It is very helpful in statistical quality control and w
18.104.22.168. Disadvantages of Range
i) It is not suitable for thorough analysis. —
ii) It is affected by the extreme values in a sample IS secittigiilvn
Example 1. Calculate the range and the coefficient of range ING data
regarding Hb% of 10 patients
MEASURE OF DISPERSION 37 a
and coefficient of range = H-L_45-5_ 40
H+L 454+5 50
3.2.2 INTER-QUARTILE RANGE (OR QUARTILE DEVIATION)
The inter-quartile range of a group of observations is the interval between the values of
the upper quartile and the lower quartile for that group. Upper quartile of a group is the
value above which 25% of the observations fall. Lower quartile is the value below which
25% of the observations fall. This measure gives us the range which covers the middle 50%
of the observations in the group. If lower quartile is Q, and the upper quartile is Q, then.
i) Inter quartile range = Q, – Q,
R, = Difference between third and first quartile.
ii) Quartile deviation or semi-inter quartile range is defined as
= sla, -Q,]
iii) Coefficient of semi inter-quartile range is defined as
22.214.171.124 Advantages of Quartile Deviation
i) It is easy to understand and to calculate
ii) It is unaffected by the extreme values
iii) It is quite satisfactory when only the middle half of the group is dealt with.
126.96.36.199 Disadvantages of Quartile Deviation
i) It ignores 50% of the extreme values
ii) It is not suitable for algebric treatment.
Example 3. Calculate the inter quartile range, quartile dieiattan and coefficient of
quartile deviation, from the following table giving the heights of students.
Height in inches 58 59 60 61 62 63 64 65 66
Frequencies 21 25 28 18 20 22 24 23 18
(No. of students)
MEASURE OF DISPERSION 39
iii) Coefficient of quartile deviation = [email protected]=-Q
_64-64 0_oa ,
64+60 124 —
3.2.3 MEAN DEVIATION
Mean deviation is defined as an average or mean of the deviations of the values from
central tendency (i.e. mean, median or mode).
Step 1 : Define data as x.
Step 2 : Calculate arithmetic mean as * ~ Sr(=N)
Step 3 : Find the deviation of each observation from the mean, dx = x-X
Step 4 : Ignoring the negative sign of deviation and denoted by |> dx|
Step 5: Apply the formula
Mean Deviation =MD= N
Where N= Total number of values
In case of frequency distribution
Mean deviation =_ 2F :l [aaxx| | or pi(ax =) x )
N(or Sf) N(= Sf)
Where x is the mid point of the class interval and f is the frequency.
188.8.131.52 Advantages of mean deviation :
i) Mean deviation is easy to understand and calculate.
ii) It can be calculated by any method of central tendency.
iii) It is class affected by the extreme items
iv) It is based on measurement, not on estimation.
184.108.40.206 Disadvantages of mean deviation :
i) It ignores the sign of value.
ii) It is not suitable for accurate and further analysis.
40 Pine &>V B ] BIOSTATISTICS AND RESEARCH METHODOLOGY
Example 4, Find the mean deviation from the following data :
| Example 5. Find the mean deviation and coefficient of mean deviation from the mean
for the following table :
3.2.4 VARIANCE AND COEFFICIENT OF VARIANCE
The square of standard deviation is called a variance. It has a significant role in inferen-
tial statistics. It is denoted by g? or (S.D.)? and defined as
(i.) Vari:a nce =o 2 — 2 (x e‘e x). (for row data)
or Variance =
= x-x) Z
(ii) Variance = 6’ =a a (for frequency distribution)
220.127.116.11 COEFFICIENT OF VARIANCE
Coefficient of variance (CV) is used to compare the variability of one character in two
different groups having different magnitude of the values or two characters in the same
group by expressing in percentage.
It is calculated from standard deviation and mean of characteristic. The ratio of stan-
dard deviation and mean is found in percentage.
18.104.22.168 USES OF COEFFICIENT OF VARIANCE
(i) The series to be compared are expressed in the same units and have equal or nearly
(ii) The two series may be expressed in the same units, but their standard deviations
and means may be different.
(iii) The two series to be compared are expressed in different units.
Example 8. The mean and standard deviations of the numbers of students of two
schools A and B, are given below :
School Mean S.0;
A 450 52
B 470 55
Compare the variability of no’s of students in two schools.
Solution. We know that the coefficient of variation
CV of school B = 92 ¥100 = =. x100 =11,.70
Here, Variability of both school is nearly equal.
3.2.5 STANDARD DEVIATION (S.D.)
Standard deviation is the square root of the arithmetic mena of the squared deviations
of items taken from the arithmetic mean. It is used most commonly in statistical analysis.
Method to calculate S.D.
Step 1: Calculate mean x
Step 2 : Find the deviation of observation from mean i.e. dx.
Step 3 : Take the square of these deviations i.e. dx?
Step 4 : Take the summation of there squared deviation ie. dx?
Scanned by CamScanner
MEASURE OF DISPERSION 43.
Step 5: Apply the formula
SD.(c) = e ae
or S.D.(o) =
In case of frequencies distribution, we have
> (x – x) £
Shortcut Method : To calculate S.D.
sp.no-n f= ) ()]
where H = Class interval
22.214.171.124 USES OF STANDARD DEVIATION :
(i) It describe the variation (deviation) of a large distribution from mean that mean it is
used as a unit of variation.
(ii) Indicates whether the variation of difference of an individual from the mean is by
chance i.e. natural or real due to some special reasons.
(iii) It helps to find out error, which determines whether the differences between means of
samples is by chance or real.
(iv) It also helps in finding the suitable size of sample for valid conclusion.
Scanned by CamScanner
PY) elle tsuy Wisier sdeal Les RESEARCH METHODOLOGY
wing table :
| Example 6. Find out standard deviation from the follo
sD.= |= = A= = 5.84 =2.416
Example 7. In a survey of 150 families in a village, the following distribution of ages
of children was found :
Ages of children | 0-2 2-4 4-6 6-8 8-10
No. of families 40 32 25
Find the mean & standard deviation of the given distribution
Solution. Let the assumed mean (A) = 5 and class interval H = 2
Class | Mid value|Frequency | d; = x A d.? fa fd?
interval (x) (f) H a. aa
0-2 1 40 -2 4 -80 160
2-4 3 32 -1 1 -32 32
4-6 5 25 0 0 0 0
6-8 7 23 1 1 23 – 23
8-10 9 30 Z 4 120 60
a0 Xd, =31 )D£,d= ,2?75
Scanned by CamScanner
MEASURE OF DISPERSION
Mean =A atopy (N=2f)
S.D.=o=Hx Paae | zie) (Short cut method)
=e | St
275 ( a)
=2x See) Eee
= 2×1.3381= 2.67626
RELATIONSHIP AMONG MEASURES OF DISPERSION
(i) Quartile deviation is (2/3) of the standard deviations
Q.D.= = or 3QD = 20
(ii) Quartile deviation is (5/6)” of the mean deviation
QD. =2MD. or [6QD.=5MD.
(iii) Mean deviation is the (4/5)” of standard deviation
In correlation, we will define the relationship between two continuous (or measure-
ment) variables. The main goal of correlation study is to understand the nature and strength
of the linear association between the two quantitative parameters.
3.3.1 DEFINITION OF CORRELATION
1) If two variables are so inter-related in such a manner that change in one variable brings
about in the other variable, then this type of relation of variable known as correlation.
2) If we change the value of one variable that will make corresponding change in the value
of other variable on an average then we can say two variables are correlation. The value
of correlation coefficient will very from -1 to +1.
3.3.2 TYPES OF CORRELATION
Correlation can be classified into three categories :
1) Positive, negative and zero correlation.
2) Linear and non-linear correlation
3) Simple, Partial and Multiple Correlation
126.96.36.199 POSITIVE, NEGATIVE AND ZERO CORRELATION
If the values of two variables move in the same direction i.e. if the value of one variable
is increase (or decrease), then value of other variable also increases (or decreases) on an
average, then the correlation said to be positive e.g. Height and Weight (as height increases
weight also increase).
If the value of one variable increases (or decreases), then the value of other variable
decreases (or increases) on an average or in a simple manner, if the value of both variable
moves in opposite direction, then it is said to be negative correlation.
If the change in the value of one variable will not affect the value of other variable then
the correlation is zero.
Example of negative correlation
X 1 2 3 4 5 6
Y 70 60 50 40 30 20
Example of Positive Correlation
MEASURE OF DISPERSION 47
188.8.131.52 LINEAR AND NON-LINEAR CORRELATION
If the change in values of one variable makes a constant ratio with the change in value of
other variable, then such type of relation known as linear correlation.
Example 1. In scatter diagram, if all the points lies in straight line-
The correlation is said to be a non-linear if the value in one variable does not make a
constant ratio with change in the value of other variable.
Example 2. Draw the scatter diagram of the following table.
If we study the relationship between two variables X and Y (say
correlation. e.g. Height and Weight
If we study the relationship between two variables, keeping all the other variable as
constant, then it is called as partial correlation. wo variables then it is said to be
If we study the relationship between more than t iationshtp between
multiple sacle In multiple correlation we measure the dee ae athe a
one variable on one side and combined effect of all other var!
ie. -ler<tl). Then
Note : Since the value of correlation lies between -1 and +1 (i. -1S
: ; iables.
(i) Ifr > 0, we say that a positive correlation between a
(ii) If r < 0, we say that a negative correlation between variables.
(iii) If r = 0, we say that no correlation between variables.
3.4 A CHART: METHOD OF MEASURING’ A CORRELATION
Graphical Methods Algebraic Methods
1. Scatter Diagram 2. Correlation Graph
1. Karl Pearson’s = 2. Spearman’s Rank 3. Concurrent 4. Two-way
Coefficicent of Difference Method _ Deviation Method Frequency Table
The correlation between variables can by studied by the following ways :
(i) Scattered diagram
(iii) Karl Pearson’s Coefficient of Correlation
(iv) Spearman’s Rank Difference Method
(v) Concurrent Deviation Method
(vi) Two-way frequency table method
(i) Scattered Diagram :
In the study of correlation between two variables, by using graphical method. First we
MEASURE OF DISPERSION 49
draw scatter diagram, for which we take the value of one variable on x-axs and the value of
other variable on y-axis. The resulting graph is a scattered point or dot in a graph sheet
known as scatter diagram.
There are various types of scatter diagram,
(i) Perfect Positive Correlation
All the points are in correlation. The’ straight line in upward direction (left bottom to
right up), the correlation scatter diagram showing positive correlation is a perfect positive.
(ii) Highly Positive :
If all the points are very near to straight line in upward direction, then we say it as a
highly positive correlation.
Fig. 3 Highly positive
(iii) Positive Correlation : If all points are near to the straight line (but not very near)
the correlation is positive.
(iv) Perfect Negative : If all the points in a scattered diagram lies in a straight line in
downward direction (left top to right bottom), the correlation is perfect negative (r = -1).
(v) High Negative : If the points are very close to straight line in downward direction, the correlation is high negative.
MEASURE OF DISPERSION ‘51
(vi) Negative : If the points are close to straight line (not very close) in downward
direction, the correlation is negative.
(vii) Zero correlation : If the points are widely scattered in a graph, the correlation is
said to be zero.
Example 3. Following table shows the values of the variables X and Y
Draw the scatter diagram and find the correlation between the variables
Sol. First we draw the scatter diagram
From the above diagram, we can say that the variables X and Y is positive]
because all the potted points are near to the straight line.
(ii) Correlation Graph :
In this method, we use the individual values of two variables, which are potted o, the
graph sheet and we obtain two different curves on a graph sheet. By the examination of
properties of potted point, we conclude that they will be correlated or not.
Example 4. Draw the diagram and examine the correlation between variables X anjy
Data are given in the following table :
Year 1990 1995 2000 2005 2010
X 5 7 6 6 8
Y 1 4 5 4 7
Sol. First we draw the graph between variables
MEASURE OF DISPERSION 53
From the observation through graph, we can say that variables are closely related to
Merits and Demerits of Graphical Method :
Merits : ;
a) It is popular method of measuring the relationship between two variables.
b) It is very easyest method, without involving any mathematical calculation.
c) Every one can easily understood and examine it.
a) We can not obtain the degree of correlation.
b) Graphical method is suitable only for small number of data.
iii) Karl Pearson’s Coefficient of Correlation : |
Karl Pearson’s Coefficient of Correlation is used to measure the degree of linear rela-
tionship between two variables. It is also called moment correlation coefficient. It is denoted
by ‘r’ and defined as
Where X=x—x and Y=y-y
N = no’s of pair of values of variables
o = Standard deviation
Another form of correlation coefficient is as
IS x? > y?
or FSD (x) XSD.(y)
Where Cov(XY)= oa
S.D.(x) = Standard deviation for x series
S.D.(y) = Standard deviation for y series.
Merits and Demerits of Karl Pearson’s Coefficient of Correlation.
i) It is important method to give a precise and quantitative result with a meaningful
ii) It also gives a direction (i.e. positive or negative) as well as the degree of the corre-
lation between the variable.s
4 SY 0 BIOSTATISTICS AND RESEARCH METHODOLOGY
i) This method is a time consuming –
ii) The limitation of value of correlation is (-I1srs+1)
Example 5. Following data gives the height of father and son in inches. Find the Karl
Pearson’s Coefficient of Correlation
Height of Father X | 65 | 66 | 67 | 67 | 69 71
Height of Son Y | 67 | 68 | 64] 68 | 70 69
Sol. We know that
Karl Pearson’s Coefficient of Correlation (r=
MEASURE OF DISPERSION 55:
lation between X and Y.
iv)Spearman’s Rank Coefficient of Correlation
It is a method of finding the correlation between two variables by taking their ranks.
This method of finding correlation is special useful in dealing with qualitative data. We use
it if the relative position or rank of magnitude are given, but the actual magnitude of vari-
ables are not given. It is denoted by p (rho) and defined as
ai. 0a _ 6d’
a n(n?—1) me (n’-n)
n = The numbers of pairs of observations
Xd? = Sum of squares of differences of corresponding ranks
There are two cases to calculate the rank correlation:
A) There is no tie
B) There is tie
Case (A) : When the rank correlation with no tie.
In this case, anyone of the values in x-series or y-series is not repeating. So we can use
the following steps for finding rank correlation with no tie.
Step 1 : Give rank one to the heighest value, rank two to next heighest and so on.
Step 2 : Rank x series value and y-series value separately.
Step 3 : Calculate the difference of rank in each pair of values ie. d=R,—Ry
Step 4 : Calculate the sum of squares of all d’s i.e. $d?
Step 5 : Use the formula
Example 6. Find the rank correlation of following data of marks in two subjects of
seven students :
Research methodology | 90 82 81 71 63 49 38
Quantitative techniques} 75 7) 72. 70 40 50 43
Sol. Let the subject “Research Methodology” be denoted by X and Quantitative Cc
niques denoted by Y
Now, use the rank correlation formula
—1<p=0.86S1, therefore the marks in two subjects are correlation.
Case (B) : When the rank correlation with tie :
In this case, one or more values in a x-series or y-series is repeated. So we have to apply
the correction factor (C-F.)
Where R, = Number of repetition of rank
Example 7. Two teachers rank five medical stud fai – sence and
the data are given below : ents based on their intelligen
Students 1 2 3 P ai
Teacher A 5 4 25 i 3 8
Teacher B 5 4 > ; 2.5
Do you agree that two teachers A
sradierié based Gn thelr fatSliirencs? and B have same degree of agreement on judging
Sol. Here the ranks of intelli i?
gence is gi pred
rank given by teacher A is considered as given so re-ranking the values is not requy Bt
considered as y series (R,) x-series (R,) and rank is given by the teac™
MEASURE OF DISPERSION 59°
r = Coefficient of Correlation
n = Numbers of pairs of observations
Example 9. Find the probable error of the coefficient of correlation problem refers to
example no. 6.
Sol. We have r = 0.86 and n =7 |
Since r is greater than (6 P.E.). Then the value of r is highly significant.
3.5.2 STANDARD ERROR
It is calculated by the following formula
3.6 MULTIPLE CORRELATION
In multiple correlation, we study the relationship between three or more variable.
le is z and ‘x and y’ both are independent variables. Then the
SUppose the dependent variab
multiple correlation coefficient is defined as Since p= 0.65 <1, we can say that both teacher have different agreement on assessment
(v) Concurrent Deviation Method:
This method is based on the direction of change in the two paired variations. The coef-
ficient of concurrent deviation between two series X and Y of direction of the change is
called the coefficient of concurrent deviation. It is denoted by r, and calculated by the fol-
lowing formula :
c = Numbers of positive sign after multiplying the change direction of change of x series
and y series.
58 a FU) Eee eset taelereolen
n = Numbers of pairs of observations
Advantages : |
i) It is very simple to understanding and easy to calculate.
ii) It is also suitable for large no’s of observations. sibel
Example 8. Calculate the coefficient of concurrent devation from the following data:
Sol. To calcukite r, construct the table of change of direction.
x ‘Direction of change ¥ Direction of change DX.DY
vi) Two-way Table Method
This method is used to examine the relationship between two categorical variable.
The entries in the cells of a two way table can be displayed as frequency counts or as
relative frequencies or they can be displayed graphically as a segmented bar chart.
3.5.1 PROBABLE ERROR (P.E.)
The probable error define the interpreted value of th e coefficient of
: A coefficient of multiple correlation lies between 0 and 1. If the value of multiple corre-
lation is one ice. ‘1’, then the correlation of variables is perfect, while, iff the value of multiple
correlation is zero i.e. ‘0’, then there is no correlation of variable.
Remark : Sometimes the multiple correlation is defined as
Question. Consider io =0.86, ,;=0.71 and 1, =0.66 are the zero order correlation
coefficients. Then find the multiple correlation coefficient.
Putting the values of 1,, =0.86, 1, =0.71 and r,,=0.66 in the above formula.
Hence, the multiple correlation is 0.8806.
MEASURE OF DISPERSION ag
Q.1. What are the impact of Biostatistics on pharmacy practice?
(i) Aware the standard of medical practice in present scenario.
(ii) Find out barrier in order to improve it.
(iii) What kind of services should pharmacists focus on?
(iv) Is there any significant difference between methods, interventions or procedures?
(v)Identifying determinants for a disease drug related problem condition etc.
(vi) Any observed source between drug and disease.
(vii) Identify more cast effective medicine.
(viii) Making informed decision.
Q.2. Find the mean value of the following data :
Size(x) 214/6/8]10|12|14| 16
Frequency (f) |1 /3/ 5] /2 /2] 6]4 | 2
MEASURE OF DISPERSION
Now, coefficient of correlation (r) = 2
Q.5. Using Karl Pearson’s method, to calculate the coefficient of correlation between data
and their frequency.
64 [ae enenimed F V BIOSTATISTICS AND RESEARCH METHODOLOGY
Q.6. Discuss the advantage and disadvantage of Karl Pearson’s method of studying corre-
(i) This method represents the presence or absence of correlation between
(i) Comparatively, it is difficult method.
(ii) This method affected by the values of extreme items.
(iii) It is based on a many assumptions.
a LEAs Fed, SEER RS SES BE EGR BOL FS EP LT ETL CSS SS
MEASURE OF DISPERSION 65
LONG ANSWER QUESTIONS
Q.1 Explain frequency distribution and their classification.
Solution : Hint (See section 1.1.6)
SHORT ANSWER QUESTIONS
Q.1. Define Statistics and their importance.
Solution. Definition : Statistics is the science of collection, presentation, analysis and inter-
pretation of numerical data for logical analysis.
(i)Numerical information is available everywhere.
(ii)The knowledge of statistical methods will help you to understand how decisions are
made and give you a better understanding of how they affect you.
Q.2. Define frequency distribution.
Solution : Suppose the value of a variable occurs twice or more in a given series of observa-
tions, then the number of accurence of the values is known as the frequency of thta value.
The way of tabulating a pool of data of a variable and their respective frequencies side by
side is called a frequency distributions.
Q.3. What do you meant by histogram?
Hint : See section 1.5.
Q.4. Define Ogive.
Hint : See section 1.5.
Q.5. Explain commulative frequency curve with a suitable example.
Hint : See section 1.5.
Q.6. Draw the frequency polygon of the following data.
Age of workers 20-30 30-40 40-50 50-60 60-70
No. of workers 15 20 26 30 6
Q.7. Draw Ogive for the following distribution
Marks obtained 10-20 20-30 30-40 40-50 50-6
ae [RY BlosTATISTICS AND RESEARCH METHODOLOGY MEASURE OF DISPERSION 67
Q.8. Explain the following terms :
Q.15. Define Mean deviation.
a) Weighted A.M. —_b) Geometric Mean Ans. Mean Deviation : It is defined as an average or mean of the deviations of the values
c) Harmonic Mean _ d) Mean, Median and Mode from central tendency.
Hint : See section 2.2.1 Q.16. What you meant by standard deviation?
Ans. Standard deviation (S.D.) is the square root of the arithematic mean of the squared
deviations of items taken from the arithematic mean. It is used most commonly in Q.9. Define a measure of a central tendency and explain their objective.
Hint : See section 2.0
Q.17. Discuss the coefficient of variance.
Q.10. Find the mean, median and mode for the following data :
xX 15 20 25 30 35 40 45 50
Q.18. Explain the following terms :
f 7 10 15 20 16 0 3 2
a) Measure of dispersion
Hint : See section 3.0
Q.11. Find the S.D. and variance for the following data :
b) Quartile deviation
Xx 4 8 12 16 20
nt : See section 3.1.2
c) Method to obtain mean
¥ 5 6 10 12 9
Hint : See section 3.1.3.
Q.12. Obtain median and mode, from the frequency distribution.
the range for the following data regarding Hb% of 10 patients-
153, 160 § data gives the height of 5 students in a class in centimeter: 164, 182, 161,
Q.13. The following distribution is given of 100 students in 1
0th class examination. Ob- Find the standard deviation
tain the mean and S.D. (Ans. o= 9.6cm)
Marks : 1-10 11-20 21-30 31-40 41-50 Q.21. Calculate the s
No.’’s of students: 16 29 30 17 8
Weight 60-62 62-j6 4
Q.14. Define Range and Its Advantages.Range : It is defined as the difference between
the highest and lowest value in the students 8 2 Ans. !
of range is known as the “coefficient of range”.
sample. Also the relative measure (Ans. o=3.22)
d L is the lowest value in any sample. ematically, if H is the highest value an Q.22. In case of continuous
Then Range = H-L ta
ble of fHb r% eof qc uhe et n cy, calculate the standard deviation by the foll
i) It is easy to calculate.
ii) It is easy to understand.
iii) It is helpful in statistical quality control and weather forcast i ng. owing
Hb% 8-9 9
Q.24. In two series of adult aged 21 year and the children 3 months old following values EXERCISE 1.2
more obtained, for the height. Find the ratio, which series shows the greater varia-
tion? SHO RT ANSWER QUESTIONS
Ans. Ratio of greater variation = 1.3 : 1.0 Q.1. Define a measure of a Central Tendency?
Sol. According to A.E. Waugh, “An average is a single value selected from a group of
Q.25. Ina series of boys; the mean blood pressure was 120 and standard deviation was 10. values to represent them in some way- a value which is supported to stand for
In the some series mean height and standard deviation are 160 cm and 5cm respec- whole group, of which it is a part, as typical of all the values in the group”.
tively. Calculate the character, which shows the greater variation.
Q.2. What are the objectives of the measure of a central tendency?
Ans. Blood pressure shows greater variation and 2.7 time high.
Sol. Objectives :
i)To help in decision making.
Q.26. Define Correlation and Discuss their types. ii)To obtain a single value that describe the characteristics of the whole group of data.
iii)To help for comparision.
iv)To help to make quantitative relationship betw
Q.27. Explain the method of Karl Pearson’s to find the coefficient of correlation. een different group average.
Q.3. Define the weighted arithmetic mean.
Sol. Weighted arithmetic mea
Q.28. Find Karl Pearson’s correlation coefficient for the following data : n is defined as the calculation of arithematic mean
ti bn yg putt -he weights to differ
f 10 15 25 35 26 eighted arithematic mean is given below :
Q.29. Calculate correlation coefficient from the given table
X 10 15 30 25 40 38 29 45 Where
4 28 20 38 41 33 27 51 39 X,, =Weigh
ted arithematic mean
W= Weighted assigned to different items,
Q.3300.. If r,n=, = 0.0.999, 9, r,,= 0.60 and r,,== | 0.55 are the zero order correlation coefficients. Th
en Explain the following terms :
find the multiple correlation coefficient. a) Geometric mean
Hint : See section 2.2.1
b) Harmonic mean
: See section 2.2.1
Hint : See section 2.2.2
Scanned by CamScanner
= —atca eam lease 71
NUMERICAL QUESTIONS |
EXERCI SE 13
Find the mean, median and mode for the following :
SHORT ANSWER QUESTIONS
* = coefficient of correlation
The following distribution of 100 students in 10″ class examination. n = No’s of pai
median. the o* Pairs of observati;o n
} Q4. Define Standard Error,
Marks 1-10 [11-20 [21-30 [31-40 ]41-50 | Ans, Standard error isd
No’s of student |
16 |29 |30 «#|17 [8 motes by SE. and defined by the
sp.<ic? following formula-
Scanned by CamScanner
72 [I (a) Eee ale) leliereyy MEASURE OF DISPERSION
LONG ANSWER QUESTIONS MULTIPLE CHOICE QUESTIONS
1. Probability error is-
Q.1. Discuss the coefficient of correlation and its application. a) 0.6475 S.E. b) 0.6745 S.E.
Hint See article no. 3.4 c) 0.6754 S.E. d) 0.6547 S.E.
Covariance varies from
Q.2. Explain the Spearman’s rank method. a) -1 to +1 b)-1 tod
Hint See article no. 3.4 c) 0 to +1 d)-a to+
Karl Pearson’s coefficient is defined for-
Q.3. Explain the following terms : a) Ungrouped data b) Group data
a) Karl Pearson’s Coefficient of Correlation c) Both (a) & (b) d) None of these
Hint correlation wasSee article no. 3.4 developed by-
a) Karl pearson’s b) R.A. Fischer
b) Correlation Graph c) Spearman d) None of these
If r = 0.9 and P.E. = 0.032, then the valu
Hint e of n will See article no. 3.4 be
a) 14 b) 15
c) Scatter Diagram c) 16 d) 17
lf r= 0, then Cov(x, y) is equal to
See article no. 3.4
a) +1 b) -1
c) 0 d) None of these
A scatter diagram is-
a) A relation between x and y values —_-b) A statistical test
c) Both (a) and (b) d) None
If there exists any relation between the sets of variables, it is called-
a) Skewness b) Correlation
c) Linear d) None of these
Which of the following is the highest range of r?
a) O and 1 b) land0
¢) -1 and +1 d) None of these
10. Which of the following is a formula of Karl Pearson’s Coefficlent of Correlation?
a) p=l- Ps b) r= DXY
n( n -1) No.0,
°o r= d) None of t
a 74 a (a) Ese nasa endure
Coefficient of correlation has maximum value-
a) +1 b) -1
c) 0 d) None of these
12. If N=10, 0, =3/ 0, =2 and > XY =0.8, then the value of r will be-
a) 0.133 b) 1.333
c) 0.013 d) None of these
13. The Spearman Correlation is used with-
a) Ordinal data b) Nominal data
c) Interval data d) None of these
14. If two variables are absolutely indepenent of each other the correlation between them
a) -1 b) +1
c) 0 d) None of these
ANSWERS (MULTIPLE CHOICE QUESTIONS)
1. pb) 2. d) 3. b) 4. ¢) 5. ¢) 6. ¢) 7. a)
8. b) 9. ¢) 10. p) 11. a) 12. ¢) 13. a) 14. ¢)
° @ [email protected] oe °@
Scanned by CamScanner