A systematic record of facts or different values of a quantity is called data.
Features of the data
Statistics deals with collection, presentation, analysis and interpretation of numerical data.
Arranging data in a order to study their salient features is called presentation of data.
Data arranged in ascending or descending order is called arrayed data or an array
Range of the data is the difference between the maximum and the minimum values of the observations
Table that shows the frequency of different values in the given data is called a frequency distribution table
A frequency distribution table that shows the frequency of each individual value in the given data is called an ungrouped frequency distribution table.
A table that shows the frequency of groups of values in the given data is called a grouped frequency distribution table
The groupings used to group the values in given data are called classes or class-intervals. The number of values that each class contains is called the class size or class width. The lower value in a class is called the lower class limit. The higher value in a class is called the upper class limit.
Class mark of a class is the mid value of the two limits of that class.
A frequency distribution in which the upper limit of one class differs from the lower limit of the succeeding class is called an Inclusive or discontinuous Frequency Distribution.
A frequency distribution in which the upper limit of one class coincides from the lower limit of the succeeding class is called an exclusive or continuous Frequency Distribution
Bar Graph:
A bar graph is a pictorial representation of data in which rectangular bars of uniform width are drawn with equal spacing between them on one axis, usually the x axis. The value of the variable is shown on the other axis that is the y axis.
Histogram:
A histogram is a set of adjacent rectangles whose areas are proportional to the frequencies of a given continuous frequency distribution
Mean
The mean value of a variable is defined as the sum of all the values of the variable divided by the number of values.
Median
The median of a set of data values is the middle value of the data set when it has been arranged in ascending order. That is, from the smallest value to the highest value
Median is calculated as
Where n is the number of values in the data. If the number of values in the data set is even, then the median is the average of the two middle value
Mode
Mode of a statistical data is the value of that variable which has the maximum frequency
Mean for Ungroup Frequency table
Here is the ungroup Frequency table
Mark obtained(x_{i})
25
35
45
65
No of student(f_{i)}
4
10
23
34
ean is given by
Greek letter ∑ (capital sigma) means summation Example
A survey was conducted by a group of students as a part of their environment awareness programmes, in which they collected the following data regarding the number of plants in 30 houses in a locality. Find the mean number of plants per house.
Number of Plants
1
3
5
7
9
11
13
Number of Houses
11
2
1
5
6
2
3
Solution
Number of Plants
No. of houses (f_{i})
f_{i} x_{i}
1
11
11
3
2
6
5
1
5
7
5
35
9
6
54
11
2
22
13
3
39
∑f_{i }= 30
∑f_{i} x_{i} = 172
ean = 172/30 = 5.73
Mean for group Frequency table
Class interval
10-25
25-45
45-65
65-85
No of student(f_{i)}
4
10
23
34
In these distribution, it is assumed that frequency of each class interval is centered around its mid-point i.e class marks
Mean can be calculated using three method
a) Direct method
This method can be very calculation intensive if the values of f and x are large.We have big calculations and chance of making mistake is quite high
Steps involved in finding the mean using Direct Method
1) Prepare a frequency table with the help of class marks
2) Multiply f_{i } x _{i } and find the sum of it.
4) Use the above formula and find the mean. Example
The following table shows the weights of 10 children:
Weight (in kg)
66-68
68-70
70-72
72-74
74-76
Number of students
3
3
2
1
1
Find the mean by using direct method. Solution:
Weight (in kg)
x_{i}
No. of students (f_{i})
f_{i} x_{i}
66-68
67
3
201
68-70
69
3
207
70-72
71
2
142
72-74
73
1
73
74-76
75
1
75
Σ f_{i} = 10
Σf_{i} x_{i} = 698
So, Mean would be
=698/10 = 69.8 kg
b) Assumed mean method
Where
a=> Assumed mean
d_{i } => x_{i} –a
This method is quite useful when the values of f and x are large. It makes the calculation easiar.In this method we take some assumed mean and calculate the deviation from it and then calculate mean using above formula Steps involved in finding the mean using Assumed Mean Method
1) Prepare a frequency table.
2) Choose A and take deviations d_{i } = x_{i } - A of the values of x_{i }.
3) Multiply f_{i } d _{i } and find the sum of it.
4) Use the above formula and find the mean. ExampleExample
The following table shows the weights of 10 children:
Weight (in kg)
66-68
68-70
70-72
72-74
74-76
Number of students
3
3
2
1
1
Find the mean by using Assumed Mean method. Solution:
Let the assumed mean = A = 71
Weight (in kg)
x_{i}
No. of students (f_{i})
di = xi - 71
f_{i} d_{i}
66-68
67
3
-4
- 12
68-70
69
3
-2
- 6
70-72
71
2
0
0
72-74
73
1
2
2
74-76
75
1
4
4
Σ fi = 10
Σfi di = -12
So, Mean would be
=71-12/10 = 69.8 kg
c) Step deviation Method
Where
a=> Assumed mean
u_{i } => (x_{i} –a)/h
This method is quite useful when the values of f and x are large. It makes the calculation further easiar by dividing the deviation from common factor. Steps involved in finding the mean using Step Deviation Method
1) Prepare a frequency table.
2) Choose A and h and take u_{i } = (x_{i} –a)/h of the values of x_{i }.
3) Multiply f_{i } u _{i } and find the sum of it.
4) Use the above formula and find the mean. Example
The following table shows the weights of 10 children:
Weight (in kg)
66-68
68-70
70-72
72-74
74-76
Number of students
3
3
2
1
1
Find the mean by using Step Deviation method. Solution:
Let the assumed mean = A = 71 and h=2
Weight (in kg)
x_{i}
No. of students (f_{i})
di = xi - 71
u_{i} =d_{i}/h
f_{i} u_{i}
66-68
67
3
-4
-2
- 6
68-70
69
3
-2
-1
- 3
70-72
71
2
0
0
0
72-74
73
1
2
1
1
74-76
75
1
4
2
2
Σ fi = 10
Σfi ui = -6
So, Mean would be
=71+ (-6/10) 2 = 69.8 kg
Important points
1)The mean obtained by all these three methods are same.
2) The assumed mean method and step-deviation method are just simplified forms of the direct method.
Mode for grouped frequency table
Modal class: The class interval having highest frequency is called the modal class and Mode is obtained using the modal class
Where
l = lower limit of the modal class,
h = size of the class interval (assuming all class sizes to be equal),
f_{1} = frequency of the modal class,
f_{0} = frequency of the class preceding the modal class,
f_{2} = frequency of the class succeeding the modal class. Example
The following table shows the ages of the patients admitted in a hospital during a year
Age (in years)
5-15
15-25
25-35
35-45
45-55
55-65
Number of patients
6
11
21
23
14
5
Find the mode Solution
odal class = 35 – 45, l = 35, class width (h) = 10, f_{1} = 23, f_{0} = 21 and f_{2} = 14
Substituting the values in the formula given above we get
ode= 36.8 year
Cumulative Frequency chart
The cumulative frequency of a class is the frequency obtained by adding the frequencies of all the classes preceding the given class.
Class interval ( Age)
No of Insurance policies
15-20
2
20-25
4
25-30
16
30-35
20
35-40
20
40-45
12
Cumulative Frequency chart will be like
Age in years
Cumulative No of Insurance policies
Less than 20 years
2
Less than 25 years
6
Less than 30 years
22
Less than 35 years
42
Less than 40 years
62
Less than 45 years
74
The above table cumulative frequency distribution of the less than type. We can similary make it like below
Age in years
Cumulative No of Insurance policies
More than or equal to 15 years
74
More than or equal to 20 years
74-2=72
More than or equal to 25 years
72-4=68
More than or equal to 30 years
68-16=52
More than or equal to 35 years
52-20=32
More than or equal to 40 years
32-20=12
The table above is called a cumulative frequency distribution of the more than type.
Median of a grouped data frequency table
Steps involved in finding Median of a grouped data frequency table
1) For the given data, we need to have class interval, frequency distribution and cumulative frequency distribution
2)Then we need to find the median class How to find the median class
a) we find the cumulative frequencies of all the classes and n/2
b)We now locate the class whose cumulative frequency is greater than (and nearest to) n/2
c)That class is called the median class
3) Median is calculated as
Where
l = lower limit of median class,
n = number of observations,
cf = cumulative frequency of class preceding the median class,
f = frequency of median class,
h = class size (assuming class size to be equal) Example
A survey regarding the heights (in cm) of 60 girls of a school was conducted and the following data was obtained:
Height (in cm)
Number of girls
Less than 140
4
Less than 145
11
Less than 150
29
Less than 155
40
Less than 160
46
Less than 165
60
Find the median height Solution
To calculate the median height, we need to find the class intervals and their corresponding frequencies.
The given distribution being of the less than type, 140, 145, 150, . . ., 165 given the upper limits of the corresponding class intervals.
So, the classes should be below 140, 140 - 145, 145 - 150, . . ., 160 - 165. Observe that from the given distribution, we find that there are 4 girls with height less than 140, i.e., the frequency of class interval below 140 is 4 . Now, there are 11 girls with heights less than 145 and 4 girls with height less than 140. Therefore, the number of girls with height in the interval 140-145 will be 11-4=7. Similarly, other can be calculated
Class interval
Frequency
Cumulative Frequency
Below 140
4
4
140-145
7
11
145-150
18
29
150-155
11
40
155- 160
6
46
160- 165
14
60
So, n =60 and n/2=30 And cumulative frequency which is greater than and nearest to 30 is 40 , So median class 150-155 l (the lower limit) = 150,
cf (the cumulative frequency of the class preceding 150 - 155) = 29, f (the frequency of the median class 150 - 151) = 11, h (the class size) = 5.
Now
= 150 + [(30-29)/11]5
=150.45 cm
Empirical Formula between Mode, Mean and Median
3 Median=Mode +2 Mean
Graphical representation of Cummulative frequency distribution
We can represent Cummulative frequency distribution on the graph also. To represent the data in the table graphically, we mark the upper limits of the class intervals on the horizontal axis (x-axis) and their corresponding cumulative frequencies on the vertical axis (y-axis), choosing a convenient scale.
When we draw the graph for the cumulative frequency distribution of the less than type.The curve we get is called a cumulative frequency curve, or an ogive (of the less than type).
When we draw the graph for the cumulative frequency distribution of the more than type.The curve we get is called a cumulative frequency curve, or an ogive (of the more than type).
When we plot both these curve on the same axis, The two ogives willintersect each other at a point. From this point, if we draw a perpendicular on the x-axis, the point at which it cuts the x-axis gives us the median
Crossword Puzzle
Across
3. The ........ frequency of a class is the frequency obtained by adding the frequencies of all the classes preceding the given class.
5. It is the difference between the maximum and minimum values in data set.
7. The highest frequency class interval is .....class
8. Middle value of the dataset Down
1. Terms for number of times the event occurred in an experiment or study
2. It is a set of adjacent rectangles whose areas are proportional to the frequencies of a given continuous frequency distribution
4. Maximum frequency data in data set
6. Average value of a data set
Thanks for visiting our website. DISCLOSURE: THIS PAGE MAY CONTAIN AFFILIATE LINKS, MEANING I GET A COMMISSION IF YOU DECIDE TO MAKE A PURCHASE THROUGH MY LINKS, AT NO COST TO YOU. PLEASE READ MY DISCLOSURE FOR MORE INFO.