1331.0 - Statistics - A Powerful Edge!, 1996  
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 31/07/1998   
   Page tools: Print Print Page Print all pages in this productPrint All  
Contents >> Stats Maths >> Cumulative Frequency and Percentage

CUMULATIVE FREQUENCY AND PERCENTAGE

Numerical variables can be represented in a variety of ways, including: stem and leaf, frequency distribution, cumulative frequency or cumulative percentage tables. As you will see, the graphs of these are very useful in finding the centres of large data sets.


CUMULATIVE FREQUENCY

Cumulative frequency is used to determine the number of observations that lie above (or below) a particular value.

The cumulative frequency is found from a stem and leaf table or a frequency distribution table by adding each frequency to the sum of its predecessor.

The last value will always equal the total for all observations, as all frequencies will have been added.

For continuous or discrete variables:

  • cumulative frequency is calculated from a frequency distribution table. A stem and leaf plot can be used to construct a frequency distribution table.

DISCRETE VARIABLES

EXAMPLE

1. The number of people who climbed Ayers Rock over a thirty day period were counted and recorded as follows:
31, 49, 19, 62, 24, 45, 23, 51, 55, 60, 40, 35 54, 26, 57, 37, 43, 65, 18, 41, 50, 56, 4, 54, 39, 52, 35, 51, 63, 42.
a)Set up a stem and leaf plot, and hence calculate the cumulative frequency by adding appropriate columns
b)Plot a graph of cumulative frequency against number of people.

Answers

a)The data ranges from 4 to 65, so the data is grouped in class intervals of 10 to produce the following table:

Stem
    Leaf
Frequency (f)
Upper value
Cumulative frequency

0
    4
1
4
1
1
    8 9
2
19
1+2=3
2
    3 4 6
3
26
3+3=6
3
    1 5 5 7 9
5
39
6+5=11
4
    0 1 2 3 5 9
6
49
11+6=17
5
    0 1 1 2 4 4 5 6 7
9
57
17+9=26
6
    0 2 3 5
4
65
26+4=30
b) Because the variable is discrete, the actual upper value recorded in each class interval is used in plotting the graph. Even though the variable is discrete, the plotted points are joined to form a continuous cumulative frequency polygon or curve, known as an ogive.

The cumulative frequency is always labelled on the vertical axis and any other variable, in this case the number of people, is labelled on the horizontal axis as shown below:
Graph: Labelling of Cumulative frequency is always on the vertical axis.
Some information that can be gained from either graph or table:
  • On 11 of the 30 days, not more than 39 people climbed Ayers Rock on a given day.
  • On 13 of the 30 days, 50 or more people climbed Ayers Rock.


CONTINUOUS VARIABLES

When a continuous variable or variable taking a large number of values is used, plotting the graph requires a different approach to that for a discrete variable.


EXAMPLE

1. The snow depth at Thredbo in the Snowy Mountains was measured (to the nearest centimetre) for twenty-five days and recorded as follows:
242, 228, 217, 209, 253, 239, 266, 242, 251, 240, 223, 219, 246, 260, 258, 225, 234, 230, 249, 245, 254, 243, 235, 231, 257.
a) Set up a frequency distribution table and hence calculate the cumulative frequency by adding appropriate columns.
b) Plot the graph of snow depth against the cumulative frequency.

Answers

a)The data ranges from 209cm to 266cm, so the data are grouped in class intervals of 10 to produce the following table:

    Snow depth (x)

Tally
Frequency (f)
End-point
Cumulative frequency

200
0
    200-<210
I
1
210
1
    210-<220
II
2
220
3
    220-<230
III
3
230
6
    230-<240
III
5
240
11
    240-<250
llll ll
7
250
18
    250-<260
llll
5
260
23
    260-<270
ll
2
270
25
b)Because the variable is continuous, the end-points of each class interval are used in plotting the graph. The plotted points are joined to form an ogive.

Remember that the cumulative frequency is always labelled on the vertical axis and any other variable, in this case snow depth, is labelled on the horizontal axis as shown below:


Graph: labelling of the cumulative frequency


Information that can be gained from either the graph or the table:
  • None of the 25 days had snow depth less than 200 centimetres.
  • On 1 of the 25 days snow depth was less than 210 centimetres.
  • On 2 of the 25 days snow depth was 260 centimetres or more.


CUMULATIVE PERCENTAGE

The main advantage of using cumulative percentage rather than cumulative frequency is that it provides an easier way to compare different sets of data.

The cumulative frequency and cumulative percentage graphs are exactly the same, the only difference being the vertical axis scale. In fact, it is possible to have the two vertical axes, cumulative frequency and cumulative percentage, on the same graph.

Cumulative percentage is calculated by dividing the cumulative frequency by the number of observations, n, then multiplying by 100 (the last value will always be equal to 100%). Thus:

CUMULATIVE PERCENTAGE = CUMULATIVE FREQUENCY ¸ n x 100


EXAMPLE

1. From the previous example, calculate the cumulative percentage and hence draw a graph with two different vertical axes: one for cumulative frequency and one for cumulative percentage.

    Snow depth (x)
Tally
Frequency (f)
End-point
Cumulative frequency
Cumulative percentage

200
0
0/25x100=0
    200-<210
I
1
210
1
1/25x100=4
    210-<220
II
2
220
3
3/25x100=12
    220-<230
III
3
230
6
6/25X100=24
    230<240
IIII
5
240
11
11/25X100=44
    240-<250
IIII II
7
250
18
18/25x100=72
    250-<260
IIII
5
260
23
23/25x100=92
    260-<270
II
2
270
25
25/25x100=100
    Apart from the extra axis, the graph will be exactly the same as that drawn in the previous example:
graph: cumulative frequency and cumulative percentage
    Information that can be gained from either the graph or table:
  • On 24% of days, snow depth was less than 230 centimetres.
  • On 7 of the 25 days, snow depth was at least 250 centimetres.
In summary, most ogives look similar to a stretched ‘S’. They are used to determine the number, or percentage, of observations that lie above (or below) a specified value.


EXERCISES

1.The following set of data gives the length of reign (to the nearest year) of various Kings and Queens of England since the Battle of Hastings in 1066.
21, 13, 35, 19, 35, 10, 17, 56, 35, 20, 50, 22, 13, 9, 39, 22, 0, 2, 24, 38, 6, 5, 44, 22, 24, 25, 3, 13, 6, 12, 13, 33, 59, 10, 7, 63, 9, 25, 1, 15.
a)Present the data in the form of an ordered stem and leaf plot.
b)Do any outliers exist? If so, can you explain the reason for their presence?
c)Describe the main features of distribution such as:
i) number of peaks,
ii) general shape, and
iii) approximate value at the centre of the distribution.
d)Calculate cumulative frequency and cumulative percentage.
e)Draw the ogive with two different vertical axes: one for cumulative frequency and one for cumulative percentage.
f)How many rulers reigned for less than 10 years?
g)How many rulers reigned for 50 years or more?
h)The current Queen of England is Queen Elizabeth II. She has reigned since 1953, and her reign has not been included in the data set. Calculate her length of reign, and briefly comment on this in comparison with the other rulers.
2.At a fast food outlet, Hungry Stats, a student often buys a small bag of french fries. Curious to know whether she was getting value for money and how consistent the store was with each bag, she counted and recorded the fries in each bag. The results from 30 different visits were as follows:
44, 46, 54, 38, 49, 46, 45, 31, 55, 37, 42, 43, 47, 51, 48, 40, 59, 35, 47, 21, 43, 37, 45, 38, 40, 32, 50, 34, 43, 54.
a)Present the data in an ordered stem and leaf plot. Split the stems if necessary.
b)Do any outliers exist? If so, can you explain the reason for their presence?
c)Describe the main features of distribution such as:
i) number of peaks,
ii) general shape, and
iii) approximate value at the centre of distribution.
d)Calculate the cumulative frequency and cumulative percentage.
e)Draw the ogive with two different vertical axes: one for cumulative frequency and one for cumulative percentage.
f)How many bags had fewer than 40 fries in them?
g)What percentage of bags had 45 or more fries in them?
h)Copy and complete Hungry Stats’ promotional saying: ‘Fifty per cent of our small bags of french fries contain at least... french fries.’
3.The table below is from the 1996 Census for Darwin. It shows numbers of unemployed females looking for full-time work, by age group.

Age group (a)
Number of females

15-24
339
25-34
273
35-44
147
45-54
121
55-64
22
(a) Age is collected in completed number of years. Thus, the interval 15-24 has an upper end-point of 25 (refer to section Stem and Leaf Plots)

a) Is the variable discrete or continuous?
b) Copy the table and calculate cumulative frequency and cumulative percentage.
c) Draw the ogive with two different vertical axes: one for cumulative frequency and one for cumulative percentage.
d) Why is there no data for females less than 15 years old?
e) In what age group does the cumulative percentage value ‘50’ lie?
f) What percentage of unemployed females looking for full-time work are less than 25 years old.
g) What percentage of unemployed females looking for full-time work is 55 years old or older?
h) How would Australian governments use this sort of information?
4.A survey was taken of 50 ABS employees in Brisbane to determine how long it takes them to travel to work. The results, to the nearest minute, were recorded as follows:
33, 63, 49, 65, 56, 45, 52, 63, 38, 66, 43, 98, 60, 58, 68, 29, 59, 87, 22, 64, 73, 56, 71, 67, 44, 31, 83, 50, 75, 65, 60, 51, 89, 69, 41, 76,58, 62, 25, 52, 64, 77, 61, 55, 80, 45, 12, 69, 40, 37.
a) What type of variable is this?
b) Present the data in a frequency table, using appropriate intervals, including relative and percentage frequencies
c) Draw a histogram to represent the data and mark in the frequency polygon.
d) Prepare an ordered stem and leaf plot for the data. Do any outliers exist? If so, can you explain the reason for their presence?
e) Describe the main features of distribution such as:
i) number of peaks,
ii) general shape, and
iii) approximate value at the centre of the distribution.
f)Calculate the cumulative frequency and cumulative percentage. Put in the end-points.
g)Draw the ogive with two different vertical axes: one for cumulative frequency and one for cumulative percentage.
h)What was the most common time interval taken for ABS staff to travel to work?
i)What percentage of people took longer than 90 minutes to travel to work?
j)How many staff took less than 40 minutes to travel to work?

Click here for answers

CLASS ACTIVITY

1.Survey teachers in your school to find out how long they have been teaching (to the nearest year). What type of variable is this? Present the data in a frequency table, using appropriate intervals, including relative and percentage frequencies.
For how many years have the majority of teachers taught? By what percentage is this more than the second most common length of service?
Draw a histogram to represent the data and mark in the frequency polygon. Prepare an ordered stem and leaf plot for the data.
Do any outliers exist? If so, can you explain the reason for their presence?
Describe the main features of distribution such as:
i)number of peaks,
ii)general shape, and
iii)approximate value at the centre of the distribution. Calculate cumulative frequency and cumulative percentage.
Draw the ogive with two different vertical axes: one for cumulative frequency and one for cumulative percentage.
How many teachers have taught for more than ten years?
What percentage of teachers has taught for more than ten years?
What percentage of teachers has taught for less than ten years?
What is the number of years below which half the teachers have taught?
Present your analysis and report in a neat project form.


    Previous PageNext Page