1331.0 - Statistics - A Powerful Edge!, 1996  
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 31/07/1998   
   Page tools: Print Print Page Print all pages in this productPrint All  
Contents >> Stats Maths >> Organising Data - Frequency Distribution Tables

ORGANISING DATA

FREQUENCY DISTRIBUTION TABLES

The frequency (f) of a particular observation is the number of times the observation occurs in the data. The distribution of a variable is the pattern of values of the observations.

Frequency distribution tables can be used for both nominal and numeric variables. (For continuous variables they should only be used with class intervals, explained further down.)

EXAMPLE

1. Twenty people were asked how many cars were registered to their households. The results were recorded as follows:

1, 2, 1, 0, 3, 4, 0, 1, 1, 1, 2, 2, 3, 2, 3, 2, 1, 4, 0, 0.

Present this data in a frequency distribution table.


Number of cars (x)
Tally
Frequency (f)

0
llll
4
1
llll l
6
2
llll
5
3
lll
3
4
ll
2


A tally mark is placed in the appropriate row in the table as the data are read from left to right.

The first result is a ‘1’, so a tally mark is placed in the row where 1 appears in the ‘number of cars’ column in the table.

The next result is a ‘2’, so a tally mark is placed in the row where 2 appears in the ‘number of cars’ column, and so on.

The fifth tally mark is drawn through the preceding four marks to make final calculations of frequency easier.

Thus, it can be seen that the number of households with no car is 4, the number of households with 1 car is 6 and so on.


CLASS INTERVALS

When a variable takes a large number of values it is easier to present and handle the data by grouping the values in class intervals. Continuous variables are always presented in class intervals; discrete variables can also be grouped and presented in class intervals. In the example below, we set out age ranges for a study of young people, but allow that some older people may fall in-scope for our study.

The frequency of a class interval is the number of observations that occur in a particular pre-defined interval. If 20 people aged 5-9 appear in our result, the frequency is 20 for this interval.

The end-points of a class interval are the lowest and highest values that a variable can take. Therefore, if the intervals are 0-4 years, 5-9, 10-14, 15-19, 20-24, and 25+; the end-points of the first interval are 0 and 4 if the variable is discrete, and 0 and 4.999 if continuous.

Class interval width is the difference between lower end-point of the interval and lower end-point of the next interval. If the intervals (continuous) are 0-4, 5-9, .... , etc.; the width of the first 5 intervals is 5, and the last interval is open. The intervals could also be written as 0-<5, 5-<10, 10-<15, 15-<20, 20-<25, and 25+.

The basic rules to follow when constructing a frequency distribution table for a data set containing a large number of observations are:
  • find the lowest and highest values of the variable,
  • decide on the width of the class intervals, and
  • make sure that all possible values of the variable are included.

EXAMPLE

1.Thirty AA size batteries were tested to determine how long they lasted. The results, to the nearest minute, were recorded as follows:

423, 369, 387, 411, 393, 394, 371, 377, 389, 409, 392, 408, 431, 401, 363,

391, 405, 382, 400, 381, 399, 415, 428, 422, 396, 372, 410, 419, 386, 390.


Construct a frequency distribution table.

The lowest value is 363 and the highest value is 431.

For the given data, and choosing a class interval of 10, the first class interval should be 360-369 to include 363 (the lowest value). There should be enough class intervals until the highest value has been included to give the following table:

Class Interval (x) (Battery life, mins)
Tally
Frequency (f)

360-369
II
2
370-379
lll
3
380-389
llll
5
390-399
llll ll
7
400-409
llll
5
410-419
llll
4
420-429
lll
3
430-439
l
1

Total
30



RELATIVE AND PERCENTAGE FREQUENCY

Analysts studying this data may not be only interested in how long batteries last, but also what proportion fall in each class interval.

The relative frequency of a particular observation or class interval is found by dividing the frequency (f) by the number of observations (n): that is, (f/n). Thus:

RELATIVE FREQUENCY = FREQUENCY ÷ NUMBER OF OBSERVATIONS

The percentage frequency is found by multiplying each relative frequency value by 100. Thus:

PERCENTAGE FREQUENCY = f/n x 100


EXAMPLE

1.Using the previous example of battery life, set up a table giving the relative frequency and percentage frequency of each interval.

Class Interval (x)
(Battery life, mins)
Frequency (f)
Relative
frequency
Percentage
frequency

360 - 369
2
0.07
7
370 - 379
3
0.10
10
380 - 389
5
0.17
17
390 - 399
7
0.23
23
400 - 409
5
0.17
17
410 - 419
4
0.13
13
420 - 429
3
0.10
10
430 - 439
1
0.03
3

Total
30
1.00
100


The analyst might now be able to say that:
  • 7 per cent of AA batteries have a life from 360 minutes up to, but less than, 370 minutes; or that
  • the probability of any randomly selected AA battery having a life in this range is approximately 0.07.

Note: these statements assume a representative sample has been drawn. For completeness, an estimate of variability should be referred to as well (see section Measures of Spread - Range)
Nevertheless, in summary, frequency distribution tables are important in providing information about the population from which the sample is drawn.



Previous PageNext Page