1331.0 - Statistics - A Powerful Edge!, 1996  
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 31/07/1998   
   Page tools: Print Print Page Print all pages in this productPrint All  
Contents >> Information Studies >> Information - Problems with Using

INFORMATION - PROBLEMS WITH USING


The previous section should have given you an idea of just how important statistical information is in modern society. Decisions that affect the lives of all Australians are often made by taking statistics into account. This places a large responsibility on people who make decisions. They should be aware of the traps one can fall into when using statistics.

This section will outline some of the problems you may encounter if you are not careful in using statistics. The quotation below from H.G. Wells was made at the beginning of the 20th century, and few would disagree that it is relevant today. The modern citizen needs to have an awareness of the problems with using statistical information.

“STATISTICAL THINKING WILL ONE DAY BE AS NECESSARY FOR EFFICIENT CITIZENSHIP AS THE ABILITY TO READ AND WRITE.”
H. G. Wells

MISINTERPRETATION OF STATISTICS

Misinterpretation is a good example of a common problem in the use of statistical information. It may be caused by a number of factors such as:
  • Ignoring definitions. You should always familiarise yourself with the definition of concepts surrounding statistical information you are using. If you are examining labour force issues, you should familiarise yourself with the definition of unemployment, participation rate, etc. If you are examining environmental issues, you should consider the definition of forest, woodland, extinct or endangered species, or even the definition of a National Park (which differs between States). An example of how ignoring a definition can lead to misinterpretation of data follows:

The ABS released a labour force publication in November 1992 with the following main feature:
    “AN ESTIMATED 25 PER CENT OF ALL FAMILIES HAD NO FAMILY MEMBER EMPLOYED.”

    Based on the above, a headline in a leading Australian newspaper read:
      “UNEMPLOYMENT AFFLICTS ONE IN FOUR FAMILIES.”

      This headline does not logically follow from the main feature above it. The headline represented a lack of understanding about the definition of
      unemployed. If you are not employed you may be unemployed: that is, in the labour force and actively seeking a job; OR, you may not be in the labour force, for example, you may be a student, retired or not actively looking for work.

      Just because a family has no member employed does not necessarily mean that those members are unemployed, because to be unemployed you have to be in the labour force: that is, you have to be actively seeking work. The headline showed a misinterpretation based on lack of understanding of an underlying definition.
      • Comparing statistics inappropriately. A great advantage of using statistics is that one can compare information to assess beliefs, ideas or thoughts about issues and topics. For example, you can compare: Sydney’s weather with Melbourne’s, past sporting results with the present, or whether males and females do the same amount of unpaid household work.

      However, there can be real problems in comparing statistics when the definitions, classifications or methods of collection underpinning them are different. Nowhere is this more apparent than with environmental statistics. Consider the table below.

      FOREST COVER, AUSTRALIA, 1980

      Source
      Per cent of Australia
      CSIRO
      4
      WORLD BANK
      14

      The definitions of forest used by CSIRO and World Bank in the above table are very different. The World Bank has included ‘woodland’ in their estimate, and this explains the large difference in figures. Therefore, it is wrong to compare the two figures in any way! It would be worse still to compare a World Bank estimate for 1980 with a CSIRO estimate for 1981, and conclude that Australia had logged most of its forests!
      • Deliberate misrepresentation. In the modern information age, it is certainly important to recognise that information can have integrity, and be objective, accurate and factual. However, it must also be recognised that information can sometimes be flawed; by being subjective, inaccurate or fictional. Consider the quotation below:
      “POLITICAL TACTICIANS ARE NOT IN SEARCH OF SCHOLARLY TRUTH OR EVEN SIMPLE ACCURACY. THEY ARE LOOKING FOR AMMUNITION TO USE IN THE INFORMATION WARS. DATA, INFORMATION, AND KNOWLEDGE DO NOT HAVE TO BE TRUE TO BLAST AN OPPONENT OUT OF THE WATER.”
      Alvin Toffler

      You might say this is an overly cynical quotation, but one does need to realise that information is open to manipulation by various forces, for example:
        According to a United Nations report, in 1986 the South African authorities ceased publishing information on South Africa’s imports and exports classified by countries that supplied and received them. This was an attempt to head off trade sanctions being imposed because of apartheid policy.

        This section outlined some problems you may encounter trying to understand and compare statistical information. Of course, you also have to be careful about how accurately statistics were collected in the first place. This leads to you being aware of sampling and non-sampling error, concepts outlined in the following pages.


        SAMPLING ERROR

        In any sample survey that you undertake you will experience sampling error. Sampling error refers to:

        THE DIFFERENCE BETWEEN AN ESTIMATE DERIVED FROM A SAMPLE SURVEY AND THE ‘TRUE’ VALUE THAT WOULD RESULT IF A CENSUS OF THE WHOLE POPULATION WAS TAKEN.

        Sampling error can be measured mathematically and is influenced by:

        Size of sample. In general, the larger the sample size (the number of people being surveyed) the smaller the sampling error.

        Many people are surprised by the small size of well-known sample surveys. Opinion polls about which party people will vote for are taken with sample sizes ranging from 600 to 2,000 people, with samples of about 1,000 the most likely. Television ratings of different programs and channels are taken from a sample survey of about 1,900 homes, out of an Australian total population of 6.5 million homes. Despite a perception that such polls are accurate, some statisticians would question their accuracy due to the small sample sizes.

        Design of sample. The method of sampling can also affect the size of sampling error. This concept is looked at in detail on the sections Random Sampling and Non-Random Sampling.


        NON-SAMPLING ERROR

        This concept refers to error apart from sampling error. Non-sampling error can occur at any stage of a sample survey or census, and unlike sampling error it is not generally easy and inexpensive to measure. There are two main types of non-sampling error: systematic error and variable error. Variable error is less serious than systematic error because, on average, it tends to balance out . Systematic error does lead to distortion of survey results, so it is important to be aware of how it occurs.


        SYSTEMATIC ERROR (BIAS)

        Later in this publication you will come across a technical definition of bias (see section Non-Random Sampling). For the purposes of this section, bias is defined as any influence that unreasonably affects or sways the results of a sample survey or census. There are a number of different sources of bias:
        • Inappropriate estimation. The ABS and other data collection agencies spend much time designing and monitoring sample surveys to ensure that non-sampling error is kept to an absolute minimum. However, even after sample survey results are finalised, bias can be introduced. This occurs if estimation is inappropriate (see section Estimation).
        • An example of inappropriate estimation relates to the issue of global warming (greenhouse effect). The graph below is the most common portrayal of global temperature change. In general, it shows an average increase in the last 160 years of between 0.3 Celsius and 0.6 Celsius in global temperatures.
        • However, some scientists have questioned the accuracy of this chart. This is because they feel that estimates from the sample survey are biased.
        CHANGES IN GLOBAL TEMPERATURE
        (Degrees Celsius)
        Graph: changes in global temperature



        The measurements that make up the graph have been taken at various weather stations around the world. You can regard the world’s surface as the population from which a sample survey can be taken.

        Scientists argue, therefore, that measurements should be taken to reflect the ratio of the world’s land mass to its sea mass. For example, if the land mass is half the sea mass, then twice as many measurements should come from the world’s seas as opposed to the land.

        In fact, in the graph above, there have been very few measurements taken from the world’s sea surfaces, whereas the great majority of measurements were taken from weather stations on land.

        But why might this bias the estimates from the sample survey? The reason is that temperatures on land tend to be naturally higher than on sea surfaces. This is due to a phenomenon known as urban heat island effect. Hence, if the sample is too heavily weighted towards land based temperatures, and the estimates do not take account of this (as some scientists claim), the results may not reflect a true global average.
        • Poor questionnaire design. You should always be careful with the questions you ask in a sample survey or census. Otherwise, bias may be introduced. If questions are leading, misleading, ambiguous or difficult to understand, the survey or census results may be distorted.
        An example of poor question design is shown by a pilot test the ABS conducted for the 1986 Census of Population and Housing. A pilot test checks that questions in a forthcoming census will be easily understood.

        For the 1986 Census it was requested that the ABS gather data on ethnicity. Initially a question: ‘What is your cultural background’ was framed.

        One of the replies to this question was simply ‘none’. When the respondent was contacted he was asked what he meant. He replied, ‘Look, leave me alone, I’m a regular sort of a bloke, I go to the footy every now and then, but I’ve never been to the opera and I’ve never taken up a musical instrument in my life.’

        This example shows that people may interpret broad concepts such as ‘cultural background’ quite differently. The question was reframed and written as: ‘What is each person’s ancestry?’. E.g. Greek, Armenian, English... etc.

        • Non-response bias. If a significant number of people do not respond to a mail-out for a sample survey, then results may be biased. This is because the characteristics of non-respondents may differ from those who have responded. Some questions may be difficult to understand for certain people.
          To reduce this form of bias, care should be taken in the design and testing of questionnaires, and following up non-respondents to a survey.

        • Interviewer bias. An interviewer can unfairly influence the way a respondent answers questions. This may occur if the interviewer is too friendly, aloof or prompts the respondent. Interviewers therefore need to be trained correctly (see the section Data Collection).

        • Processing errors. These can arise through miscoding, mispunching, incorrect computer programming and inadequate checking (see the section Data Processing).


        SUMMARY

        It is useful to have a checklist of questions ready for whenever you are presented with statistical information. This is not because there are always going to be problems with the statistics, but rather because it will give you confidence in judging their reliability. Some questions you might ask include:
        • What is the source of the information? Is it from a primary source (organisation that collected the data) or a secondary source?

        • If the information is from a secondary source, is it possible it may have been altered for whatever reason?

        • Has the primary source of information got a possible reason for misrepresenting the information?

        • Do you need to find out the method of data collection, sampling technique or response rate to the survey? Were the questions asked easy to understand?

        • If the information is from a sample survey, was the sample size adequate?

        • Do you understand the definitions of variables or topics talked about in the survey or census? Are definitions consistent?
        These are just some questions you may consider when presented with statistical information. You may feel that some of them would be difficult to answer, but if the source cannot provide you with answers, then the information’s reliability should be questioned!


        EXERCISES

        1.Can you list some possible problems with the statistics in the following statements?
        a)The average income of Australians is $83,000 according to a survey carried out in the Sydney suburb of Double Bay.
        b)A large majority of rural people oppose dropping the wool floor price according to a television phone-in poll carried out by a regional television station.
        c)Tests reveal that half (50%) of our nation’s school leavers are below average in reading and writing.
        d)Youth unemployment is over 30%; therefore, 30% of Australia’s 15-19 year olds are unemployed.
        e)A leading environmental group recently claimed that only 3% of Australia’s land mass was covered by forest, whereas a leading business organisation claimed the figure was 7%.
        2.Examine the statistics you have gathered from Exercise 2 on the last section, and list some problems you think might exist with the information.
        3.As far as presenting and debating ideas, statistical information can have limitations: it often needs to be explained or interpreted with words. Conduct a class debate about the above sentence or about the quotation below!

        “ORATORY IS DYING, A CALCULATING AGE HAS STABBED IT IN THE HEART WITH INNUMERABLE DAGGER-THRUSTS OF STATISTICS.”

        Sir Keith Hancock

        Click here for answers



        Previous PageNext Page