1307.6 - Tasmanian State and Regional Indicators, Mar 2009  
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 30/04/2009   
   Page tools: Print Print Page Print all pages in this productPrint All

What is statistical literacy and why is it important to be statistically literate?


What is statistical literacy?
Why is it important to be statistically literate?
Are you statistically literate?
Statistical literacy criteria:
1. Data awareness
2. The ability to understand statistical concepts
3. The ability to analyse, interpret and evaluate statistical information
4. The ability to communicate statistical information and understandings
Conclusion
References


Australians regularly provide the Australian Bureau of Statistics (ABS) with information about their lives: how and where they live, their family structure and activities, how they earn their money and what they spend it on. This wealth of information enables us to put together a picture of the nation. One of the ABS' corporate objectives is to assist and encourage the 'informed and increased use of statistics'. By promoting access and improving understanding and use of these statistics, the ABS aims to improve statistical literacy in the community.


What is statistical literacy?

According to H.G. Wells, statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write... and that day has arrived! Statistics are collected on most aspects of Australian life. They capture vital information about our economic performance, the well-being of our population and the condition of our environment. They help form the basis of our democracy and provide us with essential knowledge to assess the health and progress of our society. We rely on those statistics being visible, accessible and robust, and we rely on statistically literate people making best use of the information to determine our future action, by presenting clear and convincing arguments and developing 'evidence-based policy' to guide our decision making.

We are surrounded by facts and figures everyday. News headlines regularly frame statistical stories:
  • "Traffic offences have risen by 25% over the last five years."
  • "One in five of Australia's part-time workers want and are available to work more hours than they currently do."
  • "Cat Stevens was the unmistakable voice of a generation. An incredible one in two households in Australia had a Cat Stevens album in the seventies."

Statistics tell interesting stories and enable us to make sense of the world. They are indicators of change and allow meaningful comparisons to be made. In order to make sound judgements, it is essential that we are equipped with the very best knowledge for research, planning and decision-making purposes. While it may be the issues rather than the statistics that grab people's attention, it should be recognised that it is the statistics that inform the issues. Statistical literacy, then, is the ability to accurately understand, interpret and evaluate the data that inform these issues.



Why is it important to be statistically literate?

The provision of accurate and authoritative statistical information strengthens our society. It provides a basis for decisions to be made on public policy, such as determining electoral boundaries and where to locate schools and hospitals. It also allows businesses to know their market, grow their business, and improve their marketing strategies by targeting their activities appropriately.

In today's information-rich society, being statistically literate will give you an edge. It will make you more attractive to future employers and put you ahead of your competitors in the workplace. Broadening your statistical knowledge will enable you to engage in discussions and decision-making processes with authority, accuracy and integrity.



Are you statistically literate?

If you are uncomfortable with using statistics, you are not alone. Many people shy away from using statistics because of their perceived complexity. People may:
  • not know where to look to find the information they need;
  • be unfamiliar with the terminology; or
  • lack confidence in their ability to make sense of the numbers.

You do not have to be an expert at maths to work with statistics. Numeracy implies a basic competence in mathematics, a basic understanding of numbers and figures. It is certainly a prerequisite to being statistically literate, but statistical literacy is not about being adept at formulating or understanding the methodology behind the numbers. Rather, it is the ability to interpret the numbers and communicate the information contained therein effectively. Statistics simply help to tell a story. They may be presented in different ways, such as tables, graphs, maps or text, but they are not scary or boring if you know what they mean.

Increased use of statistics does not automatically lead to an increased understanding of statistics. In this information-rich age, it is important for individuals to be independent, critical thinkers, and statistical literacy is fundamental to achieving this. Be sceptical. Consider what spin may have been put on the data. What has really been said and what has been left out? Be aware. Ignoring definitions or comparing statistics inappropriately can result in misinterpretation of the data.



Statistical literacy criteria

To be statistically literate, there are four critical areas in which you need to build skills:

1. Data awareness

Are the data relevant and appropriate?

Data are the basis of statistics. Data are observations, which when organised and evaluated become information or knowledge. The amount of data available can be overwhelming. Interpreting data accurately requires a systematic approach. Think about the questions you need the data to answer. Look behind the data and consider:
  • are the data from a reliable/credible source?
  • are the data truly representative?
  • why have the data been presented in this way and what other data might be needed to fully answer a question or describe a situation?

An important aspect of statistical literacy is understanding what makes data trustworthy and reliable. Understanding how data are produced ensures that informed judgements can be made about the quality of the data.


Where did the data come from?

Data can come from a variety of sources. Beware of:
  • Pre-existing data
These may have been produced for a specific purpose. The population that the data are based on may differ from the population now under scrutiny, or the sampling method may not necessarily be appropriate for the current study.
  • Secondary data
These may have been used in a selective way to suit the purpose of a particular study or report. As such, it may not be a reliable data source or be presenting the data in a manner consistent with the intent of the original data. As a general rule, consult the original or primary data source wherever possible.
  • Data generated from observation and/or experimentation
The type of questions asked and the manner in which they are asked can influence the answers received. Data can be collected from a population as a whole or from a sample, from which conclusions can be drawn about the broader population. Types of sampling can vary, but the main thing to keep in mind is that any sample should be representative of the population. If there are limitations with the sampling procedure, it is important that these limitations are acknowledged because they can influence the validity and reliability of the results.
    Example
    In a street poll the people used in a sample are generally chosen because they are readily available and willing to participate. As a result, bias may be introduced because the sample is not truly representative of the population and the survey findings may be misleading.
  • Anecdotal evidence
This often relates to a specific event and is generally not representative. While it may be useful when describing a particular case study, care should be taken when making conclusions about the broader population.
  • Biased data
Bias can be deliberately or inadvertently introduced into survey samples. Sources of bias include:
    • sample bias (was the size of the sample appropriate and how were the respondents selected?)
    • response errors (people may misinterpret the questions and not give accurate answers)
    • missing data (people may not respond at all or give incomplete information)
    • responses may be influenced by the wording of the questions
    • responses may be influenced by the interviewer
    • groups with a vested interest may generate data that are biased towards their organisation's position, while data found to contradict that position may not necessarily be forthcoming.


How were the data collected?

There are three main forms of data collection:
  • Self-enumeration
People fill in their own forms and can complete them in their own time. This collection method may place limitations on the number and complexity of questions that can be asked, while responses may lack detail or accuracy. The Census is an example of self-enumeration.
  • Interview based surveys
An interviewer contacts the selected survey participant either in person or via telephone. This collection method generally results in higher response rates, but also introduces the risk of interviewer bias. More questions and more complex questions can be asked, with more accurate and more detailed responses usually given.
  • Administrative by-product
Data are available through administrative records generated from the administrative transactions carried out by government departments, agencies and businesses, such as birth and death statistics, and overseas arrivals and departures. Making use of this type of data helps to keep the number of surveys and censuses to a minimum, which in turn is more cost effective. However, bear in mind that the information has been collected for a different purpose and is often restricted to a set of items which are administratively determined. Comparability problems may arise when comparing data from different sources.


Are the data fit for purpose?

To make informed use of data, users need to understand what the data show, how the data should be interpreted, what pitfalls may arise when interpreting the data, and any limitations of the data. Consider the following to determine if the data are fit for purpose:
  • What was the intended purpose of the collection results?
  • Is the information representative of the total population?
  • How high are the relative standard errors? Can the data be considered reliable if the relative standard error is high?
  • How recent are the data? Is this the latest information available?
  • Are you looking for a snapshot or a trend over time?
  • Are other data sources available for comparisons? Are the datasets comparable?
  • What metadata (eg. quality statements or explanatory notes) sits around the data? Most ABS products have an Explanatory Notes tab containing useful information on scope, concepts and definitions, survey design and estimation.


2. The ability to understand statistical concepts

Basic forms of statistical representation
  • tables
  • graphs
  • maps

Different types of proportions
  • percentages
  • ratios
  • rates

More complex statistical concepts
  • difference between median, mean and mode
  • difference between original, trend and seasonally adjusted data
  • difference between census and surveys

Some of these terms are discussed in Section 3 of this article. For further explanation of terms see Statistical Language! (cat. no. 1332.0.55.002).



3. The ability to analyse, interpret and evaluate statistical information

Organise data, construct and display graphs and tables and work with different representations of the data

To be statistically literate, one must understand that how data are organised can contribute to how they are interpreted. Tables and graphs are commonly used to present results. Tables provide greater detail, showing the actual values, whereas graphs are more useful in showing relationships, concentrating on the form, shape and movement of the data. Graphs are particularly useful in representing change over a period of time.


Describe and summarise basic data

There is an extensive amount of data available. It can sometimes be difficult to get to the information you need. Careful analysis is a vital step in exposing the important story contained in the data. Poor quality analysis can lead to incorrect and inappropriate conclusions being drawn. Therefore, it is important to be vigilant. Be sure to:
  • gain an understanding of the topic and the associated data
  • be critical of the data you are using
  • investigate carefully before being satisfied that you have painted a true and accurate picture

Background knowledge helps to build up an expectation of what the data should look like, but beware of the constraints that those preconceived expectations could place on the outcome of your analysis. Results that differ from your expectations may sound legitimate alarm bells, but it is equally important to be open to what the data are showing you. Question the results. Investigate further until you are satisfied that you have got an accurate interpretation of the figures. Remember, the data may not necessarily be telling the story you want or expect them to.


Extract, understand and explain data that is presented in a variety of ways

Comparison pitfalls

Be wary when making comparisons. Comparisons cannot be made between 'apples and oranges', only between 'oranges and oranges'. Care must be taken when:
  • Comparing data from different sources
You need to consider whether the data sets are actually comparable.
    Example 1
    Results from the 2006 Census regarding unpaid child care cannot be directly compared with the results of the ABS Child Care Survey because the age of the children who were reported on is different. The Census question referred to care provided for children aged less than 15 years of age in the two weeks prior to the Census, while the Child Care Survey only included children aged less than 13 years during a single reference week.

    Example 2
    ABS and Centrelink both collect information about unemployed persons, but the data sets are not comparable. ABS unemployed are defined by activity. That is, they are people who are without work, but have been actively seeking work in the past four weeks, and were available to start work last week. Centrelink unemployed are defined by their eligibility to receive unemployment benefits.

  • Changes have occurred
Changes can occur to a data set over time, such as changes in classification, geography, sample size, methodology, etc.
    Example
    New industry classification codes, known as Australian and New Zealand Standard Industrial Classification (ANZSIC), were developed in 2006, replacing the 1993 edition, which was the first version produced. ANZSIC 2006 codes reflect the changes that have occurred in the structure and composition of industry since the previous edition, and enhance international comparability. However, direct comparisons with ANZSIC 1993 cannot be made.

  • Definitions differ
Definitions may differ depending on the context or the survey. Always check that you have the correct definition and are clear about what you are describing. Make sure you are aware of the data boundaries.
    Example
    The term 'child' can mean many different things. Depending on the context, a child could be someone:
    • aged under 13 years
    • aged under 15 years
    • aged under 18 years
    Check the Explanatory Notes to ascertain the definition of a 'child' used in that particular survey. Be wary of making comparisons with other data sources - be sure to check that you are comparing like age groups.

  • Correlating information
Correlation does not mean causation. The relationship between data and an event may be purely coincidental, or there may be multiple reasons behind an event taking place, with the data only reflecting one aspect of the relationship.
    Example
    The increased number of shark attacks along the eastern seaboard of Australia in January 2009 may have corresponded with booming retail sales of sunscreen products. This retail boom just happened to coincide with the peak shark attack period, but the number of attacks is unlikely to be related to the increased use of sunscreen.

  • Results lack variation
Variation to data is important and almost impossible to remove. Therefore, lack of variation in results over time should be cause for suspicion.
    Example
    If the unemployment rate remained unchanged over many months, it would be worth further investigation as to why this was the case.

Understand the context

Context is very important. A lot of data will be context dependent and it is important that you have a good grasp of what that context is.
    Example 1
    Many commentators will use various descriptors to captivate people's imagination. Be careful when assigning labels that you are clear about the group you are describing. Commentators may refer to the 'iGeneration' or 'Internet generation', but what exactly is the 'iGeneration'? Some people will know them as 'Generation Z'. Others will have heard them referred to as 'KIPPERS' (Kids in Parents' Pockets Eroding Retirement Savings). Some people will claim this generation covers the period 1986-2006, while others will argue that they don't come into being until after 1991. Be aware that different definitions exist.
    Example 2
    Information that is "cherry picked" to look interesting might mean something entirely different when placed in another context. In trend terms, labour force estimates indicated that Tasmania had the lowest participation rate of all the states and territories in Australia during the 2007-08 financial year. However, in October 2008, Tasmania's participation rate was at a record high (60.9%).

    Both claims are equally true, but selective reporting of this data could be misleading. Even reliable statistics can be distorted if only part of the story is told.


4. The ability to communicate statistical information and understandings

How are the data reported?

Turning data into information is an essential skill. Communicating statistical information accurately is vital for effective decision making. To ensure integrity, statistical literacy demands that we question how the data are reported and the reliability of conclusions that are drawn. Bad conclusions can still be drawn from good data. Some things to be aware of include:
  • Use of basic summary numbers
Using basic summary numbers, such as averages, can sometimes be misleading.
    Example
    If houses in Hobart were advertised for sale at $275,000, $295,000, $300,000, $325,000 and $850,000 respectively, using the mean to calculate the average house price would produce a figure of $409,000. This gives an over-inflated impression of house values in Hobart. In reality, the median value of $300,000 would give a much more accurate picture of average house prices.
  • Use of proportions
Using proportions can also produce misleading conclusions, especially if the numbers involved are small.
    Example
    According to reliable crime and justice statistics, from 2005-06 to 2006-07, there was a 50% increase in the number of murders in Tasmania. While this is true, the actual numbers of murders increased from 4 to 6, not nearly as dramatic an increase as the percentage increase would have us believe.

  • Seasonal variations
Seasonal variations can influence results.
    Example
    Retail sales for March one year may be down the following year. At face value, it may be reasonable to conclude that business returns had suffered. However, it may simply be the effects of Easter shifting from March in the earlier year to April in the later year. To remove the effects of this type of seasonal variation, the ABS uses seasonal adjustment to standardise the data.


Confidentiality of ABS data

Statistical literacy also includes recognition of ethical issues such as confidentiality. All information collected by the ABS is confidential. It is collected under the authority of the Census and Statistics Act 1905 and carries severe penalties for any person who breaches that confidentiality. In accordance with the Act, no information can be released which enables a person, household or business to be identified.
Tables containing cells with very small counts may potentially result in identifiable information. To avoid releasing identifiable information all tables are subjected to two confidentiality processes before release:
  • assessing the size of the table; and
  • introduced random error.

If the number of cells is the same as, close to, or exceeds the population size, then the table will not be released. This practice avoids the release of tables containing a large proportion of small cells containing identifiable data.

Introduced random error is a technique that was developed to avoid identification of individuals. Prior to the 2006 Census, the confidentiality technique applied by the ABS was to randomly adjust cells with very small values. For the 2006 Census, a new technique was developed which slightly adjusts all cells to prevent identifiable data being exposed. These adjustments result in small introduced random errors, but do not impair the value of the table as a whole.

Tables which have been randomly adjusted will be internally consistent, however comparisons with other tables containing similar data may show minor discrepancies. This is the case for both customised tables and standard products. These small variations can, for the most part, be ignored.



Conclusion

Statistical literacy is essentially the ability to find, access, utilise, understand and communicate the story contained within the data. Sound understanding, interpretation and critical evaluation of statistical information can then contribute to decision making. The importance of statistical literacy in our information-rich society means that it has now become a core competency like reading and writing.

Statistics infiltrate and influence every aspect of our life, via the media and advertisements, persuading us to agree with a certain point of view or take some kind of action. Therefore it is in every Australian's interest to be statistically literate, to have a good understanding of statistics and the ability to use and interpret them effectively and appropriately.



ABS References:

Statistical Language! (cat. no. 1332.0.55.002)

Statistical Literacy Paper, ABS Education Services

Surviving Statistics (cat. no. 1332.0)

Trewin, D. (2005), Making Maths Vital, Key note speech, AAMT conference

Working Together for an Informed Australia in the 21st Century, NatStats08 Conference Declaration, November 2008


Non-ABS References:

Ben-Zvi, D. & Garfield, J. (2004), 'Goals, Definitions, And Challenges', The Challenge of Developing Statistical Literacy, Reasoning and Thinking, edited by Ben-Zvi, D. and Garfield, J., Kluwer Academic Publishers, pp.3-15

Biggeri, Luigi & Zuliani, Aberto (1999) 'The Dissemination of statistical literacy among citizens and public administration directors', paper presented at the ISI 52nd Session, Helsinki, Finland http://www.stat.auckland.ac.nz/~iase/publications.php?show=5

Gal, Iddo (2002), 'Adults’ Statistical Literacy: Meanings, Components, and Responsibilities', International Statistical Review, Vol 70 (1)

Garfield, J. (1999), 'Thinking about Statistical Reasoning, Thinking, and Literacy', Paper presented at First Annual Roundtable on Statistical Thinking, Reasoning, and Literacy

Pfannkuck, M. and Wild, C. (2004), 'Towards an Understanding of Statistical Thinking', The Challenge of Developing Statistical Literacy, Reasoning and Thinking, edited by Dani Ben-Zvi and Joan Garfield, p.17-43

Watson, J. M. (2005), 'Is statistical Literacy Relevant for Middle School Students?', Vinculum Vol 42 (1)

Watson, J. and Kelly, B. (2003), The Vocabulary of Statistical Literacy, sourced: http://www.augsburg.edu/ppages/~schield/

Wells, H.G., Mankind in the Making, sourced: http://www.causeweb.org/resources/fun/db.php?id=105%5Ct_blank