3.1 Measures of quality

Report on the quality of 2021 Census data: Statistical Independent Assurance Panel to the Australian Statistician

An independent view of the quality of statistical outputs from the 2021 Census of Population and Housing

Released

28/06/2022

Release date and time

28/06/2022 10:00am AEST

Overview

There are three key measures of statistical quality used internationally that the Panel employed to assess the quality of the 2021 Census data:

undercount and overcount (see Section 3.2 and 3.5);
response rates, including item non-response rates (see Section 3.3); and
consistency with other data sources (see Sections 3.4, 3.5, 3.6 and 3.7).

The undercount and overcount estimates produced by the Post Enumeration Survey were used to obtain information on the number of people missed, or counted more than once, by the Census.

Dwelling and person response rates and item non-response rates for the Census were reviewed to see if there were any unusual observations in response patterns.

The data from the 2021 Census was compared against the 2016 and 2011 Censuses to check for unexpected differences.

3.1.1 Undercount and overcount

Using the Post Enumeration Survey results, it is possible to estimate:

the number of people who either were counted more than once or in error in the Census (gross overcount);
the number of people who should have been counted in the Census but were missed (gross undercount); and
the net error for Census imputation into dwellings determined incorrectly to be occupied or unoccupied on Census night (net overcount for persons imputed).

The net difference is called the net undercount (or overcount).

In 2021, the ABS has also produced a new coverage measure called gross coverage, which sums the coverage errors. Conceptually, gross coverage aims to identify the population that the Census counted (or received a form for) as a proportion of the population that the Census should have counted. It is the Census count, with the overcount and imputed persons removed as a proportion of the Post Enumeration Survey population estimate, expressed as a percentage.

3.1.2 Response rates

Response rates are standard measures used internationally to assess data quality for population and housing censuses. The Panel considered three types of response rates: dwelling response, person response, and item non-response. These measures are defined below and are used in this report to assess overall data quality.

The dwelling response rate is used to measure how many occupied private dwellings in Australia completed the Census. Dwelling response occurs when a form is returned for a private dwelling identified as occupied on Census night. The dwelling response rate is calculated as a percentage by dividing the number of responding private dwellings by the number of private dwellings identified as occupied on Census night (including those where no form was returned). The dwelling response rate excludes non-private dwellings. Most people were counted in a private dwelling on Census night.

The person response rate is used to measure how many people in Australia completed the Census. Person response occurs when a person is included on a form returned from either a private dwelling identified as occupied on Census night, or from a non-private dwelling. The person response rate is also calculated as a percentage by dividing the number of responding people by the total number of people in the Census counts (including those imputed into dwellings where no form was returned).

Unlike the dwelling response rate, the person response rate includes people in non-private dwellings. This takes into consideration responses on Census night from people who were in dwellings such as hotels, hospitals, nursing homes, boarding houses, mining camps and other staff quarters, among other non-private dwelling types.

It is important to note that the quality of these response measures depends on how well the ABS has determined the occupancy of private dwellings on Census night. If unoccupied dwellings are incorrectly classified as non-responding occupied dwellings, this will make the response rates appear lower than they were. Conversely, if non-responding occupied dwellings were mistakenly categorised as unoccupied this will make the response rates appear higher.

Item non-response rates are important for understanding the quality of individual data items. Item non-response is calculated as a percentage by dividing the number of households or people who provided a response to a particular question (item) by the number of households or people (including imputed people) for whom the question (item) would have been applicable.

3.1.3 Consistency with other data sources

To assess the quality of key indicators and particular data items, 2021 Census data was compared against their equivalents from the 2016 and 2011 Censuses. While differences were anticipated (and found) given 10 years of population growth and societal change, and particularly given the 2021 Census was conducted during the COVID-19 pandemic, analysis was focussed on identifying unexpected and inexplicable differences that might point to data quality issues.

Comparisons were also made against the Estimated Resident Population as at June 2021. The Estimated Resident Population used for comparison was based on the 2016 Census updated using birth, death and migration data for the intercensal period.

The Panel also used trends in other relatable data sources to identify any trends in the Census data that might require further investigation. For the two new Census variables (Long-term health conditions and Australian Defence Force service), external data was used to assess their validity.

APA

Citation