Microdata and TableBuilder: Australian Census Longitudinal Dataset

The Australian Census Longitudinal Dataset (ACLD) uses Census of Population and Housing data to build a rich longitudinal view of Australian society

Introduction

The Census of Population and Housing is conducted every five years to measure the number of people and dwellings in Australia on Census Night. The Census is the most comprehensive snapshot of the country and tells the story of how we are changing. Census data tells us about the economic, social and cultural make-up of the country. 

The Australian Census Longitudinal Dataset (ACLD) uses data from the Census of Population and Housing to build a rich longitudinal picture of Australian society. The ACLD can uncover new insights into the dynamics and transitions that drive social and economic change over time, and how these vary for diverse population groups and geographies.

Four waves of data have so far contributed to the ACLD from the 2006 Census (wave 1), 2011 Census (wave 2), 2016 Census (wave 3) and 2021 Census (wave 4).

There are three ACLD panels, representing a 5% sample of records from the 2006, 2011 and 2016 Censuses. There are seven ACLD datasets which have been generated that initiate from one of these panels.

Available ACLD datasets
PanelDataset NameDataset Description
2006 panelAustralian Census Longitudinal Dataset, 2006-2011Contains the original 2006-2011 linkage with additional experimental social security and related variables.
Australian Census Longitudinal Dataset, 2006-2011 (with visa variables)Contains the original 2006-2011 linkage with three additional visa variables from the Department of Social Services' Settlement Database.
Australian Census Longitudinal Dataset, 2006-2011-2016Contains a 2006-2011-2016 linkage, including an updated 2006-2011 linkage to take advantage of an improved linkage methodology since the initial release.
Australian Census Longitudinal Dataset, 2006-2011-2016-2021Contains a 2006-2011-2016-2021 linkage, including an updated 2006-2011 linkage to take advantage of an improved linkage methodology since the initial release.
2011 panelAustralian Census Longitudinal Dataset, 2011-2016Contains a 2011-2016 linkage.
Australian Census Longitudinal Dataset, 2011-2016-2021Contains a 2011-2016-2021 linkage.
2016 panelAustralian Census Longitudinal Dataset, 2016-2021Contains a 2016-2021 linkage.

 

This publication provides information about longitudinal microdata from the Census made available via different methods for analytical research. Microdata products contain the most detailed information available from the Census. They contain data which is either the response to individual questions on the Census form or derived from answers to two or more questions. 

This publication includes information on: 

  • the microdata products available 
  • the methodology 
  • the quality of the microdata 
  • how to apply for and use the microdata. 

Privacy

The ABS is given the authority to collect, hold and use personal information for Census and statistical purposes as legislated by the Australian Bureau of Statistics Act 1975 and the Census and Statistics Act 1905. Data are released under the Census and Statistics Act 1905, which has provision for the release of individual level records (unit records) where the information is not likely to enable the identification of a particular person or organisation. Census microdata products do not contain names or addresses, and each have different assessment processes conducted and measures applied to ensure they are sufficiently confidentialised.  

Available products

Detailed microdata: allows in-depth analysis of detailed microdata within the ABS' secure DataLab environment. The ACLD datasets available in the DataLab are:

  • Australian Census Longitudinal Dataset, 2006-2011 (with visa variables)
  • Australian Census Longitudinal Dataset, 2006-2011-2016-2021
  • Australian Census Longitudinal Dataset, 2011-2016-2021
  • Australian Census Longitudinal Dataset, 2016-2021

TableBuilder: allows users to build tables based on underlying microdata. The ACLD datasets available in TableBuilder are:

  • Australian Census Longitudinal Dataset, 2006-2011 (with visa variables)
  • Australian Census Longitudinal Dataset, 2006-2011 (with experimental social security and related variables)
  • Australian Census Longitudinal Dataset, 2006-2011-2016
  • Australian Census Longitudinal Dataset, 2006-2011-2016-2021
  • Australian Census Longitudinal Dataset, 2011-2016
  • Australian Census Longitudinal Dataset, 2011-2016-2021
  • Australian Census Longitudinal Dataset, 2016-2021

Applying for access

Before applying for access, users should read the Responsible use of ABS microdata user guide to understand the obligations when using microdata. Additionally, TableBuilder users should read and familiarise themselves with the information contained in the TableBuilder User Guide.

The list of variables (also referred to as data items) included in each of the ACLD datasets is available for download under Data downloads.

To apply, see the TableBuilder and DataLab pages.

Data available on request

Data obtained in the Census but not contained in a Census data product may be available from the ABS, on request, as statistics in tabulated form. Subject to confidentiality and sampling variability constraints, special tabulations can be produced incorporating variables, populations and geographic areas selected to meet individual requirements. These are available on a fee for service basis.  

Enquiries should be submitted via an Information consultancy form.  

To view variables available for request, refer to either the current  2021 Census dictionary or the historical dictionaries from previous Census years. 

Further information

Further information about the ACLD can be found on the ACLD page.

Further information about ABS statistical data integration is available on the ABS Data Integration page.

Methodology

Scope and coverage

The ACLD is a random 5% sample panel of people enumerated in Australia on each Census Night. The sample panels are linked to subsequent Censuses using statistical techniques. Four waves of data have contributed to the ACLD so far, from the 2006 (wave 1), 2011 (wave 2), 2016 (wave 3) and 2021 (wave 4) Censuses.

The Census covers all areas in Australia and includes people living in both private and non-private dwellings but excludes: 

  • diplomatic personnel of overseas governments and their families
  • Australian residents overseas on Census Night.

Visitors

Overseas visitors were excluded from the ACLD samples.  

The ACLD does include visitors to a household. These are people who were enumerated in a household they do not usually live in. Family information cannot be derived for these people and as such, all family, spouse, and male and female parent related variables are not applicable for these individuals. 

All dwelling related variables have been made applicable for visitors to a household. This information relates to their dwelling of enumeration on Census Night, not their usual residence. 

Most household variables are not applicable for visitors to a household, however for four variables, visitors have been included in order to align to standard Census derivations of that variable. These comprise: 

  • Total household income as stated (weekly) of household in which person was enumerated. 
  • Total household income (weekly) of household in which person was enumerated. 
  • Household income derivation indicator of household in which person was enumerated. 
  • Household composition of household in which person was enumerated. 

Any applicable household information for a visitor to a household relates to their place of enumeration, not usual residence. 

Where a variable is also applicable for visitors to a household, the usual address indicator variable for the relevant Census year can be used to restrict the table to usual residents only. 

For further information on place of enumeration and place of usual residence see, Comparing Place of enumeration with Place of usual residence.

The cell comments available in the data item lists provide precise information on who is, and is not, applicable for each variable. 

Persons temporarily absent on Census Night

The Census household form provides the opportunity to list up to three people who were temporarily absent from the dwelling on Census Night. A limited amount of information is collected for these people. This information is used to better derive the family and household characteristics of the dwelling. In deriving family and household related variables for the ACLD, information on persons temporarily absent was included where relevant and available. Details are provided in cell comments in the data item lists.

Further information 

For more information on the scope and coverage of the Census see: 

Sample design

Multi-panel sample method

The ACLD sample is maintained through the application of a multi-panel framework. This provides an approach for selecting records in the ACLD to create panels that maintain the longitudinal and cross-sectional representativeness of the dataset over time, while minimising the impact of accumulated linkage bias on longitudinal analysis. 

The multi-panel framework is comprised of multiple overlapping panels, with each dataset containing one panel of records representing a single Census population (i.e. 2006 or 2011 or 2016). Each Census year a panel is selected and linked to subsequent Censuses. The sample selection strategy for each panel is designed to maintain a linked sample size of 5%, maximise sample overlap between the panels, and introduce new records to each panel to account for new births, migrants and missed links in previous panels. This means that new births or migrants from 2007 would not have been added to the Australian Census Longitudinal Dataset, 2006-2011-2016-2021 dataset, however, they may have been selected in the 2011 panel and therefore included in the Australian Census Longitudinal Dataset, 2011-2016-2021 dataset. This allows flexibility for users to draw on the most appropriate panel for their research question. 

For further information on the multi-panel framework refer to Information Paper: Australian Census Longitudinal Dataset, Methodology and Quality Assessment, 2006-2016

Sample maintenance

Without sample maintenance, the ACLD would decline in its ability to accurately reflect the Australian population over time, due to:

  • people newly in scope of the ACLD (i.e. children born and immigrants who arrived in Australia since the previous Census) not being represented in the sample,
  • people no longer being in scope due to death or overseas migration, and
  • missing and/or incorrect links.

The 2006 panel sample was 4.9%, this achieved a linked sample size of 3.8% of the population after missed links and people no longer being in scope due to death or overseas migration. The 2006 panel sample of 979,662 records from the 2006 Census was linked to the 2011 Census, resulting in a linked sample size of 756,945 at a linkage rate of 77.3%. From the 2006 panel sample 605,618 records linked to both the 2011 and 2016 Censuses, and 501,941 records linked to the three (2011, 2016 and 2021) Censuses. 

The 2011 panel sample was increased slightly to 5.7% to achieve a linked sample size of no greater than 5% of the population after allowing for missed links and people no longer being in scope. The 2011 panel sample of over one million records (1,221,059) from the 2011 Census was linked to the 2016 Census, resulting in a linked sample size of 927,517 records at a linkage rate of 76.0%. This achieved a linked sample size of 4.3%. From the 2011 panel sample 814,337 records linked to both the 2016 and 2021 Censuses. 

The 2016 panel sample was 5.6%, this achieved a linked sample size of 4.7% of the population after allowing for missed links and people no longer being in scope. The 2016 panel sample of 1,308,274 records from the 2016 Census was linked to the 2021 Census, resulting in a linked sample size of 1,088,307 records at a linkage rate of 83.2%. This achieved a linked sample size of 4.7%. 

In each case the linkage sample size decrease was due to missed links and people no longer being in scope due to death or overseas migration.

ACLD multi-panel linkage

The 2006 panel sample is 4.9% (979,662 person records), this achieved a linked sample size of 3.8% to 2011 records. 77.3% (756,945 person records) from the 2006 panel sample was linked to 2011 records. From the 2006 panel sample 61.8% (605,618 person records) linked to both the 2011 and 2016 Censuses, this is 3.1% of the 2006 population. 51.2% (501,941 person records) linked to the three (2011, 2016 and 2021) Censuses, this is 2.5% of the 2006 population. 

The 2011 panel sample is 5.7% (1,221,059 person records), this achieved a linked sample size of 4.3% to 2016 records. 76.0% (927,517 person records) from the 2011 panel sample was linked to 2016 records. From the 2011 panel sample 66.7% (814,337 person records) linked to both the 2011 and 2021 Censuses, this is 3.8% of the 2011 population.

The 2016 panel sample is 5.6% (1,308,274 person records), this achieved a linked sample size of 4.7% to 2021 records. 83.2% (1,088,307 person records) from the 2016 panel sample was linked to 2021 records, this is 4.7% of the 2016 population.

696,356 person records overlapped between the 2006 and 2011 panel samples. 817,915 person records overlapped between 2011 and 2016 panel samples.

Linking methodology

The ACLD products in this release have been produced by making use of the following linkages: 

  • 2006 Census ACLD sample to 2011 Census
  • 2011 Census ACLD sample to 2016 Census
  • 2006-2011 ACLD sample to 2016 Census
  • 2016 Census to June 2019 Person Linkage Spine 
  • 2021 Census to June 2022 Person Linkage Spine 
ACLD sample linkages

ACLD sample linkages between each Census year were undertaken using a mix of deterministic and probabilistic linkage methodologies. 

Deterministic linkage methodology uses pre-defined rules to find unique matches between datasets. Matching rules may be gradually broadened to tolerate differences between datasets. 

Probabilistic linkage methodology allows for links to be assigned despite missing or incomplete data, provided there is enough agreement between linkage variables to offset any disagreement. Probabilistic linkage produces as a linkage weight, which is a numerical measure that shows how well records match. 

There are two main reasons why some records were not linked across Census files: 

  • Records belonging to the same individual were present at both time points, but these records failed to be linked because they contained missing or inconsistent information. 
  • The person had no record in the later Census. 
Census to ABS Person Spine linkages

The 2016 and 2021 Censuses were both linked to the ABS Person Linkage Spine, which is the linkage infrastructure that underpins the Person-Level Integrated Data Asset (PLIDA).  

The Spine aims to cover all people who were resident in Australia for the given reference period. It is updated annually, using the following administrative datasets:  

  • Medicare Consumer Directory
  • DOMINO Centrelink Administrative Data
  • Personal Income Tax Client Register. 

For further information about the Spine see Person linkage spine

For the 2016 Census, 93.68% of records linked to the June 2019 Spine. For the 2021 Census, 96.26% of records linked to the June 2022 Spine. 

There are two main reasons why some records were not linked to the Spine: 

  • Records belonging to the same individual were present on both the Spine and the Census, but they contained missing or inconsistent information. It should be noted that the Spine contains longitudinal linkage data to mitigate this issue. 
  • The individual who completed the Census was not present on the Spine. The Spine aims to cover all persons who are resident in Australia for a given reference period, but it does not have perfect population coverage.

Linkage results

Weighting, benchmarking and estimation

Weighting is the process of adjusting a sample to infer results for the relevant population. To do this, a 'weight' is allocated to each sample unit - in this case, persons. The weight can be considered an indication of how many people in the relevant population are represented by each person in the sample. Weights were created for linked records in the ACLD to enable longitudinal population estimates to be produced.

Each panel of the ACLD is a random 5% sample of persons enumerated in Australia on Census Night. As such, each person in the sample should represent about 20 people in the Australian population. Between Censuses, however, the Australian population in scope of the ACLD changes as people die or move overseas. In addition, Census net undercount and data quality can affect the capacity to link equivalent records across waves.

The ACLD weights benchmark the linked records to the estimated Australian in scope population. The weights were based on four components: the design weight, undercoverage adjustment, missed link adjustment and population benchmarking.

Weights were benchmarked to the following population groups:

  • State/territory by age groups (0-14 and ten-year groups to 85+) by sex, and
  • Indigenous status by state/territory.

Estimates of population groups are obtained by summing the weights of persons with the characteristic(s) of interest.

Carefully consider which dataset and weight is most appropriate for your analysis based on the end point and in-scope population of your research. Multiple weights are available in the ACLD detailed microdata products, however, only one weight is available on each of the ACLD TableBuilder datasets. For further information see Using the ACLD in TableBuilder.

For further information about ACLD weighting and estimation refer to Information Paper: Australian Census Longitudinal Dataset, Methodology and Quality Assessment, 2006-2016 (cat. no. 2080.5).

Dataset NameWeight Scope (a)Population Benchmark (b)(c)Weights Mean Value MalesWeights Mean Value FemalesMinimum Weight ValueMaximum Weight Value
Australian Census Longitudinal Dataset, 2006-20112006-2011 (original linkage)Adjusted 2011 ERP24.824.017.1103.4
Australian Census Longitudinal Dataset, 2006-2011-2016-20212006-2011 (re-linkage)Adjusted 2011 ERP26.625.016.1176.9
2006-2011-2016Adjusted 2016 ERP31.529.415.9341.3
2006-2011-2016-2021Adjusted 2021 ERP36.233.416.8602.8
Australian Census Longitudinal Dataset, 2011-2016-20212011-2016Adjusted 2016 ERP23.222.314.883
2011-2016-2021Adjusted 2021 ERP25.123.814.6192.4
Australian Census Longitudinal Dataset, 2016-20212016-2021Adjusted 2021 ERP21.420.614.3123.5
  1. The weight scope refers to the sub-population of linked records across the different Census years. Each of these sub-populations have been weighted up to population counts.
  2. ERP = Estimated Resident Population. The end of June ERP was selected for each Census night.
  3. The ERP was adjusted by the estimated probability to cover the longitudinal population in scope.

In all ACLD datasets the mean weight was higher for people of Aboriginal and Torres Strait Islander origin and for people in the Northern Territory. For the 2006-2011 original linkage the mean weight was higher for people who moved interstate between 2006 and 2011.

Sources of error

All reasonable attempts have been taken to ensure the accuracy of the longitudinal dataset. Nevertheless potential sources of error including sampling, linking and Census quality error should be kept in mind when interpreting the results.

Sampling error

Sampling error occurs because only a small proportion of the total population is used to produce estimates that represent the whole population. Sampling error refers to the fact that for a given sample size, each sample will produce different results, which will usually not be equal to the population value.

There are two common ways of reducing sampling error - increasing sample size and/or utilising an appropriate selection method (for example, multi-stage sampling would be appropriate for household surveys). Given the large sample size for the ACLD (1 in 20 persons), and simple random selection, sampling error is minimal.

Managing Census quality

The ABS aims to produce high quality data from the Census. To achieve this, extensive effort is put into Census form design, collection procedures and processing procedures.

There are four principle sources of error in Census data: respondent error, processing error, partial response and undercount. Quality management of the Census program aims to reduce error as much as possible, and to provide a measure of the remaining error to data users, to allow them to use the data in an informed way.

For information on the quality of 2021 Census data see Managing Census quality, and for historic Censuses see Data Quality.  

The 2021 Census Statistical Independent Assurance Panel concluded that the 2021 Census data is fit-for-purpose, is of comparable quality to the 2011 and 2016 Censuses and can be used with confidence. For further information see Report on the quality of 2021 Census data: Statistical Independent Assurance Panel to the Australian Statistician.

Quality indicators

The ACLD contains several variables that relate to the quality of linkage and have been collectively named quality indicators. The first of these are consistency flags. These variables measure the consistency of reporting on linked records between 2006 and 2011, 2011 and 2016, and 2016 and 2021. Consistency flags have been created for Census variables that would not be expected to change over time or have unlikely transitions over time. These are as follows: 

  • Age 
  • Sex 
  • Birthplace of Person 
  • Birthplace of Spouse or Partner 
  • Birthplace of Female Parent 
  • Birthplace of Male Parent 
  • Year of Arrival 
  • Indigenous Status 
  • Registered Marital Status 
  • Highest Year of School Completed 
  • Level of Highest Non-School Qualification 
  • Number of Children Ever Born.

Consistency flags can be used with other variables. For example, age inconsistency can be cross tabulated with sex to examine potential sex differences in the reporting of age. 

In addition to the consistency flags, a "Record linked in YEAR” flag is also available. This flag can be cross tabulated with another data item to examine linkage rates (that is, the proportion of records linked). For example, cross tabulating the record linked flag with State/Territory of usual residence enables an examination of differences in linkage rates between the states and territories. 

Data consistency

The ACLD is a longitudinal dataset using data from successive Censuses. 

While the Censuses had predominantly the same questions and were processed in a similar way, there were some differences between them. For example, several changes were made to how industry of employment information was collected for the 2016 Census. The ABS advises this data is not directly comparable to 2011 industry data and should not be used to measure longitudinal transitions. For further information refer to  Industry of Employment (INDP)  in  Census of Population and Housing: Understanding the Census and Census Data, Australia, 2016

Other variables that are different between Census years are personal, family and household income. Income was collected in ranges and these ranges are different in different Census years. The ACLD does not include an adjustment to income data for inflation. 

Users are encouraged to read the Census dictionary variable pages to understand Census variables, concepts, and changes over time. See the 2021 Census dictionary or the historical dictionaries from previous Census years. For additional useful information about the quality of the variables in the ACLD see Quality Declaration

A small percentage of linked records have inconsistent data, such as a different country of birth at the two time points or an age inconsistency of more than one year (when the expected five year difference is accounted for). Inconsistencies may be due to: 

  • false link - the record pair does not belong to the same individual 
  • reporting error - information for the same individual was reported differently at different time points 
  • processing error - the value of a variable was inaccurately assigned or imputed during processing. 

In most analysis, the effect of inconsistent information may only have a small impact. Characteristics from the 2006, 2011, 2016 or the 2021 data can be used in tables and some exploration of consistency over time will assist in drawing appropriate conclusions. 

No data editing was applied to the file beyond that which had already taken place during the relevant Census processing period. 

There are numerous ways to define 'consistency'. The consistency flags have fine level categories to allow users flexibility in using their own definition of 'consistent' or 'inconsistent'. For example, where one Census has 'not stated' for the year of arrival variable, a user can decide whether the record should be considered consistent or not. The same applies to where the response for one Census is 'not applicable'. The labels attached to each category suggesting consistency or inconsistency will assist the user in determining which records are consistent or inconsistent for their needs.

Proportion of linked records with inconsistent data

ACLD 2006-2011

Product overview

There are two ACLD 2006-2011 datasets which contain the original linkage that was released in 2013, these are:

  • Australian Census Longitudinal Dataset, 2006-2011 (with visa variables)
  • Australian Census Longitudinal Dataset, 2006-2011 (with experimental social security and related variables)

Both of these datasets are available in TableBuilder, however, only the Australian Census Longitudinal Dataset, 2006-2011 (with visa variables) is available as a detailed microdata product within the DataLab. 

The 2006-2011 ACLD is a representative sample of almost one million records from the 2006 Census (Wave 1) linked with corresponding records from the 2011 Census (Wave 2). 

New waves of Census data will not be added to these datasets but they will be retained and are recommended for analysis of visa class or social security information between 2006-2011.

Variables

ACLD 2006-2011-2016-2021

Product overview

The 2006-2011-2016-2021 ACLD is a representative panel sample of almost one million records from the 2006 Census (wave 1) brought together with corresponding records from the 2011 Census (wave 2), 2016 Census records (wave 3), and 2021 Census records (wave 4). 

The 2006 panel sample of records was originally linked to the 2011 Census and released in 2013. A new linkage between the 2006 panel and 2011 Census records was used for this dataset to take advantage of improved linking methodology since the initial release. The linked 2011 records were then subsequently linked to records from the 2016 Census, and 2016 linked records then linked to 2021 Census records. 

The 2006-2011-2016 ACLD dataset in TableBuilder is the precursor to this dataset. As each TableBuilder dataset can only contain one weight variable, the 2006-2011-2016 ACLD TableBuilder dataset will be retained for analysis of the 2006-2016 population with the weights designed for this linked population. 

Three weight variables, each designed for the different linked populations, are available on the 2006-2011-2016-2021 ACLD detailed microdata dataset. 

The 2006-2011-2016-2021 ACLD dataset is recommended for analysis of the 2006-2011, 2006-2016 and 2006-2021 longitudinal populations. 

Variables

ACLD 2011-2016-2021

Product overview

The 2011-2016-2021 ACLD is a representative sample of over 1.2 million records from the 2011 Census (Wave 2) brought together with corresponding records from the 2016 Census (Wave 3) and 2021 Census records (wave 4). The 2011 panel includes new births and migrants since the 2006 Census.

The 2011 panel sample of records has been linked to records from the 2016 Census and 2016 linked records then subsequently linked to 2021 Census records. 

The 2011-2016 ACLD dataset in TableBuilder is the precursor to this dataset. As each TableBuilder dataset can only contain one weight variable, the 2011-2016 ACLD TableBuilder dataset will be retained for analysis of the 2011-2016 population with the weights designed for this linked population. 

Two weight variables, each designed for the different linked populations, are available on the 2011-2016-2021 ACLD detailed microdata dataset. 

The 2011-2016-2021 ACLD dataset is recommended for analysis of the 2011-2016 and 2011-2021 longitudinal populations. 

Variables

ACLD 2016-2021

Product overview

The 2016-2021 ACLD is a representative sample of 1.3 million records from the 2016 Census (wave 3) brought together with corresponding records from the 2021 Census (wave 4). The 2016 panel includes new births and migrants since the 2011 Census (and therefore the 2011 panel sample). 

The 2016-2021 ACLD product is recommended for analysis of the 2016-2021 longitudinal population.

Variables

Using the ACLD in the DataLab

The DataLab is an interactive data analysis solution available for high end users to run advanced multivariate statistical analyses, for example, multiple regressions and structural equation modelling. Controls in the DataLab have been put in place to protect the identification of individuals and organisations. All output from DataLab sessions is cleared by an ABS officer before it is released.  

For more information about the DataLab please see DataLab

Counting units and weights

Weighting is the process of adjusting results from a sample to infer results for the total population. To do this, a weight is allocated to each person. The weight is the value that indicates how many population units are represented by the sample unit. 

Each person record has a weight. This weight indicates how many population units are represented by the sample unit. When producing estimates of sub-populations from the detailed microdata, it is essential that they are calculated by adding the weights of persons in each category and not just by counting the sample number in each category. The application of weights ensures that estimates will conform to an independently estimated distribution of the population by age, by sex, etc. rather than to the distributions within the sample itself.

There are multiple weights available in the detailed microdata datasets which contain data from more than two Census periods. You should use the weight that is most appropriate for your analysis based on the end point and the in-scope population of your research. 

Dataset NameAnalysis PeriodWeight Mnemonic
Australian Census Longitudinal Dataset, 2006-2011-2016-20212006-2011WEIGHT4_06_11
2006-2016WEIGHT4_06_11_16
2006-2021WEIGHT4_06_11_16_21
Australian Census Longitudinal Dataset, 2011-2016-20212011-2016WEIGHT4_11_16
2011-2021WEIGHT4_11_16_21

Using the ACLD in TableBuilder

TableBuilder user guide

The TableBuilder User Guide is a comprehensive reference guide for the web interface of TableBuilder. It includes information on building and working with tables, customising data, understanding the results, and confidentiality processes.

Counting units and weights

Weighting is the process of adjusting results from a sample to infer results for the total population. To do this, a weight is allocated to each linked person. The weight is the value that indicates how many population units are represented by the sample unit.

Both the sample and weighted count options have been made available for the ACLD. It is therefore critical that weighted or unweighted counts are selected as appropriate when specifying tables. Weights have only been created for, and applied to, linked records in the ACLD to enable longitudinal population estimates to be produced. The following image shows the available Summation Options.

Image: Screen shot from TableBuilder showing Summation Options.

The default option used for the ACLD is weighted count. Weights should be used when making inferences about the longitudinal Australian population and will be the basis for most analyses. The weight applied in each ACLD TableBuilder dataset has been generated to enable analysis for the full longitudinal period. For example in the 2006-2011-2016-2021 ACLD dataset, this would be people from the 2006 panel sample who have been linked in 2011, 2016 and 2021. 

Carefully consider which dataset is most appropriate for your analysis based on the end point and the in-scope population of your research. Further information on the weight scope applied to the different TableBuilder datasets can be found in the table below.

Dataset NameWeight Scope (a)Population Benchmark (b)(c)
Australian Census Longitudinal Dataset, 2006-20112006-2011 (original linkage)Adjusted 2011 ERP
Australian Census Longitudinal Dataset, with Social Security and Related Information, experimental statistics, 2006-2011
Australian Census Longitudinal Dataset, 2006-2011-20162006-2016Adjusted 2016 ERP
Australian Census Longitudinal Dataset, 2006-2011-2016-20212006-2021Adjusted 2021 ERP
Australian Census Longitudinal Dataset, 2011-20162011-2016Adjusted 2016 ERP
Australian Census Longitudinal Dataset, 2011-2016-20212011-2021Adjusted 2021 ERP
Australian Census Longitudinal Dataset, 2016-20212016-2021Adjusted 2021 ERP
  1. The weight scope in ACLD TableBuilder datasets refers to records which have been linked across all Census years available within the dataset. This is the sub-population which has been weighted up to population counts. 
  2. ERP = Estimated Resident Population. The end of June ERP was selected for each Census night.
  3. The ERP was adjusted by the estimated probability to cover the longitudinal population in scope.

Uses for unweighted counts are generally limited to research into unlinked records and more sophisticated analysis for those seeking to understand the weighting methodology better or wishing to apply their own weighting methods.

Excluding unlinked records in TableBuilder

When using the weighted summation option in TableBuilder, no results will be returned for unlinked records in 2016, as weights were not applied to these records. Results including unlinked 2011 records will only be returned if analysis is performed on unweighted data.

To exclude unlinked records from your analysis, deselect the "Unlinked record" category in each data item before adding it to the table. Such a table would produce a sample count corresponding to the equivalent table run with weights. Refer to the TableBuilder User Guide for more information on how to select data items for tables.

If the 'unlinked record' category is present on a data item that has already been added to a table, it can be removed by selecting this category within the relevant data item and then pressing the 'Remove from Table' button.

Image: Screen shot from TableBuilder showing an example of deselection of the "unlinked record" category.

Relative standard error

While weighted counts are available in the ACLD TableBuilder, the Relative Standard Error (RSE) will not be calculated for these counts due to the confounding effects of linking error present in the sample, which were not able to be quantified.

An RSE count of 200 will appear in your TableBuilder tables, do not use these counts as they are not necessary.

Confidentiality features in TableBuilder

In accordance with the Census and Statistics Act 1905, all the data in TableBuilder are subjected to a confidentiality process before release. This confidentiality process is undertaken to avoid releasing information that may allow the identification of particular individuals, families, households, dwellings or businesses.

For further information see Perturbation in the TableBuilder User Guide

Data downloads

Data item lists

Data files

History of changes

Show all

Glossary

Show all

Quality declaration

Institutional environment

Relevance

Timeliness

Accuracy

Coherence

Interpretability

Accessibility

Previous catalogue number

This release previously used catalogue number 2080.0.

Back to top of the page