2080.5 - Information Paper: Australian Census Longitudinal Dataset, Methodology and Quality Assessment, 2011-2016 Quality Declaration 
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 27/02/2018   
   Page tools: Print Print Page Print all pages in this productPrint All

APPENDIX: WEIGHTING THE ACLD

INTRODUCTION

The process of weighting enables the data user to estimate the number of people in the whole population with particular characteristics based on the observations from a sample. To do this, a 'weight' is allocated to each sample unit. The value of the weight indicates how many population units are represented by the sample unit.

The ACLD is designed to measure change in Australian society over time. For the 2011 ACLD Panel, a longitudinal weight has been implemented which allows the weighted sample to represent all persons who were in scope of both of the 2011 and 2016 Censuses. As shown in Figure 1, this ‘longitudinal population’ is the overlap between the two Censuses (the shaded region). To estimate the size of this population, the 2016 Estimated Resident Population (ERP) was multiplied by the estimated probability that a person was in scope in 2011, calculated using the 2016 Census responses for the reported 5 year ago address. Further information on this approach can be found in the paper Chipperfield, Brown & Watson (2017).


FIGURE 1 - IN SCOPE POPULATION FOR THE AUSTRALIAN CENSUS LONGITUDINAL DATASET, 2011-2016

Venn diagram showing 2006 and 2011 samples, which mostly overlap. The part of the 2006 circle that does not overlap with 2011 is deaths and overseas departures. The part of the 2011 circle that does not overlap with 2006 is births and overseas arrivals.


This method for estimating the overlapping 2011 and 2016 populations is an improvement over the method outlined in the previous issue of this Information Paper, see Australian Census Longitudinal Dataset, Methodology and Quality Assessment, 2006-2011 (cat. no. 2080.5). The current approach overcomes the limitations of the previous approach which inaccurately accounted for:

  • people who were overseas arrivals after one Census and who subsequently left Australia (or died) before the following Census; and
  • people who left Australia after one Census and then returned to Australia before the following Census.


CALCULATING LONGITUDINAL WEIGHTS

Longitudinal weights were calculated for each 2011 Panel sample record that was linked to a 2016 Census record. No weights were calculated for the unlinked records. The longitudinal weight for a linked sample record in this release of the ACLD is a measure of how many people it represents in the 2011 and 2016 overlapping populations. The weights consist of three components. The first component reflects the probability of a record being selected in the 2011 Panel sample. The second component takes into account that some selected records are less likely to be linked than other selected records. The third component takes into account Census undercount (e.g. a person in an undercounted population is less likely to have a Census record and so is less likely to have a selected Census record) and ensures that the weights are consistent with population benchmarks. These three components lead to the final weight, calculated as:

Final weight = (Design weight) x (Missed link adjustment) x (Calibration to known population totals adjustment).

Design Weight

The records in the 2011 ACLD Panel sample were selected from the 2011 Census population using equal probability random selection. For a sample size of 5.7% (see Section 1.1 Overview, 2011 Panel), the design weight for all records of the ACLD is the inverse of the probability of selection, and is equal to 17.5.

Missed Link Adjustment

A missed link occurs when a 2011 sampled record has a corresponding 2016 Census record, but the link is not identified. As missed links are more likely to occur in certain population groups, not making this adjustment would mean that these population groups would be under-represented in the linked sample. The missed link adjustment weight is equal to the inverse of the estimated propensity to link. No attempt was made to correct for false links.

The propensity to link was estimated using a logistic regression model that was applied to the 2011 Panel sample with link status as the response variable. The logistic regression model describes a relationship between a 2011 sample record's propensity to link and its values for a range of 2011 Census variables such as Indigenous status, marital status, country of birth, language spoken at home and English proficiency, labour force participation and occupation, educational attainment, mobility (whether moved address in the preceding year) and remoteness. It was found that the estimated propensity to link varied considerably between records.

Two separate models were applied to the 2011 Panel sample. The first model was applied to people under the age of 15 years on 2011 Census night. This model excluded the variables that were not applicable to people under 15 years of age, such as marital status and labour force participation. The second model was applied to the remainder of the sample (persons aged 15 years or over in 2011).

The missed link adjustment carries the assumption that the ACLD contains no false links, while not assuming that all records in the 2011 sample have a corresponding 2016 Census record.

Calibration to Longitudinal Population Totals

At this point in the process an intermediate weight had been calculated for each linked sample record that was equal to (Design weight) x (Missed link adjustment). This intermediate weight was then calibrated (or adjusted) so that the resulting weighted counts of the ACLD links would be equal to estimates of the longitudinal population size at the national and selected sub-national levels. The two sets of longitudinal population groups calibrated to were:
  1. state/territory, by sex, by ten year age group;
  2. Indigenous status by state/territory.

The size of these longitudinal population groups was estimated by multiplying the 2016 ERP for each group by the estimated proportion of 2016 Census responders for that group who reported being in scope of the 2011 Census, i.e. resident at an Australian address on 2011 Census night. This proportion was estimated using the responses to the 2016 Census address five years ago question.

The intermediate weights were calibrated using a 'raking' tool. This is a program which was developed to determine record level weights using iterative horizontal and vertical passes through the unit records until a satisfactory set of final weights was converged upon. Unlike in the previous ACLD release, imposing bounds on the calibration adjustment was not necessary because extremely high or low final weights were not produced.

This calibration adjustment improves the accuracy of weighted estimates and it implicitly adjusts for the small proportion of people who were in scope for the 2011 Census but did not complete a Census form in 2016.


Summary of weights

The mean weight for selected characteristics gives an indication of how much the final weight differs from the initial design weight (17.5) in order to address missed links and Census undercount. Table A.1 shows that the mean final weight for the linked records is 22.3 for females, and 23.2 for males. The largest weight was 83 and the smallest was 14.8. The mean weight was higher for Aboriginal and Torres Strait Islander persons (30.4) and for people in the Northern Territory (30.2).


TABLE A.1 - DESCRIPTIVE STATISTICS FOR WEIGHTS, by Selected Characteristics, 2016

Count (a)
Minimum Weight
Maximum Weight
Mean Weight
Standard Deviation
Median Weight

SEX
Male
450 054
14.8
83.1
23.2
4.9
22.0
Female
477 460
14.8
81.9
22.3
4.7
21.3

AGE
0-14
126 917
16.1
81.9
22.3
5.1
21.0
15-24
119 827
16.5
67.5
23.5
4.0
22.7
25-34
109 438
17.4
83.1
27.9
5.3
27.3
35-44
127 858
15.9
60.6
23.3
4.7
22.7
45-54
139 010
16.0
68.8
21.9
4.1
20.8
55-64
127 797
15.6
62.7
21.3
3.8
20.1
65-74
99 605
15.2
55.5
20.6
3.4
19.4
75-84
54 414
14.8
58.0
20.2
3.5
18.8
85 or over
22 651
14.8
56.2
21.2
3.9
19.8

INDIGENOUS STATUS
Aboriginal and/or Torres Strait Islander
23 059
18.9
83.1
30.4
6.0
29.9
Other (b)
904 457
14.8
81.9
22.5
4.6
21.5

STATE/TERRITORY OF USUAL RESIDENCE
New South Wales
296 695
15.5
81.9
22.7
4.9
21.6
Victoria
234 629
15.7
80.5
22.7
4.7
21.6
Queensland
185 576
15.7
81.2
23.0
4.6
22.2
South Australia
71 025
14.8
68.7
21.5
4.3
20.3
Western Australia
95 198
15.4
73.8
23.0
5.2
21.7
Tasmania
21 781
15.9
58.8
21.8
3.8
20.7
Northern Territory
6 929
17.1
83.1
30.2
8.0
28.7
Australian Capital Territory
15 583
15.2
60.0
22.0
4.6
21.1

a) Counts presented in the table have been perturbed.
b) Includes non-Indigenous persons and persons who did not state an Indigenous status in 2016.
Source: ABS, Australian Census Longitudinal Dataset 2011-2016.