2901.0 - Census of Population and Housing: Census Dictionary, 2016  
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 23/08/2016   
   Page tools: Print Print Page  
2016 Census Dictionary >> Glossary >> Derivations and imputations


Derivations and imputations

Derivations: Derivation is the process where some variables are assigned values based on responses to other questions, or (where no response has been provided) from other family members present in the same dwelling.

Variables that may be derived from responses given by other family members present in the same dwelling are:
    • Country of Birth of Person (BPLP)
    • Country of Birth of Father (BPMP)
    • Country of Birth of Mother (BPFP)
    • Language Spoken at Home (LANP)

If there is insufficient information provided to derive a response for these items, they are determined to be 'Not stated'.

In addition, the derivation process is used to create new variables by combining responses from a number of questions. Variables which are created this way include:
    • Mortgage Repayments (monthly) Dollar Values (MRED)
    • Rent (weekly) Dollar Values (RNTD)
    • Tenure Type (TEND)
    • Labour Force Status (LFSP)
    • Core Activity Need for Assistance (ASSNP)

Imputation: Imputation is a statistical process for predicting values where no response was provided to a question and a response could not be derived.

Where no Census form is returned, the number of males and females in 'non-contact' private dwellings that are thought to be occupied will be imputed. In addition, the following key demographic variables may also be imputed, if they are 'Not stated':
    • Age (AGEP)
    • Place of Usual Residence (PURP)
    • Registered Marital Status (MSTP)
    • Sex (SEXP)

The primary imputation method used for the 2016 Census is known as 'hotdecking', Other imputation processes use probability methods. In general the hotdecking method involves locating a donor record and copying the relevant responses to the record requiring imputation. The donor record will have similar characteristics and must also have the required variable(s) stated. In addition the donor record will be located geographically as close as possible to the location of the record to be imputed. The match must occur within the same Capital City or Balance of State.

The methodology for imputation is tailored to two situations. Firstly, where no Census form has been returned and secondly where a partially completed form was returned.

No Census form returned - private dwelling: Where a private dwelling was identified as occupied on Census night but a Census form was not returned, the number of males and females normally in the dwelling and their key demographic variables require imputation. In these cases, the non-demographic variables are set to 'Not stated' or 'Not applicable'.

For dwellings where the number of males and females is unknown, two imputation processes are performed. Initially, these records have their number of males and females imputed using hotdecking. Then a second imputation (also using hotdecking) is run to impute the key demographic variables for the newly created person records.

To hotdeck the number of males and females, the donor records must meet several conditions:
    • They must be occupied private dwellings where a form was returned and contain a maximum of 6 persons
    • They must have a similar Dwelling Structure (STRD) and Dwelling Location (DLOD) to the record to be imputed and
    • They must be located geographically as close as possible to the location of the record to be imputed.

The number of males and females are the only data copied from the donor record in the first hotdecking process.

In the next process, the records which have just had their number of males and females imputed, are subjected to the same hotdecking process as those records where the number of males and females had been ascertained.

This hotdecking process imputes the key demographic variables. Again the donor records must meet several conditions:
    • They must be records where everyone within the dwelling provided all their demographic characteristics
    • They must have similar Dwelling Structure (STRD) and Dwelling Location (DLOD)
    • They must have identical counts of males and females and
    • They must be located geographically as close as possible to the location of the record to be imputed.

The key demographic variables are then copied from the donor records to the records requiring imputation.

No Census form returned - Non private dwelling: Where a person in a non-private dwelling did not return a form, their demographic characteristics are copied from another person in a similar non-private dwelling using Type of Non-Private Dwelling (NPDD).

Census form returned: Where a form was returned, some or all of the demographic characteristics may require imputation. Characteristics are imputed using a combination of hotdecking and probability techniques.

If there is not enough information on the form to determine the sex (SEXP) of the person (or it is not appropriate to do so) then each record is randomly allocated a male or female sex.

Registered Marital Status imputation is carried out by finding a similar person in a similar responding dwelling based on the variables:
    • Sex (SEXP)
    • Relationship in Household (RLHP)
    • Age (AGEP)
    • Dwelling Type (DWTD) and
    • Type of Non-Private Dwelling (NPDD).

Registered Marital Status is only imputed for persons aged 15 years and over, and set to 'Not applicable' for persons aged under 15 years.


Where a complete usual address on Census night is not provided, the information that is provided is used to impute an appropriate Mesh Block (as well as Statistical Area Level 1 and Statistical Area Level 2). A similar person in a similar dwelling is located, and missing usual residence fields are copied to the imputed variable. These are based on the variables:
    • Residential Status in a Non-Private Dwelling (RLNP)
    • Dwelling Location (DLOD) and
    • Type of Non-Private Dwelling (NPDD).

Where date of birth or age details are incomplete or missing, the variable Age (AGEP) is imputed based off distribution patterns found in the responding population. Variables used in the imputation of age include:
    • Sex (SEXP)
    • Relationship in Household (RLHP)
    • Marital Status (MSTP)
    • Indigenous Status (INGP)
    • Type of Education Institution Attending (TYPP) and
    • Type of Non-Private Dwelling (NPDD).

Moreover, additional variables may also be used where they are shown to correlate with age.

Persons that provided partial or no information about their place of work will have a place of work (Destination Zone) imputed to them. This is imputed based on distributions of response observed in the responding population. Depending on the level of imputation required, place of work imputation may use the following variables (where available) in its method:
    • Place of usual residence (PURP)
    • Industry of employment (INDP)
    • Method of travel to work (MTWP).

Records that have required imputation can be identified using the Imputation flags:
    • Imputation Flag for Age (IFAGEP)
    • Imputation Flag for Number of Males and Females in Dwelling (IFNMFD)
    • Imputation Flag for Place of Usual Residence (IFPURP)
    • Imputation Flag for Place of Work (IFPOWP)
    • Imputation Flag for Registered Marital Status (IFMSTP)
    • Imputation Flag for Sex (IFSEXP).







Previous PageNext Page