2970.0.55.023 - 2001 Census of Population and Housing - Fact Sheet: Income Imputation, 2001
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 03/06/2002 First Issue
Page tools: Print Page Print All | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Income Imputation
These ranges, which are used on the Census form, were chosen after analysing data from the Survey of Income and Housing (SIHC), in which income was collected in actual dollars rather than ranges. Household and family income Household and family incomes (HIND and FINF) were not collected in the Census but were derived from person level income data. It is not possible to aggregate person income ranges to derive household and family incomes. To overcome this, data from the 1999/2000 SIHC were used to impute an income value for each person. The imputed values for each person were then aggregated to create imputed household and family level incomes. The imputation process The process involved analysis of the SIHC data to determine the imputation values to be used. Each of the 12,000 SIHC person records had the appropriate Census income range identifier (as defined above) allocated to it. For each range, the weighted mean, median, and midpoint of the range (with an arbitrarily assigned value used as the midpoint of the $1,500 or more range) were calculated. Each of these measures were then aggregated to derive imputed household and family level incomes. These imputes were then compared with the actual household and family incomes reported in SIHC to determine which would be used to impute Individual Incomes for Census records:
However, differences between person and household income ranges caused some problems. The ranges used for household and family level income are slightly finer than person level income:
At the higher end of the scale some one-income households and families were assigned to incorrect income ranges due to the fixed person level imputes used. To overcome this problem, the use of randomly assigned income values was investigated. Randomly assigned person level imputes were generated using assorted relative frequency distributions obtained from weighted SIHC data. These were used to generate household and family level incomes as described above. The resulting imputed household and family income distributions, compared to the actual income distribution from SIHC, were marginally better than imputed distribution from using median imputes. However, the randomly assigned imputes resulted in significantly more households and families being assigned to incorrect income ranges compared to the median imputes. Thus, the conclusion reached was that the median imputes were the most appropriate to use for the 2001 Census household and family income imputation. The imputed values The imputed values (Estimated income value) for each person income range, calculated using SIHC median values, were:
These are the values used when summing records to create household and family incomes, and in calculating median values. NOTE: Individual Income is only published in ranges, so these estimated values will not apply. Median Values for open-ended ranges To calculate a median value for Individual, Family or Household Income, it is necessary to identify the range in which the median lies and then to estimate where within the range the median would be. When the median lies within a range which does not have two specified finite end points (ie: $2,000 or more in the case of HIND and FINF) the default value (ie: $2,000 in the case of HIND and FINF) is retained. This is generally indicated as a table note, that when this value appears in the table the true median income is some value in the range $2000 or more. Introduction of Sampling Error The median income values from the SIHC used as the impute values are subject to sampling error, since SIHC is a sample survey. It would be appropriate to indicate that the household and family incomes, and therefore any derivations based on these values such as median income values for these units, are subject to both sampling and non-sampling error. Document Selection These documents will be presented in a new window.
|