6527.0 - Household Expenditure Survey, Australia: User Guide, 1998-99

ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 15/11/2001

Page tools: Print

Print Page Print all pages in this product

Contents >> Chapter 4. Survey design and estimation

Sample design
Sample loss
Responding households
Imputation
Final sample
Weighting
Benchmarking
Estimation
Reliability of estimates

SAMPLE DESIGN

The sample was designed to produce reliable estimates for households resident in private dwellings aggregated for Australia, for each state and for the capital cities in each state and territory.

SAMPLE LOSS

Sample loss refers to units which have been selected in the sample but are out of scope in the survey. The sampling units in the HES are private dwellings. Dwellings which are out of scope include those which are found to be vacant, under construction, converted to non-dwellings or demolished. Additionally, dwellings containing no in-scope residents (e.g. dwellings occupied by foreign diplomats and their dependants) are also out of scope. In 1998-99, of the 10,298 private dwellings selected in the sample, 1,390 dwellings were found to be out of scope.

RESPONDING HOUSEHOLDS

Of the 8,908 selected dwellings after sample loss, there were 2,015 which did not contribute to the values of HES expenditure or income. Such households included those who could not be contacted, had language problems, refused to participate, or were affected by death or illness of a household member. Also excluded were those in which the reference person or spouse did not respond to key questions in the survey such as income. Thus, there were 8,908 dwellings in the scope of the survey, of which 6,893 (77%) were included as part of the final estimates.

IMPUTATION

Of the households which provided most of the required HES information but were unable, or unwilling, to provide all of it, some were able to be retained in the sample and their missing values deduced or imputed.

For some of these households, missing information could be deduced using additional information supplied on the questionnaire (such as prices for given quantities and types of bread and milk purchased from given types of outlets).

In the remainder of cases, the missing information was imputed. Imputation is the process of replacing missing values with substitute values during processing. Imputation was carried out at two levels:

where a value was missing for a particular item, the missing value was replaced with a value which had been reported by another person or household with similar characteristics; and
where questionnaires or diaries were missing for a person in the household (other than the reference person or spouse) the missing information was replaced with whole questionnaires or diaries of another individual from a household with similar composition and characteristics.

In either case, the record providing the missing information is known as the donor record. Donors were selected so that, as far as possible, the information they provided would be an appropriate proxy for the information that was missing. Depending on which values were being imputed, donors were taken from the pool of complete households or individual records with complete information for the block of questions in which the missing information was located.

To better match donors to recipient records, both sets of records were ordered according to characteristics (such as number of adults and children present) associated with the blocks of variables being imputed. Recipients with missing information were matched with donors who fell into the same classes as themselves.

Edits were applied before and after imputation took place, to ensure that errors were not introduced through the addition of donor information.

FINAL SAMPLE

The sample on which estimates were based, or the final HES sample, is composed of households for which all necessary information is available. The information may have been wholly provided at the interview or may have been completed through imputation for partially responding households. The 1998-99 HES final sample included approximately 600 households which had at least one imputed value. Over 40% of these households had only a single value missing.

2 HES FINAL SAMPLE: NUMBER OF HOUSEHOLDS, 1998-99


	Capital city	Balance of state/territory	Total

New South Wales	1,327	706	2,033
Victoria	992	377	1,396
Queensland	580	516	1,096
South Australia	420	144	564
Western Australia	475	175	650
Tasmania	389	91	480
Northern Territory	335	89	424
Australian Capital Territory	277	-	277
Australia	4,795	2,098	6,893

WEIGHTING

Expansion factors, or weights, are values by which information for sample households is multiplied to produce estimates for the whole population.

Initial weights, based on the sample design, are equal to the inverse of the probability of selection. Weights for each member of the household are the same as the weight for the household itself.

In previous surveys, these initial weights have been adjusted to account for non-response. For the 1998-99 HES the demographic and geographic information available for non-respondents was analysed to determine whether a strong relationship existed between household non-response and its demographic and geographic characteristics. No strong relationship was detected so no adjustment to the initial weights to account for non-response was required.

BENCHMARKING

To adjust for underenumeration and to align survey estimates with independent population estimates, the weights were calibrated against person and household benchmarks. Using an iterative procedure, the weights were adjusted so that person and household estimates conformed with external person and household benchmarks. The two person benchmarks which were used in 1998-99 were: state/territory population estimates by eight age categories; and labour force status estimates (from Labour Force Survey data) by capital city/balance of state or territory by sex by five age categories. The two household benchmarks were: nine categories of household composition by capital city/balance of state or territory; and state by capital city/balance of state or territory. See the section on comparability between the 1998-99 HES and the 1993-94 HES in chapter 5 for further details of benchmarks used.

The household benchmarks were based on provisional estimates of numbers of households in Australia. The benchmarks were adjusted to include households and persons residing in private dwellings only and therefore do not, and are not intended to, match estimates of the total Australian resident population published in other ABS publications.

The benchmarks do not include people living in sparsely settled areas in the Northern Territory.

ESTIMATION

Estimates produced from the survey are usually in the form of averages (e.g. average weekly household expenditure on clothing and footwear), or counts (e.g. total number of households who own their dwelling). For counts, the estimate is obtained by summing the weights of the responding households in the required group (e.g. those households owning their dwelling). Averages are obtained by adding the weighted household values, and then dividing by the estimated number of households. For example, average weekly expenditure on clothing and footwear by Victorian households is the weighted sum of the average weekly expenditure of each selected household in Victoria who reported such expenditure, divided by the estimated number of households in Victoria. Note that the denominator is the total number of households and not just the number of households which have reported expenditure on the particular item.

RELIABILITY OF ESTIMATES

The estimates provided in this publication are subject to two types of error.

Non-sampling error

Non-sampling error can occur whether the estimates are derived from a sample or from a complete collection. Three major sources of non-sampling error are:

inability to obtain data from all households included in the sample. Although a non-response adjustment to the sampling weights was not necessary in 1998-99 (see section on weighting in this chapter), some bias may remain;
errors in reporting on the part of both respondents and interviewers. These reporting errors may arise through inappropriate wording of questions, misunderstanding of what data are required, inability or unwillingness to provide accurate information and mistakes in answers to questions; and
errors arising during processing of the survey data. These processing errors may arise through mistakes in coding and data recording.

Non-sampling errors are difficult to measure in any collection. However, every effort is made to minimise these errors. In particular, the effect of the reporting and processing errors described above is minimised by careful questionnaire design, intensive training and supervision of interviewers, asking respondents to refer to records whenever possible and by extensive editing and quality control checking at all stages of data collection and processing.

The error due to non-response is minimised by:

re-visiting all initially non-responding households in order to explain the importance of their cooperation to the project; and
ensuring the weighted file is representative of the population by calibrating to benchmarks.

Sampling error

The HES estimates are based on a sample of possible observations. Hence, they are subject to sampling variability and estimates may differ from the figures that would have been produced if information had been collected for all households. Further information on sampling error is given in appendix 1.