6527.0 - Household Expenditure Survey, Australia: User Guide, 1998-99
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 15/11/2001
Page tools: Print Page Print All | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
INTRODUCTION
There are about 2 chances in 3 that a sample estimate will differ by less than one standard error from the figure that would have been obtained if all households had been surveyed, and about 19 chances in 20 that the difference will be less than two standard errors. The relative standard error (RSE) is the standard error expressed as a percentage of the estimate. Only estimates with relative standard errors of 25% or less are considered sufficiently reliable for most purposes. However, estimates with higher relative standard errors are included in some HES publications, because they are the best estimates available. In HES publications, estimates with an RSE of 25% to 50% are preceded by an asterisk (e.g. *3.4) and those with an RSE of more than 50% are preceded by a double asterisk (e.g. **6.1) to indicate that they should be used with caution. NON-SAMPLING ERROR The imprecision due to sampling variability, which is measured by the standard error, should not be confused with inaccuracies that may occur because of imperfect reporting by respondents, errors made in collection such as in recording and coding data, and errors made in processing the data. Inaccuracies of this kind are referred to as non-sampling error, and they may occur in any enumeration, whether it be a full count or a sample. It is not possible to quantify non-sampling error, but every effort is made to reduce it to a minimum. This is done by careful design of questionnaires, intensive training and supervision of interviewers, and efficient operating procedures. CALCULATING RELATIVE STANDARD ERRORS The ABS has calculated the relative standard errors for a variety of the HES estimates, using a technique known as Jacknife. Regression models were then fitted to the relative standard errors that had been calculated using the Jacknife technique, to smooth the results, and to summarise them into a form which is concise enough to publish. The outcome of this work is published in each HES publication, where data are provided to enable relative standard errors to be calculated for each estimate shown in the publication. Table A3.1 (in appendix 3) shows the relative standard error for each expenditure item, at the Australia level. Table A1.1 on the next page shows the relative standard error for each household characteristic, at the Australia level. To obtain the relative standard error for an estimate at any other level (e.g. for a state, or for an income quintile) the value in table A1.1 or table A3.1 as appropriate, must be adjusted to take account of the smaller size of the sample contributing to that particular estimate. Because the sample size is smaller, the relative standard error will be larger. The first step in making this adjustment is to look up the number of sampled households contributing to the estimate for the item: the 'Number of households in sample' from a particular state, or income quintile, will be shown in the table which contains the estimate of interest. The relative standard error for an estimate can be calculated by multiplying the relative standard error for the item at the Australia level (found directly from table A1.1 or A3.1), by an adjustment factor (found from graph A1.2) which compensates for the smaller sample size. In theory, each different item requires a different adjustment factor. However, to prevent graph A1.2 from becoming illegible, the items have been formed into six groups (labelled A-F). Within each group of items, the theoretical adjustment factors are similar enough that a common adjustment factor can be used in practice. Table A1.1 indicates the group to which each household characteristic belongs. Table A3.1 indicates the group to which each expenditure item belongs. A1.1 RELATIVE STANDARD ERRORS OF HOUSEHOLD CHARACTERISTICS
(a) This estimate for Australia is a benchmark total. RSEs for benchmark values should not be referenced from this publication. See paragraphs under heading of Standard Errors for Benchmark Totals for more details. Graph A1.2 plots the adjustment factor for each of these 6 groups (A-F) of items, against sample size. The adjustment factor for a particular estimate can be read off this graph, once the sample size contributing to the estimate and the group to which the item belongs have been determined. In brief, the procedure for calculating the relative standard error for a particular estimate is as follows:
RSE = FCT * R% where R = the relative standard error of the estimate for Australia and is given in table A1.1 or A3.1; and FCT = a factor based on the number of sampled households and is given in graph A1.2.
STANDARD ERRORS FOR BENCHMARK TOTALS As outlined in chapter 4, estimates derived from the survey were obtained using a complex regression estimation procedure which ensures that survey estimates conform to independently estimated distributions of the population, also called benchmark totals. The relative standard error of benchmark totals, and benchmark totals by quintile, should not be referenced from this publication. (All benchmark totals are footnoted "a" in table A1.1.) An indication of the quality of some household benchmark totals may be found in Household Estimates 1986, 1991-94 (Cat. no. 3229.0). Person benchmark totals are not subject to sampling error, but are subject to non-sampling error. The Australia-level relative standard errors of benchmark values are provided only as a means of calculating non-benchmark total estimates. For example, the average number of people aged 65 years and over in a household is a benchmark total, so its Australian RSE should not be referenced from this publication; its Australian RSE in table A1.1 should only be used to calculate the RSE of non-benchmark estimates, such as the average number of people aged 65 years and over living in a couple only household. CALCULATION OF STANDARD ERRORS FOR DERIVED STATISTICS Many figures of interest may be derived by taking sums, differences and ratios of the tabulated data. Approximate standard errors for these ‘derived estimates’ can be calculated using the formulae below in which x1 and x2 are estimates and SE(x1) and SE(x2) are the standard errors of x1 and x2. Exact standard errors for these ‘derived estimates’ have not been published, although they could be calculated upon request. Note: The approximate formulae are derived assuming the correlation between x1 and x2 is zero. Correlation, in this context, is a statistical estimate which measures the linear relationship between x1 and x2 and takes values in the range [-1,1]. The correlation will be exactly zero if the two estimates are based on independent subgroups of the sample (e.g. different states or income groups). Two estimates of the same subgroup will be positively correlated if large values of the items are likely to occur together (e.g. estimates of expenditure on transport are likely to be correlated with estimates of expenditure on purchase of vehicles because purchase of vehicles is a large part of the expenditure included in expenditure on transport). Converting between relative standard error (RSE) and standard error (SE) The relative standard error is the standard error expressed as a percentage of the estimate. Formulae for converting standard errors to relative standard errors and the relative standard errors to standard errors are: Returning to the expenditure on transport example, average expenditure on transport (x1) at the fourth income quintile level was $154.80 and the RSE was equal to 4.6%. Therefore, the standard error (SE(x1)) was equal to ($154.80 * 4.6) / 100 = $7.12. Calculating the standard error for summed estimates New items or categories of expenditure can be derived by combining existing ones. The approximate standard error of the estimate is: For example, if we wanted to create a new category of expenditure, say of expenditure on transport and personal care, then to calculate the standard error of the new category we would need to know the standard error of expenditure on both transport and personal care. At the Australia level, the estimate for expenditure on transport ($117.82) and personal care ($13.73) can be obtained from table 1 of the 1998-99 HES publication Summary of Results (Cat. no. 6530.0). Calculation of the standard error for the combined estimate of transport and personal care would be as follows: Note that if there was a non-zero correlation between x1 and x2 then the standard error for a sum would be: where r is the sample correlation coefficient. Thus, if the two estimates are positively correlated (i.e. r > 0) then the standard error will be underestimated; similarly if there is a negative correlation (i.e. r < 0) then the standard error will be overestimated. Calculating the standard error for the difference between estimates The standard error of the difference can be used to determine whether two estimates are significantly different, that is, whether the difference is unlikely to be due to sampling variability. If the difference between estimates is twice the standard error of the difference, then the estimates are said to be statistically different at the 95% confidence level. The approximate standard error of the difference between estimates is: As can be seen, the approximate standard error of the difference involves the same calculations as the standard error of the sum. This approximation is accurate provided that the two estimates have zero correlation. If correlation exists then we obtain the standard error formula of In this case a positive correlation will produce an overestimate of standard error whilst a negative correlation will produce an underestimate. Calculating the standard error of the ratio of estimates Two items can be compared by calculating the ratio of one to the other. For example, researchers may want to express expenditure on petrol (expenditure code 10010301) as a percentage of total expenditure on transport costs (the sum of all expenditure codes beginning with 10). The relative standard error of the percentage or proportion can be approximated using the formula: As can be seen, this formula is similar to that used for calculating sums and differences between estimates, except that relative standard errors are used in the formula in place of the standard errors.
|