TECHNICAL NOTE DATA QUALITY
DATA QUALITY
1 When interpreting the results of a survey it is important to take into account factors that may affect the reliability of estimates. Such factors can be classified as either sampling error or non-sampling error.
SAMPLING ERROR
2 Estimates in this publication are based on information collected for a sample of registered motor vehicles, rather than a full enumeration, and are therefore subject to sampling error. They may differ from the data that would have been produced if the information had been obtained for all registered motor vehicles. Examples of the sampling error for this publication are included in this technical note.
3 The sampling error associated with any estimate can be calculated from the sample results. One measure of sampling error is given by the standard error, which indicates the extent to which an estimate might have varied by chance because only a sample of vehicles was included. There are about two chances in three that a sample estimate will differ by less than one standard error from the data that would have been obtained if all vehicles had been included, and about 19 chances in 20 that the difference will be less than two standard errors.
4 Another measure of sampling variability is the relative standard error (RSE) which is obtained by expressing the standard error as a percentage of the estimate to which it refers. The RSE is a useful measure in that it provides an immediate indication of the percentage error likely to have occurred due to sampling. In this publication, estimates that have an estimated relative standard error between 10% and 25% are annotated with the symbol '^' . These estimates should be used with caution as they are subject to sampling variability too high for some purposes. Estimates with an RSE between 25% and 50% are annotated with the symbol '*', indicating that the estimate should be used with caution as it is subject to sampling variability too high for most practical purposes. Estimates with an RSE greater than 50% are annotated with the symbol '**' indicating that the sampling variability causes the estimates to be considered too unreliable for general use.
5 The RSEs relating to 2003 estimates contained in Table 4 of this publication are shown in the following table.
RSE OF MOTOR VEHICLE USE(a), State/territory of registration - Type of vehicle |
| |
| Passenger vehicles | Motor cycles | Light commercial vehicles | Rigid trucks | Articulated trucks | Non-freight carrying trucks | Buses | Total | |
| % | % | % | % | % | % | % | % | |
Total kilometres travelled | |
| |
New South Wales | 5 | 18 | 6 | 6 | 6 | 19 | 7 | 4 | |
Victoria | 6 | 21 | 8 | 8 | 5 | 17 | 7 | 5 | |
Queensland | 6 | 16 | 6 | 8 | 7 | 15 | 7 | 4 | |
South Australia | 7 | 19 | 7 | 7 | 8 | 23 | 7 | 5 | |
Western Australia | 6 | 16 | 8 | 8 | 6 | 27 | 8 | 5 | |
Tasmania | 8 | 16 | 10 | 6 | 6 | 19 | 8 | 6 | |
Northern Territory | 6 | 15 | 7 | 9 | 11 | 23 | 10 | 4 | |
Australian Capital Territory | 5 | 16 | 6 | 6 | 8 | 17 | 9 | 4 | |
Australia | 3 | 9 | 3 | 3 | 3 | 8 | 3 | 2 | |
Number of vehicles | |
| |
New South Wales | 2 | 4 | 4 | 2 | 3 | 15 | 4 | 2 | |
Victoria | 2 | 5 | 4 | 2 | 3 | 10 | 5 | 2 | |
Queensland | 2 | 4 | 3 | 3 | 4 | 10 | 4 | 2 | |
South Australia | 3 | 4 | 5 | 2 | 4 | 7 | 3 | 2 | |
Western Australia | 2 | 4 | 4 | 2 | 3 | 14 | 5 | 2 | |
Tasmania | 2 | 3 | 3 | 2 | 3 | 6 | 5 | 1 | |
Northern Territory | 3 | 6 | 3 | 5 | 9 | 11 | 4 | 2 | |
Australian Capital Territory | 2 | 4 | 3 | 2 | 4 | 11 | 5 | 2 | |
Australia | 1 | 2 | 2 | 1 | 1 | 5 | 2 | 1 | |
Average kilometres travelled | |
| |
New South Wales | 5 | 17 | 5 | 6 | 5 | 11 | 6 | 4 | |
Victoria | 6 | 20 | 7 | 7 | 4 | 15 | 6 | 5 | |
Queensland | 5 | 15 | 6 | 8 | 5 | 14 | 6 | 4 | |
South Australia | 6 | 18 | 8 | 7 | 7 | 22 | 7 | 5 | |
Western Australia | 6 | 16 | 7 | 8 | 5 | 28 | 7 | 4 | |
Tasmania | 7 | 16 | 9 | 6 | 5 | 19 | 7 | 5 | |
Northern Territory | 5 | 15 | 6 | 7 | 14 | 19 | 9 | 4 | |
Australian Capital Territory | 4 | 15 | 5 | 5 | 6 | 15 | 8 | 4 | |
Australia | 3 | 8 | 3 | 3 | 2 | 7 | 3 | 2 | |
| |
(a) These relative standard errors relate to the estimates in table 4. |
6 As an example of the use of an RSE, the 2003 estimate for kilometres travelled by all passenger vehicles registered in Australia is 151,743 million kilometres (Table 4 of the publication). The RSE for this estimate is 3%, as shown above. Therefore, the standard error for the 2003 kilometres travelled by passenger vehicles estimate is 4,552 million kilometres. There are about two chances in three that the figure obtained if all vehicles had been included, would have been in the range 147,191 million kilometres to 156,295 million kilometres. There are about 19 chances in 20 that the figure would have been in the range 142,639 million kilometres to 160,847 million kilometres.
7 It is important to note that estimates at more detailed levels than the above are subject to higher RSEs and therefore are less reliable.
8 RSEs for other key variables are shown in the following tables. The RSEs of further detailed variables can be made available on request.
RSE OF FUEL CONSUMPTION(a), Type of fuel - Type of vehicle |
| |
| | Passenger vehicles | Motor cycles | Light commercial vehicles | Rigid trucks | Articulated trucks | Non-freight carrying trucks | Buses | Total | |
| | % | % | % | % | % | % | % | % | |
Total fuel consumption | |
| |
Petrol | | | | | | | | | |
| Leaded | 53 | 37 | 63 | 47 | 100 | 68 | 99 | 43 | |
| Lead replacement | 13 | 29 | 18 | 23 | 66 | 28 | 47 | 11 | |
| Unleaded | 3 | 10 | 6 | 39 | 97 | 26 | 14 | 3 | |
| Total | 3 | 9 | 5 | 20 | 55 | 20 | 14 | 3 | |
Diesel | 18 | - | 6 | 4 | 3 | 10 | 4 | 3 | |
LPG/CNG/dual fuel | 19 | - | 15 | 29 | 70 | 37 | 19 | 13 | |
Total | 3 | 9 | 3 | 4 | 3 | 9 | 3 | 2 | |
Average rate of fuel consumption | |
| |
Petrol | | | | | | | | | |
| Leaded | 5 | 21 | 32 | 12 | 95 | 13 | 97 | 7 | |
| Lead replacement | 3 | 7 | 7 | 9 | 13 | 21 | 25 | 3 | |
| Unleaded | 1 | 3 | 1 | 24 | 90 | 16 | 3 | 1 | |
| Total | 1 | 3 | 1 | 7 | 17 | 13 | 3 | 1 | |
Diesel | 4 | - | 2 | 2 | 1 | 5 | 2 | 2 | |
LPG/CNG/dual fuel | 10 | - | 5 | 17 | 67 | 28 | 13 | 5 | |
Total | 1 | 3 | 1 | 2 | 1 | 4 | 2 | 1 | |
| |
- nil or rounded to zero (including null cells) |
(a) These RSEs relate to the estimates in table 5. |
RSE of freight vehicles(a), State/territory of operation |
| |
| Light commercial vehicles | Rigid trucks | Articulated trucks | Total | |
| % | % | % | % | |
total tonne-kilometres | |
| |
New South Wales | 11 | 12 | 4 | 7 | |
Victoria | 12 | 10 | 4 | 7 | |
Queensland | 11 | 18 | 5 | 8 | |
South Australia | 18 | 13 | 7 | 9 | |
Western Australia | 16 | 11 | 8 | 10 | |
Tasmania | 15 | 15 | 7 | 10 | |
Northern Territory | 23 | 21 | 15 | 31 | |
Australian Capital Territory | 22 | 20 | 24 | 17 | |
Australia | 6 | 6 | 3 | 4 | |
| |
(a) These RSEs relate to the estimates in table 13. |
9 Summary tables in this publication contain estimates from the 1999 to 2003 SMVUs. The SMVU is not designed to minimise the standard errors of the movements between reference periods. Care should be taken in drawing inferences from changes in data over these years. Data from the 1998 to 2002 SMVUs have been post-stratified due to problems which were identified with sample selections for these surveys. For the 2003 survey these problems were rectified and therefore post-stratification was not necessary. The post-stratified results for the 1998 to 2002 SMVUs may be different to those obtained had initial sample selections been correct. For this reason, direct comparisons between 2003 and earlier data should be made with caution. See Technical Note 2: Methodological Review in Survey of Motor Vehicle Use, Australia, 12 months ended 31 October 2002 (cat. no. 9208.0) for information on the post-stratification process.
10 The standard error for the movement can be calculated using:
where
is an estimate of total of the variable of interest, obtained from the 1st time point
is an estimate of total of the same variable of interest, obtained from the 2nd time point.
is an estimate of movement of the total of the variable of interest from the 1st time point to the 2nd time point ie
11 For total kilometres travelled by type of vehicle from the 1999 and 2003 SMVUs, the standard errors of the movements and the estimates from which they are derived are shown in the following table.
SE OF THE MOVEMENT OF TOTAL KILOMETRES TRAVELLED |
| |
| LEVEL ESTIMATES
| MOVEMENT ESTIMATES
| |
| 1999 | RSE (1999) | 2003 | RSE (2003) | Movement | SE (Movement)(a) | |
| mill. | % | mill. | % | mill. | mill. | |
| |
Passenger vehicles | 132,706 | 3 | 151,743 | 3 | 19,037 | 5,649 | |
Motor cycles | 981 | 10 | 1,376 | 9 | 396 | 151 | |
Light commercial vehicles | 25,374 | 4 | 32,671 | 3 | 7,298 | 1,415 | |
Rigid trucks | 6,486 | 3 | 7,768 | 3 | 1,282 | 316 | |
Articulated trucks | 5,347 | 3 | 5,841 | 3 | 495 | 217 | |
Non-freight carrying trucks | 316 | 18 | 203 | 8 | -113 | 60 | |
Buses | 1,843 | 4 | 1,893 | 3 | 50 | 90 | |
Total | 173,053 | 2 | 201,497 | 2 | 28,444 | 5,830 | |
| |
(a) Calculated on unrounded data. |
12 For example, the standard error for the movement from the 1999 to the 2003 SMVU of the estimates for total kilometres travelled for all passenger vehicles registered in Australia is 5,649 million kilometres. Since the magnitude of the movement between the estimates of 19,037 million kilometres is more than twice the standard error for the movement, the ABS can say with 95 percent (19 chances in 20) confidence that the movement is significantly different from zero. Note that almost all of the movements from the 1999 to the 2003 SMVU are more than two standard errors of the movement and are therefore significantly different from zero.
NON-SAMPLING ERROR
13 Non-sampling error covers the range of errors that are not caused by sampling and can occur in any statistical collection whether it is based on full enumeration or a sample. For example, non-sampling error can occur because of non-response to the statistical collection, errors in reporting by providers, definition or classification difficulties, errors in transcribing and processing data and under-coverage of the frame from which the sample was selected. If these errors are systematic (not random) then the survey results will be distorted in one direction and therefore will be unrepresentative of the target population. Systematic errors are called bias.
14 Non-sampling error is minimised by the use of pre-advice methodology. This involves vehicle owners receiving early advice about their inclusion in the survey and encourages a higher degree of record keeping. In addition, the reporting of odometer readings taken at the start and end of the survey periods (approximately three months apart) provide reliable estimates of total distance travelled without a recall bias.
Response and non-response
15 An important factor that affects non-sampling error is the response rate achieved. Responses were received from 83% of all of the selections for 2003. After removing those vehicles that had been found to be deregistered or out of scope, the live response rate for the 2003 SMVU was 82%.
16 The ABS makes all reasonable efforts to maximise response rates. Where appropriate, mail reminders and telephone follow-up are used to attempt to contact non-responding vehicle owners.
17 A large non-response increases the potential for non-response bias, which occurs if the usage patterns of the non-responding vehicles differ significantly from those of the responding vehicles. For the SMVU, it is assumed that the characteristics of non-responding vehicles including the proportion of deregistered, out of scope and nil use vehicles are the same as for responding vehicles.
Imputation
18 The need for imputation of unfilled items on the returned questionnaires, as for previous surveys, remained quite high. Imputation is the process whereby a value is generated for missing data items by averaging the responses for similar vehicles which were operating for the reference period. Of the questionnaires returned for 2003 there were 9% of those reporting some vehicle use that needed imputation of one or more items apart from the average rate of fuel consumption. The imputation for average rate of fuel consumption for 2003 was 28%.
Adjustments
19 The SMVU measures the use of all vehicles registered during the reference year. Because selections are taken from vehicles registered some time before the beginning of each collection period, adjustments are made to account for the change in size of the registered motor vehicle fleet since the population frame was created. For the 2003 SMVU the frame was created on 31 March 2002. This involved two categories:
- re-registrations - older vehicles that are returning to the registered vehicle fleet after a period of deregistration, and
- new motor vehicles - vehicles which have not been previously registered.
CONTRIBUTION OF ADJUSTMENTS FOR RE-REGISTRATIONS, Australia - SMVU 2003 |
| |
| Percentage of total kilometres travelled | |
Type of vehicle | % | |
| |
Passenger vehicles | 2 | |
Motor cycles | 6 | |
Light commercial vehicles | 2 | |
Rigid trucks | 2 | |
Articulated trucks | 4 | |
Non-freight carrying trucks | 2 | |
Buses | -1 | |
Total | 2 | |
| |
CONTRIBUTION OF NEW VEHICLES REGISTERED AFTER 31 MARCH 2002(a) |
| |
| Percentage of total kilometres travelled | |
Type of vehicle | % | |
| |
Passenger vehicles | 9 | |
Motor cycles | 17 | |
Light commercial vehicles | 11 | |
Rigid trucks | 10 | |
Articulated trucks | 14 | |
Non-freight carrying trucks | 8 | |
Buses | 11 | |
Total | 10 | |
| |
(a) Based on data from Sales of New Motor Vehicles, Australia (Cat. no. 9314.0). |
20 These activities occur continuously and the adjustments are made to account for the registrations that are estimated to have been added to or removed from the registered vehicle fleet between the population frame date and the reference period. As deaths are estimated, it is possible for the re-registration factor to be negative.
21 Users should contact the ABS if they have any queries on the quality and reliability of estimates for particular purposes.