TECHNICAL NOTE DATA QUALITY INDICATORS
DATA QUALITY
1 When interpreting the results of a survey it is important to take into account factors that may affect the reliability of estimates. The survey procedures as well as sampling and non-sampling errors should be considered. Examination of the following quality indicators will assist users in determining fitness for purpose of estimates produced from the Survey of Motor Vehicle Use (SMVU).
SAMPLING ERROR
2 Estimates from SMVU are based on information collected for a sample of registered motor vehicles, rather than all registered vehicles. The estimates may differ from those that would have been produced if all registered motor vehicles had been included in the survey. This difference is referred to as sampling error.
3 One measure of sampling error is the Relative Standard Error (RSE), which indicates the extent to which a survey estimate is likely to deviate from the true population, expressed as a percentage of the estimate. Estimates with a RSE of 25% or greater are subject to high sampling error and should be used with caution.
4 In the datacubes associated with this release, estimates are presented side by side with their RSE. It is important to consider the RSEs when using estimates produced from SMVU as it affects the reliability of the estimates, and therefore the importance that can be placed on interpretations drawn from the data.
5 Another measure of sampling variability is the Standard Error (SE), which is an indication of the sampling error expressed in numeric terms.
6 The reliability of estimates can also be assessed in terms of a confidence interval. Confidence intervals represent the range in which the population value is likely to lie. They are constructed using the estimate of the population value and its associated standard error. For example, there is approximately a 95% chance (i.e. 19 chances in 20) that the population value lies within two standard errors of the estimates, so the 95% confidence interval is equal to the estimate plus or minus two standard errors.
7 The example below demonstrates how each of the reliability measures described above can be calculated and interpreted:
Relative Standard Error (RSE)
From Table 4 of the datacube:
Total kilometres travelled by passenger vehicles, Australia, 2015-16
Estimate = 175,899 million kilometres
RSE = 2.74%
Since the RSE on the estimate is less than 25%, the estimate would be considered reliable enough for general use.
Standard Error (SE)
SE = RSE x estimate
SE (Total kilometres travelled by passenger vehicles, Australia, 2015-16) = 2.74% x 175,899 = 4,820 million kilometres
95% Confidence Interval
95% confidence interval = Estimate plus or minus 2 x SE
Lower limit of the interval = 175,899 - (2 x 4,820) = 166,259 million kilometres
Upper limit of the interval = 175,899 + (2 x 4,820) =185,539 million kilometres
95% Confidence Interval = 166,259 to 185,539 million kilometres
It can, therefore, be considered with 95% reliability that the true distance travelled by registered passenger vehicles in Australia is between 166,259 and 185,539 million kilometres.
8 It is important to note that estimates at more detailed levels than the above are subject to higher RSEs and therefore are less reliable.
9 The movement estimated by comparing SMVU data from different time periods is also subject to sampling error.
10 The standard error for the movement between two years can be approximated for SMVU using the following formula
where is an estimate of total of the variable of interest, obtained from the 1st time point is an estimate of total of the same variable of interest, obtained from the 2nd time point is an estimate of movement of the total of the variable of interest from the 1st time point to the 2nd time point, ie
11 Estimates of movement produced from SMVU are subject to significant sampling error, and particular caution should be used when making inferences about differences between estimates over time.
12 The example below demonstrates how the reliability of movement in SMVU estimates can be calculated and interpreted:
Standard Error (SE) of movement
From Table 4 of the datacube:
Total kilometres travelled by passenger vehicles, Australia, 2014 = 176,805 million kilometres (RSE = 3.15%)
Total kilometres travelled by passenger vehicles, Australia, 2016 = 175,899 million kilometres (RSE = 2.74%)
Movement between estimates = - 906 million kilometres
SE(Movement) =7,362 million kilometres
95% Confidence Interval of movement
95% confidence interval = Estimate plus or minus 2 x SE
Lower limit of the interval = -906 - (2 x 7,362) = -15,632 million kilometres
Upper limit of the interval = -906 + (2 x 7,362) = 13,820 million kilometres
It can, therefore, be considered with 95% reliability that the true movement in distance travelled by registered passenger vehicles in Australia from 2014 to 2016 is between a decrease of 15,632 million kilometres and an increase of 13,820 million kilometres.
13 The table below presents the standard error and 95% confidence intervals for the estimated movement in total kilometres travelled by type of vehicle from SMVU 2014 to SMVU 2016.
SE OF THE MOVEMENT OF TOTAL KILOMETRES TRAVELLED - 2014 and 2016(a) |
|
| LEVEL ESTIMATES (b) | MOVEMENT ESTIMATES (b) |
| 2014 | RSE (2014) | 2016 | RSE (2016) | Movement | SE (Movement) | 95% Confidence Interval of movement |
| | | | | | | Lower Limit | Upper Limit |
| mill. | % | mill. | % | mill. | mill. | mill. | mill. |
|
Type of vehicle | | | | | | | | |
Passenger vehicles | 176 805 | 3.15 | 175 899 | 2.74 | -906 | 7 362 | -15 632 | 13 820 |
Motor cycles | 2 162 | 9.12 | 2 176 | 9.71 | 13 | 288 | -564 | 591 |
Light commercial vehicles | 45 540 | 2.73 | 50 778 | 3.14 | 5 238 | 2 021 | 1 196 | 9 280 |
Rigid trucks | 9 394 | 2.96 | 10 301 | 2.9 | 907 | 407 | 92 | 1 723 |
Articulated trucks | 7 820 | 1.62 | 7 613 | 1.84 | -207 | 188 | -584 | 171 |
Non-freight trucks | 346 | 14.72 | 290 | 15.63 | -55 | 68 | -192 | 81 |
Buses | 2 304 | 4.23 | 2 456 | 4.41 | 152 | 145 | -139 | 444 |
Total | 244 369 | 2.33 | 249 512 | 2.09 | 5 143 | 7 724 | -10 306 | 20 592 |
|
(a) Data for 2014 are for 12 months ended 31 October and data for 2016 are for 12 months ended 30 June. |
(b) Calculated on unrounded RSE estimates. |
NON-SAMPLING ERROR
14 Non-sampling error covers the range of errors that are not caused by sampling and can occur in any statistical collection whether it is based on full enumeration or a sample. For example, non-sampling error can occur because of non-response to the statistical collection, errors or omissions in reporting, definition or classification difficulties, errors in transcribing and processing data and under-coverage of the frame from which the sample was selected. If these errors are systematic (not random) then the survey results will be distorted in one direction and therefore will be unrepresentative of the target population. Systematic errors result in bias.
15 A number of indicators of possible non-sampling error are outlined below.
Imputation
16 Imputation is the process whereby a value is generated for missing data. Data may be missing for a particular data item (partial imputation), or for a unit which has not responded to the questionnaire (full imputation). For SMVU, imputed values are based on responses for similar vehicles which were operating for the reference period.
17 Imputation introduces non-sampling error, and the contribution to estimates from imputed data provides one measure of the reliability of the estimates. As for previous surveys, the need for imputation of unanswered items on the returned questionnaires remained quite high. The tables below show the percentage contribution to the estimates from both partial and full imputation.
CONTRIBUTION TO ESTIMATES FROM IMPUTATION(a), State/territory of registration |
|
| Percentage of total kilometres travelled | Percentage of total tonne-kilometres travelled | Percentage of fuel consumption |
| % | % | % |
|
New South Wales | 24 | 28 | 52 |
Victoria | 20 | 36 | 48 |
Queensland | 25 | 35 | 48 |
South Australia | 20 | 30 | 47 |
Western Australia | 21 | 29 | 46 |
Tasmania | 22 | 27 | 47 |
Northern Territory | 31 | 33 | 50 |
Australian Capital Territory | 23 | 36 | 44 |
Australia | 23 | 32 | 49 |
|
(a) Includes both partial and full imputation |
CONTRIBUTION TO ESTIMATES FROM IMPUTATION(a), Type of vehicle |
|
| Percentage of total kilometres travelled | Percentage of total tonne-kilometres travelled | Percentage of fuel consumption |
| % | % | % |
|
Passenger vehicles | 22 | . . | 52 |
Motor cycles | 26 | . . | 53 |
Light commercial vehicles | 25 | 55 | 42 |
Rigid trucks | 19 | 32 | 36 |
Articulated trucks | 20 | 31 | 45 |
Non-freight carrying vehicles | 18 | . . | 23 |
Buses | 11 | . . | 48 |
Total | 23 | 32 | 49 |
|
. . not applicable |
(a) Includes both partial and full imputation |
Response and non-response
18 An important factor that affects non-sampling error is the response rate achieve The ABS makes all reasonable efforts to maximise response rates. For SMVU, mail reminders and telephone follow-up were used to attempt to contact non-responding vehicle owners. Usable responses were received from 79% of all of the selections for 2016, comprised of 76% from registered vehicles and 3% from unregistered vehicles, out of scope and duplicates.
RESPONSE AND NON-RESPONSE BY CATEGORY |
|
| | Percentage of selections 2016 |
| | % |
|
Response received | |
| Registered vehicle | 76 |
| Unregistered vehicle(a) | 3 |
Non-response | |
| Untraceable - mailing address unknown | 3 |
| Other(b) | 18 |
Total selections | 100 |
|
(a) Includes deregistration, out of scope and duplicates. |
(b) Includes: responses that were unusable because of unresolved queries or where the vehicle was sold during the reference third and the reported data covered less than 14 days; non-response where no listing could be found to enable contact by telephone; and owner contacted by telephone but response still not secured. |
19 After removing those vehicles that had been found to be deregistered or out of scope, the response rate for the 2016 SMVU was 79%.
20 Response rates for each state and territory, and for each vehicle type, are shown in the following tables:
RESPONSE RATES, State/Territory |
|
| Response rate |
| % |
|
New South Wales | 81 |
Victoria | 79 |
Queensland | 78 |
South Australia | 83 |
Western Australia | 80 |
Tasmania | 80 |
Northern Territory | 72 |
Australian Capital Territory | 77 |
Australia | 79 |
|
RESPONSE RATES, Type of vehicle |
|
| Response rate |
| % |
|
Passenger vehicle | 76 |
Motor cycles | 74 |
Light commercial vehicles | 75 |
Rigid trucks | 79 |
Articulated trucks | 81 |
Non-freight carrying trucks | 85 |
Buses | 84 |
Total | 79 |
|
21 For the SMVU, it is assumed that the characteristics of non-responding vehicles are the same as for like responding vehicles. Non-response has the potential to cause non-response bias, which occurs if the usage patterns of the non-responding vehicles differ from those of the responding vehicles. For example, the lowest response rate achieved by vehicle type was for motor cycles (74%). This could result in the estimates for motor cycles being of a lower quality than other vehicle types.
Frame quality
22 A population or survey frame of 18.2 million vehicles was identified on 31 January 2015 using information obtained from the state and territory motor vehicle registration authorities, as part of the annual ABS Motor Vehicle Census (MVC) (cat. no. 9309.0).
23 The reliability of this frame in providing an accurate number of vehicles in scope of the survey is indicated by the number of duplicate vehicle registrations, vehicle de-registrations prior to frame extract, and out-of-scope vehicles identified. For 2016, approximately 0.7% of the total frame were identified as such. This indicates the frame was reliable in terms of providing an accurate number of registered vehicles in Australia.
24 Another indicator of frame quality is the number of units identified as in scope with different characteristics compared to what was recorded on the frame. For SMVU, this can arise when respondents indicate an alteration has been made to the vehicle body, resulting in a different body type to that recorded on the frame. These changes can happen during the time-lag between finalising the frame and collection of SMVU data (between 5 and 17 months). Vehicle classification anomalies can also result from data supplied by state and territory vehicle registration authorities.
25 An assessment of vehicle classification anomalies from 2016 data shows that while there was no bias towards specific states or territories, there were marked discrepancies for some vehicle types. For vehicles on the frame that were listed as non-freight carrying trucks, 25.0% were found to be other vehicle types and 13.6% of vehicles listed as buses were found to be other vehicle types. This issue was not significant for other vehicle types on the frame.
SURVEY PROCEDURES
26 The survey is comprised of three independent samples, with a different one used for each four month period in the overall 12 month survey period. Estimates from each of these samples are aggregated and adjusted for new motor vehicles and re-registrations of vehicles to produce an annual estimate.
Adjustments
27 The SMVU aims to measure the use of all vehicles registered during the reference year. Because selections are taken from vehicles registered some time before the beginning of each collection period, adjustments are made to account for the change in size of the registered motor vehicle fleet since the population frame was created. For the 2016 SMVU, the frame was created on 31 January 2015. These adjustments involved two categories:
- re-registrations - older vehicles that are returning to the registered vehicle fleet after a period of de-registration, and
- new motor vehicles - vehicles which have not been previously registered.
CONTRIBUTION OF ADJUSTMENTS FOR RE-REGISTRATIONS(a), Australia - 2007, 2010, 2012, 2014 and 2016(b) |
|
| PERCENTAGE OF TOTAL KILOMETRES TRAVELLED |
| 2007 | 2010 | 2012 | 2014 | 2016 |
| % | % | % | % | % |
|
Type of Vehicle | | | | | |
Passenger vehicles | 3 | 2 | 1 | - | - |
Motor cycles | 7 | 8 | 7 | 1 | 3 |
Light commercial vehicles | 2 | 2 | 2 | - | 1 |
Rigid trucks | 2 | 3 | 3 | - | 2 |
Articulated trucks | 4 | 4 | 4 | -1 | 2 |
Non-freight carrying vehicles | 2 | 6 | 1 | 1 | 2 |
Buses | -2 | 6 | 5 | 2 | 2 |
Total | 3 | 2 | 1 | - | - |
|
- nil or rounded to zero (including null cells) |
(a) Estimates for 2014 were produced using a different method than in 2007, 2010, 2012, 2016. The contribution of adjustments for re-registrations for 2014 is not comparable with other years.
(b) Data for 2007, 2010, 2014 are for 12 months ended 31 October. Data for 2012 and 2016 are for 12 months ended 30 June. |
28 These activities occur continuously and the adjustments are made to account for the registrations that are estimated to have been added to or removed from the registered vehicle fleet between the population frame date and the end of the reference period. The adjustment process also accounts for de-registrations. This means it is possible for the re-registration factor to be negative.
CONTRIBUTION OF NEW VEHICLES REGISTERED AFTER FRAME CREATION - 2007, 2010, 2012, 2014 and 2016(a) |
|
| PERCENTAGE OF TOTAL KILOMETRES TRAVELLED |
| 2007
% | 2010
% | 2012
% | 2014
% | 2016
% |
|
Type of Vehicle
Passenger vehicles | 10 | 9 | 7 | 10 | 7 |
Motor cycles | 15 | 11 | 9 | 11 | 8 |
Light commercial vehicles | 14 | 10 | 8 | 11 | 7 |
Rigid trucks | 12 | 8 | 6 | 6 | 6 |
Articulated trucks | 17 | 11 | 9 | 16 | 8 |
Non-freight carrying trucks | 9 | 8 | 13 | 13 | 11 |
Buses | 16 | 5 | 5 | 3 | 4 |
Total | 11 | 9 | 7 | 10 | 7 |
|
(a) Data for 2007, 2010, 2014 are for 12 months ended 31 October. Data for 2012 and 2016 are for 12 months ended 30 June. |
Nil use
29 Some providers may report nil use for the 4 month reference period in which they were selected. Nil use vehicles are registered vehicles that report no travel during that specific reference period. Nil use vehicles are included in the survey as their reported nil use is representative of other vehicles in the population. Vehicles may have nil use due to factors such as seasonal usage, mechanical faults or economic conditions. Where a provider gives a nil use response, a follow-up phone call is used to check the veracity of the response.
NIL USE, Vehicle type - 2007, 2010, 2012, 2014 and 2016(a) |
|
| 2007 | 2010 | 2012 | 2014 | 2016 |
NUMBER OF REGISTERED VEHICLES WITH NIL USE |
|
Passenger vehicles | 456 884 | 561 613 | 479 179 | 476 348 | 315 089 |
Motor cycles | 125 547 | 148 217 | 182 308 | 196 887 | 231 039 |
Light commercial vehicles | 114 241 | 122 227 | 71 292 | 103 727 | 99 456 |
Rigid trucks | 36 660 | 34 647 | 36 549 | 38 541 | 39 461 |
Articulated trucks | 3 680 | 5 165 | 6 162 | 6 652 | 5 092 |
Non-freight carrying trucks | 1 418 | 2 424 | 3 157 | 2 566 | 1 532 |
Buses | 1 510 | 2 831 | 1 809 | 2 006 | 2 644 |
Total | 739 940 | 877 123 | 780 455 | 826 725 | 694 315 |
PROPORTION OF REGISTERED VEHICLES WITH NIL USE (%) |
|
Passenger vehicles | 4 | 4 | 5 | 4 | 2 |
Motor cycles | 22 | 25 | 23 | 26 | 28 |
Light commercial vehicles | 6 | 5 | 5 | 3 | 3 |
Rigid trucks | 9 | 9 | 8 | 8 | 8 |
Articulated trucks | 6 | 5 | 6 | 7 | 5 |
Non-freight carrying trucks | 7 | 7 | 11 | 15 | 7 |
Buses | 2 | 2 | 4 | 2 | 3 |
Total | 5 | 5 | 6 | 5 | 4 |
|
(a) Data for 2007, 2010, 2014 are for 12 months ended 31 October. Data for 2012 and 2016 are for 12 months ended 30 June. |