Contents >>
Appendix A. Data quality framework
Appendix A. Data quality framework
The review team have used the ABS data quality framework to help identify the 'fit for purpose' nature of the SMVU data during phase 1 of this review. A summary of this framework is provided below for reference.
The ABS data quality framework incorporated into the decision cycle is based on a framework developed by Statistics Canada, which identifies six key dimensions of data quality:
Relevance
Accuracy
Timeliness
Accessibility
Interpretability
Coherence
This data quality framework has been published internationally (Brackstone G., "Managing Data Quality in a Statistical Agency", (1999) Survey Methodology, Vol. 25, no. 2, Statistics Canada) and has been recommended by the ANAO as 'better practice' in specifying performance measures ("ATO Performance Reporting under the Outcomes and Outputs Framework, Australian Taxation Office", Audit Report No.46 2000-01, pp63-64.).
More specifically, the six dimensions of quality can be described as follows:
The relevance of statistical information reflects the degree to which it meets the real needs of clients. It is concerned with whether the available information sheds light on the issues most important to users. Relevance is generally described in terms of key user needs, key concepts and classifications used and the scope of the collection (including the reference period). These components are then compared against specific user needs to assess relevance.
The accuracy of statistical information is the degree to which the information correctly describes the phenomena it was designed to measure. It is usually characterised in terms of error in statistical estimates and is traditionally decomposed into bias (systematic error) and variance (random error) components. It may also be described in terms of major sources of error that potentially cause inaccuracy (e.g. sampling, non-response).
The timeliness of statistical information refers to the delay between the reference point (or the end of the reference period) to which the information pertains, and the date on which the information becomes available.
The accessibility of statistical information refers to the ease with which it can be referenced by users. This includes the ease with which the existence of information can be ascertained, as well as the suitability of the form or medium through which the information can be accessed. The cost of the information may also be an aspect of accessibility for some users.
The interpretability of statistical information reflects the availability of the supplementary information and metadata necessary to interpret and utilise it appropriately. This information normally covers the availability and clarity of metadata, including concepts, classifications and measures of accuracy. In addition, interpretability includes the appropriate presentation of data such that it aids in the correct interpretation of the data.
The coherence of statistical information reflects the degree to which it can be successfully brought together with other statistical information within a broad analytic framework and over time. Coherence encompasses the internal consistency of a collection as well as its comparability both over time and with other data sources. The use of standard concepts, classifications and target populations promotes coherence, as does the use of common methodology across surveys.