ABS Data Quality Statement Checklist

    1. What is a Data Quality Statement?


The ABS recommends that when assessing the quality of a statistical collection or product, a Data Quality Statement be developed. A Data Quality Statement is a presentation of information about the quality of a statistical collection or product using the Data Quality Framework. These may also be referred to as Data Quality Declarations.

    2. Why use the ABS Data Quality Framework to assess Data Quality?

Among national statistical agencies, quality is generally accepted as "fitness for purpose". Fitness for purpose implies an assessment of an output, with specific reference to its intended objectives or aims. Quality is a multidimensional concept which does not just include the accuracy of statistics, but also stretches to include other aspects such as relevance and interpretability.

The ABS Data Quality Framework is based on the Statistics Canada Quality Assurance Framework (2002) and the European Statistics Code of Practice (2005). The ABS Data Quality Framework is comprised of seven (7) dimensions of quality, reflecting a broad and inclusive approach to quality definition and assessment.

    3. How to develop a Data Quality Statement?


a. Identify the data that is being assessed

b. Review and assess the data against each of the seven dimensions:

      1. Institutional Environment

      2. Relevance

      3. Timeliness

      4. Accuracy

      5. Coherence

      6. Interpretability

      7. Accessibility

c. Create a Data Quality Statement using the answers from the Data Quality checklist.


    4. Checklist of questions to generate a Data Quality Statement

      The checklist is made up of two components: first, the type of data being assessed, and second, the questions used to identify and rate the quality of the specific data.

      a. Data Details:

What type of data are you assessing?
          • Administrative data sources (refers to data generated obtained as a by-product of administrative sources or processes);
          • Survey data sources; or
          • Combination of data sources.

b. ABS Data Quality Framework checklist

Institutional Environment

            i. Which organisation(s) collect the data and what sort of organisation(s)?

            ii. What authority/legislation/agreement was the data collected under?

            iii. Which organisation(s) compile the data, and what sort of organisation is this (or are these)?

            iv. Is statistical confidentiality guarantees, and if so, under what authority or legislation?

            v. To what extent and how quickly are any identified errors in published statistics corrected and publicised?


          Relevance:

            i. About whom, or what, was the data collected?

            ii. What levels of geography are data available for?

            iii. What key data items are available?

            iv. If rates and percentages have been calculated, are the numerators and denominators for the same data source(s)? If not, please provide more information.

            v. For example, other questions you may wish to consider include:

                1. What was the original purpose for collecting the data?

                2. What does the data not represent or cover?

                3. Have standard classifications been used? If not, why not?

          Timeliness

            i. How often is the data collected or expected to be collected?

            ii. When did the data become available?

            iii. What is the reference period for the data?

            iv. For example, other questions you may wish to consider include:

                1. Are there likely to be updates or revisions to the data after its release?

                2. Are there other less frequent data sources that contain more detailed data that can be used in other reporting years when available?

          Accuracy

            i. How was the data collected?

            ii. Has the data been adjusted in any way? If so, how much was adjusted and on what data items?

            iii. What is the sample size?

            iv. What is the collection size?

            v. What are the standard errors for the key data items?

            vi. Please specify any known issues with under counts. What is done to manage these?

            vii. Please specify known issues with over counts. What is done to manage these?

            viii. For example, other questions you may wish to consider include:

                1. Are there sensitive questions or topics that are collected that may cause bias?

                2. What steps have been taken to minimise processing errors?

                3. What are the non-response , non-reporting , or item non-reporting rates?

                4. Are any parts of the population unaccounted for in the data collected?

        Coherence
            i. How consistent is the data over time? If there are differences, what are they and what is their impact?

            ii. Is the State/Territory/Regional data consistent with each other and the Australia level ?

            iii. If the data for the quality statement is based on a percentage or rate, how do the numerator and denominator compare with each other? What are the differences which affect their comparability? What is the impact of these differences?

            iv. Is a time series available for this data?

            v. For example, other questions you may wish to consider include:

                1. Have there been changes to the underlying data collection?

                2. Have any real world events impacted on the data since the previous release? How have these impacts on the data been managed?

                3. What other data sources is this data comparable with?

                4. What other data sources in society report similar information? How do these data sources compare?

        Interpretability
            i. Is there a particular context that this data needs to be considered within?

            ii. What other information is available to help users better understand this data source?

            iii. For example, other questions you may wish to consider include:

                1. Are there any ambiguous or technical terms that may need further explanation?

        Accessibility
            i. Can data that hasn't been published be requested?

            ii. What are the contact details for requesting more information?

            iii. In which formats is the data available for people to use? Where and how do you access them?

            iv. Are there any privacy or confidentiality issues that prevent the data from being released publicly?