Address to the Graduate Data Forum 2020
(Hosted by IPAA partnered with the Graduate Data Network)
Dr David Gruen
Tuesday 7 April 2020
A data-based APS, reflections and new directions
Introduction
I would like to begin by acknowledging the Ngunnawal people, the Traditional Custodians of this land. I would also like to pay my respects to their Elders past and present and extend that respect to Aboriginal and Torres Strait Islanders present here today.
Thank you to IPAA and the Graduate Data Network for providing the opportunity for me to speak today, even though it has to be done virtually, rather than face-to-face.
Just at the moment, it is hard to talk about anything other than COVID-19.
In less than a month, the spread of COVID-19 has turned many of our lives upside down, and led to rapid-fire responses by governments, each more remarkable than the one before, that none of us have seen in our lifetimes.
Importantly for the topic today, the spread of COVID-19 has also highlighted the importance of high-quality trusted data in circumstances where it is particularly difficult for the community and policymakers to quickly and accurately assess what is going on.
I am going to talk about a few different things today.
But with COVID-19 on my mind, let me begin by talking about some of the responses the ABS has in train to provide data that is as up-to-date as possible to inform the community and policymakers.
In a press release I issued on 16 March, I said:
“The economic implications of the spread of the coronavirus, COVID-19, are highly uncertain. In these circumstances, there are sizeable benefits for the community and governments to have access to information about the economic responses of individuals and businesses that is as up-to-date as possible.
The ABS has considered what additional, more up-to-date information it can provide, over and above the existing statistical releases, to enhance understanding of the economic impacts of the coronavirus.”
I then listed a series of extra surveys the ABS is running, as well as other data we committed to release in preliminary form more quickly than normal.
The first new survey we ran, ‘Business Impacts of COVID-19’, was designed in double-quick time, with help from Treasury guiding us on the information they thought would be valuable to collect. Given that it was no longer safe to conduct face-to-face interviews, this survey was conducted exclusively over the phone.
Interviews began on Monday, 16 March, and were completed on Monday, 23 March. We had planned to collect data for the survey for longer than this, but stopped early because the Australian government announced Stage 1 restrictions on social gatherings on Sunday, 22 March. That was a profound change in the economic environment facing many firms, which would have created a big structural break in these data had we continued to collect it.
Six days of interviews enabled us to collect responses from just over 1,200 businesses, and provided enough detail for us to report results disaggregated by business size – that is, by number of employees – and by industry sector.
Having finished collecting data for the survey on 23 March, we published the results on our website on 26 March – three days later.
I can’t be certain, but I suspect that may be the fastest survey ever conducted from start to finish by the ABS.
I don’t know who came up with the phrase ‘never waste a crisis’ but it seems particularly apt to me at present!
We have now finished collecting data for the second iteration of this business survey, and will be publishing results shortly.
We plan to run this business survey at least monthly while the economic effects of the spread of COVID-19 continue to evolve, and continue to be of relevance to the community and policymakers.
We are also running a household survey, which we have dubbed Rapid Acquisition of Population Information and Data, or RAPID. This household survey is collecting information on changes in the employment circumstances of members of the household, and on the impacts that the restrictions on social gatherings are having on people.
We completed enumeration of this survey late last week, having received responses from just over 1,000 households. As with the business survey, our aim is to publish the results as soon as possible.
Along with these new surveys, we are identifying new sources of data to help shed light on the evolving implication of the spread of the virus. These include using:
- near-real-time scanner data from the supermarket chains to reveal the community’s changing expenditure patterns (toilet paper anyone?);
- Single Touch Payroll data from the ATO to shed light on employment and income patterns across industries; and
- interactive mapping technology to provide geographical information on the distribution across Australia of older people, and those with health risk factors relevant to their susceptibility to infection with COVID-19.
As I said, never waste a crisis!
New technology – Automated Image Recognition in the ABS Address Register
Underpinning the Census and our household collections is the ABS Address Register. Every quarter, there are around 500,000 ‘use of address’ changes which are signalled by comparisons with administrative data. (For example, in a typical quarter there might be 100,000 new addresses identified and 400,000 changes noted such as the demolition, subdivision, conversion or completion of a building at a given address.)
Ideally, all of these potential changes would be quality assured, but due to resource limitations, only around 34,000 addresses were previously able to be confirmed by desktop review each quarter. These traditional reviews use aerial imagery and other research tools to make assessments of address use and dwelling structure.
In collaboration with the CSIRO and Data 61, the ABS has developed a machine learning model called Automated Image Recognition (AIR). This allows automatic checking of large volumes of addresses based purely on aerial imagery. In comparison with the 34,000 address checks previously completed per quarter, the ABS will be able to quality assure five times that number of addresses using the same resources, with the new approach resulting in a much better register, which means saving time and money during data collection.
This new technology has just gone into production. The Address Register team are now working with the UNCE High Level Group for Modernising Official Statistics to share learning from this project with other National Statistical Organisations.
Identifying agricultural crops using satellite imagery
The ABS is sharing microdata with Geosciences Australia to identify the presence of agricultural crops from a subset of the Rural Environment Agricultural Commodities Survey (REACS) and Agriculture Census. The presence of crops is identified on small geographic areas called cadastre (land parcel), which is the finest level of spatial detail that is possible to produce from ABS Agriculture surveys.
The basic microdata will be used with satellite imagery data, allowing trusted users to develop machine-learning methods to better identify different crops from satellite imagery. By providing this data the ABS significantly increases the amount and spatial extent of validation information which will allow machine learning approaches to be more effective.
These data are being used to support the creation of open source, national, land use and land cover maps that will be a key input into a National Land Account statistical release. The Land Use and Land Cover maps will also assist the government ability to respond to major events such as floods, fires and droughts.
In the longer term this data will also be used to create maps of specific crops at a national scale which will assist in producing agriculture statistics.
Future innovation – Single Touch Payroll (STP) data
The possibility now exists to compile national statistics, of a consistently high quality and on a consistent conceptual basis, without running a purpose-built collection at all. Single Touch Payroll (STP) is a new way of reporting tax and superannuation information to the Australian Tax Office (ATO). With STP, businesses report employees' payroll information – such as salaries and wages, pay as you go (PAYG) withholding and super – to the ATO each time they pay employees through STP-enabled software.
From 1 July 2018, large employers with 20 or more employees were brought into STP and small employers with 19 or less employees started from 1 July 2019. The ATO is already receiving over 2 million employer records and 20 million employee records every month. (This accounts for 95% of large employers and 50% of small employers.)
STP data provides enormous potential for the ABS to fundamentally change how we use administrative data to compile official statistics. In addition to the comprehensive and detailed nature of the data, the highly frequent and timely flow of STP data also presents new opportunities.
STP has the potential to further reduce business reporting burden if ABS surveys can supplement, substitute or replace some business survey questions (and potentially entire surveys, such as the Average Weekly Earnings collection). This would certainly be a revolutionary rather than evolutionary change.
With all of these changes to the data landscape, and new methods of analysis opening up all the time, it’s an exciting time to be a data analyst, or even the Australian Statistician!
Thank you.