Australian Bureau of Statistics
3412.0.55.002 - Information Paper: Further Improvements to Net Overseas Migration Estimation, Dec 2013
Latest ISSUE Released at 11:30 AM (CANBERRA TIME) 17/12/2013
|Page tools: Print Page Print All RSS Search this Product|
The paper will conclude with a discussion of the potential for further improvements.
Estimates of both ERP and NOM for Australia and each of the States and Territories are published quarterly in Australian Demographic Statistics (cat. no. 3101.0). The improvements outlined for preliminary NOM in this paper will be introduced from the June 2013 issue of Australian Demographic Statistics (cat. no. 3101.0), due for release on 17 December 2013. Changes made to final NOM, from September quarter 2006 onwards, were released in the December 2012 issue of Australian Demographic Statistics (cat. no. 3101.0) on 20 June 2013.
Conceptually, the term NOM is based on an international travellers' duration of stay being in or out of Australia for 12 months or more. With the introduction of the '12/16 month rule' method for estimating NOM, this 12 months does not have to be continuous and is measured over a 16 month reference period. For example, whether a traveller is in or out of the population is determined by their exact duration of stay in, or away, from Australia over the subsequent 16 months after arrival or departure. The 'duration of stay' is a key component in the successful measurement of NOM.
To estimate preliminary NOM, before these 16 months become available, the ABS has a propensity model that uses migration adjustments derived from final NOM one year earlier. The migration adjustments are applied to travellers with similar characteristics and are grouped according to the following variables:
The overseas arrivals and departures (OAD) data is the main input data used in the estimation of NOM. Prior to a rebuild of the OAD system by the ABS, as noted later in this paper, the old processing system automatically set any missing duration of stay to a short-term movement. It therefore had a deflationary impact on the long-term categories of travel. This directly impacted on the quality of the initial category of travel, which is used within the propensity model for preliminary NOM estimation as noted in the first dot point above.
Source of overseas migration data
The ABS statistics on overseas migration are calculated using administrative data collected and compiled by the Department of Immigration and Border Protection (DIBP). At present, the main source of data on overseas migration is incoming and outgoing passenger cards, matched with data from passports and visa permits. Information from these three data sources are collected, compiled and matched together by DIBP and stored with movement records on their Travel and Immigration Processing System (TRIPS). Each month these matched OAD records are supplied to the ABS and then processed within the OAD system.
Quarterly NOM estimates are sourced from this processed monthly OAD matched data and then combined with monthly extracts of unmatched OAD records. Unmatched OAD records are those where an inward/outward movement has been recorded by DIBP within the TRIPS system, but the data has not been able to be matched with either an equivalent passenger card, passport or visa permit.
3. IMPROVING THE QUALITY OF INPUT DATA USED IN ESTIMATING NOM - (NOM IMPROVEMENTS 2013)
Rebuild of the OAD system
In 2013, the ABS completed a rebuild of the OAD system (ROADS). The primary aim of this project was to improve the quality of OAD data, given its importance as the main data used to estimate NOM. The new system was thoroughly tested by processing over ten years of data. This time frame allowed for the complete re-processing of the NOM time series to incorporate the improvements and a thorough assessment of any changes to NOM estimation. It also allowed for new final NOM estimates for the 2006-2011 period to be produced and incorporated into the final rebasing of Australia's population estimates which was released in Australian Demographic Statistics, December Quarter 2012 (cat. no. 3101.0) on 20 June 2013. In addition, from 2006 onwards the ERP by country of birth series will also be updated with this improved data.
Detailed information on the changes and improvements made with the complete rebuild of the OAD system (ROADS), and the new OAD data time series from July 2004, will be made available with the release of Overseas Arrivals and Departures, Australia, January 2014 (cat. no. 3401.0) scheduled for 11 March 2014.
Through the process of the rebuild, all derivations, logical edits and imputations have been re-designed based on the best information, practices and methodology available at the time. All imputations within the rebuilt OAD system use a hot deck imputation method. For hot deck imputations, if a record has missing responses (called a recipient), then it receives those of another record (called a donor) which has a full set of responses before the imputation process began. The recipient record keeps all of its original responses and only has the missing responses imputed, thereby keeping as much of the collected information for that record as possible.
The idea behind this imputation is to use a set of characteristics that make the donor and recipient records as similar as possible. The characteristics used within the rebuilt OAD system vary between the different imputations. A combination of different characteristics were tested for each of the imputations to ascertain which would give better results. The characteristics used include age, country of citizenship, country of stay, direction of traveller, initial category of travel, passenger card box type, reason for journey and sampled or non-sampled data.
There are a number of imputations undertaken that specifically improve the quality of variables that flow through to the data used in NOM processing. They include country of stay, duration of stay, initial category of travel, passenger card box type, reason for journey and a specific one for the country of birth of New Zealand citizens (the latter of which is summarised in Appendix 1). Improving the initial category of travel imputation, in particular, has provided specific changes to the input data used within the NOM propensity model, which in turn has improved preliminary NOM estimation.
4. RESULTS OF IMPROVEMENTS MADE TO PRELIMINARY NOM ESTIMATION
The rebuild of the OAD system has improved the quality of matched OAD data. This OAD data is the main data used in the estimation of NOM. The improvements in this input data have therefore also provided improvements in the preliminary NOM estimates.
As shown in the table below, the previous OAD input data produced a difference between preliminary and final NOM of 24,935 persons in 2006-07. When the new OAD input was used, the difference was 21,118, an improvement of 15%. For the 23 quarters tested, only 4 quarters (June 2009, June 2010, December 2010 and March 2012) did not show an improvement to preliminary NOM. At the annual level, 2010-11 was the only year that did not show an improvement to preliminary NOM. Although there are fluctuations from quarter to quarter and for each of the States and Territories (see Graph 1), there is a consistent improvement over time.
Analysis of the changes resulting from using new OAD data for processing preliminary NOM, compared with using the old OAD data, show a reasonably consistent improvement for each of the States and Territories over time. Any positive number reflects an improvement on the previous preliminary NOM estimate, whereas a negative number indicates the reverse (Graph 1). Therefore, for Victoria there is an improvement each year in preliminary NOM from using the new OAD input data. For Queensland, the Northern Territory and the ACT there was an improvement in 5 of the 6 years tested. For NSW, Western Australia and Tasmania there was improvement in 4 of the 6 years tested. Although there were fluctuations for South Australia from year to year there was still a net improvement over the whole time period.
Graph 1 - Annual changes to preliminary NOM estimates - by State(a), based on a comparison between using old & new OAD input data
5. IMPROVED QUALITY OF CHARACTERISTICS AVAILABLE FROM FINAL NOM DATA
As mentioned earlier, there are a number of imputations undertaken that improve the quality of variables which flow through to the input data for NOM processing. Not only has this improved the data used for estimating NOM, but also the analytical dataset called the Travellers' Characteristics Database. Therefore, there are improvements to the quality of the following variables: country of birth, country of stay, initial category of travel and reason for journey.
A special imputation in the rebuilt OAD system to improve the quality of country of birth of New Zealand citizens, has flowed through to NOM data and thereby the Travellers' Characteristics Database as well as the ERP by country of birth series from 2006 onwards. The table below shows changes to NOM for the top 10 countries of birth for New Zealand citizens over the intercensal period 2006 to 2011. It compares the previous NOM (which uses old OAD input data) with the improved NOM (which uses new OAD input data). It clearly shows the old method had been imputing the New Zealand born too high at 91.9% of all New Zealand citizens who had contributed to NOM during this period. For information on the new imputation for country of birth of New Zealand citizens see Appendix 1.
An additional comparison between NOM estimates and Census data also highlights the improvements made to the country of birth data. For more information see Appendix 2: Comparison of recent migrants in NOM and Census data.
6. CHANGES TO REVISION TIMETABLES FOR NOM
The quarterly variability always experienced in Australia's population growth is predominately driven by changing trends in NOM. To help reduce the impact of possible large revisions to population estimates from only revising NOM estimates once every six months, as was the previous practice, the ABS has changed to a quarterly revision cycle. Consultation undertaken with major stakeholders prior to changing the revision cycle showed there was general support for this change.
The first quarterly revision cycle for publishing final NOM started with the March 2013 issue of Australian Demographic Statistics (cat. no. 3101.0), released on 26 September 2013.
7. FUTURE DIRECTIONS
The NOM improvements 2013, outlined in this information paper provide an update of some of the major work recently undertaken by the ABS to improve the estimation and quality of statistics on NOM.
Additional investigations are planned, which will likely result in further improvements. For example, extensive work has already been undertaken by the ABS to examine the groupings of travellers that are used by the propensity model for estimating preliminary NOM. With the improvements to the input OAD data used to estimate NOM, as noted in this paper, and the longer time series of final NOM estimates that is now available, the ABS will revisit the propensity model and re-examine the cross-classifications used. Currently, groupings are made by the following variables: initial category of travel, age, country of citizenship and state or territory of usual/intended residence. The effectiveness of other variables such as direction of travel, country of birth, port code and visa class will be examined and other areas of research such as the use of time series analysis may be undertaken. However, their use for improving preliminary NOM estimation will depend on the feasibility of being able to implement them.
The ABS will continue to collaborate with DIBP on projects to identify and improve the quality of the administrative data within the Travel and Immigration Processing System (TRIPS).
The ABS data referred to throughout this paper is sourced exclusively from data provided by the Department of Immigration and Border Protection (DIBP) each month. Their continued cooperation and support is highly valued and appreciated; without it, the wide range of statistics available on overseas arrivals and departures, net overseas migration and the country of birth of Australian residents published by the ABS would not be available. All data received by the ABS is treated in strict confidence, as required by the Census and Statistics Act 1905.
8. APPENDIX 1. Specific Imputation for Country of Birth of New Zealand Citizens
With the introduction of biometric passports for New Zealand (NZ) citizens in April 2005, the country of birth information was removed from the passport and replaced with a place of birth, for example Wellington, Auckland, Christchurch or Melbourne to name just a few. The passport was the only source of information on the country of birth of NZ citizens travelling to, or from, Australia. For other travellers who are not NZ citizens, country of birth information can be obtained from their passport or visa information. Visa information for most NZ citizens is not available as, under the trans-Tasman agreement, they do not need to hold a visa prior to travel to Australia.
Therefore, with the increased numbers of travellers holding NZ biometric passports, the proportion of movement records with a missing country of birth increased substantially. For example, by April 2013 the matched OAD data showed the number of records missing a country of birth was 251,176. NZ passport holders represented 95% of these missing a country of birth. Earlier, for April 2005, NZ passport holders represented only 6% of the missing country of birth records. By April 2007 this had increased to 79%. As a temporary fix to alleviate this issue a basic imputation was introduced by the ABS in late 2007.
In 2013, in order to greatly improve on the imputation introduced in 2007, a special imputation for country of birth of NZ citizens has been put in place within the rebuilt OAD system. It will improve country of birth statistics in the OAD, NOM, and ERP by country of birth collections. It will be introduced in the OAD collection for the first time with the release of the January 2014 OAD data and revised back to July 2004; for the NOM collection and the Travellers' Characteristics Database it will be revised back to December quarter 2003, and for the ERP collection by country of birth it will be revised back to 2006.
There are five steps to the process to generate country of birth when missing:
1. Prior to imputation, if country of birth is missing for a NZ citizen it will scan historical records of NZ citizens back to 2003 to see if there is an earlier record of the individual's country of birth. The system is able do this through the use of a unique personal identifier provided to each traveller who crosses Australia's international border. This step will look for a matched record and therefore will no longer need imputation. Currently, in 2013 approximately 80% are being matched.
2. If country of birth is still unknown after Step 1, the system will scan all previous imputations for country of birth for NZ citizens to see if there is an existing record for that individual. This ensures an individual's country of birth is only ever imputed once although they may cross Australia's international borders many times.
3. If country of birth is still unknown after Step 2, but there is a place of birth supplied on the NZ biometric passport, then a place of birth to a country of birth concordance is used. This concordance is dynamic and is updated each month from the historical time series, which is also updated monthly with additional data supplied by DIBP. The historical time series is summarised (added up) for each place of birth, separately within each country of birth. That is, if the name of a place of birth is used in more than one country, for example - 'Wellington' can be found in Australia, Canada, NZ, South Africa, UK and the USA, then the method adds up the number of instances within each of those countries from the historical series. The summary is the concordance between place of birth and country of birth. In essence, where a record is missing country of birth, the imputation will consider all possible donors with a matching place of birth. It will then choose a random donor based on its probability of occurring from the concordance, and copy across the donors corresponding country of birth.
Usually by the end of Step 3, up to 98% of NZ citizens with a missing record have been provided a country of birth.
4. If country of birth is still unknown after Step 3, but there is a place of birth supplied then a search is done on all NZ towns and place names. If a match is found it is assumed the country of birth of that record is New Zealand. Currently, very few records are imputed at this stage.
5. Lastly, if country of birth is still unknown for any NZ citizen after all other steps are taken, then the standard hot deck imputation is applied but only for non-New Zealand born as it is assumed any New Zealand born should have been picked up in the previous 4 steps. Currently, less than 1% usually make it to this stage of the imputation.
9. APPENDIX 2. Comparison of recent migrants in NOM and Census data
It is difficult to make a direct comparison with how recent migration is captured by the Census-based estimates and how it is measured in NOM, since Census data does not collect the cross-border movement information required to establish whether a person satisfies the '12/16 month rule'.
However, while noting this particular limitation, the ABS previously undertook a comparison of recent migrants using Census-based data with the equivalent NOM estimate - see Feature Article 3: Comparison of Net Overseas Migration and Estimates from the 2011 Census in Australian Demographic Statistics, December Quarter 2012 (cat. no 3101.0). The results seen in Graph 2 below, show a good alignment of the top 10 countries of birth. The NOM estimates in this case were the old NOM estimates used prior to the additional improvements to NOM that have been noted in this paper, in particular the improvements made to the country of birth imputation for New Zealand citizens.
The new results seen in Graph 3, which use the improved NOM estimates, clearly show a much better alignment for the New Zealand born with the Census-based data. This provides another indication that the changes made to the processing of the underlying input data used in NOM estimation has provided improvements to the quality of the output data for NOM.
These documents will be presented in a new window.
This page last updated 17 December 2013