Page tools: Print Page Print All | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
APPENDIX 2 DATA QUALITY ISSUES
Missing responses for country of residence/stay A further breakdown in Table A2 shows the proportion of responses missing for each passenger card box type.
Age The primary source for data on this variable is passport or visa information. An alternate source used is the passenger card. Age is calculated using date of birth. For imputation, the variables used to align the recipient with a suitable donor are: the passenger card box type and reason for journey. Generally, little imputation is required for age (less than 1% of relevant records). All movements are fully imputed for this variable. See section 3. Data Imputation above. Country of birth The primary source for data on this variable is passport or visa information if available. It is not available from the passenger card. The majority imputations are for NZ citizens. There are two separate parts to the imputation for country of birth. A specific imputation is in place for the country of birth of New Zealand (NZ) citizens, as data for this variable is not directly available from the passport or visa of NZ citizens. For details see 'Specific imputation for country of birth of New Zealand citizens' in Section 5 below. The second part is only used for non-NZ citizens. For this imputation, the variables used to align the recipient with a suitable donor are: category of movement and country of citizenship. Generally, imputation required for country of birth is less than 7% of relevant records. All movements are fully imputed for this variable. See section 3. Data Imputation above. Country of Citizenship The primary source for data on this variable is passport or visa information. An alternate source used is the passenger card. Country of citizenship is not imputed unless partial information is supplied on the passenger card. For example, if no official nationality is available and Europe was supplied as the nationality on the passenger card it would then be imputed to a country in Europe. The specific region or country grouping provided as the recipient's citizenship is used to align the recipient with a suitable donor for imputation. For the example noted above the donor would be from Europe. Generally, little imputation is required for country of citizenship (less than 1.5% of relevant records). See section 3. Data Imputation above. Country of embarkation/disembarkation The only available source for data on this variable is the passenger card. There are two separate parts to the imputation for country of embarkation/disembarkation. The first part is used if a response is missing. For this imputation the variables used to align the recipient with a suitable donor are: sampled/non-sampled data and country of residence/stay. The second part is only used if partial information is supplied on the passenger card. For example, if Europe was supplied as the country of embarkation/disembarkation on the passenger card it is imputed to a country in Europe. The specific region or country grouping provided as the recipient's country of embarkation/disembarkation is used to align the recipient with a suitable donor for imputation. For the example noted above the donor would be from Europe. Generally, imputation required for country of embarkation/disembarkation is less than 8% of relevant records. All movements are fully imputed for this variable. See section 3. Data Imputation above. Country of residence/stay The primary source for data on this variable is the passenger card. As an alternative, visa information may be used for some travellers if available. There are two separate parts to the imputation for country of residence/stay. The first part is used if a response is missing. For this imputation the variables used to align the recipient with a suitable donor are: sampled/non-sampled data, country of citizenship, category of movement and reason for journey. The second part is only used if partial information is supplied on the passenger card. For example, if Europe was supplied as the country of residence/stay on the passenger card then it is imputed to a country in Europe. The specific region or country grouping provided as the recipient's country of residence/stay is used to align the recipient with a suitable donor for imputation. For the example noted above the donor would be from Europe. Generally, imputation required for country of residence/stay is less than 16% of relevant records. Country of residence/stay is not imputed for permanent arrivals but all other movements are fully imputed. See section 3. Data Imputation above. Prior to the rebuild of the OAD system and the revision back to July 2004, the ABS imputed this data item in two stages. In the first stage, records with country of residence/stay missing were set to country of disembarkation/embarkation if a response was available. In the second stage, for remaining records where country of stay/residence was missing, values were imputed at the category of movement, reason for journey and country of citizenship level based on responses to other cards within each subgroup. For permanent arrivals, imputation was undertaken using a combination of country of embarkation and the stated responses of other permanent arrivals. Duration of stay - current Data on this variable is from two separate sources depending on whether it is the first leg, or the second leg of a travellers' journey.
There are two separate parts to the imputation for duration of stay. The first part is used if a response is missing. For this imputation, the variables used to align the recipient with a suitable donor are: passenger card box type, intention to live in Australia for next 12 months (for arrivals only), country of birth and country of citizenship. The second imputation is only used when a traveller has put one year exactly as their intended duration of stay. It will therefore only apply to those who have completed box B or box E on the passenger card. This imputation reflects historical patterns that clearly show the majority stay less than one year. The imputation first involves creating an historical data set based on information from two years earlier. It then calculates the actual recorded duration of stay for those travellers who had originally put one year exactly as their intended duration of stay. For this imputation the variables used to align the recipient with a suitable donor are: corresponding months in the historical data set, those who also stated exactly one year, passenger card box type, intention to live in Australia for next 12 months (for arrivals only), and country of citizenship. For the proportions imputed to either a long-term stay and short-term stay, for each passenger card box type for the current months, see Table A3 below. Generally, imputation required for missing duration of stay is less than 10% of relevant records. All movements are fully imputed for this variable. See section 3. Data Imputation above.
For a complete list of the categories of movement see the Glossary. There is evidence to suggest that when completing the intended duration of stay question on the incoming passenger card (Box B), some passengers are entering their arrival/departure date or their birth date rather than their intended duration of stay. From September 2003, a rule has been implemented to the data processing system stating that if all three elements are complete (years, months and days), then the intended duration of stay is to be coded to a non-response. Prior to July 2004, the ABS assigned a 'not stated' duration of stay as 10 days and therefore as a short-term movement. Duration of stay - historical Prior to the rebuild of the OAD system, a simple assumption was in place that set any traveller with a missing duration of stay to 10 days and therefore to a short-term movement. Missing duration of stay using the improved hot deck imputation methodology, as noted in 'Duration of stay - current' section above, has been revised back to July 2004. Over time, there have been a number of changes to information collected on duration of stay. Initially, the intended duration of stay was only collected from information provided by all travellers on incoming and outgoing passenger cards in the intended length of stay fields. Therefore historically, the first leg and second leg of a journey both collected duration of stay based on intention. With the introduction of TRIPS by DIBP in July 1990, the new system made possible the calculation of the actual length of stay/absence for travellers on the second leg of their journey (i.e. departing overseas visitors and returning Australian residents). This calculation based on TRIPS data commenced in July 1998. This change resulted in an improvement in data quality for duration of stay. In particular, the distribution of the number of passengers staying for one year exactly declining significantly for this group of travellers. The introduction of a new passenger card processing system from July 2001 provided further evidence of travellers rounding to one year exactly for their intended duration of stay in Australia or overseas. To reflect the historical movement patterns, the records with a reported duration of one year exactly were allocated to short-term or long-term. For visitors arriving in Australia, 75% of such records were allocated to short-term and 25% to long-term. For residents departing Australia, the distribution was 67% short-term and 33% long-term. With the rebuild of the OAD system, these proportional splits were able to be based on the behaviour of travellers from two years earlier - see Table A3 and the section above 'Duration of stay - current'. Data back to July 2004 has been revised using the new methodology based on these dynamic proportional splits. Missing response rates for the duration of stay are only available since November 1998. Prior to this, imputation carried out as part of processing by DIBP prevented reliable estimation for missing duration of stay. Passenger card box type The primary source for data on this variable is the passenger card. Administrative systems at DIBP can be used as an alternate source for some travellers. For example, all travellers with a permanent arrivals visa would be converted to box A (migrating permanently to Australia). For this imputation the variables used to align the recipient with a suitable donor are: sampled or non-sampled data, direction of traveller, intention to live in Australia for next 12 months (for arrivals only), and country of citizenship. Generally, very little imputation is required for passenger card box type (less than 0.5% of all movement records). All movements are fully imputed for this variable. See section 3. Data Imputation above. Reason for journey The only source available for data on this variable is the passenger card. For this imputation the variables used to align the recipient with a suitable donor are: sampled or non-sampled data, passenger card box type, category of movement, and age. Generally, imputation required for reason for journey is less than 10% of relevant records. All movements are fully imputed for this variable. See section 3. Data Imputation above. Sex The only source available for data on this variable is passport or visa information. For this imputation the variables used to align the recipient with a suitable donor are: the passenger card box type and reason for journey. Generally, little imputation is required for sex (less than 1.5% of relevant records). All movements are fully imputed for this variable. See section 3. Data Imputation above. State or territory of stay/residence The only source available for data on this variable is the passenger card. For this imputation the variables used to align the recipient with a suitable donor are: sampled or non-sampled data, state of clearance, and country of citizenship. Generally, imputation required for state or territory of stay/clearance is less than 7% of relevant records. All movements are fully imputed for this variable. See section 3. Data Imputation above. If a correction to the passenger card box type marked by a long-term visitor departure is made (e.g. a visitor incorrectly marks a resident box of E (Aust. resident departing temporarily) or F (Aust. resident departing permanently)), then the state of stay recorded in the incorrect box is applied. 5. SPECIFIC ISSUES FOR NEW ZEALAND PASSPORT HOLDERS Allocating passenger card box type Under the Trans-Tasman Agreement, New Zealand (NZ) citizens are not required to have a visa to travel to Australia. As a result, on their arrival in Australia visa documentation cannot be used to determine whether they are either a permanent migrant or a temporary visitor, or an Australian resident returning from NZ. Analysis undertaken by DIBP suggests that a substantial proportion of holders of NZ passports tick Box A (migrating permanently to Australia) each time they arrive in the country, causing an overcount of NZ migrants entering Australia. The following edits were applied to correct the over-counting of NZ migrants. From July 2001 to June 2002, DIBP coded all NZ citizen arrivals who had ticked Box A (migrating permanently to Australia) and had been to Australia previously (based on DIBP records) to resident returning (Box C). If these people were visitors previously, this recoding had the effect of incorrectly reducing the number of NZ migrants whilst at the same time incorrectly increasing the number of NZ citizen who were returning residents. This problem was overcome by moving the NZ citizens who had been changed by DIBP from Box A to Box C back to Box A. Since July 2002, DIBP has utilised a new edit system to ensure accurate measurement of permanent arrivals of NZ citizens. Where a person ticks Box A on his/her passenger card (migrating permanently to Australia), the record is verified by checking previous entries and related passenger card records, and if the person is previously recorded as a permanent migrant or resident then they will be counted as returning residents. This resulted in more accurate recording of NZ citizens who were migrating permanently to Australia and those who were residents returning. In 2007, to better measure the changes in traveller behaviour and more accurately capture and measure temporary migration, the ABS introduced improved methods for calculating net overseas migration. This is now the most appropriate source for statistics on migration into, and out of, Australia. Data is available from December quarter 2003. See Explanatory Note 71 in Migration Australia, 2011-12 and 2012-13 (cat. no. 3412.0). Specific imputation for country of birth of New Zealand citizens With the introduction of biometric passports for New Zealand (NZ) citizens in April 2005, the country of birth information was removed from the passport and replaced with a place of birth, for example Wellington, Auckland, Christchurch or Melbourne. The passport was the only source of information on the country of birth of NZ citizens travelling to, or from, Australia. For other travellers who are not NZ citizens, country of birth information can be obtained from their passport or visa information. However, visa information for most NZ citizens is not available as, under the trans-Tasman agreement, they do not need to hold a visa prior to travel to Australia. Therefore, with the increased numbers of travellers holding NZ biometric passports, the proportion of movement records with a missing country of birth has increased substantially. For example, by April 2013 the matched OAD data showed the number of records missing a country of birth was 251,176. NZ passport holders represented 95% of these records. Earlier, for April 2005, NZ passport holders represented only 6% of the missing country of birth records. By April 2007 this had increased to 79%. As a temporary fix to alleviate this growing issue a basic imputation was introduced by the ABS in late 2007. From August 2007 the imputation used donors with a similar category of movement and country of citizenship. From 2013, a special imputation for country of birth of NZ citizens has been introduced within the rebuilt OAD system, with data revised back to July 2004. It has improved country of birth statistics in OAD, and also outputs on net overseas migration (NOM), and the Estimated Resident Population by country of birth. There are five steps to the process to generate country of birth when missing: 1. Prior to imputation, if country of birth is missing for a NZ citizen the system will scan historical records of NZ citizens back to 2003 to see if there is an earlier record of the individual's country of birth. This is made possible through the use of a unique personal identifier provided to each traveller who crosses Australia's international border. This step will look for a record with a matching personal identifier and if one is found will use the country of birth of the matched record. In 2013 approximately 80% of records with a missing country of birth are being matched with an historical record. 2. If country of birth is still unknown after Step 1, the system will scan all previous imputations for country of birth for NZ citizens to see if there is an existing record for that individual. This ensures an individual's country of birth is only ever imputed once although they may cross Australia's international borders many times. 3. If country of birth is still unknown after Step 2, but there is a place of birth supplied on the NZ biometric passport, then a place to country of birth concordance is used. This concordance is dynamic and is updated each month from the historical time series, which is also updated monthly with additional data supplied by DIBP. The number of records for each place of birth, separately within each country of birth, is then determined cumulatively from the historical time series. That is, if the name of a place of birth is used in more than one country, for example - 'Wellington' can be found in Australia, Canada, NZ, South Africa, UK and the USA, then the method adds up the number of instances within each of those countries from the historical series. Where a record is missing country of birth, the imputation will consider all possible donors with a matching place of birth. It will then choose a random donor based on its probability of occurring from the concordance, and copy across the donors corresponding country of birth. By the end of Step 3, up to 98% of NZ citizens with a missing value have been provided a country of birth. 4. If country of birth is still unknown after Step 3, but there is a place of birth supplied, then a search is done on all NZ towns and place names. If a match is found it is assumed the country of birth of that record is New Zealand. Very few records are imputed using this step. 5. Lastly, if country of birth is still unknown for any NZ citizen after all other steps are taken, then the standard hot deck imputation is applied but only for non-New Zealand born as it is assumed any New Zealand born will have been picked up in the previous four steps. Currently, less than 1% of records are imputed using this step. 6. HISTORY OF PROCESSING CHANGES July 1998, Permanent Departures Prior to July 1998, the number of overseas-born (excluding NZ) permanent departures of Australian residents was overstated. In July 1998, DIBP introduced a Box type validation edit to the processing system. This edit checks and corrects the Box type according to the Visa Class/subclass. With the exception of Australian and NZ citizens, only Australian residents departing permanently (Box F) who hold permanent visas are retained in this Box type. For temporary visa holders who incorrectly ticked Box F, their Box type was changed to visitor or temporary entrant departing (Box D). July to December 1998, Reason for Journey Before the introduction of the redesigned passenger card in July 1998, 5% of short-term visitor arrivals, on average, were recorded as having a reason for journey of 'Other' or 'Not Stated'. This percentage rose to 14% for July, 16% in August and 29% in September 1998 as a result of processing problems. These problems were addressed by DIBP, with the percentage of 'Other' and 'Not Stated' dropping to 8% and 7% in October and November respectively. From January 1999, OAD statistics referencing these three months have been revised. The revised data were calculated by estimating the number of persons responding 'Other/Not Stated' using past trends for each country of citizenship and proportionally allocating any persons in excess of the estimated 'Other/Not Stated' total amongst the remaining categories. July to December 1998, State or territory of residence/stay For the months of August 1998, September 1998 and October 1998, data entry problems experienced by DIBP caused an overstatement of the Northern Territory as the main state of stay with a corresponding understatement for the remaining states and territories. In November 1998 these numbers returned to levels more comparable with previous years, with DIBP indicating that they had instigated data quality procedures to address this issue. From January 1999, OAD statistics referencing these months have been revised. The revised data were calculated by estimating the number of persons indicating the Northern Territory as their main state of residence/stay using past trends and proportionally allocating any persons in excess of these estimates amongst the remaining states and territories. With the introduction of the new processing system from July 2001, DIBP provided the ABS with data on all missing values for state or territory of residence/stay. From July 2001 to Jun 2004, any missing state or territory of residence/stay were imputed using category of movement and state of clearance. September 1998, Age, Country of Birth, Citizenship and Sex A problem was experienced in the processing of OAD data for movement dates between 6 September 1998 and 16 September 1998, following the introduction of changes to DIBP's input processing system. This problem may affect in the order of 10% of all September 1998 records used in estimation and result in incorrect details for citizenship, date of birth, sex and country of birth. September 1999, China and Hong Kong September 1999 overseas arrivals and departures data were revised for movements from, and to, China and Hong Kong in respect of three variables: country of birth, country of citizenship and country of residence/stay. Changes to 'country of birth' and 'country of citizenship' have been made from data supplied by DIBP. Changes to 'country of residence/stay' have been made by assuming the average proportion of country of birth to country of residence/stay for migrants from China and Hong Kong in September 1995 to September 1998. July 2004, All Data In 2013, the ABS completed a rebuild of the system which creates OAD data. All OAD data have been revised back to July 2004 based on the improved methodology. January 2013, Duration of Stay and Reason for Journey Investigations by the ABS and DIBP uncovered a high non-response rate for both duration of stay and reason for journey for the month of January 2013. This was mainly due to changes to the collection and processing of passenger cards, which were introduced in that month. January is the only month that has been affected and the non-response rates for subsequent months are at an acceptable level. The ABS and DIBP reprocessed January 2013 data. All associated time series spreadsheets and data files were revised. Document Selection These documents will be presented in a new window.
|