Data integration service

If your project needs cannot be met using our current integrated data products, the ABS provides a cost-recovered data integration service.

As an Accredited Integrating Authority, we can: 

  • Integrate new datasets to our existing data assets, including PLIDA and BLADE
  • Advise on how to establish and run data integration projects

Process for data integration

Outlined below are the steps involved in undertaking data integration to the PLIDA and BLADE data assets.

Data integrated with the PLIDA and BLADE data assets is provided to approved users as detailed microdata via the secure ABS DataLab

Access and services outlines how to access integrated data, including existing data.

1. Submit a project proposal

The first step is to contact ABS Data Services via data.services@abs.gov.au. We will discuss your project requirements and data options and walk you through the data integration process. After you contact ABS Data Services, we will provide you with an integrated data project proposal template or information to login to our user portal, myDATA Beta. This step requires you to detail the research you plan to undertake.

2. Technical assessment

Once the project proposal is accepted, the ABS conducts an initial assessment under the Five Safes framework and ensures data is suitable and available for your research purpose. We will contact the project lead if further details are required.

3. Quote and timing

Once the data requirements are clear, we will provide a quote for the data integration services and other associated charges (such as DataLab access). A timeline for delivery will also be provided. A signed Agreement to Proceed is required to be returned within 30 days of the quote being received for data integration work to progress. After signing, an invoice is issued to the project lead. 

4. Approvals

Following acceptance of the quote, we undertake a detailed assessment under the Five Safes framework and organise the relevant ABS and data custodian approvals. Data custodians may recommend additional requirements, including ethics approval. For linkage of new data, a safe data assessment is conducted under the guidance of the ABS Disclosure Review Committee.

5. Service delivery

Once the invoice is paid and approvals are in place, the data integration will take place based on the agreed timeline.

6. Access

We will advise of access requirements at the start of the project. If the project requires access to detailed microdata via the ABS DataLab, we will support your research team through the Safe Researcher onboarding process.

If a project is subject to an existing Memorandum of Understanding/Agreement in Writing with the ABS, the process may differ.

Charges

The ABS charges for custom data integration services on a cost-recovery basis. We can provide a quote estimate at any stage if you contact us at data.services@abs.gov.au.

Charges will depend on the complexity of the required data product, the level of curation required and the number of new linkages.

Custom microdata products are accessed through the ABS DataLab. DataLab access is charged separately to data integration fees. DataLab charges outlines the associated DataLab access fees.

Charges increase when a project requires:

  • Assembly/linkage of complex datasets
  • Data item aggregation/derivations
  • A Privacy Impact Assessment (PIA)
  • Extensive data cleaning and/or resupply
     

The tables below provide indicative costs for new data integration or linkage projects. Please note that very large or complex projects will be subject to custom charges. 

 

Data integration charges per project

The below fees are charged on a per project basis. New requests for data integration, or changes to existing requests, may result in additional charges. Not all charges will be applicable to your project and will depend on individual project requirements.

Please refer to the information below for descriptions of what each service involves and when it is applicable. 

Goods and Services Tax (GST) will be applied to charges unless the organisation is exempt.

Assessment charges

The Assessment charge covers the costs associated with establishing and supporting data integration requests. 

The scope of this fee includes technical assessment, advisory services, research and assessment of the legislation and privacy statements applicable to the requested data, risk assessment, governance reviews and output clearance assessment.

Assessment chargesExcluding GSTDescription
Simple$3,980The simple Assessment charge will apply, in most instances, to projects with resupply or reuse of existing data arrangements.
Complex$5,890The complex Assessment charge will apply, in most instances, to projects with new data integration requests or refreshes of existing datasets.
Privacy charges
Privacy chargesExcluding GSTDescription
Data Integration Plan (DIP)$5,250

A Data Integration Plan (DIP) is prepared for new projects where the initial governance review determines that a project cannot be accommodated under current data integration governance arrangements or the project is assessed as high risk.  

This could include instances where:

  • A Privacy Threshold Assessment (PTA) has led to the recommendation that a Privacy Impact Assessment (PIA) should be conducted
  • Additional measures are needed to fully assess the risk profile of a project
  • Additional measures are required for the ABS to meet its obligation under applicable legislation and guidelines
  • If a change in the data handling process is necessary
  • If new transparency measures will need to be enacted
Privacy Threshold Assessment (PTA) - Simple$2,670

A Privacy Threshold Assessment (PTA) is a preliminary assessment that helps to determine a project's potential privacy impacts. This includes consideration of whether a project is high risk and should be subject to a Privacy Impact Assessment (PIA) as required by the Australian Government Agencies Privacy Code.

New projects that propose to integrate one dataset will require a simple PTA. Existing projects proposing to integrate an additional dataset will also require a simple PTA.

Privacy Threshold Assessment (PTA) - Complex$7,710

A Privacy Threshold Assessment (PTA) is a preliminary assessment that helps to determine a project's potential privacy impacts. This includes consideration of whether a project is high risk and should be subject to a Privacy Impact Assessment (PIA) as required by the Australian Government Agencies Privacy Code.

New projects that propose to integrate multiple datasets will require a complex PTA. Existing projects proposing to integrate multiple additional datasets will also require a complex PTA. 

Privacy Impact Assessment (PIA)Custom

A Privacy Impact Assessment (PIA) is a systematic assessment that helps the ABS identify and manage the privacy impacts of a data integration project.

Custom charges apply as the the size, scale and scope of PIAs vary greatly depending on the need for stakeholder consultation, the nature of the data, the risk level of the project and the complexity of changes to personal information and data handling practices.

PIAs may be conducted by the ABS or by independent consultants. 

Other privacy measuresCustomAdditional privacy measures are required for all new types or categories of data proposed for integration with PLIDA. This includes targeted stakeholder consultation and additional transparency activities. 
Data supply agreement charges

Data supply agreements ensure that the data supplier and the ABS have the authority to disclose and collect the specified data. The agreement provides specific details on the data to be provided, how the data will be handled and other transparency requirements.

Data supply agreement chargesExcluding GSTDescription
Simple$4,540A simple data supply agreement is generally a straightforward Letter of Exchange (as opposed to an MoU or more complex agreement) with a single data custodian, a small number of datasets and linkage to either PLIDA or BLADE (not both). 
Complex$5,920A complex data supply agreement is generally a more detailed document often covering a larger number of datasets to be supplied, multiple data custodians and may have linkage requirements for both PLIDA and BLADE.

 

Data integration charges per dataset

The below fees are charged on a per dataset basis where integration to PLIDA or BLADE  is required. A dataset refers to a collection of data collected and maintained by a data custodian for a specific purpose and may be made up of one or more data tables. 

New requests for data integration, or changes to existing requests, may result in additional charges. Not all charges will be applicable to your project and will depend on individual project requirements.  

Please refer to the descriptions for more information of what each service involves and when it is applicable.

Please note, these costs are indicative and in some instances repeat linkage and assembly work will result in a decrease in charges due to efficiency gains from ongoing delivery. Very large or complex projects will also be subject to custom charges. 

PLIDA linkage charges
Linkage chargesExcluding GSTDescription
Simple$31,310

A simple linkage is the linkage of a single, person-level dataset to the Person Linkage Spine using deterministic linkage methodology.

The dataset provided to the ABS must contain a unique person ID for each individual, and each of the following linkage variables:

  • First name 
  • Surname
  • Date of birth
  • Sex/gender
  • Residential address
  • Additional linkage variables may be supplied to support a high-quality linkage outcome.
Complex$48,690

A complex linkage is the linkage of up to three person-level datasets to the Person Linkage Spine using deterministic linkage methodology.

If these datasets do not contain a common unique person ID, it must be possible to construct one from the available source data.

The datasets must contain a residential address and at least 3 of the following linkage variables:

  • First name 
  • Surname
  • Date of birth
  • Sex/gender
  • Additional linkage variables may be supplied to support a high-quality linkage outcome.
Data resupply$3,130In the event that data needs to be re-supplied due to major corrections required at the data provider end, additional fees may apply to cover the costs of additional checks to ensure data quality and accurate linkage. 
PLIDA assembly charges

Assembly work is required to make integrated datasets available in the DataLab. Before being loaded into the DataLab, all tables in a dataset will go through a series of data quality and confidentiality checks and treatments. Treated data is packaged into DataLab products along with mapping files that enable the data to be joined with other PLIDA data.

The scale of assembly work depends on a range of factors, including the size, complexity and cleanliness of a dataset. 

Assembly chargesExcluding GSTDescription
Small dataset$9,110

A small dataset is comprised of 5 tables or less and no more than 50 variables in total.

The dataset also contains no or minimal additional identifiers that would require de-identification and does not require non-standard confidentiality treatments. 

Medium dataset$14,540

 A medium dataset is comprised of 10 tables or less and no more than 150 variables in total.

Alternatively, a medium dataset may be smaller than this but contain additional identifiers that require de-identification, or requires other non-standard confidentiality treatments. 

Large dataset$24,620A large dataset is comprised of more than 10 tables and more than 150 variables. 
Data resupply$2,090In the event that data needs to be re-supplied due to major corrections required at the data provider end, additional fees may apply to cover the costs of additional checks to ensure data is suitable for integration with PLIDA. 
BLADE linkage charges
BLADE linkage chargesExcluding GSTDescription
Simple$16,860A simple linkage is a direct linkage to the BLADE asset. The dataset must contain simple analytical variables and must not contain additional unique identifiers that require de-identification or free-text fields that need to be reviewed for identifying information.
Complex$27,230A complex linkage may require linkage to the BLADE asset and other supplied datasets via a secondary linkage variable. The supplied dataset may include variables or additional unique identifiers that require de-identification or free-text fields that need to be reviewed for identifying information.
Data resupply$4,910In the event that data needs to be re-supplied due to major corrections required at the data provider end, additional fees may apply to cover the costs of additional checks to ensure data quality and accurate linkage.
Location linkage charges

The Location Modular Product enables the ABS to receive data which is location-based, which can then be linked to other location-based datasets and to PLIDA or BLADE via the location information included in those integrated data assets. The inclusion of location-based data ABS’s current offering of integrated data offering enable more geospatial analysis over a potential wider range of topics to further assist with evidence-based policy development and evaluation.

Location linkage chargesExcluding GSTDescription
Data Integration (linkage and assembly) charges per dataset

Basic Dataset
$5,750 Data set contains Address information only.  Address information provided is clean and good quality. Dataset only used to scope PLIDA or BLADE to population of interest.
Data Integration (linkage and assembly) charges per dataset

Small Dataset
$10,477 A small dataset is comprised less than and no more than 30 variables in total. Address information provided is clean and good quality.
The dataset also contains no additional identifiers that would require de-identification and does not require non-standard confidentiality treatments. 
Data Integration (linkage and assembly) charges per dataset

Medium Dataset
$16,721 A medium dataset is comprised no more than 100 variables in total.
Alternatively, a medium dataset may be smaller than this but contain additional identifiers that require de-identification, or requires other non-standard confidentiality treatments. 
Data Integration (linkage and assembly) charges per dataset

Large Dataset
$28,313 A large dataset is comprised of more than 100 variables. 
Data Integration (linkage and assembly) charges per dataset

Data resupply
$2,404 In the event that data needs to be re-supplied due to major corrections required at the data provider end, additional fees may apply to cover the costs of additional checks to ensure data is suitable for integration.

Quality

The ABS takes care to ensure we provide tables and data of sufficient quality for analysis. For this reason, we may decline to provide a service, if we consider that the data quality would not be fit for purpose. 

Data security and confidentiality

The ABS is committed to the protection of personal data and data pertaining to businesses. The ABS reserves the right to apply additional controls under the Five Safes framework to ensure data and confidentiality is adequately protected.

Back to top of the page