Microdata and TableBuilder: Characteristics of Employment, Australia
Enables detailed analysis of employment characteristics
Characteristics of Employment in TableBuilder and microdata
Characteristics of Employment (COE) data for 2014 to 2024 are now available in TableBuilder, and in ABS DataLab, released as a supplementary file for the Longitudinal Labour Force (LLFS) microdata. All existing users of the LLFS microdata will automatically get access to the COE file and new users can apply for access to both files.
Introduction
The Characteristics of Employment survey (COE) is conducted in August and is designed to provide statistics on employment across the following 18 concepts:
- Away from work.
- Casual work and Job security.
- Characteristics of employment (all jobs).
- Characteristics of main job.
- Characteristics of second job.
- Demography.
- Earnings in main job (median, mean and distribution of weekly and hourly earnings).
- Education and Qualifications.
- Families and children.
- Fixed-term contracts.
- Independent contractors.
- Job flexibility and Working from home.
- Labour hire.
- Leave entitlements.
- Overemployment and Overtime.
- Trade union membership.
- Underemployment.
- Working arrangements and Working patterns.
Data from the COE survey are released in both TableBuilder and as microdata in ABS DataLab.
TableBuilder is an online tool for creating tables and graphs from underlying microdata. Refer to TableBuilder for more information.
Microdata in ABS DataLab is the most detailed information available from a survey and are generally the responses to individual questions on the questionnaire or data derived from two or more questions. Datalab is the analysis solution for high-end users who want to undertake real time complex analysis of detailed microdata in a secure environment. Refer to DataLab for more information.
Accessing the data
Compare data services to see what's right for you. Information on how to apply for access can be found in TableBuilder and DataLab.
Further information about these products, and other information to assist users in understanding and accessing microdata in general, is available from the Microdata and TableBuilder Entry Page.
For further support in the use of this product, please contact Microdata Access Strategies via microdata.access@abs.gov.au.
Privacy
The ABS Privacy Policy outlines how the ABS handles any personal information that you provide to us.
Data and file structure
Survey methodology
General information about the Characteristics of Employment (COE) survey, including summary results, are available in the following publications:
- Characteristics of employment.
- Employee earnings.
- Trade union membership.
- Working arrangements.
- Labour hire workers.
Detailed information about the survey including scope and coverage, survey design, data collection methodology, weighting, estimation and benchmarking, estimate reliability and a glossary can be accessed from the Methodology page of the publication.
Data items
The data items included in the COE TableBuilder are grouped into 18 categories as shown in the image below. See the complete data items list below.
Subject to confidentiality and sampling variability constraints, special tabulations can be produced incorporating data items, populations and geographic areas selected to meet individual requirements. These are available, on request, on a fee for service basis. For more information, contact the ABS by visiting Contact us or email the Labour Statistics Branch at labour.statistics@abs.gov.au.
File structure
The underlying format of the COE TableBuilder file is structured at a single person level. This person level contains general demographic information such as age, sex and country of birth, as well as details about status of employment, weekly earnings, working arrangements, trade union membership and educational qualifications.
When tabulating data from TableBuilder, person weights are automatically applied to the underlying sample counts to provide the survey's population estimates.
Reference year
The COE TableBuilder contains a mandatory field called Reference year to allow for historical analysis. By default, this field will be present in any new table as per the image below:
Individual years can be removed from the table using the data item panel by selecting the required year and removing it from the table as per the image below:
However, at least one category (reference year) of the mandatory field must be present in a table for TableBuilder to retrieve data.
Two-yearly content
The COE TableBuilder contains both annual and two-yearly content, with data only collected every two years, collected on an alternating basis. Data items are labelled as available for either "All years," "Even years only" or "Odd years only" in the Data item list.
When a data item is placed in a table for a particular reference year where the data was not collected, TableBuilder will return estimates in a category that contains the label "Not collected (even years only)" or "Not collected (odd years only)." When data for a biennial item is requested across multiple years, TableBuilder will retrieve data for the applicable reference years and return "Not collected" for the years where the data was not collected.
Not applicable categories
Most data items included in the TableBuilder file include a 'Not applicable' category. This category generally represents the number of people who were not asked a particular question or the number of people excluded from the population for a data item when that data were derived (e.g. Status of employment in second job is not applicable for people without a second job).
Since 2021, The "Not applicable" categories in each item now have a descriptive label to describe which populations are not included for the data item. For example, in the data item “Time remaining on fixed-term contract in main job,” the “Not applicable” category has been labelled as “Not on a fixed-term contract (ongoing)” to indicate that only people on a fixed-term contract were asked about the time remaining on their contract.
Table populations
The population relevant to each data item should be kept in mind when extracting and analysing data. The actual population count for each data item is equal to the total cumulative frequency minus the 'Not applicable' category.
Generally, some populations can be 'filtered' using other relevant data items. For example, if the population of interest is 'Employees', any data item with that population (excluding the 'Not applicable' category) could be used.
Zero value cells
Tables generated from sample surveys will sometimes contain cells with zero values because no respondents that satisfied the parameters of a particular cell in a table were in the survey. This is despite there being people in the general population with those characteristics. This is an example of sampling variability which occurs with all sample surveys. Relative standard Errors cannot be generated for zero cells.
Median earnings
For the Characteristics of Employment survey, median weekly earnings are a more robust measure of the centre for earnings data and have been given more prominence since 2017.
To minimise the risk of identifying individuals in aggregate statistics, a technique is used to randomly adjust cell values. This technique is called perturbation. Perturbation involves small random adjustments of the statistics and is considered the most satisfactory technique for avoiding the release of identifiable statistics while maximising the range of information that can be released.
The ABS has tested and implemented a perturbation process in respect of median earnings data to ensure that both the confidentiality of individuals is maintained, and the integrity of medians is preserved.
Hourly earnings in main job
Since 2021, the parametric item "Hourly earnings in main job," which is available under Summation options in TableBuilder, has been changed from dollars to cents. Median and mean hourly earnings are more accurately calculated using cents as the software performs better with integers. Custom ranges can be specified in one-dollar increments (100 cents) between $5 per hour and $385 per hour (500 and 38500 cents).
Occupation data
Occupation data is provided to the unit group level (4 digit). TableBuilder may suppress this level if a requested table is too finely detailed. Occupation data can be aggregated to minor, sub-major and major group levels (3-, 2- and one-digit levels) to avoid table suppression.
Using TableBuilder
For general information relating to the TableBuilder or instructions on how to use features of the TableBuilder product, please refer to TableBuilder.
More specific information applicable to the Characteristics of Employment (COE) survey TableBuilder, which should enable users to understand, interpret and tabulate the data, is outlined below.
Confidentiality in TableBuilder
In accordance with the Census and Statistics Act 1905, all the data in TableBuilder are subjected to a confidentiality process before release. This confidentiality process is undertaken to avoid releasing information that may allow the identification of individuals, families, households, dwellings or businesses.
Processes used in TableBuilder to confidentialise records include the following:
- perturbation of data
- table suppression
Perturbation effects
To minimise the risk of identifying individuals in aggregate statistics, a technique is used to randomly adjust cell values. This technique is called perturbation. Perturbation involves small random adjustments of the statistics and is considered the most satisfactory technique for avoiding the release of identifiable statistics while maximising the range of information that can be released. These adjustments have a negligible impact on the underlying pattern of the statistics.
The introduction of these random adjustments result in tables not adding up. As a result, randomly adjusted individual cells will be consistent across tables, but the totals in any table will not be the sum of the individual cell values. The size of the difference between summed cells and the relevant total will generally be very small.
Please be aware that the effects of perturbing the data may result in components being larger than their totals. This includes determining proportions.
Table suppression
Some tables generated within TableBuilder may contain a substantial proportion of very low counts within cells (excluding cells that have counts of zero). When this occurs, all values within the table are suppressed to preserve confidentiality. The following error message below is displayed (in red) at the bottom of the table when table suppression has occurred.
ERROR: The table has been suppressed as it is too sparse.
ERROR: table cell values have been suppressed.
Counting units and weights
Weighting is the process of adjusting results from a sample survey to infer results for the total population. To do this, a 'weight' is allocated to each record. The weight is the value that indicates how many population units are represented by each sample unit.
To produce estimates for the in-scope population you must use a weight field in your tables. In TableBuilder they can be found under the Summation Options category in the left-hand pane under the applicable level. If you do not select a weight field, TableBuilder will apply 'Person weight' by default. This will give you estimates of the number of persons.
If you are estimating the number of persons with certain characteristics (e.g. 'Number of jobs held last week') the weight listed under the category heading 'Person level weighting' must be used.
When creating a table, a default Summation Item will need to be the Reference year as this item will provide data for the relevant year. This item will then be used for time-series purposes as future data becomes available.
Selecting data items for cross-tabulation
The Person level contains a range of data items detailing the characteristics of respondents including demographic, education, labour force, earnings, working arrangements, trade union membership and population variables.
Populations and data items
When adding a data item to a table, it should be noted that not all respondents to the survey may be associated with that data item. For example, the data item “Duration of current trade union membership” is only applicable to "Trade union members." When using this item in a table, it would be appropriate to also use the population "P12 - Trade union members," to restrict the output of this table to this population only.
Similarly, if multiple data items are included in a table, they should all apply to the same population group.
Cross-tabulating data items on the same level
Cross-tabulating data from the Person Level with other data items from the same level will produce data about people. For example, cross-tabulating the geographic variable 'State or territory of usual residence' by the 'Hours usually worked in main job' produces a table showing the number of people in each region by the hours that they usually work each week in their main job.
Multi-response data items
A number of the survey's data items allow respondents to report more than one response. These are referred to as 'multi–response data items'. An example of such a data item is pictured below. For this data item, respondents can report all the days of the week they usually work.
When a multi–response data item is tabulated, a person is counted against each response they have provided (e.g. a person who responds 'Monday' and 'Thursday' and 'Saturday' will be counted once in each of these three categories).
As a result, each person in the appropriate population is counted at least once, and some people are counted multiple times. Therefore, the total for a multi–response data item will be less than or equal to the sum of its components.
For more information on definitions and concepts that apply to the data items in this file, please refer to Characteristics of Employment and Labour Force, Australia.
Using DataLab
DataLab allows real time access to detailed microdata files through a portal to a secure ABS environment. Using detailed microdata in DataLab allows users to run advanced statistical analyses using recent analytical software.
About DataLab
Detailed microdata files in DataLab can be accessed on-site at ABS offices or in a secure virtual environment from your own computer. All unit record data remains in DataLab, and any analysis results or tables are checked by the ABS before being provided to users.
Refer to DataLab for more information, including prerequisites for DataLab access.
Record identifiers
The record identifiers used in the COE and LLFS microdata are consistent across both files. This is to facilitate data linkage between the two files and enable further analysis. The COE survey is collected from private dwellings in 7/8th of the Labour Force Survey (LFS) sample, so not all records in August on the LLFS will have a corresponding COE record. More details on these records and the formatting of record identifiers can be found in the Data Item List.
Weights
Person level weights (and replicate weights for calculating standard errors) are provided on the COE file. These differ from the weights provided on the LLFS file, as the weights are recalibrated for COE due to the reduced sample size compared to the LFS. Aggregate estimates from both sets of weights will align closely, as the COE survey data is benchmarked to match trend estimates from the LFS (as published in the October 2023 issue), but care should be taken when performing micro analysis.
COE weights are recommended for cross-sectional analysis of COE data items, but when linking COE and LLFS data for longitudinal analysis, new weights should be calculated based on the population benchmarks provided on the LLFS file. Care should be taken to account for attrition bias by adjusting the weights appropriately (increasing the weights for those more likely to leave the LFS). More information on using benchmarks and weights for longitudinal analysis is provided in Longitudinal Labour Force.
Earnings and other parameters
Data related to earnings and other parameters (including duration, hours, ages, etc.) are presented using two data items:
- flag item - indicates which records have earnings or other parametric data and which records do not. A flag of '1' indicates that the record has parametric data, and flags of '0' or negative values indicate that the record does not have parametric data and are categorised by the reason why they were excluded. Flag data items use identifiers that end in 'A'.
- values item - provides the value for earnings or other parametric information. Values data items have identifiers that end in 'B'.
Records that have a values data item equal to '0' have an ambiguous meaning when used on their own - it could indicate a parametric value of zero (e.g. 0 hours), or it could also indicate that there is no parametric data. Zero values should be interpreted in conjunction with the flag data item to determine the meaning and whether they should be included in analysis or not.
When analysing parametric items, such as calculating median or average weekly earnings, the data should first by filtered for records that have a flag value of '1' before performing the analysis.
Data item lists
2014 to 2024
Datalab: Characteristics of employment, Data item list, 2014 to 2024
TableBuilder: Characteristics of employment, Data item list, 2014 to 2024
Historical microdata - 1998 to 2010
DataLab: Forms of employment, Data item list, 1998
DataLab: Employee earnings, benefits and trade union membership, Data item list, 2004
DataLab: Employee earnings, benefits and trade union membership, Data item list, 2006
DataLab: Employee earnings, benefits and trade union membership, Data item list, 2008
DataLab: Forms of employment, Data item list, 2008
DataLab: Employee earnings, benefits and trade union membership, Data item list, 2010
Previous microdata releases
Historical microdata, 1998 to 2010
Prior to 2014, microdata relating to employment characteristics was released in a number of Confidentialised Unit Record Files (CURFs). These files are available in ABS DataLab.
- Forms of Employment, August 1998.
- Employee Earnings, Benefits and Trade Union Membership, August 2004.
- Employee Earnings, Benefits and Trade Union Membership, August 2006.
- Employee Earnings, Benefits and Trade Union Membership, August 2008.
- Forms of Employment, November 2008.
- Employee Earnings, Benefits and Trade Union Membership, August 2010.
History of changes
19/12/2024
This update coincides with the release of microdata from the 2024 Characteristics of Employment Survey into Tablebuilder and DataLab.
08/03/2024
This update coincides with the release of microdata from the 2023 Characteristics of Employment Survey into Tablebuilder.
13/12/2023
This update coincides with the release of microdata from the 2023 Characteristics of Employment Survey into DataLab. The release of the August 2023 Characteristics of Employment microdata into Tablebuilder has been delayed until next year due to continuing upgrades to the TableBuilder system infrastructure.
16/12/2022
This update coincides with the release of microdata from the 2022 Characteristics of Employment Survey into TableBuilder and DataLab.
New items were added to make it easier to identify employees on a fixed-term contract and labour hire workers in TableBuilder and no changes were required for the microdata in DataLab.
04/08/2022
This issue provides details on the first release of COE microdata in DataLab as a supplementary file for the Longitudinal Labour Force (LLFS) microdata. This file features data collected annually for the months August 2014 to August 2021. A future update to include the results from the August 2022 COE survey is scheduled for release later in the year on 14 December 2022.
20/04/2022
Minor updates to labels related to fixed-term contracts and leave entitlements. More details are provided in the updated Data Item List available in Data Downloads
14/12/2021
This update coincides with the release of data from the 2021 Characteristics of Employment Survey into TableBuilder.
Many improvements were made to the COE TableBuilder to simplify the way the data items were presented, increase the usability and increase the range of data items.
- Data items now grouped under 18 conceptual groups.
- The "Not applicable" categories in each data item now have descriptive labels to describe which populations are not included.
- Data item labels were revised or shortened to improve interpretability (note the concepts remained the same).
- Hours worked, hours preferred and duration data items were aligned to consistent higher level groupings.
- Occupation data now provided at the detailed unit group level (4-digit).
- The parametric item for hourly earnings in main job was changed from dollars to cents, as the software performs better with integers.
Previous catalogue number
This release previously used catalogue number 6333.0.00.001.