Microdata and TableBuilder: Sun protection behaviours

Contains data on sun protection behaviours, sunburn and suntan.

The Sun protection behaviours survey collected information from people aged 15 years and over about sunburn, sun protection and sun tanning. The survey was funded by the Cancer Councils of Australia. See Sun protection behaviours for summary results, methodology and other information.

Accessing the data

TableBuilder - produce your own tables and graphs.

Detailed microdata - approved users can access a remote desktop environment in DataLab for in-depth and interactive data analysis using a range of statistical software packages.

Compare data services to see what's right for you or refer to Frequently asked questions.

Data and file structure

Data items include:

  • Demographics, such as age, sex and country of birth
  • Geography
  • Labour force characteristics
  • Education, such as highest educational attainment
  • Income
  • Self-assessed health status
  • Sun protection behaviours, including whether outdoors for more than 15 minutes during peak UV time in the last week and sun protection measure(s) implemented.

Refer to the microdata data item list in the data downloads section for detailed information on items available.

The Sun protection behaviours data is structured as a single level person file.

Using TableBuilder

Please refer to relevant sections from the TableBuilder main page for information about how to create basic tables, custom groups, graphs and large tables. 

Weights

When tabulating data in TableBuilder, person weights are automatically applied to the underlying sample counts. Weighting is the process of adjusting results from a sample survey to infer results for the total population. To do this, a 'weight' is allocated to each sample unit. The weight is the value that indicates how many population units are represented by the sample unit.

Not applicable categories

Most data items included in the TableBuilder file include a 'Not applicable' category. The classification values of these 'Not applicable' categories, where relevant, are shown in the microdata data item list in the Data Downloads section. The 'Not applicable' category generally represents the number of people who were not asked a particular question or the number of people excluded from the population for a data item when that data was derived (e.g. Most recent date experienced sunburn is not applicable for people who did not experience sunburn in the last week).

Table populations

The population relevant to each data item is identified in the data item list and should be kept in mind when extracting and analysing data. The actual population estimate for each data item is equal to the total cumulative frequency minus the 'Not applicable' category.

Continuous data items

The Sun protection behaviours TableBuilder includes several continuous variables: 

  • They can have a response value at any point along a continuum.
  • Some continuous data items are allocated special codes for certain responses (e.g. 0 = 'Not applicable').
  • When creating ranges in TableBuilder for such continuous items, special codes will automatically be excluded. Therefore the total will show only 'valid responses' rather than all responses (including special codes). These codes are shown in the data item list.
  • Continuous items with special codes have a corresponding categorical item on the Person level that provides the ability to display data for the special code. Refer to the data item list.

Multiple-response data items

A number of data items allow respondents to report more than one response. For these items, a person is counted against each category they responded to and consequently the sum of the categories may be different to the total. An example of such a data item is 'Days of the week when outdoors for longer than 15 minutes during peak UV times'. For this data item, respondents can report multiple days on which they were outdoors in this manner.

Multiple-response data items are identified in the data item list, as they include 'multiple response' in the data item label. In TableBuilder these are denoted by '(MR)' in the data item label. The data item list can be accessed from the Data Downloads section.

Confidentiality

A confidentiality process called perturbation is applied to the data in TableBuilder to avoid releasing information that may lead to the identification of individuals, families, households, dwellings or businesses. See Confidentiality and relative standard error.

Using DataLab

The DataLab environment allows real time access to detailed microdata from the Sun protection behaviours survey topic.

The DataLab is an interactive data analysis solution available for users to run advanced statistical analyses, for example, multiple regressions and structural equation modelling. The DataLab environment contains recent versions of analytical software, including R, SAS, Stata and Python. Controls in the DataLab have been put in place to protect the identification of individuals and organisations. All output from DataLab sessions is cleared by an ABS officer before it is released.

For information about all of the data items available in the Sun protection behaviours DataLab product please see the microdata data item list.  

For more information on DataLab, including prerequisites for access, please see the DataLab page.

Reliability of estimates

As the survey was conducted on a sample of households in Australia, it is important to take account of the method of sample selection when deriving estimates from the detailed microdata. This is important as a person's chance of selection in the survey varied depending on the state or territory in which the person lived. If these chances of selection are not accounted for by use of appropriate weights, the results could be biased. 

Each person record has a main weight (FINWTPC). This weight indicates how many population units are represented by the sample unit. When producing estimates of sub-populations from the detailed microdata, it is essential that they are calculated by adding the weights of persons in each category and not just by counting the sample number in each category. If each person’s weight were to be ignored when analysing the data to draw inferences about the population, then no account would be taken of a person's chance of selection or of different response rates across population groups, with the result that the estimates produced could be biased. The application of weights ensures that estimates will conform to an independently estimated distribution of the population by age, by sex, etc. rather than to the distributions within the sample itself.

It is also important to calculate a measure of sampling error for each estimate.  Sampling error occurs because only part of the population is surveyed to represent the whole population.  Sampling error should be considered when interpreting estimates as this gives an indication of accuracy and reflects the importance that can be placed on interpretations using the estimate. Measures of sampling error include standard error (SE), relative standard error (RSE) and margin of errors (MoE).  These measures of sampling error can be estimated using the replicate weights. The replicate weight variables provided on the microdata are labelled WPC01XX, where XX represents the number of the given replicate group. The exact number of replicates will vary depending on the survey but will generally be 30, 60 or 200 replicate groups. As an example, for survey microdata with 30 replicate groups, you will find 30 person replicate weight variables labelled WPC0101 to WPC0130. 

Using replicate weights for estimating sampling error

Overview of replication methods

How to use replicate weights

Data downloads

Sun protection behaviours, Nov 2023 to Feb 2024 Microdata Data Item List

Methodology

Show all

Post release changes

Show all

Back to top of the page