4177.0.55.002 - Microdata: Participation in Sport and Physical Recreation, Australia, 2011-12 Quality Declaration 
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 09/05/2013   
   Page tools: Print Print Page Print all pages in this productPrint All

Header picture FILE STRUCTURE


DATA AVAILABLE BY LEVEL
WEIGHTS AND ESTIMATION
NOT APPLICABLE CATEGORIES
SPECIAL CODES
HOUSEHOLD SIZE
POPULATIONS
STANDARD ERRORS


DATA AVAILABLE BY LEVEL

The 2011-12 Multipurpose Household Survey asked 13,630 respondents across Australia a range of questions about their participation in sport and physical recreation activities over a 12 month period. Responses to these questions, along with a range of socio-demographic data are available as microdata through Confidentialised Unit Record Files or through TableBuilder files. The microdata files have two levels:
1. Person level
2. Activity level

These levels are hierarchical as each activity must be linked to a person. A person identifier exists on each level which allows data users to combine people's characteristics with the activities they undertake.

Person level

The Person level contains all of the standard demographic characteristics of each person such as age, sex, country of birth, education and labour force status. The level also contains person characteristic data items relevant to participation in sport and physical recreation activities.

In addition, the level includes some household characteristics applicable to the respondent such as equivalised weekly household income and whether any children aged 14 years or under are present in the household.

All geographic identifiers are included on the Person level (i.e. state/territory of usual residence, remoteness area and capital city/balance of state).

Activity level

The Activity level contains details relating to each individual sport or physical recreation activity participated in by each respondent. A maximum of 10 sports and physical recreation activities were recorded in the survey for each respondent.

A complete data item list can be accessed from the Downloads page.

WEIGHTS AND ESTIMATION

As the survey was conducted on a sample of households in Australia, it is important to take account of the method of sample selection when deriving estimates. This is particularly important as a person's chance of selection in the survey varied depending on the state or territory in which they lived. Survey 'weights' are values which indicate how many population units are represented by the sample unit.

There is one weight provided: a person weight (FINWTPA). This should be used when analysing data at the person level and activity level.

Where estimates are derived, it is essential that they are calculated by adding the weights of persons in each category, and not just by counting the number of records falling into each category. If each person's 'weight' were to be ignored, then no account would be taken of a person's chance of selection in the survey or of different response rates across population groups, with the result that counts produced could be seriously biased. The application of weights ensures that the person estimates conform to an independently estimated distribution of the population by age, sex, state/territory, part of state and labour force status.

NOT APPLICABLE CATEGORIES

Most data items included in the microdata include a 'Not applicable' category. The 'Not applicable' category comprises those respondents who were not asked a particular question and hence are not applicable to the population to which the data item refers. The classification value of the 'Not applicable' category, where relevant, are shown in the data item lists in the Downloads tab.

SPECIAL CODES

For some data items certain classification values have been reserved as special codes and must not be added as if they were quantitative values. These special codes generally relate to data items such as income. For example, code 999999999 for the data item 'Weekly personal income from all sources - Parametric', refers to 'Income unknown or not stated'.

HOUSEHOLD SIZE

Some inconsistencies may occur between household size and family composition recorded on the CURF and TableBuilder file. The household size is calculated by adding together all persons in a household who are in scope of the MPHS. The scope of the MPHS excludes several groups of people such as those in the defence forces. By comparison, when deriving the family composition there are no scope exclusions and all members of the household are included. An inconsistency may arise, for example, where one member of a couple in a household may be in the defence forces. In this case, the household size will be one person, while the family composition data item will indicate a couple household. The Participation in Sport and Physical Recreation Explanatory Notes page provides more detail about the MPHS scope.

POPULATIONS

The population relevant to each data item is shown in the data item list and should be considered when extracting and analysing the microdata. The actual population count for each data item is equal to the total cumulative frequency minus the 'Not applicable' category.

Generally, all populations, including very specific populations, can be 'filtered' using other relevant data items. For example, if the population of interest is 'Employed persons', any data item with that population (excluding the 'Not applicable' category) can be used as a filter.

As a further example, the CURF data items 'Status in employment' (STATEMPC) and 'Industry (ANZSIC 2006)' (IND06CF) are applicable to employed persons only. Therefore, either of the following filters could be used when restricting a table to 'Employed persons' only:

STATEMPC > 0 or IND06CF < 2600

(Note: For those data items, the 'Not applicable' categories (i.e. those persons who are not employed) are codes 0 and 2600 respectively and would be excluded from either population filter shown above.)

Similarly, code 1 for the data item 'Labour force status' (LFSTATUS) is 'employed persons'. If the population of interest is employed persons, the filter LFSTATUS = 1 could be used.

STANDARD ERRORS

Each record on the person level and activity level also contains 30 replicate weights and, by using these weights, it is possible to calculate Standard Errors (SEs) for weighted estimates produced from the microdata. This method is known as the 30 group Jack-knife variance estimator.

Under the Jack-knife method of replicate weighting, weights were derived as follows:
  • 30 replicate groups were formed each group mirroring the overall sample (where units from a collection district all belong to the same replicate group and a unit can belong to only one replicate group)
  • one replicate group was dropped from the file and then the remaining records were weighted in the same manner as for the full sample
  • records in that group that were dropped received a weight of zero.
This process was repeated for each replicate group (i.e. a total of 30 times). Ultimately each record had 30 replicate weights attached to it with one of these being the zero weight.

Replicate weights enable variances of estimates to be calculated relatively simply. They also enable unit records analyses such as chi-square and logistic regression to be conducted which take into account the sample design. Replicate weights for any variable of interest can be calculated from the 30 replicate groups, giving 30 replicate estimates. The distribution of this set of replicate estimates, in conjunction with the full sample estimate (based on the general weight) is then used to approximate the variance of the full sample.

To obtain the SE of a weighted estimate y, the same estimate is calculated using each of the 30 replicate weights. The variability between these replicate estimates (denoting y(g) for group number g) is used to measure the SE of the original weighted estimate y using the formula:

Standard error formula

where:

g = the replicate group number

y(g) = the weighted estimate, having applied the weights for replicate group g (is this needed?)

y = the weighted estimate from the sample.


The 30 group Jack-knife method can be applied not just to estimates of the population total, but also where the estimate y is a function of estimates of the population total, such as a proportion, difference or ratio. For more information on the 30 group Jack-knife method of SE estimation, see Research Paper: Weighting and Standard Error Estimation for ABS Household Surveys (Methodology Advisory Committee), July 1999 (cat. no. 1352.0.55.029).

Use of the 30 group Jack-knife method for complex estimates, such as regression parameters from a statistical model, is not straightforward and may not be appropriate. The method as described does not apply to investigations where survey weights are not used, such as in unweighted statistical modelling.