4430.0.30.002 - Microdata: Disability, Ageing and Carers, Australia, 2012 Quality Declaration 
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 22/05/2014   
   Page tools: Print Print Page Print all pages in this productPrint All

FILE STRUCTURE


DATA AVAILABLE BY LEVEL
USING REPEATING DATASETS
WEIGHTS AND ESTIMATION


DATA AVAILABLE BY LEVEL

The SDAC 2012 data is available across ten levels. Many of the levels have a hierarchical relationship:

1. Household
2. Family
3. Income Unit
4. Person
5. All conditions
6. Restrictions
7. Specific activities
8. All recipients (not available on TableBuilder)
9. Broad activities
10. Assistance providers

The first four levels of SDAC data are in a hierarchical relationship, where each level is derived from the previous. These levels can be described as follows: a person is a member of an income unit, which is a member of a family, which is a member of a household. A household may have more than one family, while a family may have more than one income unit, and so on.

Levels 5 to 9 relate to the characteristics of conditions, restrictions and activities, with each being a sub-level of level 4 (Person). That is, a person can have multiple conditions and restrictions, as well as require assistance with one or more activities. Level 10 is a sub-level of level 9 (Broad activities), as it relates to characteristics of the assistance provided for activities identified in level 9. An activity can be undertaken with the assistance of one or more providers.

There are ‘dummy’ or ‘Not Applicable’ records at each of the sub-person levels 5 to 9, which allow for those instances where a person does not contribute a record to a particular level. For example, a person with no conditions will not contribute a record to the 'All conditions' level. This allows data items on sub-person levels to be used for calculating the total of ‘all persons’.

Additionally, ‘Not Applicable’ records exist at the Assistance Providers level for those people who experience difficulty with a broad activity (i.e. a record exists on level 9), but do not have a provider of assistance for that activity (i.e. no record exists on level 10).

Broadly, each level provides the following:
  • household level - information about the household size and structure and household income details
  • family level - information about the family size and structure, including whether there is a carer and/or a person with disability in the family
  • income unit level - information about the income unit size and whether there is a primary carer in the income unit
  • person level (main level and includes person level 2 on TableBuilder) - all demographic and socio-economic characteristics of the survey respondents, and most of the health and related information they provided
  • all conditions level - detailed information about the long-term health conditions reported in the survey
  • restrictions level - detailed information about the restrictions reported in the survey
  • specific activities level - detailed information about how much support people need to perform specific activities, such as moving about their place of residence
  • recipient level (not available on TableBuilder) - detailed information on respondents who need help or supervision with everyday activities because of their age or disability, including the types of assistance they need
  • broad activities level - detailed information about how much support people need to perform tasks at the broad activity level (e.g. mobility, communication)
  • assistance providers level - detailed information on people providing assistance to others because of age or disability, including the types of assistance they provide.
While the survey collects from private dwellings and health establishments, only private dwellings are included at the household, family and income unit levels. A full listing of output data items available on the CURF and TableBuilder can be accessed on the Downloads tab of this release.

USING REPEATING DATASETS

The 'one to many' relationships described by levels 5 to 10 are known as repeating datasets, that is, sets of data with a counting unit that may be repeated for a person. For example, the repeating dataset for conditions will have one record per condition reported, because condition is the counting unit. Repeating datasets are only useful when common information is collected for each instance of a counting unit. For example, each condition reported has the data item 'Whether reported condition is main condition' associated with it. This data item corresponds to each condition reported. Note that only one of the conditions reported for a particular person has a 1 (Yes) for 'Whether reported condition is main condition'. This enables a table to be run on 'All conditions' by 'Whether reported condition is the main condition' to ascertain which conditions cause the most problems.

Note that although the output above only relates to a single person, the totals are a count of all conditions for that person. As with the person level file, some data items in a repeating dataset are only applicable to a particular sub-population of the dataset. For instance, the item 'Whether assistance is always or sometimes required with each activity' from the specific activities level is only applicable for activities where the respondent needs assistance. Records outside the sub-population will appear as 'Not applicable'.

WEIGHTS AND ESTIMATION

As the survey was conducted on a sample of households in Australia, it is important to take account of the method of sample selection when deriving estimates. This is particularly important as a person's chance of selection in the survey varied depending on the state or territory in which they lived. Survey 'weights' are values which indicate how many population units are represented by the sample unit.

There are two survey weights provided: a person weight (FINWTP) and a household weight (FINWTH). These should be used when analysing data at the person and household level respectively. The household weight should also be used for the family level and income unit level and the person weight for all other levels.

Where estimates are derived, it is essential that they are calculated by adding the weights of persons or households, as appropriate, in each category, and not just by counting the number of records falling into each category. If each person's or household's 'weight' were to be ignored, then no account would be taken of a person's or household's chance of selection in the survey or of different response rates across population groups, with the result that counts produced could be seriously biased. The application of weights ensures that:
  • person estimates conform to an independently estimated distribution of the population by dwelling type (private and non-private), age, sex, state/territory and part of state
  • household estimates conform to an independently estimated distribution of households by certain household characteristics (e.g. by number of adults and children), state/territory and part of state rather than to the distributions within the sample itself.
Counting units and weights

The counting unit for level one is the household, for level two the family, for level three the income unit, for level four the person, for level five all conditions, for level six all restrictions, for level seven all specific activities, for level eight recipients of care, for level nine all broad activities and for level ten all assistance providers. There is a weight attached to each level in order to estimate the total population of the respective counting unit. The weight on levels one to three is the household weight and the weight on levels four to ten is the person weight.

What you count depends on the level from which you select the weight. A household level weight estimates the number of households with a particular characteristic. Likewise, the weight included in the family level estimates the number of families, and the weight included in the income unit level estimates the number of income units, with the selected characteristic. Only private dwellings are included at the household, family and income unit levels.

A person weight stored on the person level, or below, provides an estimate of the number of persons with the selected characteristic. When the weights from levels five to ten are used, the population is restricted to persons who have a record on the particular level, but will be repeated for each instance of the counting unit. Replicate weights have also been included and these can be used to calculate the standard error. For more information, refer to the Standard Errors section below.

Standard Errors

Each record on the household level and person level also contains 60 replicate weights and, by using these weights, it is possible to calculate standard errors for weighted estimates produced from the microdata. This method is known as the 60 group Jack-knife variance estimator.

Under the Jack-knife method of replicate weighting, weights were derived as follows:
  • 60 replicate groups were formed with each group formed to mirror the overall sample (where units from a collection district all belong to the same replicate group and a unit can belong to only one replicate group)
  • one replicate group was dropped from the file and then the remaining records were weighted in the same manner as for the full sample
  • records in that group that were dropped received a weight of zero.

This process was repeated for each replicate group (i.e. a total of 60 times). Ultimately each record had 60 replicate weights attached to it with one of these being the zero weight.

Replicate weights enable variances of estimates to be calculated relatively simply. They also enable unit records analyses such as chi-square and logistic regression to be conducted which take into account the sample design. Replicate weights for any variable of interest can be calculated from the 60 replicate groups, giving 60 replicate estimates. The distribution of this set of replicate estimates, in conjunction with the full sample estimate (based on the general weight) is then used to approximate the variance of the full sample.

To obtain the standard error of a weighted estimate y, the same estimate is calculated using each of the 60 replicate weights. The variability between these replicate estimates (denoting y(g) for group number g) is used to measure the standard error of the original weighted estimate y using the formula:

Image: equasion to obtain the standard error of a weighted estimate
where:

g = the replicate group number

y(g) = the weighted estimate, having applied the weights for replicate group g

y = the weighted estimate from the sample.

The 60 group Jack-knife method can be applied not just to estimates of the population total, but also where the estimate y is a function of estimates of the population total, such as a proportion, difference or ratio. For more information on the 60 group Jack-knife method of SE estimation, see Research Paper: Weighting and Standard Error Estimation for ABS Household Surveys (Methodology Advisory Committee), July 1999 (cat. no. 1352.0.55.029).

Use of the 60 group Jack-knife method for complex estimates, such as regression parameters from a statistical model, is not straightforward and may not be appropriate. The method as described does not apply to investigations where survey weights are not used, such as in unweighted statistical modelling.