4324.0.55.001 - Microdata: Australian Health Survey, National Health Survey, 2011-12 Quality Declaration 
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 11/05/2017   
   Page tools: Print Print Page Print all pages in this productPrint All

USING THE BASIC CURF


ABOUT THE BASIC CURF

The NHS 2011–12 Basic CURF contains unit records relating to all of the survey respondents. The data are released under the Census and Statistics Act 1905, which has provision for the release of data in the form of unit records where the information is not likely to enable the identification of a particular person or organisation. Accordingly, there are no names or addresses of survey respondents on the CURF and other steps, including the following list of actions, have been taken to protect the confidentiality of respondents:
    • the level of detail of many data items has been reduced by grouping, ranging or top coding values
    • some unusual records have been changed to protect against identification
    • excluding some data items that were collected
    • income data has been perturbed.

The nature of the changes made, and the relatively small number of records involved ensure that the effects on data for analysis purposes is considered negligible. These changes also mean that estimates produced from the Basic CURF may differ from those published in Australian Health Survey: First Results, 2011-12 (cat. no. 4364.0.55.001), subsequent publications and/or the Expanded CURF.

Detailed information about the data collected, comments regarding data quality and other points to assist in using and interpreting the data are contained in Australian Health Survey: Users' Guide, 2011-13 (cat. no. 4363.0.55.001). It is recommended that relevant parts of the guide be read in conjunction with the use of the NHS 2011-12 Basic CURF.


COUNTS AND WEIGHTS
NUMBER OF RECORDS BY LEVEL, NHS 2011-12 BASIC CURF

LEVELSRECORD COUNTS (UNWEIGHTED)
WEIGHTED COUNTS (if applicable)
Household level15 565
8 581 354
Person level (Selected persons)20 426
22 105 281
Alcohol Day31 844
N/A
Alcohol Type34 747
N/A
Actions (Condition Group)48 709
N/A
Conditions level72 694
N/A
Medication level49 172
N/A
Biomedical level (Persons 5+)20 426
20 649 321


Weights and Hierarchical Files

Weight Variables

There are three weight variables on the file:

Household Weight (NHSFHHWT)
- Household level - Benchmarked
Person Weight (NHSFINWT)
- Selected Person level - Benchmarked to the total population.
Biomedical Weight (NHMSPERW)
- Biomedical level - Benchmarked to the total population aged 5 years and over. Note that this level also contains non-biomedical participant records, however, their biomedical weight is set to 0 so they won't contribute to estimates. When using biomedical variables in conjunction with other variables on the biomedical level or with variables from other levels, the biomedical weight should be used.

There are no weights associated with the other levels. This is because the records are repeated for each person. If for example, NHSFINWT is merged onto the Conditions level, it will be attached to each condition record and therefore be repeated for each person where they have more than one condition. This should be considered when producing tables. See 'Copying information across levels' below for more information.

For more information about weights, see 'Reliability of Estimates' below.

Using Weights

The NHS is a sample survey. To produce estimates for the in-scope population you must use weight fields in your calculations. The 'Biomedical Weight (Benchmarked weight)' must be used for all tables where a biomedical level data item is being used. This includes where biomedical items are being used with items from other levels. Which weight, if any, is used on data at non-benchmarked levels will affect the result as shown in the examples below:

Level of Data ItemEstimates if use Household WeightEstimates if use Person Weight
HouseholdHouseholds with the specified characteristics.Persons in households with the specified characteristics.
Person level (Selected persons)Households containing one or more selected persons with the specified characteristics.Persons with the specified characteristics.
Alcohol DayHouseholds containing one or more selected persons with one or more alcohol days with the specified characteristics.Persons with one or more alcohol days with the specified characteristics.
Alcohol TypeHouseholds containing one or more selected persons with one or more alcohol types with the specified characteristics.Persons with one or more alcohol types with the specified characteristics.
Actions (Condition Group)Households containing one or more selected persons with one or more actions with the specified characteristics.Persons with one or more actions with the specified characteristics.
ConditionsHouseholds containing one or more selected persons with one or more conditions with the specified characteristics.Persons with one or more conditions with the specified characteristics.
MedicationHouseholds containing one or more selected persons with one or more medications with the specified characteristics.Persons with one or more medications with the specified characteristics.
BiomedicalNot applicable because not all households contain at least one biomedical participant. Persons with the specified biomedical characteristics.*

*Note: Biomedical persons (Benchmarked weight) must be used to produce population estimates of persons 5 years and over with the specified biomedical characteristics, rather than Persons (Benchmarked weight). The Biomedical persons (Benchmarked weight) applies a weight of 0 to children under 5 years and biomedical non-participants, ensuring that they do not contribute to the population estimate.


IDENTIFIERS

Every record on each level of the file is uniquely identified.

The identifiers ABSLID, ABSPID, ABSBID, ABSTID, ABSGID, ABSCID, ABSMID and ABSUID appear on all levels of the file. Where the information for the identifier is not relevant for a level, it has a value of 0. See the Data Item List for details on which ID equates to which level.

Each household has a unique thirteen digit random identifier, ABSLID*. This identifier appears on the household level and is repeated on each level on each record pertaining to that household. The combination of identifiers uniquely identifies a record at a particular level as shown below:

1. Household = ABSLID*
2. Person = ABSLID* + ABSPID
3. Alcohol Day = ABSLID* + ABSPID + ABSBID
4. Alcohol Type = ABSLID* + ABSPID + ABSBID + ABSTID
5. Actions = ABSLID* + ABSPID + ABSGID
6. Conditions = ABSLID* + ABSPID + ABSGID + ABSCID
7. Medication = ABSLID* + ABSPID + ABSMID
8. Biomedical = ABSLID* + ABSPID + ABSUID

*Note the SAS name for the Household record identifier is ABSHHID on the Expanded CURF.

ABSLID assists with linking together people of the same household and also with household characteristics such as geography (located on the household level). The combination of ABSLID, ABSPID, ABSGID and ABSCID identifies each individual condition record a person has. When merging data with a level above, only those identifiers relevant to the level above are required. When merging with the level below (for example, the conditions level with the person level), the data on the person level will duplicate for each condition. See 'Copying information across levels' below for more information.

Copying information across levels


The following SAS code is an example of copying information from a lower level to a level above:
    PROC SORT DATA=NHS11B.NHS11BCN OUT=SORTED_BCN; /*Create a sorted temporary dataset based on the Conditions level*/
    BY ABSLID ABSPID ABSGID;

    DATA TOT_LTC (KEEP=ABSLID ABSPID ABSGID LONGTERM); /* Create a count of current, diagnosed, long-term conditions */
    SET SORTED_BCN;
    BY ABSLID ABSPID ABSGID; /* This step will go through each Condition record within each unique combination of ABSLID, ABSPID and ABSGID */
    RETAIN LONGTERM;

    IF FIRST.ABSGID THEN
    DO;
    LONGTERM=0;
    END;

    IF CONDSTAT=1 THEN LONGTERM=LONGTERM+1; /*Starts a count of the number of diagnosed, long-term conditions*/

    IF LAST.ABSGID THEN OUTPUT; /* This outputs the last record including the totals found for each unique combination of ABSLID, ABSPID and ABSGID */

    PROC SORT DATA=NHS11B.NHS11BAC OUT=SORTED_BAC; /* Create a sorted temporary dataset based on the Actions level */
    BY ABSLID ABSPID ABSGID;

    DATA MRGFILES;
    MERGE TOT_LTC SORTED_BAC;
    BY ABSLID ABSPID ABSGID;

    PROC FREQ DATA=MRGFILES; /*This procedure gives a sample count of the data copied up from the Condition level to the Actions level */
    TABLES LONGTERM;

    RUN;
The new variable LONGTERM presents a count of the number of diagnosed/longterm conditions belonging to each actions record. For example, a person with two current, diagnosed, long-term cardiovascular conditions (on the Conditions level) would have a value of '2' for LONGTERM on the cardiovascular actions record on the Actions level.

The following SAS code is an example of copying information from a higher level to a level below:
    PROC SORT DATA=NHS11B.NHS11BSP OUT=SORTED_BSP (KEEP=ABSLID ABSPID AGEB SEX);
    BY ABSLID ABSPID;

    PROC SORT DATA=NHS11B.NHS11BAC OUT=SORTED_BAC;
    BY ABSLID ABSPID;

    DATA MRGFILES;
    MERGE SORTED_BAC SORTED_BSP;
    BY ABSLID ABSPID;

    RUN;

This merge matches one Person record to many Actions records. The data items copied from the person level ('AGEB' and 'SEX' in the example) will be repeated for the counting unit of the level they have been added to, Actions in this case. Each Actions record will therefore receive the same AGEB and SEX of the Person they belong to.

For more information regarding merges across levels (including sample SPSS and Stata code), see SAMPLE CODE AND USING CURFS.


MULTI-RESPONSE ITEMS


A number of questions in the survey allowed respondents to provide one or more responses. Each response category for these multi-response data items is treated as a separate data item. On the CURF, these data items share the same identifier (SAS name) prefix, but are each separately suffixed with a letter - A for the first response, B for the second response, C for the third response and so on.

For example, the multi-response data item 'Days in last week consumed alcohol' has seven response categories (excluding 'Not applicable' and 'No alcohol consumed in last week'). There are seven data items named ALCDYWA, ALCDYWB, ALCDYWC...ALCDYWG. Each data item in the series will have either a positive response code or a null response code, with the exception of the first item in the series, ALCDYWA. ALCDYWA has four potential response codes: the positive response code 1 - 'Monday', the code 0 - null response, as well as the two additional response codes, code 8 - 'No alcohol consumed in last week' and code 9 - 'Not applicable'. The remaining items ALCDYWB--G have just the two response codes each, either null response or a valid response corresponding to the day of the week. The data item list identifies all multi-response items and lists the corresponding codes and response categories.

Note that the sum of individual multi-response categories will be greater than or equal to the population applicable to the particular data item as respondents are able to select more than one response.


RELIABILITY OF ESTIMATES


As the survey was conducted on a sample of private households in Australia, it is important to take account of the method of sample selection when deriving estimates from the CURF. This is particularly important as a person's chance of selection in the survey varied depending on the state or territory in which the person lived. If these chances of selection are not accounted for, by use of appropriate weights, the results will be biased. For details on the NHS weighting process, see Weighting, Benchmarks and Estimation procedures in Australian Health Survey: Users' Guide, 2011-13 (cat. no. 4363.0.55.001).

Each person record has a main weight (NHSFINWT). This weight indicates how many population units are represented by the sample units. When producing estimates of sub-populations from the CURF, it is essential that they are calculated by adding the weights of persons in each category and not just by counting the sample number in each category. If each person's weight were to be ignored when analysing the data to draw inferences about the population, then no account would be taken of a person's chance of selection or of different response rates across population groups, with the result that the estimates produced could be biased. The application of weights ensures that estimates will conform to an independently estimated distribution of the population by age, by sex, etc. rather than to the distributions within the sample itself.

Each person record on the CURF contains 60 replicate weights in addition to the main weight. Replicate weights can be used to calculate measures of sampling error. For details on sampling error calculations and replicate weights, see Technical Note.


CODING ISSUES

Issues have been identified with the coding of the 2011-12 data for the conditions 'Back pain/problem, disc disorder', 'Diseases of the digestive system', 'Symptoms, signs and conditions not elsewhere classified', 'Rheumatism' and 'Other diseases of the musculoskeletal system and connective tissue'. Analysis indicates that at the national level:
  • 'Back pain/problem, disc disorder' was under-reported by approximately 545,000 people
  • 'Diseases of the digestive system' was over-reported by approximately 230,000 people
  • 'Symptoms, signs and conditions not elsewhere classified' was over-reported by approximately 255,000 people
  • 'Rheumatism' was over-reported as a 'Current and long term' condition by approximately 145,000 people
  • 'Other diseases of the musculoskeletal system and connective tissue' was over-reported as a 'Current and long term' condition by approximately 275,000 people.

Therefore, 2011-12 data for these conditions are not comparable to other years. However, 2014-15 data have been correctly coded.


BASIC CURF FILES

ASCII text format files

These files contain the raw confidentialised survey data in hierarchical comma delimited ASCII text format.

NHS11B.csv contains all levels
NHS11BHH.csv contains Household level data
NHS11BSP.csv contains Person level data
NHS11BA3.csv contains Alcohol day level data
NHS11B14.csv contains Alcohol type level data
NHS11BAC.csv contains Actions level data
NHS11BCN.csv contains Conditions level data
NHS11BMD.csv contains Medications level data
NHS11BBI.csv contains Biomedical level data

SAS files

These files contain the data for the CURF in SAS format.

NHS11BHH.sas7bdat contains Household level data
NHS11BSP.sas7bdat contains Person level data
NHS11BA3.sas7bdat contains Alcohol day level data
NHS11B14.sas7bdat contains Alcohol type level data
NHS11BAC.sas7bdat contains Actions level data
NHS11BCN.sas7bdat contains Conditions level data
NHS11BMD.sas7bdat contains Medications level data
NHS11BBI.sas7bdat contains Biomedical level data

SPSS files

These files contain the data for the CURF in SPSS format.

NHS11BHH.sav contains Household level data
NHS11BSP.sav contains Person level data
NHS11BA3.sav contains Alcohol day level data
NHS11B14.sav contains Alcohol type level data
NHS11BAC.sav contains Actions level data
NHS11BCN.sav contains Conditions level data
NHS11BMD.sav contains Medications level data
NHS11BBI.sav contains Biomedical level data

STATA files

These files contain the data for the CURF in Stata format.

NHS11BHH.dta contains Household level data
NHS11BSP.dta contains Person level data
NHS11BA3.dta contains Alcohol day level data
NHS11B14.dta contains Alcohol type level data
NHS11BAC.dta contains Actions level data
NHS11BCN.dta contains Conditions level data
NHS11BMD.dta contains Medications level data
NHS11BBI.dta contains Biomedical level data

Information files

FORMATS.sas7bcat is a SAS library containing formats
NHS11B.sas contains a SAS program to load NHS11B.csv and the SAS formats into SAS for Windows
IMPORTANT INFORMATION.pdf describes the file contents of the CURF and information on using the CURF
COPYRITE1.bat describes Copyright obligations for CURF users

Frequency files

The following plain text format files contain data item code values and category labels at each level. Weighted frequencies are included for the household, person and biomedical levels.

FREQUENCIES_NHS11BHH.txt contains Household level data
FREQUENCIES_NHS11BSP.txt contains Person level data
FREQUENCIES_NHS11BA3.txt contains Alcohol day level data
FREQUENCIES_NHS11B14.txt contains Alcohol type level data
FREQUENCIES_NHS11BAC.txt contains Actions level data
FREQUENCIES_NHS11BCN.txt contains Conditions level data
FREQUENCIES_NHS11BMD.txt contains Medications level data
FREQUENCIES_NHS11BBI.txt contains Biomedical level data