|Page tools: Print Page Print All RSS Search this Product|
ABOUT THE CURFs
The 2012 CURFs contain unit records relating to almost all of the survey respondents. The data are released under the Census and Statistics Act 1905, which has provision for the release of data in the form of unit records where the information is not likely to enable the identification of a particular person or organisation. Accordingly, there are no names or addresses of survey respondents on the CURFs and other steps, including the following list of actions, have been taken to protect the confidentiality of respondents:
Steps to confidentialise the datasets made available on the CURFs are undertaken in such a way as to ensure the integrity of the datasets and optimise the content, while maintaining the confidentiality of respondents. Intending purchasers should ensure that the data they require at the level of detail they require are available on the CURFs; data obtained in the survey but not contained on the CURFs may be available in tabulated form on request. The Data Item List document in the Summary tab contains information about the list of data items and categories on the HEC 2012 Basic and Expanded CURFs which is available as a datacube from the Downloads tab.
CONTENTS OF THE CURFs
This section provides details of the files included on each of the Basic and Expanded CURFs.
HEC BASIC CURF FILE CONTENTS
The HEC Basic CURF distributed on CD–ROM or via the RADL contains the following files:
These files contain the raw confidentialised survey data in hierarchical comma delimited ASCII text format.
HEC12B.CSV contains all levels data
HEC12BH.CSV contains the Household level data
HEC12BI.CSV contains the Income unit level data
HEC12BP.CSV contains the Person level data
HEC12BL.CSV contains the Loans level data
HEC12BLN.CSV contains the Longitudinal level data
These files contain the data for the CURF in SAS for Windows format.
HEC12BH.sas7bdat contains the Household level data
HEC12BI.sas7bdat contains the Income unit level data
HEC12BP.sas7bdat contains the Person level data
HEC12BL.sas7bdat contains the Loans level data
HEC12BLN.sas7bdat contains the Longitudinal level data
These files contain the data for the CURF in SPSS for Windows format.
HEC12BH.SAV contains the Household level data
HEC12BI.SAV contains the Income unit level data
HEC12BP.SAV contains the Person level data
HEC12BL.SAV contains the Loans level data
HEC12BLN.SAV contains the Loans level data
These files contain the data for the CURF in STATA format.
HEC12BH.DTA contains the Household level data
HEC12BI.DTA contains the Income unit level data
HEC12BP.DTA contains the Person level data
HEC12BL.DTA contains the Loans level data
HEC12BLN.DTA contains the Loans level data
FORMATS.sas7bcat is a SAS library containing formats
SIH12B.SAS contains a SAS program to run the SAS formats
IMPORTANT INFORMATION.PDF describes the file contents of the CURF and information on using the CURF
COPYRITE1.BAT describes Copyright obligations for CURF users
RESPONSIBLE ACCESS TO CURFs.PDF is an acrobat file explaining CURF users' role and obligations when using confidentialised data
The following plain text format files contain documentation about data item code values and category labels at each level, with weighted and unweighted frequencies for each value.
FREQUENCIES_HEC12BH.TXT contains documentation of the Household level data
FREQUENCIES_HEC12BI.TXT contains documentation of the Income Unit level data
FREQUENCIES_HEC12BP.TXT contains documentation of the Person level data
FREQUENCIES_HEC12BL.TXT contains documentation of the Loans level data
FREQUENCIES_HEC12BLN.TXT contains documentation of the Longitudinal level data
HEC EXPANDED CURF FILE CONTENTS
The HEC Expanded CURF can only be accessed via the RADL or the ABSDL, and contains the following files:
HEC12EH.sas7bdat contains the file of Household level data in SAS for Windows format
HEC12EI.sas7bdat contains the file of Income unit level data in SAS for Windows format
HEC12EP.sas7bdat contains the file of Person level data in SAS for Windows format
HEC12EL.sas7bdat contains the file of Loans level data in SAS for Windows format
HEC12ELN.sas7bdat contains the file of Longitudinal level data in SAS for Windows format
HEC12EH.SAV contains the file of Household level data in SPSS format
HEC12EI.SAV contains the file of Income unit level data in SPSS format
HEC12EP.SAV contains the file of Person level data in SPSS format
HEC12EL.SAV contains the file of Loans level data in SPSS format
HEC12ELN.SAV contains the file of Longitudinal level data in SPSS format
HEC12EH.DTA contains the file of Household level data in STATA format
HEC12EI.DTA contains the file of Income unit level data in STATA format
HEC12EP.DTA contains the file of Person level data in STATA format
HEC12EL.DTA contains the file of Loans level data in STATA format
HEC12ELN.DTA contains the file of Longitudinal level data in STATA format
FORMATS.sas7bcat is a SAS library containing formats.
The following plain text format files contain documentation about data item code values and category labels at each level, with weighted and unweighted frequencies for each value.
FREQUENCIES_HEC12EH.TXT contains documentation of the Household level data
FREQUENCIES_HEC12EI.TXT contains documentation of the Income Unit level data
FREQUENCIES_HEC12EP.TXT contains documentation of the Person level data
FREQUENCIES_HEC12EL.TXT contains documentation of the Loans level data
FREQUENCIES_HEC12ELN.TXT contains documentation of the Loans level data
NOTES ON SPECIFIC DATA ITEMS
The data items included on the CURFs, and the categories within the data items, differ between the Basic and Expanded CURFs. The Expanded CURFs contain more variables than the Basic CURFs as well as more detailed data for selected variables. The data item list also shows the differences between the 2012 Basic and Expanded CURFs. Many of the differences result from the difference in the maximum household size permitted on the Basic and Expanded CURFs.
A complete list of the data items available on each record level for the CURFs, including relevant population and classification details, is available from the Downloads tab.
Many of the data items included on the CURFs are self-explanatory. The Glossary provides links to terms and definitions for most of the survey's data items. However, some items require further explanation.
There are several identifiers on records at each level of the file.
Each household has a unique random identifier. This identifier appears on the household level (ABSHID), and is repeated on the income unit, person, expenditure and loans level records relating to that household.
Each family within the household is numbered sequentially. Non family members, single person households and persons in group households have a sequential "family number" commencing at 50. Family number (ABSFID) appears on the income unit level and the person level. The combination of household and family number uniquely identifies a family.
A family has one or more income units and each income unit within the family is numbered sequentially. Income unit number (ABSIID) appears on the income unit level and the person level. The combination of household, family and income unit number uniquely identifies an income unit.
An income unit has one or more persons and each person within the income unit is numbered sequentially. Person number (ABSPID) appears on the person level. The combination of household, family, income unit and person number uniquely identifies a person.
A household may have one or more loans and each loan within the household is numbered sequentially. Loan number (ABSLID) appears on the loans level. The combination of a household and loan number uniquely identifies a loan.
A household may have one or more longitudinal records and each instance within the household is numbered according to the wave number. Longitudinal number (ABSWID) appears on the longitudinal level. The combination of a household and Longitudinal number uniquely identifies a longitudinal instance.
To enable CURF users greater flexibility in their analyses, the ABS has included Climate Zones, Socio-economic Indexes for Area (SEIFA) and several sub-state geography items on the Basic and Expanded 2012 CURFs. Conditions are placed on the use of these items. Tables showing multiple data items, cross-tabulated by more than one sub-state geography at a time, are not permitted due to the detailed information about small geographic regions that could be presented. However, simple cross-tabulations of population counts by sub-state geographic data items may be useful for clients in order to determine which geography item to include in their primary analysis, and such output is permitted.
The climate zone classification used for HECS is based on the eight broad climatic zones defined by the Australian Building Codes Board (ABCB). Each zone is based on humidity, temperature and rainfall characteristics. For more information please see paragraphs 18 to 20 of the Explanatory Notes, Household Energy Consumption Survey, Australia: Summary of Results, 2012 (cat. no. 4670.0.).
The person level records contain detailed information on income by source. The income unit and household level records contain information at a broader level. If detailed information is required for income analyses at the income unit or household level, this can be calculated by aggregating the person level information for each income unit or household. Income is recorded on both a 'current' and a 'previous financial year' basis. For more information about current and previous financial year income, see Part 1.2 'Current, annual and weekly income' in Household Energy Consumption Survey, User Guide, Australia, 2012 (cat. no. 4671.0) .
'Total current weekly income from all sources'
The publications relating to the 2012 HECS use this measure of income. It is consistent with the measure of income used in the 2011-12 SIH.
The component items of "Total current weekly income from all sources" are:
Previous financial year exclusion flag
The previous financial year exclusion flag at the person level (FINSCOPE) has a value of 1 for females whose family situation changed since the last financial year at time of interview (by moving in with a new partner, separating from a partner or becoming widowed) and for persons who arrived in Australia during 2012. At the income unit level a value of 1 in the previous financial year exclusion flag (FINSCOPU) indicates income units where the reference person or spouse has FINSCOPE=1. At the household level the previous financial year exclusion flag (FINSCOPH) indicates households where the reference person or spouse of one of the income units in the household has FINSCOPE=1. Users wishing to analyse previous financial year income data may wish to exclude such persons from their analysis (by limiting their analysis to records where FINSCOPE=2).
Assets and liabilities
The survey collected information on a comprehensive range of household assets and liabilities to enable analysis of net worth and its components across households. Similar data was collected in the Survey of Income and Housing.
Household energy expenditure and consumption
Weekly energy expenditure is included for all households. Households that were unable or refused to supply energy expenditure information had their expenditure imputed. please see Data processing methods, Survey design and operation, Household Energy Consumption Survey, User Guide, Australia, 2012 (cat. no. 4671.0) for more information. Detailed expenditure and consumption on electricity, mains gas and bottle gas are only available to households who had their billing details on hand during the interview.
Weekly housing costs used in the 2012 HECS are consistent with those used in the 2011-12 SIH. For further information refer to the section 'Using the CURF' in Microdata: Income and Housing, Australia, 2011–12 (cat. no. 6541.0.30.001).
The CURFs include estimates of imputed rent for owner-occupied dwellings. The imputation has also been applied to other housing tenures in order to value the in-kind benefit conferred to households paying subsidised rent or households occupying their dwelling rent free. Including imputed rent as part of household income and expenditure conceptually treats owner-occupiers as if they were renting their home from themselves, thus simultaneously incurring rental expenditure and earning rental income. Inclusion of imputed rent estimates in income measures is in accord with international standards for household income statistics, and provides a broader picture of the economic well-being of owner-occupier households and their social and economic circumstances relative to other households.
The imputed rent estimates have been included on the CURFs. Two household level variables are included, 'Weekly gross imputed rent' and 'Weekly HH income from net imputed rent'. Gross imputed rent is the market value of the rental equivalent, and has been estimated using hedonic regression. Net imputed rent for owner occupiers has been derived by subtracting the housing costs normally paid by landlords (i.e. rates, mortgage interest, insurance, repairs and maintenance) from gross imputed rent. Income totals incorporating the imputed rent estimates have not been included. Users wishing to analyse the effect of imputed rent on income should add net imputed rent to household income. When analysing household expenditure, gross imputed rent should be added and any housing costs normally paid by a landlord should be deducted. For further information refer to Part 1.12 'Imputed rent estimates' in Survey of Income and Housing, User Guide, Australia, 2011–12 (cat. no. 6553.0).
Payments to non household members
The financial resources available to certain persons can be affected by regular payments that they may make to provide support for persons outside the household. Information on payments for child support, alimony to former spouse, and payments to family members not in the household have been included on the CURFs.
Imputation flags exist for each module in the questionnaire, rather than for specific data items. A value of 1 (partially imputed) indicates that at least one question in that module was imputed. Referring to the contents of the questionnaire module can provide an indication of whether particular data items may have included imputed data. The number of flags with a value of 1 for a particular record provides an indication of the extent of imputation in that record. A value of 2 (fully imputed) indicates that a person record has been fully imputed. In households where one or more people did not respond, person records were imputed if the non-responding persons was not a 'significant person'.
Multiple response data items
The energy and child care topics contains a number of multiple response data items on the CURF. In these instances respondents were able to select one or more response categories, and the output data items are multi-response in nature. This section describes these items and provides some information on how to use them.
On the Basic and Expanded CURFs, the data items are:
RELIABILITY OF ESTIMATES
Use of weights
As the survey was conducted on a sample of private households in Australia, it is important to take account of the method of sample selection when deriving estimates from the CURF. This is particularly important as a person's chance of selection in the survey varied depending on the state or territory in which the person lived. If these chances of selection are not accounted for, by use of appropriate weights, the results will be biased.
Each household, income unit, person and loan record contains a weight. This weight indicates how many population units are represented by the sample unit.
Weights for each member of the household are the same as the weight for the household itself. Information for sampled households can be multiplied by the weights to produce estimates for the whole population. For further information on the weighting process, refer to Part 2.6 'Benchmarks and weighting of survey results' in Household Energy Consumption Survey, User Guide, Australia, 2012 (cat. no. 4671.0).
If estimates of population sub groups are to be derived from the CURF, it is essential that they are calculated by adding the weights of persons/households in each category and not just by counting the number in each category. If each person's/household's weight were to be ignored when analysing the data to draw inferences about the population, then no account would be taken of a person's/household's chance of selection or of different response rates across population groups, with the result that the estimates produced could be seriously biased. The application of weights ensures that estimates will conform to an independently estimated distribution of the population by age, by sex, etc. rather than to the distributions within the sample itself.
It should be noted that as a result of some of the changes made to protect confidentiality on the CURF, estimates of benchmarked items produced from the CURF may not equal the benchmarked values. For further information refer to the'Reconciliation of the CURF data' document in this product.
Relative sampling error
Two types of error are possible in an estimate based on a sample survey: non sampling error and sampling error. For further information on non-sampling and sampling error refer to 'Reliability of Estimates' in Household Energy Consumption Survey, User Guide, Australia, 2012 (cat. no. 4671.0).
Each record on the CURF contains 60 'replicate weights' in addition to the 'main weight'. These replicate weights can be used to derive estimates of standard error.
The basic idea behind the replication approach is to select subsamples repeatedly (60 times) from the whole sample. For each of these subsamples the statistic of interest is calculated. The variance of the full sample statistic is then estimated using the variability among the replicate statistics calculated from the subsamples. As well as enabling variances of estimates to be calculated relatively simply, replicate weights also enable unit record analyses such as chi–square tests and logistic regression to be conducted which take into account the complex sample design.
There are various ways of creating replicate subsamples from the full sample. The replicate weights produced for the 2012 HECS have been created using a group jack knife method of replication. The formulae for calculating the SE and RSE of an estimate using this method are:
g = 1,..,60 (the no. of replicate groups)
y(g) = weighted estimate, having applied the weights for replicate group g
y = weighted estimate from the full sample.
RSE(y) = SE(y)/y * 100%.
It is not clear that the jack knife method will provide good estimates for the variance of quantile boundaries such as the median, (see Rao, J.N.K., Wu, C.F.J., and Yue, K., (1992) for some recent work on resampling methods for complex surveys: Survey Methodology, vol. 18, pp. 209–217). An indirect approach (known as the Woodruff method) is available for estimating the variance of a quantile based on replicate weights (see Sarndal, Swenson and Wretman: Model Assisted Survey Sampling, Springer–Verlag, 1992).
To enable CURF users to check that they are using the replicate weights correctly, RSEs for selected key items have been calculated from the SIH Expanded CURF, and are presented as part of the sample tabulations available from the Downloads tab. The RSEs for estimates other than medians have been calculated using the group jack knife method, and RSEs for the medians have been calculated using the Woodruff method.
These documents will be presented in a new window.