Australian Bureau of Statistics
6541.0.30.001 - Microdata: Income and Housing, Australia, 2011-12 Quality Declaration
Latest ISSUE Released at 11:30 AM (CANBERRA TIME) 04/10/2013
|Page tools: Print Page Print All RSS Search this Product|
ABOUT THE CURFs
The 2011–12 CURFs contain unit records relating to almost all of the survey respondents. The data are released under the Census and Statistics Act 1905, which has provision for the release of data in the form of unit records where the information is not likely to enable the identification of a particular person or organisation. Accordingly, there are no names or addresses of survey respondents on the CURFs and other steps, including the following list of actions, have been taken to protect the confidentiality of respondents:
Steps to confidentialise the datasets made available on the CURFs are undertaken in such a way as to ensure the integrity of the datasets and optimise the content, while maintaining the confidentiality of respondents. Intending purchasers should ensure that the data they require at the level of detail they require are available on the CURFs; data obtained in the survey but not contained on the CURFs may be available in tabulated form on request. The Data Item List document in the Summary tab contains information about the list of data items and categories on the SIH 2011-12 Basic and Expanded CURFs which is available as a datacube from the Downloads tab.
This section provides details of the files included on each of the Basic and Expanded CURFs.
SIH BASIC CURF FILE CONTENTS
The SIH Basic CURF distributed on CD–ROM or via the RADL contains the following files:
These files contain the raw confidentialised survey data in hierarchical comma delimited ASCII text format.
SIH12B.CSV contains all levels data
SIH12BH.CSV contains the Household level data
SIH12BI.CSV contains the Income unit level data
SIH12BP.CSV contains the Person level data
SIH12BL.CSV contains the Loans level data
These files contain the data for the CURF in SAS for Windows format.
SIH12BH.sas7bdat contains the Household level data
SIH12BI.sas7bdat contains the Income unit level data
SIH12BP.sas7bdat contains the Person level data
SIH12BL.sas7bdat contains the Loans level data
These files contain the data for the CURF in SPSS for Windows format.
SIH12BH.SAV contains the Household level data
SIH12BI.SAV contains the Income unit level data
SIH12BP.SAV contains the Person level data
SIH12BL.SAV contains the Loans level data
These files contain the data for the CURF in STATA format.
SIH12BH.DTA contains the Household level data
SIH12BI.DTA contains the Income unit level data
SIH12BP.DTA contains the Person level data
SIH12BL.DTA contains the Loans level data
FORMATS.sas7bcat is a SAS library containing formats
SIH12B.SAS contains a SAS program to run the SAS formats
IMPORTANT INFORMATION.PDF describes the file contents of the CURF and information on using the CURF
COPYRITE1.BAT describes Copyright obligations for CURF users
RESPONSIBLE ACCESS TO CURFs.PDF is an acrobat file explaining CURF users' role and obligations when using confidentialised data
The following plain text format files contain documentation about data item code values and category labels at each level, with weighted and unweighted frequencies for each value.
FREQUENCIES_SIH12BH.TXT contains documentation of the Household level data
FREQUENCIES_SIH12BI.TXT contains documentation of the Income Unit level data
FREQUENCIES_SIH12BP.TXT contains documentation of the Person level data
FREQUENCIES_SIH12BL.TXT contains documentation of the Loans level data
SIH EXPANDED CURF FILE CONTENTS
The SIH Expanded CURF can only be accessed via the RADL or the ABSDL, and contains the following files:
The test files mirror the actual data files, but have random data and random identifiers. These files are on the RADL website and can be downloaded so users can use these to trouble shoot their code prior to submitting RADL jobs.
SIH12EH.sas7bdat contains the test file of Household level data in SAS for Windows format
SIH12EI.sas7bdat contains the test file of Income unit level data in SAS for Windows format
SIH12EP.sas7bdat contains the test file of Person level data in SAS for Windows format
SIH12EL.sas7bdat contains the test file of Loans level data in SAS for Windows format
SIH12EH.SAV contains the test file of Household level data in SPSS format
SIH12EI.SAV contains the test file of Income unit level data in SPSS format
SIH12EP.SAV contains the test file of Person level data in SPSS format
SIH12EL.SAV contains the test file of Loans level data in SPSS format
SIH12EH.DTA contains the test file of Household level data in STATA format
SIH12EI.DTA contains the test file of Income unit level data in STATA format
SIH12EP.DTA contains the test file of Person level data in STATA format
SIH12EL.DTA contains the test file of Loans level data in STATA format
SIH12EH.sas7bdat contains the file of Household level data in SAS for Windows format
SIH12EI.sas7bdat contains the file of Income unit level data in SAS for Windows format
SIH12EP.sas7bdat contains the file of Person level data in SAS for Windows format
SIH12EL.sas7bdat contains the file of Loans level data in SAS for Windows format
SIH12EH.SAV contains the file of Household level data in SPSS format
SIH12EI.SAV contains the file of Income unit level data in SPSS format
SIH12EP.SAV contains the file of Person level data in SPSS format
SIH12EL.SAV contains the file of Loans level data in SPSS format
SIH12EH.DTA contains the file of Household level data in STATA format
SIH12EI.DTA contains the file of Income unit level data in STATA format
SIH12EP.DTA contains the file of Person level data in STATA format
SIH12EL.DTA contains the file of Loans level data in STATA format
FORMATS.sas7bcat is a SAS library containing formats.
The following plain text format files contain documentation about data item code values and category labels at each level, with weighted and unweighted frequencies for each value.
FREQUENCIES_SIH12EH.TXT contains documentation of the Household level data
FREQUENCIES_SIH12EI.TXT contains documentation of the Income Unit level data
FREQUENCIES_SIH12EP.TXT contains documentation of the Person level data
FREQUENCIES_SIH12EL.TXT contains documentation of the Loans level data
NOTES ON SPECIFIC DATA ITEMS
The data items included on the CURFs, and the categories within the data items, differ between the Basic and Expanded CURFs. The Expanded CURFs contain more variables than the Basic CURFs as well as more detailed data for selected variables. The data item list also shows the differences between the 2011–12 Basic and Expanded CURFs. Many of the differences result from the difference in the maximum household size permitted on the Basic and Expanded CURFs.
A complete list of the data items available on each record level for the CURFs, including relevant population and classification details, is available from the Downloads tab.
Many of the data items included on the CURFs are self-explanatory. The Glossary provides links to terms and definitions for most of the survey's data items. However, some items require further explanation.
There are several identifiers on records at each level of the file.
Each household has a unique random identifier. This identifier appears on the household level (ABSHID), and is repeated on the income unit, person, expenditure and loans level records relating to that household.
Each family within the household is numbered sequentially. Non family members, single person households and persons in group households have a sequential "family number" commencing at 50. Family number (ABSFID) appears on the income unit level and the person level. The combination of household and family number uniquely identifies a family.
A family has one or more income units and each income unit within the family is numbered sequentially. Income unit number (ABSIID) appears on the income unit level and the person level. The combination of household, family and income unit number uniquely identifies an income unit.
An income unit has one or more persons and each person within the income unit is numbered sequentially. Person number (ABSPID) appears on the person level. The combination of household, family, income unit and person number uniquely identifies a person.
A household may have one or more loans and each loan within the household is numbered sequentially. Loan number (ABSLID) appears on the loans level. The combination of a household and loan number uniquely identifies a loan.
To enable CURF users greater flexibility in their analyses, the ABS has included two Socio-economic Indexes For Area (SEIFA) and several sub-state geography items on the Expanded 2011–12 CURFs. Conditions are placed on the use of these items. Tables showing multiple data items, cross-tabulated by more than one sub-state geography at a time, are not permitted due to the detailed information about small geographic regions that could be presented. However, simple cross-tabulations of population counts by sub-state geographic data items may be useful for clients in order to determine which geography item to include in their primary analysis, and such output is permitted.
The person level records contain detailed information on income by source. The income unit and household level records contain information at a broader level. If detailed information is required for income analyses at the income unit or household level, this can be calculated by aggregating the person level information for each income unit or household. Income is recorded on both a 'current' and a 'previous financial year' basis. For more information about current and previous financial year income, see Part 1.2 'Current, annual and weekly income' in Survey of Income and Housing, User Guide, Australia, 2011–12 (cat. no. 6553.0) .
Where possible, supplementary items have been included on the file which replicate the content of the items that have been included on previous issues of the SIH CURFs. The SIH files include two income aggregates, "Total current weekly income from all sources" and "Total current weekly income from all sources (2005–06 basis)".
'Total current weekly income from all sources'
The publications relating to the 2011–12 SIH use this measure of income. It is consistent with the measure of income used in 2007–08 and 2009–10.
The component items of "Total current weekly income from all sources" are:
'Total current weekly income from all sources (2005-06 basis)'
This measure of income is comparable to that used in the publications relating to the 2005–06 survey however, there are some differences related to changes and improvements in the collection of information about sources of income that were introduced in 2007–08. The differences are the use of improved reported income from trusts and the inclusion of a broader measure of income from family members outside the household instead of restriction to regular, cash income from persons outside the household.
Previous financial year exclusion flag
The previous financial year exclusion flag at the person level (FINSCOPE) has a value of 1 for females whose family situation changed since 1 July 2010 (by moving in with a new partner, separating from a partner or becoming widowed) and for persons who arrived in Australia during 2011–12. At the income unit level a value of 1 in the previous financial year exclusion flag (FINSCOPU) indicates income units where the reference person or spouse has FINSCOPE=1. At the household level the previous financial year exclusion flag (FINSCOPH) indicates households where the reference person or spouse of one of the income units in the household has FINSCOPE=1. Users wishing to analyse previous financial year income data may wish to exclude such persons from their analysis (by limiting their analysis to records where FINSCOPE=2).
Assets and liabilities
The 2011–12 survey collected information on a comprehensive range of household assets and liabilities to enable analysis of net worth and its components across households. Similar data was collected for 2003–04, 2005–06 and 2009–10.
Weekly housing costs included on previous SIH CURFs and used in the publication Housing Occupancy and Costs, Australia, 2011–12 (cat. no. 4130.0) is labelled on the 2011–12 CURFs as "Weekly housing costs (SIHC basis)" and has the field name HCOSTSH. The component items are:
In the 2011–12 publications, housing costs have continued to be measured using HCOSTSH, in order to provide comparability with earlier issues.
However, in SIH surveys since 2003–04, extra information on housing costs has been collected.
A number of other related items are included on the CURF:
The SIH CURFs include estimates of imputed rent for owner-occupied dwellings. The imputation has also been applied to other housing tenures in order to value the in-kind benefit conferred to households paying subsidised rent or households occupying their dwelling rent free. Including imputed rent as part of household income and expenditure conceptually treats owner-occupiers as if they were renting their home from themselves, thus simultaneously incurring rental expenditure and earning rental income. Inclusion of imputed rent estimates in income measures is in accord with international standards for household income statistics, and provides a broader picture of the economic well-being of owner-occupier households and their social and economic circumstances relative to other households.
The imputed rent estimates have been included on the SIH CURFs. Two household level variables are included, 'Weekly gross imputed rent' and 'Weekly HH income from net imputed rent'. Gross imputed rent is the market value of the rental equivalent, and has been estimated using hedonic regression. Net imputed rent for owner occupiers has been derived by subtracting the housing costs normally paid by landlords (i.e. rates, mortgage interest, insurance, repairs and maintenance) from gross imputed rent. Income totals incorporating the imputed rent estimates have not been included. Users wishing to analyse the effect of imputed rent on income should add net imputed rent to household income. When analysing household expenditure, gross imputed rent should be added and any housing costs normally paid by a landlord should be deducted. For further information refer to Part 1.12 'Imputed rent estimates' in Survey of Income and Housing, User Guide, Australia, 2011–12 (cat. no. 6553.0).
Social transfers in kind
The SIH CURFs include estimates of social transfers in kind at the household level. Social transfers in kind consist of goods and services provided free or at subsidised prices by the government. Information reported in the SIH was used as the basis for allocating government social transfers in kind to households based on the composition of households and the characteristics of their members. The value of government social transfers in kind for education, health, housing, social security and welfare, and electricity concessions and rebates (indirect benefits) is added to disposable income to derive disposable income plus social transfers in kind. Final income is equal to disposable income plus social transfers in kind less taxes on production. For further information refer to Appendix 4 'Social transfers in kind' in Survey of Income and Housing, User Guide, Australia, 2011–12 (cat. no. 6553.0).
Imputation flags exist for each module in the questionnaire, rather than for specific data items. A value of 1 (partially imputed) indicates that at least one question in that module was imputed. Referring to the contents of the questionnaire module can provide an indication of whether particular data items may have included imputed data. The number of flags with a value of 1 for a particular record provides an indication of the extent of imputation in that record. A value of 2 (fully imputed) indicates that a person record has been fully imputed. In households where one or more people did not respond, person records were imputed if the non-responding persons was not a 'significant person'.
Payments to non household members
The financial resources available to certain persons can be affected by regular payments that they may make to provide support for persons outside the household. Information on payments for child support, alimony to former spouse, and payments to family members not in the household have been included on the CURFs.
Multiple response data items
The child care topic contains a number of multiple response data items on the 2011–12 CURFs. In these instances respondents were able to select one or more response categories, and the output data items are multi-response in nature. This section describes these items and provides some information on how to use them.
On the Basic and Expanded CURFs, the data items are:
RELIABILITY OF ESTIMATES
Use of weights
As the survey was conducted on a sample of private households in Australia, it is important to take account of the method of sample selection when deriving estimates from the CURF. This is particularly important as a person's chance of selection in the survey varied depending on the state or territory in which the person lived. If these chances of selection are not accounted for, by use of appropriate weights, the results will be biased.
Each household, income unit, person and loan record contains a weight. This weight indicates how many population units are represented by the sample unit.
Weights for each member of the household are the same as the weight for the household itself. Information for sampled households can be multiplied by the weights to produce estimates for the whole population. For further information on the weighting process, refer to Part 2.6 'Benchmarks and weighting of survey results' in Survey of Income and Housing, User Guide, Australia, 2011-12 (cat. no. 6553.0).
If estimates of population sub groups are to be derived from the CURF, it is essential that they are calculated by adding the weights of persons/households in each category and not just by counting the number in each category. If each person's/household's weight were to be ignored when analysing the data to draw inferences about the population, then no account would be taken of a person's/household's chance of selection or of different response rates across population groups, with the result that the estimates produced could be seriously biased. The application of weights ensures that estimates will conform to an independently estimated distribution of the population by age, by sex, etc. rather than to the distributions within the sample itself.
It should be noted that as a result of some of the changes made to protect confidentiality on the CURF, estimates of benchmarked items produced from the CURF may not equal the benchmarked values. For further information refer to the'Reconciliation of the CURF data' document in this product.
Relative sampling error
Two types of error are possible in an estimate based on a sample survey: non sampling error and sampling error. For further information on non-sampling and sampling error refer to Part 2.8 'Reliability of Estimates' in Survey of Income and Housing, User Guide, Australia, 2011-12 (cat. no. 6553.0).
Each record on the CURF contains 60 'replicate weights' in addition to the 'main weight'. These replicate weights can be used to derive estimates of standard error.
The basic idea behind the replication approach is to select subsamples repeatedly (60 times) from the whole sample. For each of these subsamples the statistic of interest is calculated. The variance of the full sample statistic is then estimated using the variability among the replicate statistics calculated from the subsamples. As well as enabling variances of estimates to be calculated relatively simply, replicate weights also enable unit record analyses such as chi–square tests and logistic regression to be conducted which take into account the complex sample design.
There are various ways of creating replicate subsamples from the full sample. The replicate weights produced for the 2011–12 SIH have been created using a group jack knife method of replication. The formulae for calculating the SE and RSE of an estimate using this method are:
g = 1,..,60 (the no. of replicate groups)
y(g) = weighted estimate, having applied the weights for replicate group g
y = weighted estimate from the full sample.
RSE(y) = SE(y)/y * 100%.
It is not clear that the jack knife method will provide good estimates for the variance of quantile boundaries such as the median, (see Rao, J.N.K., Wu, C.F.J., and Yue, K., (1992) for some recent work on resampling methods for complex surveys: Survey Methodology, vol. 18, pp. 209–217). An indirect approach (known as the Woodruff method) is available for estimating the variance of a quantile based on replicate weights (see Sarndal, Swenson and Wretman: Model Assisted Survey Sampling, Springer–Verlag, 1992).
To enable CURF users to check that they are using the replicate weights correctly, RSEs for selected key items have been calculated from the SIH Expanded CURF, and are presented as part of the sample tabulations available from the Downloads tab. The RSEs for estimates other than medians have been calculated using the group jack knife method, and RSEs for the medians have been calculated using the Woodruff method.
These documents will be presented in a new window.
This page last updated 3 October 2013