This document was added or updated on 16/02/2015.
EXPLANATORY NOTES - SOCIOECONOMIC FACTORS AND EARLY CHILDHOOD DEVELOPMENT
DATA SOURCES
1 This publication uses data from the 2012 Australian Early Development Index for Queensland and data from the 2011 Census of Population and Housing data. The Australian Early Development Index is now known as the Australian Early Development Census.
Australian Early Development Census data
2 The Australian Early Development Census (AEDC) is a census of children's health and development in their first year of formal full-time schooling. The AEDC is an Australian Government initiative. The Social Research Centre undertook the 2012 data collection and is responsible for managing the AEDC data.
3 AEDC data are collected using checklists completed by teachers for each child in their class. The AEDC instrument is made up of about 100 questions. Teachers used their knowledge about and observations of children in their class, in conjunction with data from enrolment forms, to complete the AEDC checklists.
4 The 2012 AEDC collection was undertaken between 1 May and 31 July 2012 and information was collected on 289,973 Australian children during their first year of full-time school. Overall, 96.5 per cent of all Australian children who registered to commence school in 2012 are included in this cycle of the AEDC.
5 The AEDC measures five domains of early child development. These domains are Physical health and wellbeing, Social competence, Emotional maturity, Language and cognitive skills (school-based), and Communication skills and general knowledge. AEDC results are typically reported as proportions of children who are regarded as 'on track', 'developmentally at risk' and 'developmentally vulnerable' based on cut-offs for each domain. Developmentally vulnerable children can be classified as vulnerable on one or more domains and/or vulnerable on two or more domains. Children who are identified as having special needs receive AEDC domain scores, but are not included in the ranked ordering of domain scores across all children who participated in the AEDC and do not receive a domain category. For more information see the AEDC website at www.aedc.gov.au.
Census of Population and Housing
6 The Census of Population and Housing (Census) is undertaken by the ABS every five years, and is collected under the authority of the Census and Statistics Act 1905. For information about the 2011 Census, including collection methodology, please refer to the information provided on the Census 2011 Reference and Information section of the ABS website. Information about the data quality of the Census is also available on the ABS website under Census Data Quality.
SCOPE AND COVERAGE
7 The scope of the Integrated Queensland AEDC and ABS Census dataset is restricted to children enrolled in their first year of full-time school in 2009 or 2012, for whom the teacher submitted an AEDC checklist. The coverage of the dataset is further restricted to children whose data were recorded on the 2011 Census of Population and Housing.
8 The 'Socioeconomic factors and early child development' article released in February 2015 is restricted to children who were enrolled in their first year of full-time school in 2012 and who were living in Queensland in 2012. A small number of children that lived in Queensland attended school in another state or territory.
DATA INTEGRATION
9 Statistical data integration involves combining information from different administrative and/or survey sources to provide new datasets for statistical and research purposes. Further information on data integration is available on the National Statistical Service website – Data Integration.
10 Data linking is a key part of statistical data integration and involves the technical process of combining records from different source datasets using variables that are shared between the sources. Data linkage is typically performed on records that represent individual persons, rather than aggregates. The most common methods link records on exact matches for common variables ('deterministic' linkage), or on close matches ranked by probabilities that the variables used will result in a true match ('probabilistic' linkage).
Linkage between Australian Early Development Census and Census data
11 The 2012 AEDC records were linked to the 2011 Census of Population and Housing data using a deterministic linkage methodology that requires exact matches between variables common to both datasets. It is considered a "bronze" standard linkage because name and address information were not used in the linkage process. Data was linked using date of birth, sex, and codes representing small geographic areas.
12 Information about linkage methodologies used in similar studies can be found in Research Paper: Assessing the Quality of Linking School Enrolment Records to 2011 Census Data: Deterministic Linkage Methods (cat. no. 1351.0.55.045) and Research Paper: Assessing the Quality of Different Data Linking Methodologies Across Time, Using Tasmanian Government School Enrolment Data (cat. no. 1351.0.55.047).
Linkage results
13 At the completion of the linkage process, 41,849 (67.9%) out of the 61,593 records from the 2012 Queensland AEDC dataset received were linked to the 2011 Census data. Whilst the linkage rate is slightly lower than results from other Bronze linkage projects using the 2011 Census data, the overall linkage accuracy for this project was estimated to be high. This was deemed to be the most appropriate balance of linkage rate and linkage accuracy. There is potential to raise the linkage rate, however, any small increase in the linkage rate would be outweighed by the loss in link accuracy. As the focus of this project is to analyse individual characteristics, including those for sub-populations, linkage accuracy was treated as higher importance than a slight increase in the linkage rate.
14 While of a high quality, these links still have a small chance of being false. This chance of error is influenced by a few factors. The first factor is the amount of missing or invalid information for the linking variables used. Matches can only be made on valid responses and any of the unique links could have potentially been duplicated in the records with missing or invalid information if that information was present.
15 The second factor is persons in the AEDC dataset who were missing on the Census data. While both sources of data are population counts, children may not have been included in the 2011 Census because they were not a resident of Australia at the time, were abroad temporarily at the time of collection, or were missed for another reason. Those people who were missing from the 2011 Census could have created duplicate records for the links that were considered unique.
16 Another factor impacting on potential error is the quality of the variables used for linking. While inaccurate responses for variables have a small impact here, the larger impact comes from the efficacy of variables to match records uniquely out of a pool of possible links. Variables that are more likely to contain unique responses, such as date of birth, are more effective for linking than variables that are less likely to be unique, such as sex.
CREATION OF DATA ITEMS
Parental Education and Occupation
17 Records within a family were linked to enable the identification of parental characteristics. This was only undertaken for parents, natural or adopted children, step children, and foster children who were at home on Census night.
Mother's age at time of child's birth
18 Mother's age at the time of child's birth was only calculated where both the child and the mother were at home on Census night, and the child was identified as the natural or adopted child of both parents or lone female parent, or the step-child of the male parent. This was unable to be calculated for children living in a female same-sex couple family.
WEIGHTING
19 Weighting is the process of adjusting a sample to infer results for the relevant population. The estimates in this publication are obtained by assigning a 'weight' to each linked record. The weight is a value which indicates how many children's records are represented by the linked record. Weights aim to adjust for the fact that the linked records may not be representative of all the records. Weighting was used to ensure better representation of population sub-groups and to enhance the reliability of the linked data for analysis.
20 A 'two-step linking propensity calibration' procedure was used, which involves estimating the link rate using a logistic regression model.
21 The first step of the calibration process used methodology developed to adjust for non-response in sample surveys. The concepts of non-response and non-links differ in that the former is a result of an action by a person selected in a sample, and the latter is the failure to link a record likely as a result of the quality of its linking variables. However, both situations may result in under/over representation, and as such the methodology developed to adjust for non-response is suitable to apply to adjust for non-links. The Integrated AEDC and Census dataset is unique in that many characteristics of the non-links are known, and these characteristics can therefore be used as inputs into a non-links adjustment.
22 The propensity of an AEDC record to be linked to a Census record was modelled, and each record was assigned an initial weight. Records in the linked dataset which share characteristics with unlinked records are given higher weights by this model, such that unlinked records are adequately represented on the linked file. This step was conducted using the following variables:
- Sex
- Indigenous status
- Place of birth (Australia, Other English speaking country, Other country)
- Hybrid postcode-remoteness (postcode for those with more than 800 population, remoteness otherwise)
- Socio-Economic Indexes for Areas (SEIFA) Index of Relative Socio-Economic Advantage and Disadvantage
- School size.
23 The second step of the calibration process used the weighted file produced in the first step, and benchmarked it to Remoteness, State, Sex and Indigenous status.
24 A subsequent benchmarking step was run, to ensure adequate representation of vulnerable children.
25 The weights have a mean value of 1.46 and range between 1 and 9.12.
USE OF THE DATA
26 As the data in this publication is based on weighted estimates, there may be differences between figures in this publication and those published elsewhere.
27 While every effort was made to assure the quality of the statistics presented in this publication, they should be considered experimental and treated with caution.
28 The AEDC checklist contains a number of questions about a child's background and home life, that are not necessarily known to teachers. As such, these items have comparatively high rates of unknown responses. The following table shows the answer rates for these questions, across Queensland.
ANSWER RATES FOR SELECTED QUESTIONS
|
| No | Yes | Don't know |
|
|
Has emotional problem | 95.4% | 3.0% | 1.6% |
Has behaviour problem | 94.9% | 4.0% | 1.2% |
Child regularly read to at home | 5.4% | 90.1% | 4.6% |
|
| |
POPULATIONS
29 The 'Socioeconomic factors and early child development' article uses all AEDC results in Queensland for 2012 which were successfully linked to the 2011 Census. This represents weighted totals of children rated for developmental vulnerability on each domain as outlined in the following table.
WEIGHTED TOTAL NUMBER OF CHILDREN WITH A DEVELOPMENTAL RATING, BY DOMAIN
|
| Physical health
and wellbeing | Social
competence | Emotional
maturity | Language and
cognition | Communication
and general
knowledge |
|
|
Population | 57,909 | 57,895 | 57,723 | 57,854 | 57,898 |
|