6227.0.55.002 - Experimental estimates of education and training performance measures based on data pooling, Survey of Education and Work, 2007 to 2010, Sep 2011  
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 09/09/2011  First Issue
   Page tools: Print Print Page Print all pages in this productPrint All

SIGNIFICANCE TESTING

In this paper data from the Survey of Education and Work (SEW) are used to determine whether or not there have been statistically significant changes over time in key performance measures . A particular focus is on the degree to which larger population samples made possible by data pooling from successive surveys can reduce sample error and thereby enhance the detection of change where it exists.

For comparing estimates between surveys, or between populations within a survey, it is necessary to determine whether differences are 'real' differences between the corresponding population characteristics or simply the result of sampling variability between the survey samples. One way to examine this is to determine whether the difference between the estimates is statistically significant. This is done by calculating the standard error (SE) of the difference between two estimates (x and y ), and using that to calculate the test statistic:


If the value of this test statistic is greater than 1.96 then there is good evidence of a statistically significant difference between the two population estimates with respect to that characteristic. Otherwise, it cannot be stated with confidence that there is a real difference between the population estimates.

The ABS advises care in the interpretation of the results of significance testing, particularly for fine level aggregate comparisons. It should be first noted that significance testing only provides information on the probability of a specific event (eg. the difference between two means) occurring randomly when the null hypothesis is true. It does not test if the null hypothesis is actually true. Indeed, it should be expected that there will always be some level of difference between any two subpopulation sample estimates because of demographic and socio-economic heterogeneity in the samples. Hence if 100 fine level aggregate statistical significance tests are conducted (eg. at state by SEIFA decile level) using a significance level of (1-alpha=5%) then 5% of the tests are expected by chance to show statistically significant differences. The interpretation of such analyses should not be arbitrarily conclusive, (ie. that there is a definite difference) but consider the calculated p-value or z-score, the confidence interval of the estimates and the heterogeneity of Australian population characteristics at CD by SEIFA decile level to draw a balanced judgement on the likely level of evidence (weak, good, very good etc) for a real difference in two subpopulations versus the possibility of false positive findings.

For more information, see ABS Research Paper: Socio-Economic Indexes for Individuals and Families (Methodology Advisory Committee), Jun 2007 (cat. no. 1352.0.55.086)