| Module 3: Interpreting Data
3. Questioning the data
3.2 How were the data produced?
Consider the following case study done on a sensitive political issue.
Public schools versus non-public schools
Coleman, Hoffer and Kilgore (1982) [2] analysed data from the study, High School and Beyond. The study examined the gains in scores on Reading, Vocabulary and Mathematics tests between Years 10 and 12 of students attending Public, Catholic and Other-Private high schools. Their question was "whether private schools bring about - for comparable students - higher achievement in basic cognitive skills". The students involved in the study were required to complete a battery of tests. Table 3.2.1 provides a simplified version of their results.
Table 3.2.1. Estimated Year 10 to 12 Gains in Test Scores and Learning Rates, with Corrections for Dropouts Missing from Senior Distribution
1Estimated learning rate was based on the average number of items learned and the number remaining to be learned.
 |
|  |  |
 | Test your knowledge
|  |  |
 |
|  |  |
 | Question
Select an interpretation in gains of scores you could make from the table of data above. |  | Answer |
 |
- Each sector assists students to develop their cognitive outcomes more in mathematics than in reading.
- Catholic sector schools contribute to similar gains in reading, mathematics and vocabulary.
- Non-public schools (private and Catholic) do not contribute as much as public schools in gains in scores.
- Public schools do not contribute as much as non-public schools to gains in scores.
|  | Click here for answers |
 |  |  |  |
Do the data support this interpretation?
In looking at these data you need to ask, "Are the data produced good enough to support this interpretation?". Coleman et al were criticised as being pro-private school (Goldberger and Cain 1982). Evidence for this claim was provided from an analysis of the way that the data were produced. The major criticisms included:
- Although the authors of this paper implied that the same numbers of schools from each sector were involved in the study, the sample sizes for each sector were very variable - the number of schools involved in the study were as follows:
This might lead you to question the representativeness of the private-sector sample.
- Also, in the private schools, only 79% of Year 10 students participated compared with a participation rate of 90% for public schools and 95% for Catholic schools. Such variation might lead you to wonder whether some selection of students for the survey had occurred at the participating private schools so that the "better" students were selected for the study. Such an approach would increase the likelihood of an unrepresentative sample for the private school sector.
- The study was designed to support the argument that certain types of schools improved the learning rates for students more than other schools did. However, the questions on the test were elementary. Did they really assess learning at high school - was the study measuring what it claimed to do?
- When students were stratified according to curriculum stream (i.e., academic, general, vocational), scores for academic public school students were about the same as those students in the private and Catholic sectors where the emphasis is on an academic curriculum. Thus, socioeconomic differences could also be a contributing factor which needs to be taken into account. (This is due to the perception that upper class or middle class homes will be more likely to put children into academic studies - an assumption which needs testing!)
How valid would be the inferences made from the data? The questions that have been raised above are typical of the approach we need to take when examining data.
How can you decide whether or not the data are good data? Here are some guidelines:
1. Ask yourself if there are other variables that have not been considered but which could affect the data.
2. Is a context provided for the data? i.e., is the source of the data clearly described? This would allow you to decide if the variables were well defined and if the measurements were valid, reliable and accurate.
3. How was the sample used in the study selected? Was there any bias in the way that the sample was selected?
4. Were the data produced by a recognised and respected agency such as the Australian Bureau of Statistics? (Although this might not always guarantee that the data are good, in general you would have greater confidence in the reliability of the data when compared with data produced by an 'unknown' organisation by a body with 'vested interests'.)
5. Are the variables well defined and do these actually measure the property that is the focus?
|  |