Page tools: Print Page Print All | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
DATA QUALITY
RELIABILITY OF SURVEY ESTIMATES All sample surveys are subject to error, which can be broadly categorised as either:
Sampling error occurs because only a small proportion of the total population is used to produce estimates that represent the whole population. Sampling error can be reliably measured as it is calculated based on the scientific methods used to design surveys. Non-sampling error may occur in any data collection, whether it is based on a sample or a full count (eg Census). Non-sampling error may occur at any stage throughout the survey process. Examples of non-sampling error include:
Sampling and non-sampling errors should be considered when interpreting results of the survey. Sampling errors are considered to occur randomly, whereas non-sampling errors may occur randomly and/or systematically. SAMPLING ERROR
Measure of sampling variability One measure of the likely difference is given by the standard error (SE), which indicates the extent to which an estimate might have varied because only a sample of dwellings was included. There are about two chances in three that the sample estimate will differ by less than one SE from the figure that would have been obtained if all dwellings had been included, and about 19 chances in 20 that the difference will be less than two SEs. This is known as the margin of error (MoE) at the 95% confidence level. The margin of error at the 95% confidence level is expressed as 1.96 times the SE. The 95% confidence interval is the estimate +/- MoE i.e. the range from minus 1.96 times the SE to the estimate plus 1.96 times the SE. Another measure of the likely difference is the relative standard error (RSE), which is obtained by expressing the SE as a percentage of the estimate to which it relates. The RSE is a useful measure in that it provides an immediate indication of the percentage errors likely to have occurred due to sampling, and thus avoids the need to refer also to the size of the estimate. More detail on the calculation of SEs, MoEs and RSEs can be found in the Technical Note. For proportion estimates, the RSE is a percentage error of a percentage. The MoE has also been published for proportion estimates to show the absolute size of the sampling error. This will assist users in assessing the reliability of these estimates. Estimates with relative standard errors less than 25% are considered sufficiently reliable for most purposes. However, estimates with relative standard errors of 25% or more are included in Australian Bureau of Statistics (ABS) publications of results from this survey. Estimates with RSEs greater than 25% but less than or equal to 50% are annotated by an asterisk to indicate they are subject to high SEs relative to the size of the estimate and should be used with caution. Estimates with RSEs of greater than 50%, annotated by a double asterisk, are considered too unreliable for most purposes. These estimates can be used to aggregate with other estimates to reduce the overall sampling error. Relative standard errors for estimates are published in 'direct' form. For 2012-13 Australian Aboriginal and Torres Strait Islander Health Survey (AATSIHS), RSEs for estimates were calculated for each separate estimate and published individually using a replicate weights technique (Jackknife method). This direct calculation of RSEs can result in larger estimates having larger RSEs than smaller ones, since these larger estimates may have more inherent variability. More information about the replicate weights technique can be found in the Technical Note. Standard errors of proportions, differences and sums Proportions formed from the ratio of two estimates are also subject to sampling error. The size of the error depends on the accuracy of both the estimates. The difference between, or sum of, two survey estimates (of numbers or percentages) is itself an estimate and is therefore also subject to sampling error. The SE of the difference between, or sum of, two survey estimates depends on their SEs and the relationship between them. The formulas to approximate the RSE for proportions and the SE of the difference between, or sum of, two estimates can be found in the Technical Note. Testing for statistically significant differences For comparing estimates between surveys or between populations within a survey it is useful to determine whether apparent differences are 'real' differences between the corresponding population characteristics, or simply the product of differences between the survey samples. One way to examine this is to determine whether the difference between the estimates is statistically significant. This is done by calculating the standard error of the difference between two estimates (x and y) and using that to calculate the test statistic using the formula below: (x-y) ________________ SE(x-y) If the value of the test statistic is greater than 1.96, then we may say that we are 95% certain that there is a statistically significant difference between the two populations with respect to that characteristic. Otherwise, it cannot be stated with confidence that there is a real difference between the populations. NON- SAMPLING ERROR Lack of precision due to sampling variability should not be confused with inaccuracies that may occur for other reasons, such as errors in response and recording. Inaccuracies of this type are referred to as non-sampling error. This type of error is not specific to sample surveys and can occur in a census enumeration. The major sources of non-sampling error are:
These sources of error are discussed below. Errors related to scope and coverage Some dwellings may have been inadvertently included or excluded because, for example, the distinctions between whether they were private or non-private dwellings may have been unclear. All efforts were made to overcome such situations by constant updating of lists both before and during the survey. Also, some persons may have been inadvertently included or excluded because of difficulties in applying the scope rules concerning the identification of usual residents and Indigenous status of those residents. Response errors Response errors may have arisen from three main sources:
Errors may be caused by misleading or ambiguous questions, inadequate or inconsistent definitions of terminology used, or poor overall survey design (for example, context effects where responses to a question are directly influenced by the preceding questions). In order to overcome problems of this kind, individual questions and, the questionnaire overall, were thoroughly tested before being finalised for use in the survey. Testing for NATSIHS and NATSINPAS took two forms:
As a result of both forms of testing, modifications were made to question design, wording, ordering and associated prompt cards, and some changes were made to survey procedures. In considering modifications, it was sometimes necessary to balance better response to a particular item/topic against increased interview time or effects on other parts of the survey. For example, questions for collecting data on usual intake of fruit and vegetables referred to consumption in the form of 'serves', which required the use of a prompt card to define a serve, and a fair amount of recall and calculation on the part of the respondent. Although every effort was made to minimise response errors due to questionnaire design and content issues, some errors will inevitably have occurred in the final survey enumeration. As the survey is quite lengthy, reporting errors may also have resulted from interviewer and/or respondent fatigue (i.e. loss of concentration), particularly for those respondents reporting for both themselves and a child. Inaccurate reporting may also occur if respondents provide deliberately incorrect responses. While efforts were made to minimise errors arising from fatigue, or from deliberate misreporting or non-reporting by respondents, through emphasising the importance of the data and checks on consistency within the survey instrument, some instances will have inevitably occurred. Reference periods used in relation to each topic were selected to suit the nature of the information being sought. In particular to strike the right balance between minimising recall errors and ensuring the period was meaningful and representative (from both respondent and data use perspectives), and would yield sufficient observations in the survey to support reliable estimates. It is possible that the reference periods did not suit every person for every topic and that difficulty with recall may have led to inaccurate reporting in some instances. Lack of uniformity in interviewing standards may also result in non-sampling errors. Training and retraining programs, and checking of interviewers’ work were methods employed to achieve and maintain uniform interviewing practices and a high level of accuracy in recording answers on the survey questionnaire (see: Interviews within the Data collection page). The operation of the Computer Assisted Instrument (CAI) itself, and the built in checks within it, ensure that data recording standards are maintained. Respondent perception of the personal characteristics of the interviewer can also be a source of error, as the age, sex, appearance or manner of the interviewer may influence the answers obtained. Non-response bias Non-response may occur when people cannot or will not cooperate in the survey, or cannot be contacted by interviewers. Non-response can introduce a bias to the results obtained insofar as non-respondents may have different characteristics and behaviour patterns in relation to their health to those persons who did respond. The magnitude of the bias depends on the extent of the differences and the level of non-response. The 2012-13 NATSIHS and NATSINPAS achieved an overall response rate of 80.2% and 79.2% respectively (fully/adequate responding households, after sample loss). Data to accurately quantify the nature and extent of the differences in health characteristics between respondents in the survey and non-respondents are not available. Under or over-representation of particular demographic groups in the sample are compensated for at the State, section of State, sex and age group levels in the weighting process. Other disparities are not adjusted for. Households with incomplete interviews were treated as fully responding for estimation purposes where the only questions that were not answered were legitimate 'don't know' or refusal options, or any or all questions on income, or where weight and height were not obtained. These non-response items were coded to 'not stated'. To improve the sample representativeness of the surveys generally, as well as for the biomedical component (NATSIHMS) and the second day dietary recall and pedometer component (in NATSINPAS), households that contained at least one fully responding person but also contained a selected adult or child who did not respond to the survey, were also retained and considered adequately responding. This process added an additional 371 and 180 households respectively to the NATSIHS and NATSINPAS samples. Note that this is consistent with the approach taken for previous ABS Aboriginal and Torres Strait Islander surveys. Processing errors Processing errors may occur at any stage between the initial collection of the data and the final compilation of statistics. These may be due to a failure of computer editing programs to detect errors in the data, or may occur during the manipulation of raw data to produce the final survey data files. For example, in the course of deriving new data items from raw survey data, or during the estimation procedures or weighting of the data file. To minimise the likelihood of these errors occurring, a number of quality assurance processes were employed.
OTHER FACTORS AFFECTING ESTIMATES In addition to data quality issues, there are a number of both general and topic-specific factors which should be considered in interpreting the results of this survey. The general factors affect all estimates obtained, but may affect topics to a greater or lesser degree depending on the nature of the topic and the uses to which the estimates are put. This section outlines these general factors. Additional issues relating to the interpretation of individual topics are discussed in the topic descriptions provided in other sections of this Users' Guide. Scope The scope of the survey defines the boundaries of the population to which the estimates relate. The most important aspect of the survey scope affecting the interpretation of estimates from this survey is that institutionalised persons (including inpatients of hospitals, nursing homes and other health institutions) and other persons resident in non-private dwellings (e.g. hotels, motels, boarding houses) were excluded from the survey. Coverage The NATSIHS and NATSINPAS includes all geographic areas except migratory. Due to the close timing of the NATSIHS to the National Health Survey (NHS) and the National Nutrition and Physical Activity Survey (NNPAS), the latter two surveys excluded all persons living in very remote areas, discrete Aboriginal and Torres Strait Islander communities, as well as a small number of persons living within SA1s that include Aboriginal and Torres Strait Islander communities. These exclusions were to minimise overlap of the Aboriginal and Torres Strait Islander household sample of both the NHS and NATSIHS and the NNPAS and NATSINPAS. Undercoverage UNDERCOVERAGE, by state or territory
Of the national rate, 6% is due to planned frame exclusions and overlap with the Monthly Population Survey where analysis has shown that the impact of any bias is minimal. More information on these exclusions is provided below. Undercoverage may occur due to a number of factors, including:
Each of these factors are outlined in more detail in the following paragraphs. To assist interpretation, a diagrammatical representation of the potential sources of undercoverage is denoted below. Frame exclusions Frame exclusions were incorporated into the AATSIHS to manage the cost of enumerating areas with a small number of Aboriginal and Torres Strait Islander persons. For more information, see the Scope page of this Users' Guide. Non-response Non-response may occur when people cannot or will not cooperate, or cannot be contacted. Unit and item non-response by persons/households selected in the survey can affect both sampling and non-sampling error. The loss of information on persons and/or households (unit non-response) and on particular questions (item non-response) reduces the effective sample and increases both sampling error and the likelihood of incurring response bias. To reduce the level and impact of non-response, the following methods were adopted in this survey:
In the AATSIHS non-response accounts for a portion of overall undercoverage. The two components of non-response were:
Non-identification as being of Aboriginal and/or Torres Strait Islander origin Non-identification of Aboriginal and Torres Strait Islander households during the screening process may have occurred due to:
Known undercoverage, due to other issues arising in the field, included sample being excluded due to:
Biomedical quality control and quality assurance For participants living in non-remote areas, most of the AATSIHS blood and urine samples were collected at Sonic Healthcare collection clinics or via a home visit using standard operating procedures for phlebotomy collection. In some areas, other pathology service providers were used (including IMVS Pathology for regional areas in South Australia and Northern Territory) however, the same collection procedures were used. For participants living in remote areas, local temporary collection sites were set up in community areas and samples were collected by Aboriginal Health Workers and/or a Sonic Healthcare professional. All samples were transported to a central Sonic Healthcare laboratory at Douglass Hanly Moir (DHM) Pathology in Sydney, Australia. Sample quality was monitored during remote transportation using temperature logging information. All samples were analysed at DHM on machines accredited by the National Association of Testing Authorities (NATA). DHM conducted Internal Quality Control (QC) analysis for all the instruments used to conduct analysis on the AATSIHS blood and urine samples which were reported to the ABS. Periodic analysis of external Quality Assurance (QA) samples provided by the Royal College of Pathologists Australasia (RCPA) was conducted at DHM, with results independently assessed against set targets. The ABS monitored the analysis and delivery of results through key performance indicators which met contractual agreements with Sonic Healthcare. The results from the QC and QA reports indicate that the accuracy and precision of instruments used to analyse the AATSIHS samples fell within expected limits against set targets. More information on the quality assurance methods and procedures can be found in the Biomedical Measures chapter of this Users' Guide.
|