1001.0 - Annual Report - ABS Annual Report, 2000-01  
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 27/09/2001   
   Page tools: Print Print Page  
Contents >> Section 2 - Special Articles >> Chapter 3 - Research and Analysis in the ABS

INTRODUCTION

The ABS has a long history of significant investment in research and analysis of data. This investment is consistent with the legislation controlling the operations of the ABS, for example Section 6(1)(b) of the Australian Bureau of Statistics Act 1975 which states that the functions of the ABS include 'to collect, compile, analyse and disseminate statistics and related information'. Among the strategies in the latest ABS Corporate Plan is 'strengthening our analytical capability so that we can develop new measures from existing data, add value to the data we collect, and improve quality'.

The ABS pursues a diverse program of research into such matters as:

  • concepts and frameworks that can give the best statistical expression to important aspects of the Australian economy, society and environment;
  • ways of gathering data from households and businesses that deliver the most valuable information while imposing the lowest possible cost; and
  • emerging techniques and technologies that can enhance the value of statistical products or improve the efficiency of statistical processes.

In 1999, the ABS made a significant resource commitment to expanding and re-focussing its research and analysis capacity. This was in response to the increasing complexity of the demands of users and the availability to analysts of a wider range of datasets, both ABS and non-ABS.

Key aims of the expanded program include developing new socioeconomic indicators and other analytical products from existing datasets, and providing analytical services to the producers and users of socioeconomic data. This article concentrates on the analytical methods being developed and applied by the ABS to fill gaps in its suite of statistical products or to otherwise enhance the service that the ABS gives to its users.


UNDERSTANDING AND MEETING DECISION-MAKERS’ NEEDS

Through considering the potential demand for information in particular fields and carefully analysing the available datasets, both ABS and non-ABS, the ABS devises ‘information development plans’ that specify the statistical development activity to be undertaken. Important aspects to be addressed in such plans include the identification of:
  • Data Gaps - where important demands for statistics cannot be met immediately from existing data repositories. The ABS, in consultation with other stakeholders, assesses priorities, costs and benefits to decide which gaps should be filled, by whom and how. In some cases, the ABS might establish a new statistical collection or add a topic to an existing collection. But these are not the only ways of filling gaps in the suite of statistics; it may be possible to transform the raw contents of existing data repositories into statistical products. Often this entails the application of analytical techniques such as modelling to extract information from the raw data; and
  • Data Overlaps - where there are multiple sources of statistics bearing upon a given policy decision or research question. In these circumstances, the ABS might undertake a data confrontation project. The focus of such a project is to understand any differences between the statistical pictures painted by the multiple sources and, if possible, to construct estimates that best exploit the information that is latent in all the sources.

Finding analytical solutions to data gaps and overlaps is a key component of the ABS research and analysis program.


THE RESEARCH AND ANALYSIS PROGRAM

The current ABS research and analysis program addresses a broad range of issues, as outlined below.

New measures of socioeconomic concepts

Policy-makers and researchers appeal to a wide variety of socioeconomic concepts, not all of which can be measured directly in statistical surveys. In many cases, the survey data must be transformed or modelled to meet users’ needs. The ABS already publishes statistical measures for many concepts that arise in economic or social theory and underlie policy design - these include aggregate economic activity, productivity, inflation, income distribution, life expectancy, the energy intensity of production and so on.

Much of the ABS research and analysis program aims to expand the range of measures and to enhance the measures that are already published. Some projects currently underway are discussed below.

Human capital refers to the stock of knowledge and skills embodied in a nation’s people. The concept arises in theories of economic growth, educational choice and the labour market. At present, only proxy measures of human capital are available to Australian policy-makers and researchers - they can examine, for example, the numbers of people with particular educational qualifications. During 2001, the ABS has been developing experimental measures of the value of human capital for Australia. The key notion (one of several that might have been used) is that the economic value of individuals’ human capital can be expressed as the present-day value of the lifetime income streams that they can earn by applying their knowledge and skills. Initially, the main product of this research will be an aggregate time series which can be used to analyse trends and augment the existing estimates of Australia’s wealth. In the future, it may be possible to embed the human capital stock estimates in an integrated set of accounts that also display flows (such as migration, natural population growth and education) and link these with the labour accounts and other systems of socioeconomic data.

Productivity improvement is the growth in a nation’s output over and above that explained by growth in the inputs to production. Measures of productivity are important to an understanding of long term improvements in Australians’ living standards and to changes in such key variables as Australia’s international competitiveness. The ABS currently publishes estimates of multifactor productivity for the market sector of the economy (Cat. no. 5204.0) which take into account labour and capital inputs. In recent years, considerable research effort has been devoted to enhancing the quality of the estimates. For example, the ABS has developed better measures of output for service industries such as Finance, Property and Business Services and Health. The input estimates have also been improved - the productivity calculations now take account of the flow of capital services into production rather than the former proxy measure based on the stock of capital; and experimental estimates of labour inputs have been developed that incorporate adjustments for changes in the quality of labour. Future work will be focussed on exploring ways of constructing productivity estimates for individual industries, based on the annual series of input-output tables that is now compiled.

Other projects related to measuring socioeconomic concepts include:
  • applying ‘hedonic’ methods to compile computer price indexes. These methods allow one to take account of the rapid changes in the features and power of computers; and
  • assessing the feasibility of compiling ‘spatial price indexes’ or some other way of comparing price levels in different locations. The Consumer Price Index allows one to compare the rates of price change in the eight capital cities, but does not allow a comparison of price levels. A key issue that arises when compiling spatial indexes is the dissimilarity of some goods and services purchased in different places (such as heaters in Hobart and air conditioners in Darwin).

Statistics for ‘small domains’

Statistics compiled at the whole-of-Australia level satisfy the needs of many decision-makers and researchers. But other users need data dissected by geographic areas (for example, States or regions), by subpopulations (for example, age-sex groups or household types) or by industry or other dimensions. The ABS already publishes a wide variety of disaggregated data - the outputs from the Census, for example, include estimates for small geographic areas and population subgroups; many economic survey estimates are dissected by State and industry; and many social survey estimates are dissected by State and population subgroup. It is not possible to run a census for all socioeconomic topics, owing to the prohibitive financial cost and the load on households and businesses that provide the data. While ABS sample surveys can deliver somewhat disaggregated estimates, there is an ever growing user demand for more and finer dissections.

A current project is ascertaining whether it is possible to satisfy these user demands by applying modelling and other analytical techniques to existing datasets. Several projects are currently underway which aim to dissect published estimates across new dimensions or to generate finer dissections across existing dimensions. Two of these projects are discussed below.

Household wealth estimates are provided in the household balance sheet, which shows the value of the household sector’s net worth dissected by a number of assets and liabilities, such as dwellings, shares and loans (Cat. no. 5204.0). There is considerable interest in understanding how wealth is distributed across household types. This includes questions about whether wealth is more (or less) evenly distributed than income, and how the composition of assets and liabilities evolves during households’ life cycles. For some years, the ABS has been working toward a more integrated suite of income and wealth statistics (Cat. no. 6549.0). The ABS runs regular surveys of households’ incomes - but not of households’ wealth, owing to the cost, the burden on data providers and concerns about the quality of the data that would be obtained. During 2001-02, the ABS will be developing experimental estimates of household wealth dissected by household type using a range of indicators such as the stock indicators that are available for some assets and liabilities and the income/expense flow indicators that are available for other assets and liabilities. In the future, these estimates may be used to refine and extend the household balance sheet, and may be integrated with existing estimates of household income distribution.

Crime statistics are compiled from a variety of sources, including the National Crime and Safety Survey (NCSS) which presents estimates of crime victimisation and other variables at the State level (Cat. no. 4509.0). There is considerable interest in analysing the pattern of crime victimisation for smaller areas, but the NCSS sample is not large enough to support further geographic disaggregations. During 2001, the ABS is experimenting with model-based approaches for estimation of small area statistics. One procedure being investigated is to model the relationship between crime victimisation (from NCSS) and the characteristics of people and locations (from the Census) at a fairly coarse level of geography, and then to apply that model to the Census data at a finer level of geography.

Both the wealth and the crime estimates rely on assumptions about the relationships between variables. Some of these assumptions can be tested, but for others only broad plausibility checks are possible. Thus it will be important that the models, assumptions and experimental findings of these and similar projects are subjected to stringent peer review before the estimates could be regarded as anything other than experimental.

More generally, modelling the relationships between sample survey data and a Census or administrative databank may be a fruitful method of estimating for small domains. Similar strategies may help the ABS address the needs of users who want estimates at higher frequencies than are published at present such as monthly rather than quarterly, or annual rather than triennial).

Using alternative data sources

Traditionally, national statistical agencies such as the ABS have relied largely on their own censuses and sample surveys when compiling economic and social statistics. Such direct collections will remain a major element of ABS operations. However, other government agencies and businesses are accumulating large databanks that potentially have considerable value for statistical purposes. These databanks provide an opportunity to replace or supplement directly collected data. Alternatively they might allow the extension of existing estimates to smaller geographic areas or subpopulations or to more frequent time periods. There may also be the opportunity to use such data to support the measurement of socioeconomic concepts for which a direct collection would be too difficult or expensive.

Before by-product data can be used to generate ABS statistical products, many questions about analytical methods and quality must be addressed. The questions that arise in a few exploratory projects are discussed below.

Scanner datasets record the prices and quantities of goods purchased through certain outlets, such as supermarkets. In the short to medium run, ABS analyses are concentrating on using the datasets to model and refine current compilation practices for the Consumer Price Index (CPI), such as the selection of outlets and commodities for which CPI price collectors gather data and the index formulae applied at various levels of aggregation. In the longer term, the possibility of using scanner data directly in the compilation of price indexes or indicators of the value and volume of business activity may be considered. Before any such direct use could be contemplated, the ABS would have to achieve a thorough understanding of such issues as: what proportions of outlets, commodities and transactions are covered by the scanner data and how the coverage may vary over time; whether the barcodes are assigned in such a way as to permit consistent tracking of commodities; and what costs may be entailed by acquiring and pro
cessing the very large scanner datasets.

Administrative databanks have been assembled by many government agencies. Their main purpose is to assist management of the agencies’ business and customers, although some agencies such as the Australian Taxation Office and Centrelink are now extracting performance information and statistics. The ABS has been, and continues, to explore possibilities for using administrative by-product data to enhance the national statistical service. It may be possible, for example, to develop a better statistical picture of business demographics (such as patterns of business formation, growth, decline and closure) or of household income and labour dynamics (such as patterns of labour market experience, earnings and other sources of income). Statistics of this kind can be difficult and expensive to collect directly by survey. Before any such use of administrative data could be contemplated, there are a number of issues which would have to be addressed, including:
  • how the confidentiality of individuals’ data can be protected; and
  • how analyses can deal with the fact that the data may be partial (because the databank covers only the customers of an agency, not the whole population) or otherwise imperfect (because the variables in the databank were designed for business management purposes, not for statistical purposes).

Other projects in the research and analysis program, where consideration is being given to alternative data sources, include:
  • assessing the feasibility of deriving a more timely indicator of the pulse of economic activity from Electronic Funds Transfer Point-of-Sale (EFTPOS) and other by-product datasets;
  • using administrative data to estimate the value and volume of services (such as education, health and police services); and
  • using administrative data to enhance the range and quality of Indigenous statistics.

Over the years, the ABS has developed a large array of tools (mathematics, procedures and software) to analyse datasets collected through the Bureau’s own censuses and sample surveys. The question arises, however, whether those tools are the most appropriate for dealing with very large by-product datasets. Issues that are, or need to be, addressed include:
  • how might traditional models and methods have to change to deal with datasets that have not been assembled using ABS classifications, definitions and collection methods? For example, what methods are needed to assess the quality of the datasets (and especially to detect any drift in quality as time passes)?
  • how might traditional research strategies have to change? For example, might the bulk of the exploratory analyses be done on sampled datasets, and the preferred or final model be validated against the full dataset? and
  • what computing tools are needed to store, transport, browse and transform these very large (and rapidly growing) datasets?

As well as seeking to add value to (or extract the value latent in) particular by-product datasets available today, the ABS research program is gathering intelligence about possible future changes to the national data environment and to statistical systems.

Drawing statistical threads together

The ABS publishes a rich suite of statistical products describing major aspects of Australian life - the economy, society and environment. There is, however, a growing demand for statistical products that draw information together, regardless of source. Such ‘integrating’ statistical products help decision-makers and the community form a more comprehensive view of some aspect of life; they also help researchers analyse the interactions between key variables.

One important objective of the ABS analysis program is to develop products that draw statistical threads together. Areas of current research include:

Compendium publications present data relating to a whole field of national life. Prominent examples are Australian Social Trends (Cat. no. 4102.0), and Australia’s Environment: Issues and Trends (Cat. no. 4613.0). During 2001-02, the ABS is developing a new publication Measuring Australia’s Progress (MAP) which is a major project that will present an array of indicators encapsulating key aspects and indicators of national progress, together with analyses of historical trends and linkages. A first issue of MAP is scheduled for release in April 2002. Publications of this kind do more than just repackage tables available in other ABS publications. They bring in data relating to an economic, social or environmental issue regardless of the source. They can also cast light on gaps, overlaps and inconsistencies in the data, and hence act as a spur to further statistical development activity.

Satellite accounts are a more systematic way of drawing statistical threads together. They are an adjunct to the Australian na
tional accounts and can be used to:
  • highlight a particular aspect of economic life (such as tourism, which is not recognised as an industry in standard classifications);
  • display the results of different statistical treatments (such as household accounts, which treat the household as a producing, not just a consuming, unit; or income accounts adjusted for the depletion of natural resources);
  • analyse macro-micro links (such as dissected wealth matrices that link the aggregate balance sheet for the household sector with the distribution of wealth across types of household); and
  • analyse links between the economy and society and the environment (such as the link between the volume and composition of production and the emission of atmospheric pollutants).

Satellite accounts are a very powerful tool for analysis, but constructing them is expensive and time consuming. Key determinants in deciding whether to develop a new satellite account include: the strength of user demand; the availability of international standard concepts, frameworks and procedures; and the quantity and quality of data.

In late 2000, the ABS published a tourism satellite account for Australia, which was the culmination of extensive research and statistical analysis, and resulted in a product that presented aggregate activity, the supply and use of commodities and employment associated with tourism activities (Cat. no. 5249.0). On a similar basis the ABS has recently released Water Account for Australia (Cat. no. 4610.0), and Energy Accounts (Cat. no. 4604.0). The ABS has also developed experimental estimates of national income adjusted for the depletion and some discoveries of subsoil assets. Satellite accounts that are being developed or contemplated include: household wealth distribution; non-profit institutions; household production and consumption; information technology and telecommunications.

Confronting multiple data sources is an important aspect of the ABS analysis program because it allows one to understand the nature of differences between the statistical pictures painted by the multiple sources, and in doing so it may also allow one to construct estimates that make best use of the information embodied in all the sources. During 2000-01, the ABS has undertaken several projects of this kind, for example:
  • experimental indexes of socioeconomic disadvantage for Indigenous areas. These indexes distil information available from the Census and some surveys (such as the National Aboriginal and Torres Strait Islander Survey) to provide summary comparisons of socioeconomic conditions in Aboriginal and Torres Strait Islander regions and some other Indigenous geographic areas. The experimental indexes are intended partly as a testbed for better understanding the nature of Indigenous data and the need for further statistical development. Later stages of the project may incorporate Indigenous-coded data available from administrative sources in some States - these will both test the validity of the experimental indexes and provide more comprehensive coverage of some aspects for which Census and survey data are rather thin (such as Indigenous health status); and
  • comparisons of survey vehicles for disability data. The ABS has used a variety of means to collect data on the incidence and severity of disability - ranging from the comprehensive Survey of Disability and Aged Care (SDAC) to smaller modules incorporated in surveys that focus on other topics (Cat. no. 4430.0). Given the multiple sources, the significant issues include: what is the most cost-effective way of collecting data on disability; and whether it is possible to combine the various survey results to assemble a time series of disability estimates.

Other projects of this kind include using datasets from multiple sources to better understand life-long patterns of both formal and informal education and training; and the melding of police and ABS survey data to analyse the prevalence of crime.


ISSUES REGARDING ANALYTICAL WORK

Emerging technical issues

Much of the ABS analytical program relies on traditional techniques from mathematical statistics, econometrics, time series analysis and other disciplines. However, some of the technical matters arising in the research and analysis program are novel (not having been addressed until fairly recently in the literature) or at least are new to the ABS (not yet having been applied to the development of the Bureau’s statistical products). Some of the technical issues that have emerged in the research program to date, and are relevant to a wide range of ABS work, are discussed below.

Analyses that take account of complex survey design. ABS surveys are designed to yield accurate estimates of headline variables (say, the unemployment rate or the quarterly change in retail sales) while keeping the cost and the load on data providers as low as possible; they are not usually designed to deliver, for example, sets of household or business data that are most convenient for modelling. To enhance the accuracy and reduce the cost of surveys, ABS mathematical statisticians apply a variety of techniques such as stratification and clustering of the sample. Thus the sampling designs for some ABS surveys can be quite complex. When it later comes to analysis, however, many standard techniques for fitting and testing models ignore the complex sample design - in effect, it is assumed that the data have been drawn by simple random sampling. This expedient can lead to invalid inferences about the explanatory power of one’s models.

Multilevel analyses. Some ABS projects are trying to estimate socioeconomic variables or analyse data patterns at multiple geographic levels (say, both States and statistical local areas) or for multiple units (say, both persons and households). The relationships between variables can be quite complex. For example, the probability of falling victim to a crime may be influenced both by the characteristics of individual people and by the characteristics of the areas in which they live. Moreover, the strength of the various influences may rise or fall as one changes the unit of analysis from individuals to households or as one moves from coarse to fine geography. For problems of this kind, so-called ‘multilevel-effects modelling’ may provide a natural framework in which to test hypotheses and develop new analytical products.

Analyses of huge datasets. The datasets being used in some analytical projects, especially the transactional and customer databanks, are very large. As discussed earlier, exploiting the statistical potential of such datasets may prompt some reconsideration of ABS research strategies, analytical techniques and software tools.

An important task for a national statistical agency is to find methodologies that both deliver defensible statistical products and are robust and economical enough to permit their application to satisfying a wide range of users’ needs. The ABS is building its knowledge by designing pilot projects that will exercise promising new analytical approaches and will deliver product prototypes for users’ scrutiny. Joint projects with researchers in universities and other organisations will also improve our understanding of the emerging issues.

Quality and validation of analytical products

Over the years, the ABS has developed an array of tools for monitoring, managing and enhancing the quality of estimates derived from its traditional data collections. For example, the Census includes a rigorous program of pre-enumeration testing and post-enumeration validation, and survey outputs are accompanied by statements regarding sampling and non-sampling errors. But quality management for analytical products can pose different and difficult problems.

First, many analytical products are derived from models or other elaborate transformations of data. The assumptions underlying the models may be contested by other analysts, or they may be inconsistent with the applications that some users have in mind. To ensure that its analytical methods are professionally defensible, the ABS submits its projects to peer review by experts in both the subject matter and the statistical techniques. To ensure that users understand the applications for which ABS analytical products have been developed (and applications which the products will not support), such products are accompanied by detailed information about the models or transformations that have been applied and the assumptions that have been invoked.

Second, the quality of the administrative and business by-product datasets being used in some analytical projects is not yet well understood. Some projects may be applying ABS survey data to purposes for which they were not originally designed. To assist users, the ABS is developing quality statements for its analytical products. The statements describe the sampling and non-sampling errors in the original datasets and how those errors may flow through the models and transformations to the analytical product; it may be possible in some cases to estimate confidence bounds around key transformed statistics. The quality statements also give users information about the consequences of the models themselves being inappropriate to the task at hand.

Third, most analytical products are initially released to users as ‘prototypes’ or ‘experimental statistics’. The intention is that the user community should have time to scrutinise and comment on the prototypes before they become part of the suite of official statistics. The ABS has found that this practice can elicit valuable suggestions for improving the content and presentation of new products.

Collaborative research and development

Developing a new analytical product can demand an understanding of abstruse concepts (such as human capital theory or environmental economics), sophisticated techniques (such as multilevel modelling or time series signal extraction) and refractory data. The ABS does not yet have the expertise or resources needed to cover all emerging areas of user demand; and unless some skill is likely to have fairly wide application to the Bureau’s work, it may not be sensible to develop in-house expertise. Thus, the ABS is keen to establish collaborative working relationships with analysts and users in government agencies, universities and other organisations. Such relationships may take a variety of forms, such as:
  • Joint projects. The project team would include members from both the ABS and a partner organisation. Joint projects are among the most valuable ways of pooling expertise and transferring knowledge. Before such an arrangement is established, issues to be addressed include: how to define a research topic that will both enhance the national statistical service and interest the partner organisation; how to protect data confidentiality; and how to negotiate rights over publications and other products flowing from the joint work; and
  • Project board membership and peer review. All substantial analytical projects in the ABS are governed by a project board. A partner organisation may provide board members who will oversee key decisions such as the scope and aims of the research project, the choice of data and methods, and quality standards. Alternatively the partner may provide members of the peer review panel that examines the prototype.


CONCLUSION

The analysis and research program of the ABS has already delivered a number of significant outcomes such as the tourism satellite accounts and improvements in the quality of a number of existing products. However the full impact of the significant increase in resources in this area will only begin to emerge in 2001-02 with the release of a number of significant products including Measuring Australia’s Progress.

The ABS envisages a significantly increased role for research and analytical techniques in the future so as to meet the increasingly complex needs of users. Examples where the use of analytical techniques have the potential to fill gaps in the suite of statistical products include:
  • developing measures of concepts that are not susceptible to direct observation;
  • providing estimates for smaller areas or subpopulations or for more frequent time periods;
  • tapping the latent value of administrative and business databanks to enhance (or reduce the cost of) statistics; and
  • drawing together the statistical threads to encapsulate a major aspect of the Australian economy, society and environment.

If the ABS analysis program is to deliver the best possible value to users, it must be based on a thorough understanding of the decisions that users are making and the research that they are undertaking. It also demands an understanding of the quality (and other characteristics) of statistical products that are most crucial to supporting users’ activities. The ABS is keen to collaborate with analysts and other users to develop that understanding.



Previous PageNext Page