6401.0.60.004 - Information Paper: An Implementation Plan to Maximise the Use of Transactions Data in the CPI , 2017  
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 16/06/2017  First Issue
   Page tools: Print Print Page Print all pages in this productPrint All

METHODS FOR COMPILING TRANSACTIONS DATA


BACKGROUND

2.1 Multilateral methods possess a number of desirable qualities, both theoretical and practical, to produce temporal price indexes from transactions data. This section details the practical and methodological decisions for aggregating transactions data in four sub-sections: aggregation structure, multilateral method, extension method and multilateral window length. This publication returns to the framework established in the Information paper: Making Greater Use of Transactions Data to compile the Consumer Price Index (cat. no. 6401.0.60.003) which linked the ABS Data Quality Framework (DQF) to six main criteria for an NSO to evaluate different multilateral methods (Table 2.1).

Table 2.1: Framework for assessing multilateral methods

Consideration Quality dimensions

Resources: does this method help facilitate more effective use of human and information resources? Institutional Environment, Timeliness
Theoretical properties: what conceptual properties does the index method have, and how well do these align with the CPI purpose? Accuracy
Transitivity: to what extent is the index transitive? Accuracy, Coherence
Characteristicity: to what extent are price comparisons relevant to the time periods being compared? Accuracy, Relevance
Flexibility: what scope is there to use or adapt the method for new statistical products or data sources? Coherence, Institutional Environment
Interpretability: how easy is it to understand the method and the price movements it calculates? Interpretability



2.2 For an NSO to implement new methods using transactions data, it is necessary to describe how changes will be harmonised with existing data sources and methods used in the CPI. At the lowest level, a NSO must define a homogeneous item and how unit values will be aggregated across time, retailer and region. At a higher level, elementary aggregation must occur where prices are combined to produce price indexes that must be combined with other components in the CPI. Operational decisions about how multilateral methods will be implemented into the wider CPI collection will be discussed below in the aggregation structure sub-section.

2.3 The ABS has previously conducted research into four multilateral methods for implementation into the CPI. Empirical findings showed that different multilateral methods typically produced similar results in practice which is consistent with other research findings (Ivancic, Fox and Diewert 2011; Chessa, Verburg and Willenborg 2017). The multilateral method sub-section will detail the preferred method for the Australian CPI based on the framework described in Table 2.1.

2.4 Practical challenges exist when applying multilateral methods in a production setting. When a multilateral index is extended by an additional period (e.g. quarter), previous price movements are revised, which is unacceptable for NSOs. To deal with this revisions problem, the ABS will implement an extension method to compile the CPI which is described in the extension methods sub-section.

2.5 Finally, the decision to implement a multilateral method requires a NSO to specify the number of time periods used for each set of price comparisons. Most research has recommended a minimum of one year plus one period (i.e. five quarters) to account for seasonal availability of products. The estimation window sub-section will detail the preferred estimation window size for the Australian CPI.


AGGREGATION STRUCTURE

Product definition

2.6 The definition of a homogeneous product where the calculation of a unit value occurs will largely remain consistent with current practices in the CPI. The ABS will continue to define products using product classifications provided by Australian proprietors known as the stock keeping unit (SKU). The unit value will continue to be calculated using expenditure and quantity information across all stores from the same proprietor for each capital city in Australia (e.g. Company 1 for Sydney).

2.7 The unit value will be calculated on a quarterly frequency to align with the publication frequency of the Australian CPI. This differs slightly with current CPI practice, where unit values are derived at both monthly and quarterly frequencies for practical reasons (e.g. consistent with other modes of collection). Research has shown that the unit value calculation should align with the publication frequency of the CPI (Diewert, Fox and de Haan 2016).

2.8 The calculation of the unit value should occur across products that are considered equivalent from the perspective of a consumer. Research by other NSOs has shown that matched model multilateral indexes can have a downward bias if price increases are missed when the same item is ‘relaunched’ using a different product identifier (Chessa 2016). The issue of relaunches is a known problem when identifying products using barcodes for certain commodities, while the choice of a broader product definition such as SKU (which is an aggregation of multiple barcodes) should mitigate this problem. The ABS will continue to monitor the suitability of defining products using the SKU.


Elementary aggregation

2.9 Following the definition of a product, elementary aggregation (i.e. aggregating prices to form price indexes) can be performed using a multilateral method. The Information paper: Making Greater Use of Transactions Data to compile the Consumer Price Index (cat. no. 6401.0.60.003) presented empirical evidence using a modified aggregation structure that aggregated prices directly to the EC level for each respondent using a multilateral method. Peer review received by the ABS highlighted that some ECs contain relatively heterogeneous items, and that performance of the multilateral method could be improved by compiling multilateral methods below the EC level. Compiling multilateral methods below the EC level is consistent with practices adopted by other NSOs (Dalén 2017; Chessa 2016).

2.10 The ABS has conducted further research into compiling multilateral methods below the EC level using respondent classifications provided within transactions datasets. Figure 2.1 details an aggregation structure for implementation in the CPI, which uses respondent classes as elementary aggregates (EAs) when these are available from transactions datasets. The Törnqvist index formula will be used to aggregate respondent EAs together to compile ‘Respondent x EC’ price indexes in order to capture changes in consumer expenditure patterns overtime. ‘Respondent x EC’ indexes will be weighted by expenditure (market) share using the Lowe Index formula, with weights being reviewed on an annual basis using both transactions and other data sources.

Figure 2.1 Aggregation structure
Diagram: Figure 2.1 Aggregation structure


2.11 The aggregation structure in Figure 2.1 produced very similar time series compared to aggregation direct to the EC level as presented in the Information paper: Making Greater Use of Transactions Data to compile the Consumer Price Index (cat. no. 6401.0.60.003), since both aggregation structures use expenditure weights at the product level. Despite a negligible empirical difference, the structure in Figure 2.1 is preferred as it ensures coherence with traditional elementary aggregation in the CPI whilst utilising each respondent’s expenditure information using the Törnqvist index formula.

2.12 The structure described in Figure 2.1 includes contributions from transactions data respondents only. This structure will be used to compile price indexes for 28 ECs (list provided in paragraph 2.20). The motivation to compile these ECs using transactions data only is based on evidence of high expenditure (market) share, as well as the resources required to maintain a high quality non-transactions data index component. Moving forward, the ABS will monitor the suitability of the ECs using transactions data only.


MULTILATERAL METHOD

2.13 As discussed in section one, multilateral methods offer advantages to NSOs where price indexes can be compiled using a census of all products whilst producing weighted price indexes that are free of chain drift. The Information paper: Making Greater Use of Transactions Data to compile the Consumer Price Index (cat. no. 6401.0.60.003) conducted research into the following four methods for potential implementation into the Australian CPI(footnote 1) . These methods were:
  • Weighted Time Product Dummy (TPD)
  • Geary-Khamis (GK)
  • Quality adjusted unit value using TPD (QAUV_TPD)
  • GEKS-Törnqvist(footnote 2)

2.14 One way to assess the accuracy/performance of multilateral methods is to evaluate it against a set of desirable properties. This is known as the test approach. The Information paper: Making Greater Use of Transactions Data to compile the Consumer Price Index (cat. no. 6401.0.60.003) identified that no multilateral method passed all the tests proposed by Diewert (1999) and Balk (1996, 2001), meaning that the importance placed on each test would dictate the preferred multilateral method.

2.15 The Information paper: Making Greater Use of Transactions Data to compile the Consumer Price Index (cat. no. 6401.0.60.003) also assessed multilateral methods from the economic approach to index numbers. This approach assumes consumers optimise their basket of purchases to minimise cost for a given level of utility. It identified that the GEKS-Törnqvist method is exact for a "flexible" functional form - that is, it expresses the price differences experienced by optimising consumers without imposing restrictive assumptions about how they can substitute between products (Diewert 1999). In contrast, the GK method and other additive methods are consistent only with either perfect substitution or perfect non-substitution, so they may suffer from substitution bias if consumer preferences are more complex. At the time of the information paper, the TPD and QAUV_TPD methods had not been assessed rigorously from the economic approach.

2.16 Recent work by Diewert and Fox (2017) has assessed the TPD, GK, QAUV_TPD and the GEKS-Törnqvist from the economic approach to index numbers. This work established that the TPD is an approximately additive method that is consistent with linear and Cobb-Douglas preferences, while the QAUV_TPD shares the same economic assessment as the GK method. Using a simulated dataset to mimic the characteristics observed in transactions data, the authors show that in certain circumstances the TPD and GK can diverge from indexes that are free from substitution bias.

2.17 In terms of the other criteria described in Table 2.1, all four multilateral methods have the flexibility to deal with different types of data sources (e.g. data without weighting information, data with characteristic information). With respect to interpretability, the GEKS-Törnqvist has a slight advantage in that it is based on traditional price index theory - as the multilateral movements are derived by combining superlative bilateral indexes.

2.18 Testing the different multilateral methods in practice, the ABS found little difference and no clear indication of substitution bias for the TPD, GK and QAUV_TPD. The Information paper: Making Greater Use of Transactions Data to compile the Consumer Price Index (cat. no. 6401.0.60.003) found evidence that when the multilateral methods temporarily diverged, it was due to the GEKS-Törnqvist use of average matched expenditure shares to weight the importance of products. Other authors (Chessa, Verburg and Willenborg 2017; Diewert 2013) have drawn similar conclusions, demonstrating that the GEKS-Törnqvist can be sensitive to products that have periods with atypical prices and very small quantities (clearance prices). In these instances, the other multilateral methods may have an advantage over the GEKS-Törnqvist.

2.19 Weighing up the above considerations, the ABSs preferred method for compiling price indexes using transactions data is the GEKS-Törnqvist. While the different multilateral methods produce similar results, the two main criteria that differentiate the GEKS-Törnqvist from the other multilateral methods are its theoretical properties (economic approach to index numbers) and interpretability (based on bilateral index number theory). To remedy the sensitivity of the GEKS-Törnqvist to products with atypical prices and small quantities (clearance prices), the ABS will refine its methods to detect and exclude these products from index compilation. The exclusion of products at clearance prices is consistent with current practices adopted in the CPI.

2.20 The list below details the ECs which will use the GEKS-Törnqvist as the aggregation method where transactions data are available. These 28 ECs account for approximately 17 per cent of the CPI weight as of March quarter 2017.

ECs using multilateral methods:
  • Beef and veal
  • Bread
  • Breakfast cereals
  • Cakes and biscuits
  • Cheese
  • Cleaning and maintenance products
  • Coffee, tea and cocoa
  • Eggs
  • Fish and other seafood
  • Food additives and condiments
  • Fruit
  • Ice cream and other dairy products
  • Jams, honey and spreads
  • Lamb and goat
  • Milk
  • Oils and fats
  • Other cereal products
  • Other food products n.e.c.
  • Other meats
  • Other non-durable household products
  • Personal care products
  • Pets and related products
  • Pork
  • Poultry
  • Snacks and confectionery
  • Tobacco
  • Vegetables
  • Waters, soft drinks and juices


EXTENSION METHOD

2.21 When multilateral methods are used to produce a temporal index, each bilateral price comparison depends on prices observed in other periods of the multilateral comparison window. As a result, incorporating a new period into the multilateral comparison window may revise previous price indexes, which is unacceptable for CPI purposes. To resolve this, researchers have developed methods for using the latest multilateral index incorporating the latest data to update the published index series.

2.22 The Information paper: Making Greater Use of Transactions Data to compile the Consumer Price Index (cat. no. 6401.0.60.003) considered four methods for extending the index series. These can be characterised into the following two groups(footnote 3) :
  • The direct (annual) extension(footnote 4) method proposed by Chessa (2016). This involves extending the multilateral estimation window from some (annually) fixed base period as each new period becomes available, and using the price change between the base period and the new period to extend the series.
  • Rolling window methods inspired by Ivancic, Diewert and Fox (2011), which all involve calculating a new multilateral index using a window of fixed length as each new period becomes available. Having chosen some splice period common to the current and previous windows, the series is extended using the ratio of the price change between the splice period and the current period (using the current window) and the price change between the splice period and the previous period (using the previous window). Choosing the splice period to be the previous period yields a movement splice (Ivancic, Diewert and Fox 2011); choosing the start of the current window yields a window splice (Krsinich 2016); choosing the midpoint of the current window yields a half splice (de Haan 2015). Algebraically, the published index movement from the previous period (t -1) to the current period (t) can be expressed as:

Equation: This equation shows how the long-term price movement can be derived from rolling window multilateral comparisons.

where:

Equation: Price movement between the splice period(s) and current period (t ) based on the current multilateral window.= price movement between the splice period s and t based on the current multilateral window

Equation: Price movement between the splice period (s) and the previous period (t-1)  based on the previous multilateral window.= price movement between s and t-1 based on the previous multilateral window

2.23 The Information paper: Making Greater Use of Transactions Data to compile the Consumer Price Index (cat. no. 6401.0.60.003) found that the indexes extended using the direct method can be influenced by the choice of link month. Although these indexes seem plausible in the long term, their price movements soon after the link period are based on only a few periods of data, which can make them more volatile than index movements later in the window. Lamboray (2017) suggests a hybrid fixed-base rolling window approach which may address this issue: this is an area for further research. For the moment, however, the ABS decided not to adopt the direct (annual) extension method.

2.24 The Information paper: Making Greater Use of Transactions Data to compile the Consumer Price Index (cat. no. 6401.0.60.003) compared rolling window splicing methods and found that the movement, window and half splice indexes were often similar. On balance, however, the half splice indexes were most often (but not always) the closest to the reference "full" index which incorporated no splicing, with other splicing methods displaying a degree of drift.

2.25 A likely cause of this drift is systematic price changes immediately after products appear or disappear that reflects the product's age rather than actual inflation. For instance, where products tend to appear at prices that are (in retrospect) higher than normal or disappear at prices that are lower than normal. This can result in downward quality adjustment bias. As Krsinich (2016) and de Haan (2015) argue, the window splice mitigates this bias for new products by implicitly revising the contributions of new products to the index as more of their prices become available. However, this implicit revision makes the window splice sensitive to the price changes of disappearing products, as the current window has less information about their normal prices than the previous window. Conversely, the movement splice mitigates quality adjustment bias for disappearing products but will be sensitive to new products. The half splice is a reasonable compromise between the two.

2.26 Since the Information paper: Making Greater Use of Transactions Data to compile the Consumer Price Index (cat. no. 6401.0.60.003), Diewert and Fox (2017) have tested a "mean splice" rolling window method - initially proposed by Ivancic, Diewert and Fox (2011) - which involves extending the index using the geometric mean of the indexes produced from all possible choices of splice period. Using the notation above, the mean splice extension can be expressed algebraically as

Equation: This equation shows how the long-term price movement can be derived using the mean splice rolling window method. This method uses the geometric average of all price indexes between two multilateral windows using every possible link quarter.

where the multilateral window length is T +1 periods, so the current and previous periods overlap between t-T and t-1. It can be shown that the mean splice effectively makes a small implicit revision to price movements early in the current window and a large implicit revision to price movements later in the current window. This mitigates the effect of both new and disappearing products, similar to the half splice.

2.27 Empirical testing suggests that the half and mean splice methods produce comparable indexes. A typical example is shown in Figure 2.2 below which compares GEKS-Törnqvist indexes with different splicing methods using a nine quarter estimation window. Since these results are at the respondent level, all splicing methods have been standardised (i.e. period 0 corresponds to an index level of 100), and are then expressed relative to the mean splice index (e.g. Movement splice (MS) = MS index less mean splice index). The results show that the mean and half splice (HS) are the closest in proximity (within 0.5 index points), while the movement (MS) and window splice (WS) deviate from the mean splice by a larger amount in opposite directions. Figure 2.3 shows an example where the half splice (HS) deviates a larger magnitude from the mean splice.

Figure 2.2: Cakes and biscuits EC
Graph: Figure 2.2: Cakes and biscuits EC


Figure 2.3: Fruit EC
Graph: Figure 2.3: Fruit EC


2.28 In summary, the ABS will use the mean splice. This is motivated by several factors:
  • Conceptually, it seems more natural to make the results independent of the choice of splice period by using all the periods they have in common, rather than choosing a single splice period.
  • Empirically, the mean splice appears more robust - while the half splice mitigates systematic quality adjustment bias, choosing an alternative splice period close to the midpoint can give quite different results.
  • The mean splice has appealing properties in the long term - this is an area for further study


MULTILATERAL WINDOW LENGTH

2.29 The decision to implement a multilateral method requires an NSO to specify the number of time periods used for price comparisons. Most research using rolling window approaches has recommended a minimum of one year and one period (i.e. five quarters, 13 months) to account for products seasonal availability, though there is currently no consensus on the optimal length of the multilateral window.

2.30 The choice of multilateral window length is a trade-off between two criteria described earlier in Table 2.1 - characteristicity and transitivity. If the multilateral window is too long then the index could suffer from a loss of characteristiciy where price change in the past may disproportionally impact recent inflation estimates. If the multilateral window is too short, the index may suffer from the 'chain drift' problem. Empirical testing of different window sizes is necessary to assist with this decision.

2.31 Results presented in the Information paper: Making Greater Use of Transactions Data to compile the Consumer Price Index (cat. no. 6401.0.60.003) used a window size of two years and one period (i.e. nine quarters, 25 months) as the preferred window length - this was based on empirically testing various estimation windows compared to each other (as well as their proximity to different "full" price series). Empirical testing by the ABS showed that varying the length of the estimation window generally made little difference to the price series generated. When the series did diverge the use of a shorter window (i.e. one year and one period) tended to display more downward (upward) drift if the series showed a decreasing (increasing) price trend. This publication continues with recommending a window size of two years and one period for the length of the multilateral window.

1 Multilateral Methods of the Information paper: Making Greater Use of Transactions Data to compile the Consumer Price Index (cat. no. 6401.0.60.003) provides a more detailed explanation of these multilateral methods. <back
2 This method is also known as CCDI attributed to the authors Caves, Christensen and Diewert (1982) and Inklaar and Diewert (2017). <back
3 Multilateral Extension Methods of the Information paper: Making Greater Use of Transactions Data to compile the Consumer Price Index (cat. no. 6401.0.60.003) provides a more detailed explanation of these extension methods. <back
4 This method is named the Fixed Base Moving Expansion (FBME) by Chessa, Verburg and Willenborg (2017). <back