DATA LINKING METHODOLOGY
The statistical linking methodology applied in this project is called probabilistic linking (Felligi & Sunter, 1969). This method links records from two datasets using several variables common to each. A key feature of the methodology is the ability to handle a variety of linking variables and record comparison methods to produce a single numerical measure of how well two particular records match. This allows ranking of all possible links and optimal assignment of the link or non-link status (Solon and Bishop, 2009).
The probabilistic linking methodology used here can be generalised into the following steps: