Geostatistical Interpolation Introduction

Data typically cannot be collected over the entire study area (i.e., cost of sampling and laboratory measurements), hence the need for spatial interpolation to fill in spatial gaps. Kriging was developed by the mathematician George Matheron based on the Master’s thesis of D. G. Krige of South Africa. The use of Kriging started in the 1960’s and was first used in the mining industry but has become a valuable tool more generally in the world of geostatistics.

Spatial interpolation capitalizes on the correlation between observations (green dots) as captured by the variogram, to make predictions at any unsampled location (blue dot, denoted u). The predicted value is typically calculated as the weighted mean of nearest observations (e.g., data located within a search window depicted by the black circle) and, because of the spatial correlation, observations closest to u (i.e., smaller distance h) tend to receive the largest weight. Differences between interpolators mainly reside in the way these weights are being calculated (Goovaerts, 1997).

Screenshot

Note that Kriging requires that all observations have different geographical coordinates (i.e., no co-located data) to ensure that the kriging system is solvable. The existence of data with similar coordinates is checked during data import and a warning about potential problems in later data analysis is issued if it is the case.

In addition to an estimated value, kriging provides a measure of prediction uncertainty in the form of the variance of prediction errors, known as the kriging variance.

Change of Spatial Support (COS)

Data are often available over different spatial supports, for example, some data are recorded for ZIP codes and others for census tracts.

How can we undertake correlation analysis for data with different spatial supports? There is a need for spatial interpolation to derive all variables over the same geographical support by changing of spatial support to make the datasets spatially “compatible”. Depending on the shape and size of the geographies involved, we distinguish three situations (see below and Goovaerts, 2014):

  1. up-scaling, such as when aggregating groundwater arsenic concentrations from individual wells to township level
  2. down-scaling, such as when assigning a population density to each groundwater well where arsenic levels are measured
  3. side-scaling between two sets of overlapping geographies, such as when deriving an incidence rate of prostate cancer for each block group using township data

Screenshot

More information on this process can be found on the Change of Spatial Support page.

These problems can be tackled using geostatistical methods implemented in Vesta as illustrated here for the City of Flint, Michigan.

For example, the workspace below shows a series of residential tax parcels in Flint (white polygons) overlaid on a satellite imagery. Dots represent a subset of residences that were sampled for water lead levels. These observations are interpolated using Kriging in Vesta to estimate the water lead level for all residences.

Screenshot

In the workspace below, the same observations are used to estimate water lead levels at the scale of census tracts (up-scaling) in order to smooth out local fluctuations and explore relationships with census variables, such as poverty level or ethnic composition.

Screenshot

A third example (shown below) illustrates the down-scaling approach where kriging in Vesta is used to estimate poverty level at the residential level (left map) using census tract data (right map).

Screenshot

The last example below is an application of side-scaling where kriging in Vesta is used to estimate poverty level at the voting precinct level (right map) using census tract data (left map).

Screenshot