In public health, visualizing, analyzing, and interpreting the spatial distribution of diseases is critical for effective disease prevention, control, and resource allocation. Data mapping provides valuable insights into the spatial patterns of diseases, such as clusters of cancer cases or areas with elevated risk factors. By visually representing the spatial distribution of diseases, public health officials can identify high-risk areas, target resources more effectively, and implement targeted interventions to mitigate disease transmission.
The Power of Geostatistics in Public Health
Individual humans represent the basic unit of spatial analysis in health research. However, because of the need to protect patient privacy, publicly available data are often aggregated to a sufficient extent to prevent the disclosure or reconstruction of patient identity. The information available for human health studies thus falls within two main categories: individual-level data (e.g., residences of patients with early or late-stage diagnosis for breast cancer) or aggregated data (e.g., percentage of late-stage diagnosis at the census tract level).
Although none of these datasets falls within the category of “geostatistical data” as classically defined in the spatial statistics literature (Cressie 1993), geostatistics, which was originally developed to improve the evaluation of recoverable reserves in mining deposits, offers a promising alternative to common methods for analyzing spatial point processes and lattice data (Goovaerts and Gebreab, 2008).
In particular, it allows the incorporation of spatial autocorrelation in the mapping process while providing a measure of uncertainty of the predicted risk, which can be substantial when based on observed rates computed from sparsely populated geographical units or recorded for minority populations. In the case of breast cancer recorded at the census tract level, very high incidence rates (red polygons in the map above) correspond to smaller census tracts where few cases make any rate calculation unreliable. Mapping observed or raw rates can then be misleading and some form of noise-filtering or smoothing is warranted (Goovaerts, 2005).
Disease Mapping: A Practical Application of Geostatistics
One of the main features of geostatistical mapping techniques is their reliance on direct modeling of the spatial structure in the data instead of simplistic models based on adjacency relationships. Besides its use in the computation of kriging (i.e., interpolation) weights, semivariogram analysis can help interpret and detect multiple scales of variability.
The second benefit of geostatistical interpolation is its ability to account for population size in the spatial filtering of the noise attached to unreliable disease rates. For the example of breast cancer mortality rates in Michigan, geostatistical noise-filtering smoothed the extreme low and high rates observed in sparsely populated counties of the North. The more significant uncertainty of rates estimated for small populations is captured by the error variance that increases as population density decreases.
Third, data measured at various scales and over different spatial supports can be incorporated using geostatistics (Goovaerts, 2006). This situation is frequently encountered in health studies where data are typically available over a wide range of scales, spanning from individual-level to different levels of aggregation (e.g. census tracts, ZIP codes, counties). Kriging also offers a flexible framework to derive disease risks over spatial supports that differ from the original measurements. Geostatistics allows to tackle three different types of change of support: upscaling (e.g., individual-level to census tract, or census tract to county), downscaling (e.g., choropleth to isopleth maps), and side-scaling (e.g. ZIP codes to census tracts).
Area-To-Point (ATP) kriging, allows the creation of maps where disease rates vary continuously in space (isopleth maps), reducing the visual bias associated with the interpretation of choropleth maps; see above map on breast cancer mortality rates in Michigan. Side-scaling is useful for exploring relationships between health outcomes and putative factors recorded over different geographies
Geostatisitcs: An Influential Tool For Public Health
Overall, geostatistical mapping has great potential for improving our understanding of the complex relationships between environmental exposures, social determinants, and health outcomes. By identifying spatial patterns and risk factors, geostatistical methods can help inform public health interventions and policies aimed at reducing health disparities and improving the health of populations.
- Cressie, Noel. 1993. Statistics for Spatial Data. New-York: Wiley.
- Goovaerts, P. 2005. Geostatistical analysis of disease data: estimation of cancer mortality risk from empirical frequencies using Poisson kriging. International Journal of Health Geographics, 4:31.
- Goovaerts, P. 2006. Geostatistical analysis of disease data: accounting for spatial support and population density in the isopleth mapping of cancer mortality risk using area-to-point Poisson kriging. International Journal of Health Geographics, 5:52.
- Goovaerts, P. and S. Gebreab. 2008. How does Poisson kriging compare to the popular BYM model for mapping disease risks? International Journal of Health Geographics, 7:6.