4 Space-Time Data Visualization Techniques To Consider

Spatial and space-time data visualization are not new concepts, and there are many proven techniques that streamline the visualization process of geospatial data. Before doing anything with data, it always helps to have a plan, and that plan begins with the identification of the questions you want to answer and the purpose behind diving into the data. This blog will dive into geospatial data visualization and 4 visualization techniques to consider for your next project.

What is Geospatial Data?

Understanding geospatial data is easiest in the context of the four W’s: What, Where, When, and Why.

What data are you working with? Observations on deaths from disease? Species of trees in a forest? Concentrations of soil cadmium? Knowing what you are observing drives the nature of your visualizations and questions asked.
Where are these observations occurring, and at what spatial scale?
When do the observations occur?
Why do they take on the values you observe at given locations and specific times? Geospatial data are comprised of two types, called vector and raster, as described here.

Reasoning With Geospatial Data

To obtain knowledge and insights by visualizing geospatial data, two approaches are frequently considered: data-driven or hypothesis-driven.

In practice, both approaches are used in an analysis. The data-driven approach assumes you have no prior thoughts on what might be creating patterns in your data. By “patterns,” we mean relationships observed on both maps and statistical graphics (such as scatterplots). Inspection of patterns often yields thoughts and questions regarding the origin of those patterns – a means of hypothesis generation.

In the hypothesis-driven approach, a prior hypothesis originating from knowledge of underlying processes might suggest the patterns one expects to observe, and visualization then becomes the first step in hypothesis testing.

What To Understand About Visualization

In an earlier blog, we discussed rapid visualization using statistical and cartographic brushing, and how that can lead to hypothesis generation. This article will focus on different maps and several statistical graphics that can be used to generate insights and knowledge.

Visualization quickly becomes an iterative, interactive activity where one looks at maps and graphics, has an idea, and then explores that new idea using additional visualizations. That’s why it’s critical to apply strong inference to any visualization exercise you undertake.

Map views depend on the kind of data being presented. You’ll usually start with a background map showing relevant features to provide the viewer with references, then add layers showing the data of interest. Layers can be added to a map showing point data, area data, and continuous data to understand air pollution levels, disease outcomes, population centers, and so on.

Map visualizations oftentimes lead directly to hypothesis generation, but the pattern on maps can be highly misleading.

Is a given map pattern significant in some sense, or is it attributable entirely or in part to chance? Always remember: The visual assessment of patterns on maps is highly subjective. For this reason, it is useful to assess the statistical significance of map patterns using cluster analysis and other pattern recognition techniques.

Four Spatial and Space-Time Data Visualization Techniques

The next step is to view relationships among the data using statistical graphics, including boxplots, histograms, scatterplots, and three-dimensional plots, among others.

1. Boxplots

Boxplots, or box-and-whisker plots, are used to show the distribution of data based on five-number summaries: minimum, first quartile (Q1), median, third quartile (Q3), and maximum. The “box” represents the interquartile range (IQR), which contains the middle 50% of data, while “whiskers” extend to the minimum and maximum values, excluding outliers. Outliers, often marked as individual points, fall beyond 1.5 times the IQR. Boxplots are useful for comparing the distribution and identifying outliers in datasets across categories.

Box plots in Vesta are time-enabled. Shown here is the map of breast cancer late stage diagnosis rates in Michigan in 1994 (year shown at figure bottom). The time series of the box plots by year from 1986 to 2006 are shown on the right, with the statistics for 1994 shown in the upper right margin. Vesta box plots rapidly communicate that the variability in late stage diagnosis is decreasing through time (because the length of the whiskers is decreasing) and the rate, over all, is decreasing as well (because the mean in each year is trending down). One also sees that, after reaching a minimum in 2002, the rate of late stage breast cancer diagnosis increases a bit. Careful monitoring and perhaps intervention might be warranted in the future.

2. Histograms

Histograms display the frequency distribution of continuous or discrete data. The data range is divided into intervals (or “bins”), and bars are drawn to represent the count or proportion of data within each bin. Taller bars indicate more observations in that interval, which helps reveal data distribution characteristics such as skewness, modality, and spread. It is especially useful to brush select data on the histogram and see where they appear on the map.

Histograms in Vesta are time-enabled with cartographic and statistical brushing. Here we see the breast cancer data with the map on the left and the histogram for 1986 on the right. The highest values on the histogram have been selected by clicking on the corresponding bar in the histogram (shown in gold). The counties with those high values are shown on the map in purple. This operation can be accomplished for any year in the dataset, supporting rapid hypothesis generation based on an understanding of when and where high (or low, or medium) values occur.

3. Scatterplots

Scatterplots visualize relationships between two quantitative variables, where each point represents an observation’s value on both the X and Y axes. They are particularly useful for identifying correlations, trends, and clusters. When color-coded, scatterplots can also display categories or additional variables, making it possible to examine multi-dimensional relationships in the data.

Vesta scatterplots are time- and brushing-enabled. Could poverty be driving high breast cancer late stage diagnosis rates? Here we see in 1986 the 3 counties with the highest rates are not those with the highest poverty rates, in fact, they are medium poverty compared to the state (inspect scattergram on the right poverty vs. rate). Statistics for all of Michigan (extreme upper right) are shown for all 68 counties (poverty mean 0.10607) and also for the 3 selected ones (poverty mean 0.11867) and are not that different. Also, the overall correlation between rate and poverty state-wide is only r=0.017053.

4. Three-Dimensional (3D) Plots

3D plots add a third axis (usually the Z-axis) to provide depth, making it possible to visualize relationships between three quantitative variables simultaneously. Points, lines, or surfaces are drawn in a 3D coordinate system, which helps in identifying patterns or correlations in multi-dimensional data. While effective, 3D plots can sometimes be harder to interpret due to perspective distortion or overlap, so they’re often used interactively or with rotation options.

Vesta 3d plots are time-enabled, brushing-enabled, and can be rotated. Here the researcher suspects the high rates in Michigan’s thumb area may be explained by a high driving distance to breast cancer screening facilities, as well as poverty. She identifies the county with the highest rate and sees it is unremarkable in terms of driving distance and poverty.

Each of these techniques provides unique insights, and the choice depends on the data type, the number of variables, and the specific insights needed. As noted earlier, multiple map layers can then be associated with the statistical graphics, and statistical and cartographic brushing used to rapidly explore data relationships.

3 Takeaways Of Spatial Data Visualization Techniques

There are 3 main lessons to walk away with:

Visual map pattern analysis is subjective but useful for establishing context and generating hypotheses.
Statistical inference and modeling can be used to assess the significance of map patterns, to make predictions, and for forecasting.
Interactive graphics and maps using linked windows and brushing, as shown in the video, above, are extremely useful for rapid hypothesis generation and are a first step in statistical analysis and modeling.

Take a look at our GIS solutions and get in touch with our team for more information.