Local Cluster Analysis (Previously Local Moran's I)

Local Indicators of Spatial Autocorrelation (LISAs) can be used to identify points or locations that are parts of statistically significant clusters of similar values or spatial outliers (areas distinct from their neighbors). The Local Moran’s I is the most commonly used LISA statistic (Anselin, 1995). It is essentially a moving window approach to compute the Moran’s I coefficient for each data point. It can be used as indicators of local spatial clusters and as a diagnostic for outliers in global spatial patterns.

The univariate Local Moran statistic is currently implemented in Vesta. This method tests for spatial autocorrelation in a single variable for each time period.

Screenshot

Test Statistics

The local Moran's I statistic compares the value recorded at location u₀ (kernel value) with values at J(u_0,). neighborhood locations u_j. This statistic is computed as:

Screenshot

where m and s are the mean and standard deviation of the set of z-data. This local statistic is simply the product of the kernel value and the average of neighboring values; it can detect both positive and negative autocorrelations. It exceeds zero if the kernel and neighborhood averaged values jointly exceed the global mean m (High–High, HH cluster) or are jointly below m (Low–Low, LL cluster). LISA values are negative if the kernel and neighborhood mean values are on opposite sides of the global mean m, which indicates .

To test whether any test statistic, I(u₀), is significantly greater or smaller than 0 (i.e., presence of positive or negative spatial autocorrelation), one needs to know its probability distribution under the null hypothesis of spatial independence. The common way to generate such a reference distribution is to shuffle the set of z-values randomly and then to use the shuffled values to compute the neighborhood average while the kernel value remains the same (conditional randomization). In other words, the LISA statistic is computed for randomly distributed values in adjacent locations. This operation is repeated K times (K = 100 is the default value in Vesta) to compute the P-value of the test.

Impact of Missing Values in Local Cluster Analyses

If you have a dataset with missing values, calculations of the Local Moran statistics will be based on only those neighboring locations with data. Because the statistics are evaluated for significance with Monte Carlo randomizations (i.e., the differences in the distribution of observed statistic values can influence whether a particular value is judged as "rare"), removing one or more locations from a geography (thus creating missing values) can change results for all locations. You might observe this if you decide that a value or two represent outliers in your data, and re-run an analysis using missing values instead of the recorded ones. You will find that results for locations close to the one where missing values now occur change, but results for other locations may change as well due to the change in the overall distribution of the statistic.

Local Cluster Analysis Process

Select "Local Cluster Analysis" in the Method panel
Select "Start" to open the Local Cluster Analysis dialog
Select the dataset of interest
Select the variable of interest from the dataset
Select the preferred adjacency method
- Nearest Neighbors uses the closest neighboring points or polygon centroids; The number of nearest neighbors can be modified to either expand or reduce the size of the neighborhood.
- Polygon Adjacency uses the polygons that share a border or a corner (i.e., vertex) with the kernel polygon (1st order queen adjacency).
The number of randomizations is set to 100 by default; it can be increased for greater accuracy or decreased for faster calculations.
Select "Run" to execute the method
- Look at the bottom of the workspace for a message indicating that the task is in process or has been completed.
Results can be found in the Data panel under the respective dataset

Local Cluster Analysis Results

The Local Moran test results provide insight into the local spatial autocorrelation, and offer insight into data clusters (similar values) and outliers (value different from neighbors). Four quantities are calculated: z-score, Local Moran's I for each location, p-value, and Local Moran Class. Each observation/location is classified into one of the following four categories:

HH - the location is part of a significant high-high correlation cluster, meaning that the location has high autocorrelated and surrounded by other high autocorrelated neighbors.
LL - the location is part of a significant low-low correlation cluster, meaning that the location has low autocorrelation and surrounded by other low autocorrelation neighbors.
HL - the location is an outlier with high autocorrelation but surrounded by neighbors with low autocorrelation.
LH - the location is an outlier with low autocorrelation but surrounded by neighbors with high autocorrelation.

If the data is labelled with NS, the Local Moran's I is not significantly different from zero (p-value greater than 0.05).

Each of the four new variables can be visualized in a map.