Implementation of Poisson GWR

As described in the About GWR page, SpaceStat takes a unified approach to all regression methods, allowing a model to be specified with both geographic and non-geographic weights. This approach means that as long as an aspatial analysis has been performed on a point dataset (such as a set of polygon centroids), the same model can be run using both aspatial and geographically weighted regression (GWR) methods, and when all points have the same geographic weight, the results will be the same. Thus, our description of the implementation of the various forms of GWR (Poisson, described here, linear, and logistic) is very similar to the implementation pages for the aspatial form.

As described here in the overview of Poisson regression, the goal of this statistical tool is to parameterize a relationship between one or more independent variables, and the expected value (mean) of a discrete (integers only) dependent variable that follows a Poisson distribution. This statistical tool uses the log as its link function; taking the log of the function produces a linear combination of predictors, so again this is a form of linear model. The log link funtion also constrains the expected values to positive numbers. As shown in the overview of Poisson regression, a general formula can be written as:

Obtaining the parameter estimates

For all regression in SpaceStat, the regression formulation is carried out in terms of maximum likelihood ( L) estimation. A "likelihood" is a probability (and must have a value within the range of 0 - 1); in this case the probability that the dependent variable can be predicted from the independent variables. As indicated in the equation below, the maximum likelihood estimator uses a Poisson distribution to define a joint probability distribution from the individual dependent variable observations. In the following equations, the brackets around the beta, which symbolizes the regression coefficients, indicate that we are estimating two or more regression coefficients.

The goal of maximum likelihood estimation is to maximize the Log-Likelihood (lnL), which has a value between 0 and negative infinity (negative, because you are taking the log of a value that is less than 1). Maximum likelihood estimation is an iterative process. Recall from our overview that we need to account for both geographic and non-geographic weighting factors (w) in our estimation. The weighted log-likelihoods for logistic regression can be obtained by raising each individual probability to the power of a weight factor, taking the product over observations, and then taking the logarithm.

To estimate the regression coefficients, SpaceStat uses the Taylor expansion of the equation above, and then the maximum likelihood algorithm determines the direction and sign of changes in the regression coefficients which will increase the lnL. After starting from an arbitrary set of coefficient estimates, the initial function is estimated and the residuals are evaluated. From these results, the algorithm modifies the coefficient values, and generates a new set of residuals which are compared to previous values. This process continues until there is little change in the lnL. There is a possibility that this process will not lead to convergence due to what is called a "ridge-effect"; in this case, the Log-Likelihood stays constant as coefficients are varied.

Evaluating the full model

To evaluate the significance of the full aspatial Poisson regression model, SpaceStat presents the difference (deviance) between the log-likelihood of the full model and that of a "perfectly–fitted" model in which the Poisson mean at each observation is set equal to the observed value. There is not a local version of this statistic.

Significance of individual terms in the Poisson regression model

For Poisson GWR regression, SpaceStat presents the parameter estimates, parameter standard errors and p-values (using a chi-squared distribution). Likelihood ratio tests are used to evaluate the significance of individual parameters in the model. The basic idea of these significance tests is the same as the test of significance of the full model (described here for aspatial Poisson regression), except the tests are based on the difference in -2lnL for an overall model and a nested model where one term has been dropped. If the test for a particular parameter is not significant, this means that coefficient for that variable can be considered to not be significantly different from zero, and that you can drop this variable from your model without a reduction in model performance.