About Forward Stepwise and Forward Selection
In the forward stepwise model selection procedure, variables are sequentially added to an "empty" (intercept only) model. In contrast, backward procedures start with all of the variables in the model, and proceed by removing them. To simplify the description, this overview describes the process for stepwise linear regression, with modifications that apply to logistic and Poisson regression presented below.
The forward stepwise procedure
In the first round of forward stepwise iterations, the regression terms (i.e., datasets that you selected in the model definition step) are each added to the starting model, and the regression calculation is performed to find the improvement in the residual sum of squares for each of these resulting models relative to the intercept only model. For each new model, SpaceStat calculates a p-value for the change in the sum of squares; this calculation is based on an F-distribution and incorporates the degrees of freedom in the regression term and the error variance. When all of the one-term models have been created, the forward stepwise procedure selects the variable associated with the lowest p-value model as the first round candidate for entry into the model. This variable’s p-value is then compared to the "p to enter" cut-off value you have specified in the stepwise procedure dialog box, and if it is lower than the cut-off, the candidate term will enter the regression model. This one-term model provides a new starting model for the next round. If in this or in any of the subsequent rounds the lowest candidate p-value is not lower than the "p to enter" value specified in the regression dialog, then the forward stepwise procedure will stop.
Initially, subsequent rounds follow the same procedure as the first: The terms not already in the model are examined in turn to see to what extent adding them to the model improves model fit. The candidate term (dataset) that provides the greatest improvement (i.e., is associated with the lowest p-value from an F-test) is again compared against the "p to enter" to determine if it should be added to the model. However, in addition to testing whether terms can be added, the stepwise procedure also examines every model with more than one term by looping through each of the terms in the model, and again on the basis of an F-distribution, calculating the p-value for removal. Specifically, the p-value for removal is based on comparing the resulting increase in residual sum of squares to the error variance of the larger model. The term with the highest p-value becomes the candidate for removal from the model. If the largest p-value is larger than the "p to stay" value you supplied in the dialog, then the term is removed from the model.
SpaceStat allows as many rounds of this procedure as there are possible regression terms in your starting model (i.e., as many rounds as the number of terms you defined in the model definition step). If a term (already added) is later removed from the model, SpaceStat augments the number of remaining rounds to give all of the remaining terms a chance to enter. If, however, during a particular round, the same terms is removed and then added, this event will trigger an end to the forward stepwise regression procedure. The integrity of the results obtained in forward stepwise regression depends on the choice of "p to enter" and "p to stay"; for the process to work, the "p to stay" value should never be made smaller than the "p to enter" value.
Forward selection
Setting the "p to stay" value to 1 ensures that no term already in the model will be allowed to leave. This reduces to the "Forward Selection" procedure used in other software packages' stepwise regression tools.
Logistic and Poisson forward stepwise regression
Rather than an F-test, to determine whether variables enter or leave logistic regression and Poisson models, SpaceStat calculates the exact log-likelihood difference attributable to the focal variable. The log-likehood is evaluated with a chi-squared test to determine significance relative to the "p to enter" and "p to stay" cut offs you set in the regression settings.
Click here to see how to perform stepwise regression.