Deriving Difference Datasets  

Generating a difference dataset is just one example of a way to use the "Derive new dataset" option from the Tools drop-down menu (or from the right-click menus for geographies).  A difference dataset is simply the difference between values for each location in two specified datasets at two specified times.  You can also create a difference dataset using the "Difference" method.

Here are two ways that difference datasets are commonly used:

  • To examine how values for different variables vary at a given time. For example, you can use these datasets to examine differences in lung cancer rates between women and men.  When you use the "Derive new dataset" option (rather than the Difference method), you can produce datasets with multiple times and animate them.

  • To observe how one dataset changes through time. For example, you could analyze how lung cancer incidence rates differ across years, as in the example below.  Note in this example, rates in all polygons were higher in 1980 than in 1960, so the difference is always positive.  In many cases, you will have both negative and positive values in your difference map.

 

How to create a difference dataset starting with two different datasets (all times):

  1. Choose "Derive new dataset" from the Tools menu, or from the menu that pops up when you right-click on a geography.

  2. To subtract one dataset from a different dataset (the example in the first bullet-point, above), first enter a name for the new dataset, and then choose the geography (both datasets must be in the same geography).  

  3. The syntax for identifying datasets to be modified is to precede the dataset name by a $, surround it in parentheses, and include the name in quotation marks. This syntax encapsulates the dataset name so that it isn't interpreted as part of a mathematical expression, even if the name includes mathematical symbols. SpaceStat automatically adds this syntax if you select datasets from the drop-down menu at the lower left.  The final dialog box for subtracting the cancer rate for white females (RWF) from the rate for white males (RWM) would look like the image below, and once you hit "ok", a new dataset with the name "RWM-RWF" would appear under SEAs in the Data View.

  4.   

  5. Once the new dataset appears in the data view, you can view it in the table (excerpted below), or with any of the other visualization tools.  In this example, the dataset for males was entered first, so that values for females were subtracted from males, which just happens to produce only positive results for these datasets.  Doing the analysis with datasets in the reverse order would give the same results, except for the sign of the difference would be reversed.  

How to create a difference dataset for two times in the same dataset:

  1. Choose "Derive new dataset" from the Tools menu, or from the menu that pops up when you right-click on a geography.

  2. To subtract values from one dataset from values for the same dataset at a different time (the example in the second bullet-point and map, above), first enter a name for the new dataset, and then choose the geography (both datasets must be in the same geography).  

  3. The syntax for identifying datasets to be modified is to precede the dataset name by a $, surround it in parentheses, and include the name in quotation marks. This syntax encapsulates the dataset name so that it isn't interpreted as part of a mathematical expression, even if the name includes mathematical symbols. SpaceStat automatically adds this syntax if you select datasets from the drop-down menu at the lower left, but you will need to modify this definition by including the specific dates and/or times that you want to subtract.  Then, to indicate dates, you will need to insert the year, month, day information within the parenthesis, and separated by commas, as shown in the example below (you can also include leading zeros before single digit months or days, e.g., "1980,01,01" but these are not required).  If your dataset is hours, minutes, seconds, then these parameters need to be indicated within the parenthesis following a comma after the dataset name.  Finally, if your dataset includes BOTH dates and times, then you need to include both in the dataset definition, e.g., $("RWM", 1980,1,1,12,0,0).  

  4. The final dialog box for subtracting the cancer rate for white males (RWM) in 1960 from the rate for white males in 1980 would look like the image above, and once you click OK, a new dataset with the name "RWM1980-RWM1960" would appear under SEAs in the Data View.

 

Table of Contents

Index

Glossary

-Search-

Back