Multidimensional Scaling

Multi-dimensional Scaling (MDS) is a spatial analysis method that helps visualize the similarity or dissimilarity among multiple variables or observations in a reduced number of dimensions, typically two or three. It creates a spatial configuration where points that are more similar are placed closer together, making patterns and groupings easier to identify. In Vesta, MDS can be applied to variables in a dataset to explore relationships between the variables and detect clusters or patterns over time.

Multi-dimensional Scaling (MDS) is applicable when you want to explore and visualize the similarity or dissimilarity among multiple variables or observations, especially if you have many variables and want to reduce their complexity into few dimensions for easier interpretation.

Scenario example

Suppose you have a dataset with several environmental contamination variables measured across many locations, and you want to understand how similar or different these locations are based on their contamination profiles. MDS can help by placing these locations in a 2D or 3D space where distances represent their similarity: closer points mean more similar contamination profiles, while distant points are more different.

Clues that MDS is appropriate: - You have multiple quantitative variables and want to analyze their joint patterns - You want to visualize similarity or dissimilarity relationships among samples or variables - You need a dimension reduction technique that preserves distance or dissimilarity - Your data has spatial or temporal components and you want to explore grouping or trends - You do not have a predefined dependent variable but want to explore structure

Selecting an MDS approach

When running MDS, choosing between correlation and distance depends on your analysis goal:

  • Use correlation when you want to focus on the similarity of variable patterns regardless of their scale or magnitude. Correlation measures how variables co-vary, emphasizing their relationships rather than absolute differences. This is useful when variables are on different scales or you want to find groups based on shared trends.

  • Use distance (such as Euclidean distance) when you want to capture absolute differences between observations or variables. This considers the magnitude of differences and is appropriate when the scale of measurement matters and you want to preserve the actual distances in the reduced space.

Standardizing variables for MDS is recommended when your variables are measured on different scales or units to ensure that no single variable disproportionately influences the analysis due to its scale. Standardization puts all variables on a common scale (usually mean zero and standard deviation one), allowing MDS to capture patterns based on relative differences rather than magnitude differences.

You should choose to standardize variables when:

  • Variables have different units or ranges
  • Some variables have much larger variance than others
  • You want to treat all variables equally in the similarity/distance calculation
  • You want to focus on patterns of variation rather than absolute values

If your variables are already on comparable scales or you're interested in absolute distances, standardization might not be necessary.

MDS Process

  1. Click on the "Analyze" button from the side bar menu
  2. Select "Exploratory Spatial Analysis" Category, and then Multi-dimensional scaling
  3. Select the dataset from the dropdown menu
  4. Select all of the desired variables for MDS analysis
  5. Select dissimilarity type
  6. Select whether to standardize variables or not
  7. Click "Run" to execute MDS

MDS Output

Screenshot

MDS analysis produces several elements on a results workspace. The report provides a summary of the MDS analysis performed on your dataset. It typically includes information about the stress or goodness-of-fit measure indicating how well the reduced dimensions represent the original dissimilarities, the coordinates of observations in the reduced dimensional space (e.g., Component 1, Component 2, Component 3), and possibly interpretations of the spatial configuration of points. This helps you understand the similarity relationships among observations or variables in a simplified visual form.

Why are there always 3 components? The math produces as many dimensions as there are objects, but Vesta always outputs the first 3 so you can view a 2D scatter (Component 1 vs Component 2) and still have a third dimension available. So you get 3 components whether you select 1, 2, or many variables. With only 1 or 2 variables, Component 1 often captures most of the structure; Components 2 and 3 may be much smaller or mostly noise.

The results scatter plot shows Component 2 (x-axis) versus Component 1 (y-axis). Objects that are similar in your variables tend to appear close together; objects that differ tend to appear farther apart.

The variance explained shows how much of the total dissimilarity information is captured by each component. Higher percentages mean the component captures more of the original data structure.