Centroids and Projected Centroids

The plot of centroids labeled by variables should be interpreted in the same way as the category quantifications plot in homogeneity analysis or the multiple category coordinates in nonlinear principal components analysis. By itself, such a plot shows how well variables separate groups of objects (the centroids are in the center of gravity of the objects).

Notice that the categories for Age in years are not separated very clearly. The younger age categories are grouped together at the left of the plot. As suggested previously, ordinal may be too strict a scaling level to impose on Age in years.

Figure 1. Centroids labeled by variables
Scatterplot of centroids with Dimension 2 on the vertical axis and dimension 1 on the horizontal axis. Each point is the category of a variable.

When you request centroid plots, individual centroid and projected centroid plots for each variable that is labeled by value labels are also produced. The projected centroids are on a line in the object space.

Figure 2. Centroids and projected centroids for Newspaper read most often
Scatterplot of actual and projected centroids for Newspaper read most often

The actual centroids are projected onto the vectors that are defined by the component loadings. These vectors have been added to the centroid plots to aid in distinguishing the projected centroids from the actual centroids. The projected centroids fall into one of four quadrants formed by extending two perpendicular reference lines through the origin. The interpretation of the direction of single nominal, ordinal, or numerical variables is obtained from the position of the projected centroids. For example, the variable Newspaper read most often is specified as single nominal. The projected centroids show that Volkskrant and NRC are contrasted with Telegraaf.

Figure 3. Centroids and projected centroids for Age in years
Scatterplot of actual and projected centroids for age

The problem with Age in years is evident from the projected centroids. Treating Age in years as ordinal implies that the order of the age groups must be preserved. To satisfy this restriction, all age groups below age 45 are projected into the same point. Along the direction defined by Age in years, Newspaper read most often, and Neighborhood preference, there is no separation of the younger age groups. Such a finding suggests treating the variable as nominal.

Figure 4. Centroids and projected centroids for Neighborhood preference
Scatterplot of actual and projected centroids for Neighborhood preference

To understand the relationships among variables, find out what the specific categories (values) are for clusters of categories in the centroid plots. The relationships among Age in years, Newspaper read most often, and Neighborhood preference can be described by looking at the upper right and lower left of the plots. In the upper right, the age groups are the older respondents; they read the newspaper Telegraaf and prefer living in a village. Looking at the lower left corner of each plot, you see that the younger to middle-aged respondents read the Volkskrant or NRC and want to live in the country or in a town. However, separating the younger groups is very difficult.

The same types of interpretations can be made about the other direction (Music preferred, Marital status, and Pets owned) by focusing on the upper left and the lower right of the centroid plots. In the upper left corner, we find that single people tend to have dogs and like new wave music. The married people and other categories for marital have cats; the former group prefers classical music, and the latter group does not like music.

Next