Principal components analysis

Principal components analysis

In ecology, ordination was developed and used to overcome subjectivity and bias associated with direct gradient analysis

In DGA, ordination was used to arrange species and quadrats along axes of variation which were chosen by the researcher on the basis of his/her observations relative to the "important environmental factors" to which species respond

In indirect ordination (syn. ordination, indirect gradient analysis), axes representing major directions of environmental and community variation are sought from computations on the data themselves: data are summarized and patterns are sought using species responses alone; environmental interpretation is usu. a subsequent and independent stage in the analysis

Specific objectives of ordination are to:

summarize community data by producing a low-dimensional ordination space (typically of 1-3 dimensions) in which similar species and samples are close together and dissimilar entities far apart
relate species and community patterns to environmental variables

These two objectives reflect the two common philosophies about ordination:

Ordination is a technique for matrix approximation
There is an underlying (latent) structure in the data

occurrences of all species under consideration are determined by a few unknown environmental variables (latent variables) according to a simple response model

Comparison to DGA: DGA is useful when important environmental variables are readily appreciated and measured and when objectives include direct, integrated use of environmental data; if relationships are not so clear, ordination is an appropriate starting point for summarizing and displaying patterns There are many specific techniques which fall under the general heading of ordination. We will discuss three of the most common techniques: principal components analysis (PCA), reciprocal averaging (RA), and detrended correspondence analysis (DCA)

PCA

PCA dates to 1901 (Pearson, Philosophical Magazine, 6th Series 2:559-572); it was ignored until 1933 (Hotelling 1933, J. Educ. Psych. 24:417-441,498-520)

At this time, some psychologists were seeking a single measure of intelligence, and reducing multivariate information to 1 (or a even a few) axis had considerable appeal

Consider the following simple example:

Species

Quadrat	X	Y
1	3	3
2	4	1
3	1	5

n_X=n_Y=3 X=8 X²=26 mean X=2.667 x²=4.667^* s²_X=2.333 Y=9 Y²=35 mean Y=3 y²=8 s²_Y=4 XY=18

^*corrected SS=X²-{(X)²/n}

We can plot these data in quadrat-dimensional space

length of vector A = |A| = X² = 26 ~ 5.1

length of vector B = |B| = Y² = 35 ~ 5.9

therefore, we see that SS have a direct geometric interpretation

|A| |B| cos = XY -->

cos

XY/{(

X²)(

Y²)} = 18/(

26)(

35) = 0.60 -->

= 53.4°

Sums of squares and cross products matrix (SSCP) describes the length of vectors and juxtaposition of all species--it contains all info about geometric relations between spp. in a quadrat-dimensional space:

This interpretation assumes we start at the origin on all axes (i.e., quadrat w/o any vegetation). We can, instead, adjust data to plot around the mean (this is called centering):

Quadrat	X'	Y'
1	3-2.667= 0.333	3-3= 0
2	4-2.667= 1.333	1-3=-2
3	1-2.667=-1.667	5-3= 2

X'=0 X'²=4.667 mean X'=0 x²=4.667 s²_x=2.333 Y'=0 Y'²=8 mean Y'=0 y²=8 s²_y=4 XY=-6

We can re-plot these data, thereby moving the coordinate system to center it around the data

|A| = X'² = 4.667

|B| = Y'² = 8

cos = X'Y'/{( X'²)(Y'²)} = -6/(4.667)(8) = -0.98 -->

= 169°

After centering, correlation coefficient (r) has a geometric interpretation--it is the cosine of the angle between the species

Note that orthogonal (uncorrelated) = right angles (i.e., cos 90° = 0)

Perfectly correlated species have angles of 0° (perfectly positively correlated, r = +1) or 180° (perfectly negatively correlated, r = -1)

We can calculate a variance-covariance matrix, SSCP/(n-1):

÷ (n-1) = ÷ (3-1) =

For the centered data,

Quadrat X' Y' 1 0.333 0 2 1.333 -2 --> SSCP = 3 -1.667 2

As an alternative to plotting data in quadrat-dimensional space, we can plot the data in species-dimensional space

distance²(1,origin) = 0.333² + 0² = 0.111

distance²(2,origin) = 1.333² + (-2)² = 5.778

distance²(3,origin) = (-1.667)² + 2² = 6.778

distance² = 12.667 = Trace(SSCP) = 4.667 + 8

this is sometimes called the total dispersion represented in the matrix (similar quadrats --> less total dispersion)

also, am't of total dispersion accounted for by variability in species B = 8/12.667 = 63% (i.e., there is more "spread" along species B than along species A)

Centering allows us to draw conclusions about species and quadrats. Anything at the point of reference (i.e., at the center) is trivial; anything that deviates from it is information.

In addition to centering, a common adjustment is to standardize

this consists of dividing the centered data by standard deviations (i.e.,

variance)

Quadrat	X''	Y''
1	0.333/1.528 = 0.218	0/2 = 0
2	1.333/1.528 = 0.873	-2/2 = -1
3	-1.667/1.528 = -1.091	2/2 = 1

mean X''=0 X''²=2 x²=0 s²_x=1 mean Y''=0 Y''²=2 y²=0 s²_y=1 X''Y'' = -1.964

SSCP = --> SSCP/(n-1) = which is a correlation matrix: r_AB = -0.98 r_AA = 1 r_BB = 1

Now the vectors (i.e., distance from A to origin, distance from B to origin) are the same length

they contribute equally to dispersion in matrix

Standardization, therefore, weights rare species equally w/ common species

Standardization by species implies:

All species are of equal a priori interest

Therefore, each presence of a rare species is proportionally more important than that of an abundant species

To this point, we have data centered, and possibly standardized (depending on the weight we want to give rare vs. common species)

We will consider the centered (but not standardized) data:

The next step in PCA is to fit a line through these points such that the sum of the squares of the perpendicular distances from the points to the line is minimized

Axes are rotated (in this case, 127.2°)--this is called "rigid rotation"

New axes of the coordinate system are linear combinations of the axes of the original coordinate system (i.e., species) e.g., coordinate of the third quadrat on axis 1 is (-0.605)(-1.667) + (0.796)(2) = 2.600 ( = 127.2°; cos = -0.605, sin = 0.796; latter numbers are called eigenvectors--PCA is done w/ eigenanalysis w/ real data sets)
Species are also plotted in the new coordinate system

Note that centering and rotating axes has not changed relative positions of points to each other

With the centered data, total dispersion in the system is 12.667, of which 63% was explained along species Y axis (and 37% along species X axis) In the new coordinate system, the first axis accounts for over 99% of the dispersion in the system: SSCP = Total dispersion = 12.561 + 0.106 = 12.667; variability explained by first axis = 12.561/12.667 = 99.2% Also, note that cross product term is 0, indicating that new axes are orthogonal Whereas it required 2 axes (one of which was somewhat less that twice as "important" as the other) to fully describe the dispersion in the centered data, in the new coordinate system, 99% of the dispersion in the system is along the first axis This becomes very important when analyzing many species and quadrats--w/ only 2 species, it is not necessary to conduct a PCA (because gradient can be interpreted directly)

Summary of PCA algorithm:

Species are "plotted" in quadrat-dimensional space
Data are centered, and possibly standardized
A "best-fit" line is projected through the data (PCA axis 1)
Another line is fit through the data, orthogonal to the first (PCA axis 2)

and so on, for up to n-1 axes (where n=number of quadrats)

Original axes vs. PCA axes:

Arrangement of points never changes; only the axes change (standardizing changes arrangement of points)
Angular relations between points as viewed from the original are unaltered by the second transformation (rigid rotation), but they are changed by the first transformation (centering)
Original axes have a simple meaning: abundances of individual species. PCA axes have a complex meaning: linear combinations of abundances {sum of the abundance times the eigenvector (sine or cosine of the angle) for each species}.
PCA axes concentrate variance or structure of the point configuration into relatively few axes, in contrast to the high dimensionality of the original data.

Note that PCA involves analysis of community data alone-- environmental data are not included. Thus, environmental interpretation of PCA results is a separate step.

Previous lecture

Next lecture

n_X=n_Y=3	X=8 X²=26	mean X=2.667	x²=4.667^*	s²_X=2.333
	Y=9 Y²=35	mean Y=3	y²=8	s²_Y=4
	XY=18

X'=0	X'²=4.667	mean X'=0	x²=4.667	s²_x=2.333
Y'=0	Y'²=8	mean Y'=0	y²=8	s²_y=4
	XY=-6

mean X''=0	X''²=2	x²=0	s²_x=1
mean Y''=0	Y''²=2	y²=0	s²_y=1
	X''Y'' = -1.964