Articles

How Cladists Work

by Albert J. Klee, Ph.D.

THE IDEA BEHIND CLADISTICS

It is generally understood that classifications ought to reflect the phylogeny of organisms, where each taxon (a genus, for example, is a taxon) should originate from a single ancestral form. The emergence of "cladism" or "cladistic taxonomy" in the 1960's, therefore, is expected to supplant Linnaean classification in the future since, in classifying species, cladists place a priority in achieving coherence with the Darwinian principle of common descent. In grouping species, cladists look for "derived similarities," meaning those aspects that species can be expected to share by virtue of a common evolutionary ancestry. This approach differs from that of phenetics, which associates species based on overall similarity and does not address ancestry. This was the way aquarium species were classified in the old days. It also differs also from classification based on ad hoc "key characters," a device known to aquarists looking up identification in a book based on such keys. Cladists, on the other hand, avail themselves of all the types of evidence available, including DNA sequences and hybridization studies, biochemistry, and traditional morphology. They often make use of computerized algorithms to identify the most likely phylogeny or "family tree" that relates the species they are considering. A cladistic analysis therefore is applied to a certain set of information. To organize this information a distinction is made between characters, and character states.

Consider the color of fish scales; this may be blue in one species but red in another. Thus, "red fish scales" and "blue fish scales" are two character states of the character "scale-color." The researcher decides which character states were present before the last common ancestor of the species group and whichwere present in the last common ancestor by considering one or more outgroups. An outgroup is an organism that is considered not to be part of the group in question, but is closely related to the group. This makes the choice of an outgroup an important task, since this choice can profoundly change the topology of a tree. Note that only character states present in the last common ancestor are of use in characterizing clades.

CLADOGRAMS

Cladistics provides systematists with one very handy feature. Due to its bipolar character states (a species has a given character or it doesn't - there's no inbetween), cladistics lends itself very readily to computer analysis. Computers can analyze hundreds or thousands of characters in a fraction of the time that a human systematist could. Computers then use the results to generate branching diagrams - called "cladograms" - which graphically represent how the included species are interrelated. The following is an example of a cladogram:

Clades ideally have many "agreeing" character states present in the last common ancestor. Ideally there are a sufficient number of them to overwhelm characters caused by convergent evolution (i.e. characters that resemble each other because of environmental conditions or function, not because of common ancestry). A well-known example due to convergent evolution would be a character "presence of wings". Though the wings of birds and insects serve the same function, each evolved independently, as can be seen by their anatomy. If a bird and a winged insect were scored for the character "presence of wings", this would confound the analysis, possibly resulting in a false picture of evolution.

HOW CLADOGRAMS ARE OBTAINED

Many cladograms are possible for any given set of taxa, but one is chosen based on the principle of parsimony, i.e., the most compact arrangement, that is, with the fewest character state changes. Though at one time this analysis was done by hand, computers are now used to evaluate much larger data sets. Sophisticated software packages allow the statistical evaluation of the confidence we can put in the veracity of the nodes of a cladogram.

X Computation  
0 0 0 (3/4)*(3/4)*(3/4) .4218
0 0 1 (3/4)*(3/4)*(1/4) .1406
0 1 0 (3/4)*(1/4)*(3/4) .1406
0 1 1 (3/4)*(1/4)*(1/4) .0468
1 0 0 (1/4)*(3/4)*(3/4) .1406
1 0 1 (1/4)*(3/4)*(1/4) .0468
1 1 0 (1/4)*(1/4)*(3/4) .0468
1 1 1 (1/4)*(1/4)*(1/4) .0156

 

p Computation L
0 0*1*0 0
0.1 0.1*0.9*0.1 0.009
0.2 0.2*0.8*0.2 0.032
0.3 0.3*0.7*0.3 0.063
0.4 0.4*0.6*0.4 0.096
0.5 0.5*0.5*0.5 0.125
0.6 0.6*0.4*0.6 0.144
0.7 0.7*0.3*0.7 0.147
0.8 0.8*0.2*0.8 0.128
0.9 0.9*0.1*0.9 0.081
1.0 1*0*1 0

Using a parsimony criterion is only one of several methods to infer a phylogeny from molecular data: maximum likelihood and Bayesian inference, which incorporate explicit models of sequence evolution, are also ways to evaluate sequence data. As you might expect, these methods are difficult to explain to those not versed in statistics or mathematics but I'll give it a try.

The maximum likelihood method is based on a model and on a distribution. The model is the probability of an event dependent on a model parameter, p. The likelihood of the parameter given the data is the probability of observing X given p. The maximum likelihood method consists in optimizing the likelihood function. The goal is to estimate the parameters p which make it most likely to observe the data X. In this example I'll use the binomial distribution, often used to calculate the probability of the outcome of coin tossing. Suppose we throw a coin three times where the probability of observing a head (1) is p=1/4 and the probability of observing a tail (0) is p=3/4. We can compute the probability of each possible data set (X) of three tosses as follows:

For example, in a toss of coins three times, the probability of obtaining a head followed by two tails (the fifth entry in the table) clearly is: 1/4*3/4*3/4 = 0.1406. Inversely, if we observe the outcome of three throws as "1 0 1" we can compute the likelihood, L, of the probability that produced this outcome:

For example, in a toss of coins three times, if the probability of obtaining a head is 0.8, the probability of obtaining a head, then a tail, and then a head is (the eighth entry in the table) is: 0.8*0.2*0.8 = 0.128.

"1 0 1" is most likely for p = 0.7. Substitute a set of observations on a group of fishes rather than a set of observations on a group of coin tosses, and substitute a postulated clade for the probability in the last table, and you have the basic idea of one way cladists form their cladograms using the maximum likelihood principle.

CONCLUSION

Given differences in the specialties and biases of individual cladists, it is hardly surprising that different characters, weighted differently, would result in different cladograms. But new techniques in comparative morphology, molecular genetics, and refinements in cladistic methods are beginning to produce a fair measure of agreement among the cladograms proposed by various workers. When several independent cladistic studies - using different character suites - are largely in agreement, we can be reasonably confident that the cladograms accurately reflect real evolutionary pathways.

This article first appeared in PVAS’s Delta Tale, Vol 35 # 2.

 
© 2007-2012 Potomac Valley Aquarium Society, Inc.

Web Design by Cristyn Keister
Site hosting services provided by Monster Aquaria Network