Molecular versus Morphological Data in Systematic Studies

NOTE: These are lecture notes for Biology 391, Organic Evolution, at The University of Tennesee at Martin.  Anyone outside of UT Martin wishing to use these notes or to contact me for additional information should first read the information obtained by clicking here.

Goals: The goals of this lecture are to introduce the use of molecular data in systematic analysis, and begin a discussion of the pros and cons of using molecular versus morphological (structural) data in systematics.

Related Textbook Material: Freeman and Herron (2001) Chapter 13

Lab Manual Questions over this material are in Lab Manual Chapter XII


The Lecture:


In the previous lecture, and in your exercise on toucan phylogeny, you have been introduced to systematics, the study of phylogenetic relationships. So far the examples we have used have used characteristics that are based on morphology, that is, structural aspects of organisms such as bone structure, organs, plumage color, etc. Many modern studies of systematics continue to use such characteristics, but in addition modern systematists use molecules, especially DNA, to study phylogenetic relationships.

In principle, we can use aspects of DNA to study systematics the same way we use aspects of morphology. Suppose for example that we determine the sequence of base pairs of the same gene in several species of ingroup and in an outgroup. Suppose we find, reading just one DNA strand, the following:
outgroup:  AACTGAGCT 
species 1:  ATCTGAGGG 
species 2 :  ATCTGAGGT 
species 3:  AACTGAGCT

We can use each base position in the gene as a character, and we can use the specific base that occurs there (A,T,C, or G) as a character state. So for this example, for the first base position, all species have A. Since all species are the same, this is not useful for studying phylogeny. For the second position, the outgroup and species 3 both have A, but species 1 and 2 have T. Using outgroup comparison, we determine that for this position, A is primitive and T is derived. Since species 1 and species 2 share the derived state of T, this is evidence that they are more closely related to each other than to species 3. Note that this is exactly the same method as the one you have already used to study phylogeny -- you are using cladistics and determining derived states based on outgroup comparison. In this case, your characters are positions in genes and your character states, which can be primitive or derived, are the specific bases that occur at these positions. Note that in the sequence above there is one other phylogenetically informative position; can you find it? There is also one place where you can find a derived state but where it is NOT phylogenetically informative; can you find that?

When people first started using molecular data, especially DNA sequences such as the ones noted above, they thought DNA would provide evidence that would allow us to determine the phylogeny of all life, and that it would always be superior to morphology since when we look at DNA we're looking directly at the genes and the genes should provide the best evidence of relationships. Since then, however, people have discovered that there are advantages and disadvantages to using morphology and to using DNA data for systematics, and have come to recognize that the best studies of phylogeny will be based on a variety of different kinds of data. For the rest of this lecture, we will consider the pros and cons of morphology versus DNA in phylogenetic analysis. As we do this, we will also consider how DNA evolves, and how DNA evolution results in some aspects of DNA that make it very useful for phylogenetic analysis (systematics) while others cause problems.

Pros and cons of using morphology and DNA in phylogenetic analysis

  1. Genetics: when we study DNA, we are looking directly at the genetic material. We know that what we are looking at is inherited; DNA in modern organisms has been passed down ancestral organisms, so DNA should reflect ancestry and be reliable for studying phylogenetic relationships. In contrast, when we look at a structure of an organism, we don't know exactly how it is inherited. It is possible that some aspects of the structure may depend on direct environmental influences and not be inherited at all. Since we want to study inherited characteristics to study phylogenetic relationships, this could make morphology less reliable for phylogenetic studies.
  2. Existence of variation: one of the problems with studies of morphology is that we may be looking at a group of species that are extremely morphologically similar. For example, people studying some groups of frogs have complained that in terms of skeletal structure, muscles, and organs, one species is almost exactly like another. Such similarities make it very difficult to find a large enough number of characters to do a reliable study of systematics (remember that we need a large number of characters because some may show convergent evolution; to be sure to have enough so that we can reliably say that most show homology we need a large number.) In contrast, DNA sequences have turned out to be variable among species so that it has generally been possible to find a large number of characters (base pair positions) that have phylogenetically informative variation. Thus, in this aspect, DNA is superior to morphology.
  3. Cost: it is much less expensive to study morphology than DNA. This means that for the same cost we may be able to study more aspects of body structure than we can genes. In this case, morphology is superior to DNA, although DNA technology becomes less expensive and faster all the time, so this advantage to morphology is becoming less important.
  4. Ability to study characters that reflect a variety of different genes: when we study morphology, we can study different aspects of the body that are likely coded by many different genes. When we study DNA, we often have the technology and money to study only a single gene or a small number of genes. A potential problem with studying just one gene is that evolution at different base pairs in the same gene may be related so what we are calling separate characters may evolve in a way that is correlated -- so if convergent evolution occurs at one site it may also occur at another, and make our phylogeny less reliable. If we study traits coded by many genes, it is less likely that their evolution will be correlated. In this case, studies of morphology may be superior to our current studies of DNA since our current studies of DNA are typically based on just one gene.
  5. The Fossil Record: we can include fossil species in studies based on morphology, but, with very rare exceptions, we can not get DNA from fossils. This means studies based on morphology can include more species. This may allow us to find outgroups that are more closely related to the ingroup and therefore more reliable, and to have a better representation of the ingroup. Both of these can make our determination of primitive and derived states, and therefore our phylogeny, more reliable. So for this reason, studies based on morphology can be superior to those based on molecules, if we are looking at a group for which a good fossil record exists.
  6. Determination of Homology: determining which traits are like each other and may be homologous can be a problem for both morphology and for DNA data. For morphological data, because we do not know which genes are involved in coding for particular structures, we might call similar looking structures the same, and consider them homologous, when they are really coded by different genes and should be considered different. This could make the phylogeny unreliable. Unfortunately, similar problems can arise with DNA data. In the evolution of DNA, a fairly common occurrence is for a gene to be duplicated -- instead of one copy of this gene at one place in the chromosomes, when the gene is duplicated there are then two copies of the gene at different places in the chromosomes. These duplicated copies of the gene both then evolve. When we study genes, we might accidentally compare the original form of the gene in one species with the duplicate form of the gene in another species. These aren't really homologous. Another problem with determining DNA homology is that there are often mutations that insert several base pairs into a gene or that delete several base pairs from the gene. The result of this is that the same gene in different species has different length. When this has happened, it can be hard to line up the genes correctly to be sure that we are really comparing the same position in the gene of one species with the gene of another species. This problem, called the problem of alignment, also makes it hard to determine DNA homology. Thus, determining the traits that may be homologous and are appropriate to compare can be a problem for both DNA information and for morphology.
  7. Convergent evolution: convergent evolution is, remember, a problem for phylogenetic analysis. It suggests that species may be related, because they have similar characteristics, when in fact those species are NOT related. Both morphological data and DNA data are subject to this problem of convergent evolution. Morphology is subject to natural selection. When species independently occur in similar environments, they are likely to evolve similar traits through natural selection. For example, dolphins and fish evolved independently in water but have many traits in common because they have evolved independently through natural selection because they improve the ability to swim -- they are adaptations to living in the water. So we can expect convergence, caused by natural selection, in morphological traits. For reasons that we will discuss in the next lecture, much of the DNA may be neutral with regard to natural selection, which means that the different DNA forms in a population all have the same fitness so do not evolve through natural selection. For DNA for which this is true, natural selection can not cause convergent evolution since natural selection is not affecting this DNA. This could be a benefit to studying phylogeny based on DNA, but, unfortunately, DNA data can show convergent evolution just by chance. Many mutations that occur substitute one base pair for another in a DNA sequence (for example, a T might be substituted for a C in some gene.) These are the kinds of trait most commonly studied in phylogenetic analyses based on DNA sequences. They are subject to convergence because there are only four possible bases (A,T,C, and G) so it is not unlikely that by chance, different species that both have a mutation at some site could have the same mutation and independently evolve the same base pair at the same site. This problem of chance convergence has been commonly observed in phylogenetic studies based on DNA.
  8. The rate of evolution: some people have argued that the same gene in different species should evolve at the same rate. If it is true, this turns out to be an advantage in studies of phylogeny based on DNA. We will consider this in much more detail in the next lecture.
At this point, you should recognize that there are advantages and disadvantages to both morphological and molecular studies of phylogeny. Many people currently argue that to really know the phylogenetic tree of life, we need to study both kinds of characteristic.

Study Tips:


Click here to return to the index of lectures