Molecular versus Morphological Data in Systematic Studies
NOTE: These are lecture notes for Biology
391, Organic Evolution, at The University
of Tennesee at Martin. Anyone outside of UT Martin wishing to
use these notes or to contact me for additional information should first
read
the
information obtained by clicking here.
Goals: The goals of this lecture are to introduce the use of
molecular data in systematic analysis, and begin a discussion of the pros
and cons of using molecular versus morphological (structural) data in systematics.
Related Textbook Material: Freeman and Herron (2001) Chapter
13
Lab Manual Questions over this material are in Lab
Manual Chapter XII
The Lecture:
In the previous lecture, and in your exercise on toucan phylogeny,
you have been introduced to systematics, the study of phylogenetic relationships.
So far the examples we have used have used characteristics that are based
on morphology, that is, structural aspects of organisms such as
bone structure, organs, plumage color, etc. Many modern studies of systematics
continue to use such characteristics, but in addition modern systematists
use molecules, especially DNA, to study phylogenetic relationships.
In principle, we can use aspects of DNA to study systematics the same
way we use aspects of morphology. Suppose for example that we determine
the sequence of base pairs of the same gene in several species of ingroup
and in an outgroup. Suppose we find, reading just one DNA strand, the following:
| outgroup: |
AACTGAGCT |
| species 1: |
ATCTGAGGG |
| species 2 : |
ATCTGAGGT |
| species 3: |
AACTGAGCT |
We can use each base position in the gene as a character, and we can
use the specific base that occurs there (A,T,C, or G) as a character state.
So for this example, for the first base position, all species have A. Since
all species are the same, this is not useful for studying phylogeny. For
the second position, the outgroup and species 3 both have A, but species
1 and 2 have T. Using outgroup comparison, we determine that for this position,
A is primitive and T is derived. Since species 1 and species 2 share the
derived state of T, this is evidence that they are more closely related
to each other than to species 3. Note that this is exactly the same method
as the one you have already used to study phylogeny -- you are using cladistics
and determining derived states based on outgroup comparison. In this case,
your characters are positions in genes and your character states, which
can be primitive or derived, are the specific bases that occur at these
positions. Note that in the sequence above there is one other phylogenetically
informative position; can you find it? There is also one place where you
can find a derived state but where it is NOT phylogenetically informative;
can you find that?
When people first started using molecular data, especially DNA sequences
such as the ones noted above, they thought DNA would provide evidence that
would allow us to determine the phylogeny of all life, and that it would
always be superior to morphology since when we look at DNA we're looking
directly at the genes and the genes should provide the best evidence of
relationships. Since then, however, people have discovered that there are
advantages and disadvantages to using morphology and to using DNA data
for systematics, and have come to recognize that the best studies of phylogeny
will be based on a variety of different kinds of data. For the rest of
this lecture, we will consider the pros and cons of morphology versus DNA
in phylogenetic analysis. As we do this, we will also consider how DNA
evolves, and how DNA evolution results in some aspects of DNA that make
it very useful for phylogenetic analysis (systematics) while others cause
problems.
Pros and cons of using morphology and DNA in phylogenetic analysis
-
Genetics: when we study DNA, we are looking directly at the genetic
material. We know that what we are looking at is inherited; DNA in modern
organisms has been passed down ancestral organisms, so DNA should reflect
ancestry and be reliable for studying phylogenetic relationships. In contrast,
when we look at a structure of an organism, we don't know exactly how it
is inherited. It is possible that some aspects of the structure may depend
on direct environmental influences and not be inherited at all. Since we
want to study inherited characteristics to study phylogenetic relationships,
this could make morphology less reliable for phylogenetic studies.
-
Existence of variation: one of the problems with studies of morphology
is that we may be looking at a group of species that are extremely morphologically
similar. For example, people studying some groups of frogs have complained
that in terms of skeletal structure, muscles, and organs, one species is
almost exactly like another. Such similarities make it very difficult to
find a large enough number of characters to do a reliable study of systematics
(remember that we need a large number of characters because some may show
convergent evolution; to be sure to have enough so that we can reliably
say that most show homology we need a large number.) In contrast, DNA sequences
have turned out to be variable among species so that it has generally been
possible to find a large number of characters (base pair positions) that
have phylogenetically informative variation. Thus, in this aspect, DNA
is superior to morphology.
-
Cost: it is much less expensive to study morphology than DNA. This
means that for the same cost we may be able to study more aspects of body
structure than we can genes. In this case, morphology is superior to DNA,
although DNA technology becomes less expensive and faster all the time,
so this advantage to morphology is becoming less important.
-
Ability to study characters that reflect a variety of different genes:
when
we study morphology, we can study different aspects of the body that are
likely coded by many different genes. When we study DNA, we often have
the technology and money to study only a single gene or a small number
of genes. A potential problem with studying just one gene is that evolution
at different base pairs in the same gene may be related so what we are
calling separate characters may evolve in a way that is correlated -- so
if convergent evolution occurs at one site it may also occur at another,
and make our phylogeny less reliable. If we study traits coded by many
genes, it is less likely that their evolution will be correlated. In this
case, studies of morphology may be superior to our current studies of DNA
since our current studies of DNA are typically based on just one gene.
-
The Fossil Record: we can include fossil species in studies based
on morphology, but, with very rare exceptions, we can not get DNA from
fossils. This means studies based on morphology can include more species.
This may allow us to find outgroups that are more closely related to the
ingroup and therefore more reliable, and to have a better representation
of the ingroup. Both of these can make our determination of primitive and
derived states, and therefore our phylogeny, more reliable. So for this
reason, studies based on morphology can be superior to those based on molecules,
if we are looking at a group for which a good fossil record exists.
-
Determination of Homology: determining which traits are like each
other and may be homologous can be a problem for both morphology and for
DNA data. For morphological data, because we do not know which genes are
involved in coding for particular structures, we might call similar looking
structures the same, and consider them homologous, when they are really
coded by different genes and should be considered different. This could
make the phylogeny unreliable. Unfortunately, similar problems can arise
with DNA data. In the evolution of DNA, a fairly common occurrence is for
a gene to be duplicated -- instead of one copy of this gene at one place
in the chromosomes, when the gene is duplicated there are then two copies
of the gene at different places in the chromosomes. These duplicated copies
of the gene both then evolve. When we study genes, we might accidentally
compare the original form of the gene in one species with the duplicate
form of the gene in another species. These aren't really homologous. Another
problem with determining DNA homology is that there are often mutations
that insert several base pairs into a gene or that delete several base
pairs from the gene. The result of this is that the same gene in different
species has different length. When this has happened, it can be hard to
line up the genes correctly to be sure that we are really comparing the
same position in the gene of one species with the gene of another species.
This problem, called the problem of alignment, also makes it hard
to determine DNA homology. Thus, determining the traits that may be homologous
and are appropriate to compare can be a problem for both DNA information
and for morphology.
-
Convergent evolution: convergent evolution is, remember, a problem
for phylogenetic analysis. It suggests that species may be related, because
they have similar characteristics, when in fact those species are NOT related.
Both morphological data and DNA data are subject to this problem of convergent
evolution. Morphology is subject to natural selection. When species independently
occur in similar environments, they are likely to evolve similar traits
through natural selection. For example, dolphins and fish evolved independently
in water but have many traits in common because they have evolved independently
through natural selection because they improve the ability to swim -- they
are adaptations to living in the water. So we can expect convergence, caused
by natural selection, in morphological traits. For reasons that we will
discuss in the next lecture, much of the DNA may be neutral with regard
to natural selection, which means that the different DNA forms in a
population all have the same fitness so do not evolve through natural selection.
For DNA for which this is true, natural selection can not cause convergent
evolution since natural selection is not affecting this DNA. This could
be a benefit to studying phylogeny based on DNA, but, unfortunately, DNA
data can show convergent evolution just by chance. Many mutations that
occur substitute one base pair for another in a DNA sequence (for example,
a T might be substituted for a C in some gene.) These are the kinds of
trait most commonly studied in phylogenetic analyses based on DNA sequences.
They are subject to convergence because there are only four possible bases
(A,T,C, and G) so it is not unlikely that by chance, different species
that both have a mutation at some site could have the same mutation and
independently evolve the same base pair at the same site. This problem
of chance convergence has been commonly observed in phylogenetic
studies based on DNA.
-
The rate of evolution: some people have argued that the same gene
in different species should evolve at the same rate. If it is true, this
turns out to be an advantage in studies of phylogeny based on DNA. We will
consider this in much more detail in the next lecture.
At this point, you should recognize that there are advantages and disadvantages
to both morphological and molecular studies of phylogeny. Many people currently
argue that to really know the phylogenetic tree of life, we need to study
both kinds of characteristic.
Study Tips:
-
make yourself a table of the pros and cons to morphological versus molecular
data in systematics (and then learn the material in the table!)
-
be sure you can explain the different reasons convergent evolution occurs
in morphological and molecular data
-
draw yourself sample DNA sequences. Use your drawings to make sure
you understand the problem of alignment of sequences from different species,
and the problem of chance convergence
Click here to return to
the index of lectures