A Population Genetics Model of Natural Selection

NOTE: These are lecture notes for Biology 391, Organic Evolution, at The University of Tennesee at Martin.  Anyone outside of UT Martin wishing to use these notes or to contact me for additional information should first read the information obtained by clicking here.

Goals: the goal of this lecture is to introduce a way of modeling natural selection occurring at a single gene trait with two alternate alleles in a diploid, sexually reproducing species. NOTE: this lecture relies on material from the previous lecture on Hardy-Weinberg Equilibrium; review that material before going through this lecture!

Related Textbook Material: Freeman and Herron (2001) Chapter 5

Lab Manual Questions over this material are in Lab Manual Chapter IV


The Lecture:

Now that you have seen what happens in a situation of no evolution (Hardy-Weinberg Equilibrium), let's develop a model of natural selection. Remember that natural selection occurs if genotypes don't have equal fitness. We will model a situation in which natural selection is the ONLY form of evolution occurring (there will be no mutation, genetic drift, non-random mating, or gene flow -- remember, these are the other forms of evolution, as introduced in the last lecture.) So for this model, we will assume (as we did for Hardy-Weinberg Equilibrium) infinite population size, no movement of individuals among populations, no mutation, and random mating. However, we will see what happens when the different genotypes for the trait differ in fitness.

First, consider the fitnesses of genotypes. Remember that fitness is a measure of how well individuals with a certain genetic trait are expected to survive and reproduce. We will still consider a single gene with two alleles (A and a) and three genotypes (AA, Aa, and aa.) Natural selection will occur if individuals with different genotypes differ in fitness. So to understand and model natural selection, a first important point is to determine how to measure fitness.

Remember that natural selection occurs because individuals with some traits (those with higher fitness) survive and therefore reproduce more than do other individuals. The important feature here is reproduction -- it is those traits that are reproduced most that necessarily become more common from generation to generation. So we will measure fitness based on the degree to which traits (in this case, different genotypes) reproduce.

A first measure of the fitness of a genotype is absolute fitness: the number of offspring an individual with a particular genotype has on average. Absolute fitness could be studied in natural populations; one could determine that, for example, individuals with a certain genotype have on average six offspring. A problem with this measure is that there is no way to determine, just looking at this number, whether having an average of six offspring means that this form has high or low fitness. If AA individuals have six offspring, Aa have three offspring, and aa have one offspring, then six is the best, and we would say it represents high fitness. On the other hand, if AA individuals have six offspring, Aa have twelve, and aa have thirty, then we would say six represents very low fitness. The important point to note here is that whether a particular absolute fitness is high or low depends on the fitnesses of other genotypes in the population. So we will develop a second measure of fitness that accounts for this.

A second measure of the fitness of a genotype is relative fitness: the degree to which individuals with a particular genotype reproduce in proportion to individuals with other genotypes. Typically, relative fitness is measured with respect to the genotype that has the highest fitness. The genotype with highest fitness has a relative fitness value of one; the other fitnesses have relative fitness values that are fractions of one, telling us how well these individuals reproduce with respect to the individuals of the genotype that reproduces best.

Mathematically, we symbolize absolute fitness as W and relative fitness as w (so be careful how you draw this to be sure you can distinguish upper case W, the absolute fitness, from lower case w, the relative fitness.) To indicate what genotype we're talking about, we give the W or w a subscript:

WAA means the absolute fitness of AA
WAa means the absolute fitness of Aa
Waa means the absolute fitness of aa

wAA means the relative fitness of AA
wAa means the relative fitness of Aa
waa means the relative fitness of aa

Based on the above definitions, the absolute fitnesses are numbers we would measure from natural populations. From the first example of absolute fitnesses above, we would say that:

WAA = 6 (we would have measured that on average an AA individual has six offspring)
WAa =3 (again, we would have measured this)
WAa =1 (as measured.)

Then we would calculate the relative fitnesses. First, we would note which genotype has the highest absolute fitness. I n this case, it is AA (but it doesn't have to be, it depends on the trait and the situation.) We often call the genotype with maximum absolute fitness Wmax where max stands for maximum absolute fitness. We then calculate the relative fitnesses by dividing each absolute fitness by Wmax, as follows:

wAA = WAA/ Wmax = 6/6 =1 (remember, as noted above, the highest relative fitness is always one, so this should not be a surprise.)
wAa = WAa/ Wmax=3/6 = 0.5
waa = Waa/ Wmax= 1/6 = 0.17

Relative fitnesses are useful because they express fitnesses of different individuals with respect to each other. For the calculations you will need to do you will always need to use relative fitnesses. Some of the problems you will get will start with absolute fitnesses; in this case, you will need to calculate relative fitnesses. In other problems, you will be given information so that you can get relative fitnesses immediately -- for example, you may be told that one genotype survives best and another survive half as well, in which case relative fitnesses would be one and 1/2, respectively. Or you might be told that one genotype survives to reproduce best, another survives 60% as well, and a third only 5% as well, in which case relative fitnesses would be 1, 0.6, and 0.05, respectively. If you are given the relative fitnesses, you won't need the absolute fitnesses; you only use the absolute fitnesses if that is what you are given, and you use them to calculate the relative fitnesses -- the relative fitnesses are what you really want.

Now let's use our definitions of relative fitness, allele frequencies, and genotype frequencies to see how to model natural selection. We will take the same approach we did for Hardy-Weinberg Equilibrium and start with the gametes that begin one generation, then follow them as they form zygotes, grow up into adults, and finally reproduce to produce gametes to start the next generation.

Again, we will call the allele frequencies in the gametes that start a generation p and q:

Freq(A)=p
Freq(a)=q

Since the population is still assumed to be randomly mating and infinitely large, these gametes will unite to form zygotes exactly as occurred in Hardy-Weinberg Equilibrium, so, for the same reasons as for Hardy-Weinberg Equilibrium, the genotype frequencies of zygotes will be:

Freq(AA)=p2
Freq(Aa)=2pq
Freq(aa)=q2

Now these zygotes will grow up into adults, and here's where we see something different from Hardy-Weinberg Equilibrium. These zygotes have different fitnesses; some survive better than others. We These differences are measured by the differences in relative fitness we have defined. We will use these to determine the genotype frequencies of the reproducing adults.

To do this, the first thing we need to determine is a measure of how well, on average, an individual from the population survives. This is called the average fitness of the population; it means how well an average individual of those originally produced, as zygotes, is expected to survive to reproduce. It is symbolized by a w with a line over it (called "w bar") and calculated as:

Note that this calculation takes the fitness of each genotype and multiplies it by the frequency with which that genotype is produced in the zygotes. Because it is multiplied by these frequencies, it is as though we had counted up every individual in the population, measured its fitness, and then taken the average of all of those -- multiplying by the frequency accounts for the fact that some genotypes may be more common than others, and gives us the correct average fitness for the population.

We will now use this to determine the adult genotype frequencies of the surviving individuals from the population. To do this, we take the part of the average fitness that represents each genotype, and divide by the average fitness. That is:

Now that we know the frequencies of reproducing adults, we can ask what frequencies of alleles they have, and what allele frequencies will be passed to the next generation. Note that these are the same -- the frequency of an allele in the reproducing adults IS the frequency with which that allele will be passed to the next generation (since it's the reproducing adults that pass it on.) To determine this, remember (from the Hardy-Weinberg Equilibrium lecture) that we can determine the frequency of allele A as:

Since we have just calculated, above, the frequencies of AA and Aa, we can now use these numbers in this formula to calculate the allele frequency of A. We could calculate the frequency of allele a similarly, as

or we can remember that Freq(A)+Freq(a)=1, so once we know Freq(A) we can calculate Freq(a)=1-Freq(A).

At this point, you should have all the information you need to work all the problems in Chapter VI of your lab manual, and you should try to start working on them. Here are some tips on how to work the problems:

Your first step should be to determine whether you're working a problem for a population in Hardy-Weinberg Equilibrium or a problem in which natural selection is occurring. Remember that the difference between the two models is in whether or not different genotypes have different fitness. If the genotypes all have the same fitness (all survive and reproduce equally well) that should tell you you're looking at a Hardy-Weinberg Equilibrium problem. If the genotypes differ in fitness (if they don't survive and reproduce equally well) that should tell you you're looking at a natural selection problem.

If you have a Hardy-Weinberg Equilibrium problem there are two places the problem could start. You could be given allele frequencies to start out or you could be given genotype frequencies to start out. So your next step in solving the problem is simply to figure out, and write down, which you've got: a llele frequencies or genotype frequencies.

In a Hardy-Weinberg Equilibrium problem remember that if you have an allele frequency, you can figure out the other allele frequency (since p+q=1) and you can figure out any genotype frequency (Freq(AA)=p2, Freq(Aa)=2pq, Freq(aa)=q2.) So if you're given an allele frequency, use it, and get whatever information the question asks for. If you're given a genotype frequency rather than an allele frequency, your first step should be to calculate an allele frequency, because once you have the allele frequencies you can figure out anything else the question might require. To calculate an allele frequency from a genotype frequency, you need the genotype frequency of a homozygote (that is, either AA or aa.) Once you have that, take the square root to get the allele frequency of the allele that makes up that genotype.

If you have a natural selection problem then there are three places the problem could start, and you need to figure out which one you have. As with Hardy-Weinberg problems, things go more smoothly if you have allele frequencies, so your first step is to get an allele frequency (if you're not already given one.) Here are the three places a natural selection problem could start, and what your first step should be for each possible starting place (once you've completed the first step, you should be able to use the natural selection formulas to work the rest of the problem -- it is often the first step that is hardest.)

  1. You could be given an allele frequency in gametes at the start of a generation. If you are, use the allele frequency to determine the other allele frequency (remember p+q=1) and use the fitness information you are given (be sure you take this and calculate relative fitnesses if you're not already given them) and use the formulas you have to get the information required by the question.
  2. You could be given a genotype frequency in zygotes. If you are, remember that genotype frequencies in zygotes are p2, 2pq, and q2, just like in Hardy-Weinberg Equilibrium. If you have the genotype frequency of a homozygote in zygotes, take the square root to get the allele frequency that was present at the start of the generation, and proceed as you did in the first case where you were given the allele frequency directly.

  3. You could be given a genotype frequency in adults. If you are, then you are in a situation where the population is NOT in the proportions of Hardy-Weinberg Equilibrium. To calculate the allele frequencies in these adults, which are also the allele frequencies in the gametes that will start the next generation, use the formula for the allele frequencies: Freq(A)=Freq(AA) + (1/2)Freq(Aa.) Once you have the allele frequencies that start the next generation, proceed as you would in the first case where you were given the allele frequencies directly.


When e-mailing me problems, show your work as clearly as you can. You can use the following ways of representing things to make typing the formulas (which is tedious, as I know, just having typed out a bunch of them for this lecture) easier: To represent p squared (p2) type p^2

To represent w bar, the average population fitness, type wbar.

To represent things with subscripts, like the absolute and relative fitnesses, you don't have to put the subscripts as subscripts -- it's OK to type:

wAA for wAA

It will probably help to work stuff with pen and paper first, then type it out. Get into the habit of writing down all formulas and steps clearly and legibly -- you'll make many fewer mistakes, you'll be able to go back and study from what you've done, and I'll expect you to write down all formulas and steps clearly and legibly on exams so you'll do better if you get into the habit of doing so now.

Finally, from time to time look at the questions and the answers and think about what you'd really expect to happen in the situation you're given, through natural selection. Do the numbers you're getting makes sense? Can you see how they might relate to a real situation? The goal, really, is to be able to translate evolution into math and then the math back into evolution -- don't get so bogged down in algebra that you forget what you're really trying to study.


Now that you've learned the basic model, let's consider some more aspects of natural selection as it affects situations such as those you have modeled: natural selection occurring because of different fitnesses of genotypes for a single gene with two alternate alleles in a diploid, sexually reproducing population. Depending on the traits being studied, and the environment, fitness relationships among genotypes vary. They also depend on just how the traits are coded -- whether one allele is completely dominant to the other (recessive) allele or whether there is some form of codominance or incomplete dominance such that the heterozygote has a different phenotype from either homozygote. (If these terms -- dominance, recessiveness, incomplete dominance, codominance -- are not completely familiar to you, be sure to review some genetics before proceeding!)

An important point to remember is that fitness depends on the phenotype -- the appearance, structure, function of an organism. If two different genotypes result in the same phenotype (as when one allele is dominant) then both of those genotypes will have the same fitness -- they look, act, function just like each other, so they survive and reproduce just like each other.

Here are the possible ways the fitnesses of different genotypes could be related to each other: If one allele is completely dominant to the other, recessive, allele, so that the heterozygote has same phenotype as the dominant homozygote, there are two ways fitnesses could be related:

  1. the recessive phenotype could have highest fitness (it would have a relative fitness value of 1; the dominant homozygote and heterozygote would have fitness values that are equal to each other and that are less than 1.)
  2. the dominant phenotype could have highest fitness (in this case, the dominant homozygote and the heterozygote would both have relative fitness values equal to one, and the recessive homozygote would have a relative fitness value equal to something less than 1.)
If neither allele is completely dominant (in a case of codominance or incomplete dominance), so that the heterozygote has different phenotype from either homozygote, there are three ways that fitnesses could be related to each other:
  1. fitness codominance occurs when the heterozygote has fitness that is intermediate between the fitnesses of the two homozygotes (one homozygote has highest fitness, the heterozygote has lower fitness than it, and the other homozygote has still lower fitness.)
  2. heterosis occurs when the heterozygote has higher fitness than either homozygote (the heterozygote would have a relative fitness value of one, the two homozygotes would have fitness values of less than one.)
  3. underdominance (negative heterosis) occurs when the heterozygote has lower fitness than does either homozygote (one homozygote would have a relative fitness value of one, the other homozygote might also have a fitness value of one or, more likely, would have a somewhat lower fitness value, and the fitness value of the heterozygote would be even lower.)
You can use the formulas you've learned to calculate the effect of natural selection over one or two generations for any of these situations. If you repeated calculation of these formulas for 50 or 100 generations (I don't recommend doing this by hand!) you could learn what happens in each of the situations described above, in the long run. It turns out that the different situations listed above do differ in the expected result of natural selection. To learn about these, you will complete a computer exercise to study the different situations. This is described in Chapter VI or your lab manual. In this assignment, you will look at specific instances of the forms of natural selection described above, and of one other form of evolution, genetic drift. These specific instances illustrate general patterns -- for example, what you learn for the specific case of heterosis you study will show you patterns that are generally true for heterosis. Your goal will be to learn these general patterns.

Click here to return to the index of lectures