Computer Assignment # 1: Modeling natural selection and genetic drift

Introduction to the Assignment: The goals of the exercise are to explore the outcome of natural selection on traits coded by a single gene with two alternate alleles for situations with different genetic coding of the traits with higher and lower fitness, and to explore the effects of genetic drift.  You will consider examples of species that represent different general situations (Hardy-Weinberg Equilibrium (no evolution), dominant has highest fitness, recessive has highest fitness, fitness codominance, heterosis, underdominance, genetic drift.)  To complete the assignment, you must first understand the different ways natural selection can affect a trait coded by a single gene with two alternate alleles, and you must understand genetic drift. This information has been presented in the lecture notes. Click here to review the lecture notes different forms of natural selection (the different forms are the dominant having highest fitness, the recessive having highest fitness, fitness codominance, heterosis, and underdominance.) Note that this section of the lecture also has information on this assignment. Click here to review the lecture notes on genetic drift.

The assignment will require you to model evolution based on details you will be given about particular species.  While you will model the details of these specific species, it turns out that the general trends you will find for these species represent the general expected trends for these different evolutionary situations. Your main goals are to use these examples to learn the general expected trends, and to figure out why the different situations show different evolutionary patterns.

Some of the trends you discover will relate to the pattern of evolution of an allele -- does it increase or decrease quickly or slowly, does it increase or decrease with a gradual, smooth curve or does frequency fluctuate up and down. Other trends relate to what happens at equilibrium.  "At equilibrium" means when there is no more major change in allele frequency -- after many generations, when evolution has occurred.

Note that the natural selection model you will use assumes no evolution occurs except for natural selection -- there is no mutation, no gene flow, no genetic drift, and mating is random.  The genetic drift model you will use assumes no evolution occurs except for genetic drift

To learn the expected trends, you will run computer models of natural selection or genetic drift with numbers that fit the species being modeled.  You will then complete the assignment in three parts, which are:

  1. use the information you obtain to draw graphs and fill in tables.  You will be expected to hand in the graphs and tables, and then to discuss the material you should learn from them in your weekly lab section.
  2. test your ability to recognize the general trends by determining what situations are represented by some assigned graphs.
  3. demonstrate your understanding of natural selection as it applies to these models, and to have a little fun (for a change?), you will make up your own situation (realistic or imaginary) of natural selection that fits one of the general situations of natural selection, and write a paragraph or two describing the situation.
Working in Groups: I strongly encourage you to work in groups for parts A and B of this assignment.  You may work in groups of up to four students.  To get the information you need, you should be together while you work on it, but you may want to work in a computer lab where you can divide up some of the work between different people.  You should discuss the material together first to be sure everone can fill in Table 1 and do graph 2A.  Once you have done that, you can have different people in charge of drawing different graphs.  Note that the final table in part A requires information from all previous graphs and tables, and note that each of you will be responsible for the material from all graphs and tables on exams.  Also note that everyone must have all the graphs from part A before doing part B.

You must complete part C, the written paragraph, on your own -- come up with your own situation and give a written description of it.

Specific Instructions for the Three Parts of the Assignment:

Part A: To complete part A, you will use the models of evolution and information about species that are linked to the page you get by clicking here.  using information from the situations given on the web to draw graphs and fill in tables.  Fill in the information in the graph template or table, which you can print from the web.  Also, give each figure or table a clear, explanatory title (caption) that make the table or figure stand on its own.  Table captions should be given at the top of the table, figure captions at the bottom of the figure.  (Note that this is the standard format for figure and table captions; for more information click here to see the web pages on general writing style.)  As you complete each figure or table, read the "what you should learn" section that follows the instructions for the figure or table.  The answers to the "what you should learn" sections are NOT to be handed in, but they are what you need to study to prepare for the lab section during which we will discuss this material.  During the lab after you have handed your assignment in, different lab groups are to come to lab prepared to discuss one or two of the graphs/tables; the lecture syllabus indicates which groups are responsible for which graphs/tables.  You will, of course, need to have learned the answers to ALL the "what you should learn questions" for the exams over this material.

I. Indicate which general situation is represented by each of the species and environments you have modelled.  The possible evolutionary situations to fill into the blanks are: Hardy-Weinberg Equilibrium (no evolution), dominant has highest fitness, recessive has highest fitness, fitness codominance, heterosis, underdominance, genetic drift.  Fill this information into table 1, which you can get by clicking here. Be sure to give the table a caption (title), which should be written directly after the words "Table 1" at the top of the table.

II.  To study what happens over time, evolutionarily, use the natural selection and genetic drift computer models to calculate allele frequency for many generations given starting conditions that you assign.  Use the numbers you obtain to make graphs of allele frequency versus time.  To understand these graphs, remember that allele frequencies can vary from 0 (the allele is lost from the population) to one (when the allele is fixed, the only allele at this locus in a population), and that following what happens to the frequency of one allele lets you figure out what happens to the other allele too, since p+q = 1.

The graphs you will make are described below.  To make graphs, you can either copy information into a standard graphics program (graphics given in Microsoft Excel work, although for some reason this seems to work better on a PC than on a Mac) or make graphs by hand on the graph template provided by clicking here (print out a separate template for each of the required graphs.)  For each graph, you will be required to run one of the evolution models (natural selection or genetic drift) to make each required line.  Some graphs involve just one species, others involve more than one species.  Get information on appropriate fitness (for natural selection) or population size (for genetic drift) from the pages on the web about the species.  Allele frequency information to start the graphs is given on THIS page, in the instructions for the graph. FOR EACH GRAPH, run the model for 100 generations (this is the default; if you don't change the number of generations it will run for 100 generations.)  If you make the graph by hand using the graph template, plot only the values for generations 1, 10, 20, 30, 40, 50, 60, 70, 80, 90, and 100 for each of the required situations. BE SURE to enter the figure number in the blank after the word "figure" below the graph and to give each graph a clear caption (title), which should be written directly after the figure number.
 
In addition to making graphs, for each line on each graph you will calculate what is called the heterozygosity for the final population (the population at generation 100).  Heterozygosity is defined as the proportion of heterozygotes produced; it is calculated as the Hardy-Weinberg expected proportion, 2pq, using the last allele frequencies given (for generation 100).  Note that if one allele is fixed and one lost, heterozygosity = 0; this represents a situation of no genetic variation.  If both alleles are maintained, then heterozygosity provides a measure of genetic variation: the higher the heterozygosity, the higher the genetic variation.

Note that after each graph is a series of practice questions.  These are not to be handed in, but will be discussed in lab during the lab section after you hand in your assignment.  They represent the things about the different situations of natural selection and genetic drift that you are expected to learn (and may be tested over) from doing this exercise and studying these graphs.

1.  For the red monkeyflower, make one line to represent what happens to the N allele starting with an allele frequency of 0.5; label this line N.  Make a second line to represent what happens to the n allele starting with an allele frequency of 0.5; label this line n.  Write the value of the heterozygosity remaining after 100 generations next to the graph.

Practice Questions for Graph 1:

  1. What situation of natural selection (dominant has highest fitness, recessive has highest fitness, fitness codominance, heterosis, underdominance) is represented here?
  2. In this situation, does an allele get fixed?
  3. What happens to the allele that results in the highest fitness homozygote?
  4. What happens to the allele that results in the lowest fitness homozygote?
  5. Look at the two different lines for the two alleles: if you had just one of those lines, could you draw the other? How would you do this? (HINT: remember that the allele frequencies must add up to one.)


2. Draw three lines on this graph.  Start each line with an allele frequency of 0.01 for the allele you are graphing. Line "C" (label it with a C) should show the frequency of the C allele in a peppered moth population in a black, industrial woodland.  Line "T" (label it with a T) should show the frequency of the T allele in a peppered moth population in a gray, non-industrial woodland.  Line "N" (label it with an N) should show the frequency of the N allele in a red monkeyflower population.  Also indicate the value of heterozygosity for each line.

Practice Questions for Graph 2: The point of this graph is to illustrate what happens to the allele that gives a highest fitness homozygote in three different situations.

  1. Each line shows what happens to one of two alternate alleles; for each line, what would the line for the other allele look like?
  2. Which general situation (dominant has highest fitness, recessive has highest fitness, fitness codominance, heterosis, underdominance) is represented by each line?
  3. In which does/do an allele get fixed?  In which does/do an allele NOT get fixed?  How is this reflected by the heterozygosity values you calculated?
  4. How do the situations differ in how rapidly an allele increases in frequency -- that is, looking at the first generations, in which is there rapid increase in the frequency of an allele and in which is there only slow increase in the frequency of an allele?
  5. Consider the difference in coding of the highest fitness phenotype in these three situations and explain why an allele is fixed in some, but not all, of these situations and why these situations differ in the rate at which the allele that gives the highest fitness homozygote increases in frequency.


3. Draw three lines on this graph.  All lines will show the "S" allele African Swallowtail Butterfly; they should start from three different initial allele frequencies of S: S=0.1, S=0.5, and S=0.9. Label each line with the initial allele frequency (so one line will be labeled S=0.1, one will be labeled S=0.5, and one will be labled S=0.9.)  Also indicate the value of heterozygosity for each line.

Practice Questions for Graph 3: This graph illustrates a situation of natural selection in which the final outcome of selection depends on the starting allele frequencies.

  1. Each line shows what happens to one of two alternate alleles; for each line, what would the line for the other allele look like?
  2. Which general situation (dominant has highest fitness, recessive has highest fitness, fitness codominance, heterosis, underdominance) is represented here?
  3. How does what the population of butterflies looks like (i.e. are they black and orange? black and white?  some mixture?) depend on the initial allele frequencies in this situation?
  4. Consider the starting frequencies of each phenotype to develop an explanation why the final outcome of natural selection depends on the starting allele frequencies in this situation.
4. Draw three lines on this graph.  Start each line with an allele frequency of 0.5 for the allele you are graphing. Line "T" (label it with a T) should show the frequency of the T allele in a population of peppered moths in a black, industrial woodland.  Line "g" (label it with a g) should show the frequency of the g allele in a population of yellow monkeyflowers.  Line "n" (label it with an n) should show the frequency of the n allele in a population of red monkeyflowers. Also indicate the value of heterozygosity for each line.

Practice Questions for Graph 4: This graph illustrates what happens to an allele that gives low fitness in homozygous form in three situations.

  1. Each line shows what happens to one of two alternate alleles; for each line, what would the line for the other allele look like?
  2. Which general situation (dominant has highest fitness, recessive has highest fitness, fitness codominance, heterosis, underdominance) is represented by each line?
  3. Which of the general situations modelled here maintains the highest frequency of an allele that gives a low fitness homozygote?
  4. Which of the general situations modelled here maintains the lowest  frequency of an allele that gives a low fitness homozygote?
  5. Which situation maintains the highest level of genetic variation (based on your calculated heterozygosity values)?
  6. In which situation would you expect the lowest average population fitness (what we calculate as w bar) once the population has reached equilibrium (a situation of no further net change)?  To determine this, consider which would lead to the largest proportion of low fitness individuals being born each generation.
  7. Develop an explanation why these three situations differ with respect to which maintains the highest fitness of a low frequency allele
5. Draw three lines on this graph.  All lines will show the "S" allele in the Grizzly Bear; all should start with an allele frequency of 0.5.  Line "A" (label it with an A) should represent the Alaska population, line "Y" the Yellowstone population, and line "S" the Selkirk population.  Also indicate the value of heterozygosity for each line.

HINT FOR LINE A:  you should be able to draw the line for the Alaska population without running any computer models -- think about the conditions and whether any evolution is occurring in this population. HINTS FOR LINES Y AND S:  if you run either of these more than once, you should get different numbers -- don't think you're doing something wrong because you get something different if you run it more than once or if you get something different from your classmates.  Think about why it is that you get something different -- what kind of evolution is occurring?

Practice Questions for Graph 5:

  1. What form of evolution is occurring?
  2. This graph should look quite different from the graphs you have made.  Why does it look different?
  3. Why is it that if you run the same model more than once you can get different results?  You will explore this more in the next exercise.
  4. In which situation do you maintain most genetic variation (as indicated by your heterozygosity value)?  Note that this could vary some; in the next exercise, you will explore it in more detail.
III.  For this exercise you will fill information in table 2, which can be obtained by clicking here.  To get this information, you will run five runs of the model for Grizzly Bears in Selkirk for each of the starting allele frequencies of the "S" allele: 0.1, 0.5, 0.9 (in other words, you'll start at 0.1 and do that five times, then start at 0.5 and do that 5 times, then start at 0.9 and do that 5 times.  This sounds like a lot but it doesn't take that long since you won't be making graphs, just getting some information as noted below.)  Then you will do the same thing for the Yellowstone population -- five runs of the model for each of the following allele frequencies: 0.l, 0.5, and 0. 9.  For each run of the model you will write into the table whether an allele gets fixed, which allele gets fixed if one does, and in what generation an allele got fixed if one does. You will also calculate heterozygosity for each situation and determine average heterozygosities for each different situation.  Be sure to give the table a caption (title), which should be written directly after the words "Table 2" at the top of the table.

Practice Questions for Table 2:

  1. What form of evolution is occurring here?
  2. Is an allele more likely to be fixed within 100 generations in a larger population or a smaller population?
  3. Can you be sure, before you run a model, which allele will be fixed -- if one is fixed -- and which lost?
  4. When the allele frequencies start at 0.5, is either allele more likely to be fixed?  Why?
  5. When the allele frequencies start at 0.1 or 0.9, is either allele more likely to be fixed?  Why?
  6. How does the time it takes for an allele to become fixed compare between larger and smaller populations?
  7. What is the impact of genetic drift on genetic variation (as indicated by heterozygosity values)?  Is this affected by population size?  If so, how?  Is it affected by initial allele frequency? If so, how?
IV.  For this exercise, you will fill information into table 3, which can be obtained by clicking here.  To get this information you will model each of the situations given in the table, for each of three allele frequencies: 0.1, 0.5, 0.9, for the allele indicated in the table.  You will fill in the final allele frequency for the allele. Be sure to give the table a caption (title), which should be written directly after the words "Table 3" at the top of the table.

Practice Questions for Table 3:  This table summarizes results of natural selection for the different situations, and answers to the following questions should be related to answers to the questions from the graphs of the different situations of natural selection (Graphs 1-4)

  1. For which situation(s) of natural selection (dominant with highest fitness, recessive with highest fitness, fitness codominance, heterosis, underdominance) does the final outcome depend on the allele frequency at the start, and for which situation(s) does it not?
  2. Develop an explanation for why the outcome depends on the initial allele frequency where it does, and why it does not depend on the allele frequency where it does not.
  3. Which situations of natural selection maintain most genetic variation in a population over time?  Which situations do not maintain genetic variation?
  4. Develop an explanation for why these situations differ in whether or not they maintain genetic variation?
  5. For the situations in which one allele is lost and one fixed, what determines which is lost and which is fixed?
  6. For the situations that maintain variation, what determines which allele is most common?
V.  Summary table.  Based on all the information you have obtained from the previous graphs and tables, fill in the information required for table 4, which can be obtained by clicking here.  Note that the headings to the columns in the table indicate what should go into the column and also indicate, in parentheses, the possible answers -- please use these possible answers (not some other wording) in the table. Be sure to give the table a caption (title), which should be written directly after the words "Table 4" at the top of the table.

Part B.  You will look at several graphs of allele frequency versus generation.  The graphs can be seen by clicking here. Each of the graphs plots the frequency of one allele (called the frequency of A, but remember that this does NOT imply anything about dominance!) evolving from each of three initial allele frequencies: 0.1, 0.5, and 0.9.  All other information (fitness, population size) is the SAME within a graph (i.e. all the lines on graph 5 are drawn for the same fitnesses and population sizes; only the allele frequency is different.)  Your goals are to identify the general situation plotted (Hardy-Weinberg Equilibrium (no evolution), dominant has highest fitness, recessive has highest fitness, fitness codominance, heterosis, underdominance, genetic drift) for each, and to find values of fitness (for natural selection) or population size (for genetic drift) that will give the plot shown.  To do the latter, you should first look at the graphs you have drawn in part A of this assignment to find the graph most similar to the one shown, and then use the computer programs to find values of the fitness of each phenotype (for forms of selection) or population size (for genetic drift) that will give you a graph that is close in pattern to this one.   The values do not have to give EXACTLY this graph but should show the same basic pattern, similar times to important events such as fixation or loss, and similar equilibrium values. Remember that you do not expect drift to give you an identical graph but you should try for similarities in degree of fluctuation and time to fixation if an allele is fixed.

Part C. Each of you must select ONE of the graphs from part B that you have determined to represent a form of natural selection (NOT drift) to complete part C.   Note that the group does NOT have to complete all of these and that they must be done independently -- each of you must come up with your own species, traits, and answer to this question without looking at what others have done (although you may discuss them with each other.)  Present, in a TYPED, DOUBLE-SPACED paragraph, a description of a situation in a species that would result in the fitness relationships that give rise to this graph.  Please hand in two copies of your paragraph (I keep a copy of all of these; if you only hand in one copy it will not be returned to you).  Use the writing style appropriate for scientific writing as described on the web pages for writing style.  Use species and characteristics that are NOT those discussed in this exercise, or in this course so far or in the textbook.  You can use any other real or imaginary species that you like as long as it fits the assumptions of the model (diploid, sexually reproducing, has a trait controlled by a single gene with two alternate alleles, etc.) You can make up traits and reasons for them to differ in fitness. Be sure to clearly describe the species and the phenotypes associated with each genotype, and to clearly explain what aspect or aspects of the environment and/or biology of this species cause the phenotypes to have the fitnesses they do.  The information you give should be like the information given on the pages that describe the species used for part A of this assignment (the peppered moth, red monkeyflower, etc. -- give genotypes, phenotypes, fitnesses, and a clear explanation of why the phenotypes differ in fitness).  BE SURE to indicate which of graphs (give the graph number) your situation fits. Be creative, have fun with this one!

Last Revised 9 January 2003