Goals: the goals of this lecture are to show how levels of genetic drift and gene flow will affect the potential for local adaptation in a population, and to introduce methods for distinguishing the effects of genetic drift, gene flow, and natural selection on populations.
Related Textbook Material: Freeman and Herron (2001) Chapter 6
Lab Manual Questions over this material are in Lab Manual Chapter V
For the first part of this lecture, we will consider how levels of genetic drift and gene flow affect the potential for populations to adapt. First, we will define local adaptation as the evolution, through natural selection, of traits that have high fitness in the environmental conditions specific to a population. For example, consider the peppered moth. Having black wings is a local adaptation to an environment around a factory in which tree trunks are black. Having gray wings is a local adaptation to an environment that is not polluted, so that tree trunks are gray. These adaptations are local in that they are NOT found throughout the whole species, having evolved through natural selection because they have high fitness in the specific environments of only certain populations of the species.
The degree to which local adaptation can occur in a species depends on the potential for populations to evolve differences from each other and on the potential for natural selection to occur within each population. As we saw in the previous lecture, these depend on levels of genetic drift and of gene flow. Genetic drift depends on population size and gene flow depends on the degree to which individuals move among populations.
As we have seen, if there is a large effect of genetic drift (as there is in small populations), loss of genetic variation decreases the potential for natural selection. Based on this, we would expect smaller populations to have less potential for local adaptation than do larger populations.
We have also seen, however, that the effects of genetic drift could be counteracted by gene flow, since gene flow increases genetic variation in populations. If there is gene flow of at least one individual per generation, some genetic variation will remain in populations since this amount of gene flow prevents fixed differences among populations. A small amount of gene flow, by increasing genetic variation, increases the potential for natural selection, and therefore increases the potential for local adaptation.
A large amount of gene flow, in contrast, will prevent local adaptation. High levels of gene flow cause populations to be similar to each other; so many individuals move from population to population that, within each population, many alleles come from individuals from other areas, not adapted to local conditions. So the gene flow will swamp the effects of natural selection.
At this point, you should review your understanding of gene flow, genetic drift, and their impact on natural selection and local adaptation by studying the following questions in your lab manual: Chapter VIII, questions 7 - 11.
For the rest of this lecture, we will consider how to distinguish natural selection, genetic drift, and gene flow acting on populations.
First, let's remember how selection, drift, and flow are predicted to affect populations. They affect the following aspects of populations: genetic variation within populations, genetic variation between populations (that is, how different or similar populations are from each other genetically), and whether or not populations are made of adults with genotypes in Hardy-Weinberg Proportions. The following table summarizes the predicted results of these forms of evolution.
| Form of Evolution | Impact on Genetic Variation Within Populations | Impact on Genetic Variation Among Populations | Population Predicted to be in Hardy-Weinberg Proportions? |
|---|---|---|---|
| Natural Selection | Decreases Variation | Increases Variation | No |
| Genetic Drift | Decreases Variation | Increases Variation | Yes |
| Gene Flow | Increases Variation | Decreases Variation | Yes |
Based on this information, if we have information on some genetic traits in some populations, we should be able to distinguish the effects of genetic drift, gene flow, and natural selection. If we see high variation within populations and similarity between populations, and if traits are in Hardy-Weinberg proportions, that suggests gene flow. If we see low genetic variation in populations, differences among populations, and if traits are in Hardy-Weinberg proportions, that suggests genetic drift. If we see low genetic variation in populations, differences between populations, and if traits are NOT in Hardy-Weinberg proportions, that suggests natural selection.
Now let's consider how we could get some genetic traits to study so that we could test for these different forms of evolution using these criteria. While there are some aspects of organisms that we can observe directly (for example, gray versus black coloration in peppered moths), most of the traits used by population geneticists to study forms of evolution in this way are molecular. They study either proteins, which are coded by genes and can indicate genetic differences, or DNA, which shows variation in genes directly.
The most commonly used molecules for studying population genetics have been different forms of proteins called allozymes: different forms of the same enzyme. These typically represent the proteins coded for by different alleles for a gene for some enzyme. Suppose for example that you have a gene that codes for an enzyme called esterase. Different alleles for this gene would produce slightly different forms of this esterase protein. There can be several forms of an enzyme such as esterase that all work -- all catalyze the appropriate reaction. However, they would differ slightly in amino acid sequence.
It is possible to distinguish between different allozymes such as these different forms of esterase using a process called electrophoresis, which means the separation of molecules based on differences in electric charge. Often, different allozymes will differ slightly in charge because of the slight differences in amino acids making up their sequence. So electrophoresis lets us tell different forms of enzyme (allozymes) coded for by different alleles for a gene apart from each other.
Using electrophoresis to distinguish allozymes based on charge has been the most commonly used method of identifying genetically based traits to distinguish between the effects of different forms of evolution. It is general cheap and easy to do. More recently, people have developed fairly cheap and easy methods of identifying genetic traits studying parts of the DNA directly, so these are also being used now to identify genetically based traits to distinguish between the effects of different forms of evolution.
For this class, the main important point is that it is possible to study a variety of different genetically based traits so we can get data that will allow us to tell different forms of evolution apart.
Once we have information on some genetic trait or traits, remember that to tell the forms of evolution apart we need to evaluate how much variation there is within populations, how similar or different populations are from each other, and whether or not populations are in Hardy-Weinberg proportions. These can all be tested statistically. In this class, we'll just look at one of the statistical tests: a chi-square test which will allow us to test whether genotype frequencies differ from those expected if the population is in Hardy-Weinberg proportions. The chi-square test is a widely used test that you have probably used in other contexts; it is easy to apply, and it will let you see how aspects of population genetics can be tested statistically.
First consider why we need statistics. Suppose you measured genetic variation in a bunch of individuals in a population, counted up genotypes to determine genotype frequencies, and counted up alleles to determine allele frequencies. You could see if you had the expected relationships of p2, 2pq, and q2 based on your allele frequencies p and q. However, because of random variation in your sample of individuals, the numbers probably wouldn't match up to EXACTLY what you'd expect even if the population really were in Hardy-Weinberg proportions. So the question is, how different do they have to be from what you'd expect for you to conclude that they're not in Hardy-Weinberg proportions? This is where the statistics come in. We can use statistics to determine how sure we can be that these genotypes are not in Hardy-Weinberg proportions. By convention (based on what has generally worked in biology and other sciences) we will conclude that if we can be at least 95% sure that the genotypes are not in Hardy-Weinberg proportions, our conclusion will be that they are not in Hardy-Weinberg proportions. Otherwise, we conclude that we can't detect any difference between the genotype frequencies we observe and the expected Hardy-Weinberg proportions, so the population is probably in Hardy-Weinberg proportions.
Now let's consider how to do the test. Your lab manual outlines the basic steps for a chi-square test at the start of chapter VIII. I will go through the same steps, and include an example, here.
NOTE: the following instructions are for doing a chi-square to test for Hardy-Weinberg proportions. Chi-square tests are used in many other situations (for example, in Biology 120 you probably used one to study distribution of Daphnia.) The basic statistical test is the same for all situations but the details of how to get expected distributions and things like that differ, so be sure to read through this even if you already have done chi-square tests and if you're required to do them in other classes make sure you know the details of how to use them for each situation!
To do a chi-square test to see if genotype frequencies are in Hardy-Weinberg proportions, you first need some data -- a sample of individuals with known genotypes from a population. As an example, we'll look at a sample of 100 humans and their known genotypes for the MN blood group. There are three possible genotypes, MM, MN, and NN. In this sample, the numbers of individuals of each genotype were found to be:
| Genotype | MM | MN | NN |
| Number of Individuals | 47 | 46 | 7 |
To do a chi-square test to determine whether these genotypes are in Hardy-Weinberg proportions, the first step is to calculate allele frequencies for the two alleles (M and N) in the sample. Since we know the three genotypes, these can be calculated as:
Freq. (M) = p = # of M alleles / total number of alleles.
total number of alleles= #individuals x 2 = 100 x 2 = 200
total number of M alleles = 2 (#MM individuals) + (#MN individuals)
= 2(47)+46=140
So Freq(M)=p= # M/total # alleles = 140/200 = 0.7
We could calculate the frequency of N, Freq.(N)=q, similarly (as # N alleles/ total # alleles) or we can (more easily) remember that p+q=1, so Freq(N)=q=1-p = 1-0.7 = 0.3
The second step is to determine the genotype frequencies for BB, Bb, and bb that you would expect, given these allele frequencies, if the population is in Hardy-Weinberg proportions. We will call these the "expected genotype frequencies" (the word "expected" is a standard part of the chi-square test) and they are, as always, p2, 2pq, and q2. So:
Expected genotype frequency of MM = p2= (0.7)2=0.49
Expected genotype frequency of MN=2pq = 2(0.7)(0.3)=0.42
Expected genotype frequency of NN = q2= (0.3)2=0.09
The third step is to determine the expected number of individuals in your sample that you would see if the population were in Hardy-Weinberg proportions. To do this, multiply the expected frequency of each genotype by the total number of individuals in the sample. So:
Expected # MM=(expected genotype frequency of MM)x(Number of individuals)=(0.49)(100)=49
Expected # MN=(expected genotype frequency of MN)x(Number of individuals)=(0.42)(100)=42
Expected # NN=(expected genotype frequency of NN)x(Number of individuals)=(0.09)(100)=9
The fourth step is to calculate the value of chi-squared. This is a standard statistical test for which the formula is:
Where "E" means expected number, "O" means observed number, and the sigma is the standard mathematical symbol indicating that you are supposed to take a sum. NOTE: since this is a standard statistical formula, which could be found in any statistics textbook, I will not expect you to memorize it; on exams, I will give you this formula, written in the format given above.
What this means is that we calculate (O-E)2/E for each of the three genotypes, and then add these numbers up. So:
chi-square=(obs #MM- exp#MM)2/(exp#MM)
+ (obs #MN- exp#MN)2/(exp#MN)
+ (obs #NN- exp#NN)2/(exp#NN)
The observed values are the numbers of individuals with the different
genotypes that were counted in the original sample. The expected values
are the numbers we calculated as expected if the population is in Hardy-Weinberg
Proportions. So: chi-square=(47-49)2/49+(46-42)2/42+(7-9)2/9
chi-square=0.08+0.38+0.44=0.9
The fifth step is to compare the value you calculated for chi-square to what is called a critical value. The critical value is a theoretically calculated value that indicates what you would calculate if there were exactly 95% chance that the population differs from the expected frequencies. If the number you calculated is as big as the critical value or bigger, it means there is at least 95% chance that the population genotype frequencies are different from Hardy-Weinberg Proportions, and you would conclude that the population is NOT in Hardy-Weinberg proportions. if the number you calculated is smaller than the critical value, then you would conclude that you can not detect any difference between the observed genotype frequencies and the Hardy-Weinberg expected genotype frequencies, so the population apparently IS in Hardy-Weinberg proportions.
Normally, you would look up the critical value in a statistical table; it depends on things like the number of categories you're looking at and whether or not you've estimated expected values based on your data. For this course, every chi-square test you'll do will have the same critical value, so for this course we won't worry about looking up the value, it will always be 3.841 (I got this value from a statistical table.) NOTE that if you're doing a chi-square test in another course you'll probably have to look this value up, and it won't necessarily be the value I just gave you. Also note that I will give you this value on exams.
So, if the chi-square value you calculate is larger than 3.841, you'll conclude that the population is NOT in Hardy-Weinberg proportions. If the chi-square value you calculate is less than 3.841, you'll conclude that the population IS in Hardy-Weinberg proportions. For the example that we did, chi-square turned out to be 0.90, so the MN blood group genotype frequencies are apparently in Hardy-Weinberg proportions.
Remember that the real point to all this is to test for different forms of evolution. What forms of evolution predict a population to be in Hardy-Weinberg Proportions? What forms predict a population will not be in Hardy-Weinberg Proportions? Look back at the table in the first part of this lecture to remind yourself.
At this point, you should be able to work the problems in questions 14-18 in Chapter VIII of your lab manual.