Let us say there are two variants or alleles of a gene, A and a. Since every individual is a diploid, there are three possible genotypes, AA, Aa, and aa. Calculation of genotype and allele frequencies In a population, you can count how many individuals have genotype AA, how many have Aa, and how many have aa. Number of individuals with AA=x Number of individuals with Aa=y Number of individuals with aa=z Then x+y+z must equal total number of individuals in the population. The proportion of number of individuals of each genotype of the population is called frequency. Frequency of AA =x(x+y+z) Frequency of Aa =y(x+y+z) Frequency of aa =z(x+y+z) The sum of the frequencies of AA, Aa, and aa must add up to 1. The frequency of each allele in the population can also be calculated from the genotype frequency. Number of A allele=2x+y Number of a allele=2z+y It is very important to distinguish between the genotype frequency and allele frequency. Example: Let us say there are 200 individuals; 38 are AA, 150 are Aa, and the remaining 12 are aa. The frequency of AA =38200=0.19 The frequency of Aa =150200=0.75 The frequency of aa =12200=0.06 Total=0.19+0.75+0.06=1.00 Frequency of A allele =0.19+120.75=0.565 Frequency of a allele =0.06+120.75=0.435 Total=0.565+0.435=1.00 Lecture 6-2 p=frequency of allele A q=frequency of allele a then, p+q=1 Frequency of genotype AA=pp=p2 Frequency of genotype Aa=2pq=2pq [the reason it is 2pq is, you can get Aa in two ways, A from mom & a from dad, or a from mom & A from dad]. Frequency of genotype aa=qq=q2 Frequencies of AA, Aa, and aa should add up to 1. Therefore, p2+2pq+q2=1 This is called Hardy-Weinberg equilibrium. The Hardy-Weinberg principle tells whether evolution has occurred. The law basically states that if there is no evolution, then equilibrium of allele frequencies will remain in effect in each succeeding generation. In order for equilibrium to exist, the following five conditions must be met: 1. There are no mutations; therefore, no new alleles are introduced into the population. 2. No gene flow can occur (i.e. no migration) 3. Mating is random. 4. The population is large [genetic drift cannot cause the allele frequencies to change]. 5. There is no selection in favor or against the alleles. Obviously, the HW equilibrium is a theoretical proposition. In the real world nothing is in equilibrium. If we stick to populations genetics and evolution, mutations occur, drift is a given, populations migrate, mating is not random, certain alleles are favored by selection, and certain alleles are not favored by selection. However, HW can indicate the extent these factors play into evolution. There is an important point to be remembered. You can ALWAYS find allele frequencies from genotype frequencies. BUT, You CANNOT ALWAYS find genotype frequencies from allele frequencies. Why is this true? Consider, p=0.4 and q=0.6 There are infinite combinations of AA/Aa/aa frequencies that yield p=0.4 and q=0.6. For instance, the following genotype frequencies all satisfy p=0.4 and q=0.6 AA 0.10 0.20 0.30 0.40 Aa 0.60 0.40 0.20 0.00 aa 0.30 0.40 0.50 0.60 You can find genotype frequencies from allele frequencies ONLY if the population is at HW equilibrium at the locus. PROBLEM #1. In a population the percentage of the homozygous recessive genotype (aa) is 36%. Calculate the following: a. The frequency of the "aa" genotype. b. The frequency of the "a" allele. c. The frequency of the "A" allele. d. The frequencies of the genotypes "AA" and "Aa." Solution: Frequency of aa=36% or 0.36 (given) We know, Frequency of aa =q2=0.36 Frequency of a allele=q=0.36=0.60 Since p+q=1, Frequency of A allele =p=1q=10.60=0.40 Frequency of AA genotype =p2=0.402=0.16 Frequency of Aa genotype=2pq=2(0.4)(0.6)=0.48 Check: 0.16+0.48+0.36=1.00 PROBLEM #2. If 9% of an African population is born with a severe form of sickle-cell anemia (ss), what percentage of the population will be more resistant to malaria because they are heterozygous(Ss) for the sickle-cell gene? ss frequency: 9% or 0.09 q2=0.09 q=0.09=0.3 p=1-0.3=0.7 Ss frequency=2pq=20.30.7=0.42 or 42% PROBLEM #3. Within a population of butterflies, the color brown (B) is dominant over the color white (b). And, 40% of all butterflies are white. Given this simple information, calculate the following: a. The percentage of butterflies in the population that are heterozygous. b. The frequency of homozygous dominant individuals. Solution: Butterflies with BB and Bb genotypes have brown eyes. Butterflies with bb genotype have white eyes. Frequency of bb=q2=0.40 q=0.40=0.6325 p=1-0.6325=0.3675 a. Frequency of Bb genotype=2pq=0.4649 or 46.5% b. Frequency of BB genotype=p2=0.1350 or 13.5% [In this problem, if the frequency of brown butterflies were given instead, there would be no way to determine the frequency of alleles and therefore the frequency of each genotype. Think about this. Why is this true?]. PROBLEM #4. The following data of M, MN, NN blood group is for a population sample of 1,000 individuals M 490 MN 420 N 90 Using the data provide above, calculate the following: a. The frequency of each allele (M, N) in the population. b. Supposing the matings are random, the frequencies of the matings. c. The probability of each genotype resulting from each potential cross. Solution: a [I originally calculated the allele frequencies as follows. Frequency of MM genotype=4901000=0.49=p2. - Therefore, Frequency of M allele =p=0.70. - Therefore, Frequency of N allele=q=0.30 Though, the answers turned out to be correct, this method is wrong. They were correct because this population happened to be in HW equilibrium. Use the following method which is correct to derive allele frequencies from the genotype frequencies.] p and q can be found as follows: p=0.49+120.42=0.49+0.21=0.70 q=0.09+120.42=0.09+0.21=0.30 b. Frequency of MN genotype =4201000=0.42 Frequency of NN genotype ==901000=0.09 Frequency of MM and MM individual mating = 0.49*0.49=0.2401 Frequency of MM and MN individual mating = 0.49*0.42=0.2058 Calculate the other frequencies similarly. c. Resulting genotype probabilities MM and MM mating results in 1.0 MM genotype MM and MN mating results in 0.5 MM and 0.5 MN genotypes. MM and NN mating results in 1.0 MN genotype MN and MN mating results in 0.25 MM, 0.5 MN, and 0.25 NN genotypes. MN and NN mating results in 0.5 MN and 0.5 NN genotypes. NN and NN mating results in 1.0 NN genotype Testing HW equilibrium Problem 1: Is the following data at HW? AA 245 Aa 210 aa 45 Total 500 Step 1: Find true genotype frequencies AA Genotype frequency =245500=0.49 Aa Genotype frequency =210500=0.42 aa Genotype frequency =45500=0.09 Step 2: Find true allele frequencies p=0.49+(0.5)(0.42)=0.70 q=0.09+(0.5)(0.42)=0.30 Step 3: Find expected genotype frequencies from true allele frequencies: AA: p2=0.49 Aa: 2pq=0.42 aa: q2=0.09 Step 4: compare true and predicted genotype frequencies Since true and predicted are identical, the population with the given data is at HW equilibrium. Problem 2: AA 400 Aa 200 aa 400 Total 1000 Step 1: Find true genotype frequencies AA Genotype frequency =4001000=0.40 Aa Genotype frequency =2001000=0.20 Aa *Genotype frequency =4001000=0.40 Step 2: Find true allele frequencies p=0.40+(0.5)(0.20)=0.50 q=0.40+(0.5)(0.20)=0.50 Step 3: Find expected genotype frequencies from true allele frequencies: AA: p2=0.25 Aa: 2pq=0.50 aa: q2=0.25 Step 4: compare true and predicted genotype frequencies Since there is significant difference in the true and predicted genotype frequencies, the population with the given data is not at HW equilibrium. Lecture 6-3 The lecture looks at two different population samples. One is Navajo, and another is Aborigine. Each individual populations is at HW equilibrium. But, when they are combined, there is dramatic deviation from HW. Navajo genotype frequencies MM 305 MN 52 NN 4 Total 361 MM 305361=0.845 MN 52361=0.144 NN 4361=0.011 Navajo allele frequencies p=0.845+(0.5)(0.144)=0.917 q=0.011+(0.5)(0.144)=0.083 HW prediction: MM: p2=0.840 MN: 2pq=0.152 NN: q2=0.007 Looking at heterozygote (MN) frequency, this population is very close to HW equilibrium. Now let us look at Aborigine population. Aborigine genotype frequencies MM 22 MN 216 NN 492 Total 730 MM 22730=0.030 MN 216730=0.296 NN 492730=0.674 p=0.030+(0.5)(0.296)=0.178 q=0.674+(0.5)(0.296)=0.822 HW prediction: MM: p2=0.032 MN: 2pq=0.293 NN: q2=0.676 Looking at heterozygote (MN) frequency, this population is very close to HW equilibrium. If we combine these two populations and assume there is no mating between populations, the genotype frequencies of the combined populations should be the same as sum of individual populations. Navajo+Aborigine Frequencies MM 327 3271091=0.300 MN 268 2681091=0.246 NN 496 4961091=0.455 Total 1091 p=0.300+(0.5)(0.246)=0.423 q=0.455+(0.5)(0.246)=0.578 HW prediction: MM: p2=0.179 MN: 2pq=0.489 NN: q2=0.334 Looking at heterozygote (MN) frequency, the combined population is far from HW equilibrium. Specifically, the true heterozygote frequency is much less than predicted, in other words, there is heterozygote deficit in the population. The primary reason is absence of random mating between the populations. This effect where there is heterozygote deficit relative to HW is called Wahlund effect. The more different the genotype frequencies among subpopulations, the greater the overall deficiency of heterozygotes. If you sample from subpopulations that have diverged sufficiently, you will observe an overall deficiency of heterozygotes. Genome Wide Association Studies (GWAS) should always test for HW. Otherwise, GWAS may reflect that a genotype near a SNP is causing the disease. Video quiz problem: A population has the following genotype data: AA: 644 Aa: 435 aa: 116 Calculate the Hardy-Weinberg frequencies. If these frequencies differ from those in the actual population, speculate as to the cause. MM 6441195=0.539 MN 4351195=0.364 NN 1161195=0.097 p=0.539+(0.5)(0.364)=0.721 q=0.097+(0.5)(0.364)=0.279 HW prediction: MM: p2=0.520 MN: 2pq=0.402 NN: q2=0.078 Lecture 6-4 Sub-populations may have differences in allele and genotype frequencies. One sub-population may have alleles in some genes not found in other populations if the mutations are recent. The sub-populations are completely isolated. This is less common in modern human ethnic groups. How to quantify? FST=Predicted heteroz. freq.Observed heteroz. freq.Predicted heteroz. freq. FST=HW 2pqObserved heteroz. freq.HW 2pq Example problems: [I did these calculations using Excel.] In this problem, the alleles are fixed in the populations (pop 1 has only AA, pop 2 has only aa). Pop 1 AA 100 Aa 0 aa 0 POP 2 AA 0 Aa 0 aa 100 POP 1 + POP 2 AA 100 Aa 0 aa 100 Total 200 Genotype AA frequency =100200=0.5 Genotype Aa frequency =0200=0 Genotype aa frequency =100200=0.5 p= 0.5+(0.5)(0)=0.5 q= 0.5+(0.5)(0)=0.5 HW 2pq=20.50.5=0.5 Observed heteroz. (Aa) freq. = 0 FST=0.500.5=1 Example 2: POP 1 AA 250 Aa 500 aa 250 Total 1000 POP 2 AA 490 Aa 420 aa 90 Total 1000 POP 1 + POP2 AA 250+490 =740 Freq =7402000=0.37 Aa 500+420 =920 Freq =9202000=0.46 aa 250+90 =340 Freq =3402000=0.17 Total 1000+1000=2000 p= 0.37+0.50.46=0.6 q= 0.17+0.50.46=0.4 HW 2pq = 0.48 Observed heteroz. (Aa) freq. = 0.46 FST=0.480.460.48=0.042 How to interpret FST? If $$F{ST}=0,noallelefrequencydifferencesbetweenthetwopopulationsIf0<F{ST}<1,allelefrequenciesdiffersomewhatbetweenthet wopopulationsIfF_{ST}=1$$, fixed allele frequencies between the two populations $$F{ST}islargerwhencomparingtwopopulationsthathavebiggerdifferencesinallelefrequencies.F{ST}wouldbezerowhenthetwo populationshaveidenticalallelefrequencies.F_{ST}$$=1 if alleles are fixed in the populations. In the last example problem, we dont see a huge value of FST indicating there has been some gene flow between populations and mating is not entirely non-random. Lecture 6-5 Gene flow is due to migration between populations. Gene flow is the great homogenizing force in evolution. Gene flow makes allele frequencies converge (or undoes the divergence). This gene flow is assumed to be random with respect to genotype. Two broad categories of models of gene flow: Continent-island model: huge effect of continent on island but negligible effect of island on the continent. Island model: multiple neighboring island affecting each others allele frequencies. Suppose we have four island exchanging enes with each other and they start off with the allele frequencies 0.9, 0.65, 0.35, 0.1, over time, they will converge to mean (0.5) of these allele frequencies This assumes there is symmetric migration between islands. The speed of convergence depends on the migration rate and the magnitude of difference in the allele frequencies. Lecture 6-6 Inbreeding Extreme form of inbreeding is self-fertilization. Over many generations, heterozygote frequency goes down and eventually becomes zero. Inbreeding (even in "non-selfing") reduces heterozygote frequency. How do we measure? This is calculated as F very similar to .$$F_ST## Example problem: AA 553 Freq=5531000=0.553 Aa 294 Freq=2941000=0.294 aa 153 Freq=1531000=0.153 Total 1000 p=0.553+(0.5)(0.294)=0.7 q=0.153+(0.5)(0.294)=0.3 HW 2pq=20.70.3= 0.42 Observed heteroz. freq.=0.294 F=HW 2pqobserved heteroz.freq.HW 2pq=0.3 F of 0.3 represents pretty severe inbreeding. F vs. FST Both are based on the same principle. Both indicate non-random mating. One is within a population (inbreeding) and another is between populations. Both represent the Wahlund effect.