1. Consider two possible alignments of two sequences below:
a)
#1 ATCGTTCAGG--TCTTGGACATTAAGACAAAAAACATGCATAGCAT
#2 ATG-ACAGGGGGTCATGGACAATAAGTCAA----CATCCACAGAAT
b)
#1 ATCGTTCAGGTCTTGGACATTAAGACAAAAAACATGCATAGCAT----
#2 ATG-------ACAGGGGGTCATGGACAATAAGTCAACATCCACAGAAT
If the gap:change cost is 2 and gaps of any length are counted equally, what is the cost of each alignment? Which is preferable?
What are the costs if the gap:change cost is 10? Which is preferable?
What are the costs if gap:change=2 and the gap extension penalty is 0.1 per site? Which is preferable?
In alignment b, what evolutionary event does the gap in sequence
#2 represent?
2. For alignment a in problem 1 above, what is the nucleotide diversity of these two sequences (leave gapped positions out of the calculation)?
Imagine a population in which there are two alleles: sequences
1 and 2 in problem 1 above. If the frequency of allele #1 is 0.77
and the frequency of allele #2 is 0.23, what is the nucletodie diversity
in the population?
3. Imagine a population in which there are four alleles of a short gene. Using the following alignment and frequencies, calculate the nucleotide diversity in the population.
seq #
alignment
freq
1 ATCGTTCAGG--TCTTGGACATTAAGACAAAAAACATGCATAGCAT
0.40
2 ATG-ACAGGGGGTCATGGACAATAAGTCAA----CATCCACAGAAT
0.08
3 ATA-ACAGGCGGTCATGCACAATAAGTCTA----CATCGACAGAAT
0.02
4 ATG-ACAGGGGCTCTTGGGCAATTAGTCAG----GATCCACAGAAT
0.50
Express in a sentence the meaning of this result. If another population
were found to have a nucleotide diversity of 0.001 for the same locus,
what might be the cause of the difference between the two populations?
4. Suppose that 5 different allozymes are examined by electrophoresis in two rodent populations, one of house mice (M) and of Norway rats (R). Each gene is represented by a number (1 through 5), and each allele of the gene is represented by a small letter. Suppose the alleles exist in the populations in the following frequencies:
Locus frequency
M1a 0.96
M1b 0.03
M1c 0.01
R1a 0.92
R1b 0.04
R1c 0.04
M2a 0.64
M2b 0.31
M2c 0.05
R2a 0.50
R2b 0.31
R2c 0.19
M3a 0.15
M3b 0.82
M3c 0.03
R3a 0.03
R3b 0.96
R3c 0.01
M4a 0.01
M4b 0.01
M4c 0.98
R4a 0.03
R4b 0.03
R4c 0.94
M5a 0.34
M5b 0.22
M5c 0.34
R5a 0.68
R5b 0.01
R5c 0.31
What percent of the loci are polymorphic at the 95 percent level in mice? In rats? For each species, what percent are polymorphic using a 99 percent cutoff? Express in a sentence the meaning of these results
What is the average heterozygosity in mice? In rats? Express in a sentence what this result means.
What forces in evolution might have caused this difference?
5. Consider the following two short protein sequences of the same protein from two taxa (the spaces are inserted just to make counting easier):
#1 CVPCFFKRIS QGHQRNDCEG CKSAISYNGM
#2 CIPCFFKRIT QGHQRNECEG CKSALTYNGM
What is the proportion of observed differences between the sequences?
What is the probability that a substitution occurred at any one amino acid site since their divergence?
If the two taxa are two species of apes that diverged from each
other 5 million years ago, what is the rate of amino acid substitution?
6. For the following nucleotide sequences:
ATG CTA AAC GGA CAT TGT CAT GAT GGG CAC AGT
ATA CTA AAT GGT CTT TGC AAT CAT GGA CAT AGC
What is the proportion of observed differences between the sequences?
What is the probability that a substitution occurred at any one nucleotide amino acid site since their divergence, assuming a Jukes-Cantor model?
What is the probability that a substitution occurred at any one nucleotide amino acid site since their divergence, assuming a Jukes-Cantor model?
What is the probability that a substitution occurred at any one nucleotide amino acid site since their divergence, assuming Kimura’s 2-parameter model?
What are Ks and Ka for these two sequences? What does this suggest about their evolution?
If the two taxa are two species of apes that diverged from each
other 5 million years ago, what is the total rate of nucleotide substitution,
assuming a Jukes-Cantor model? Using the same model, what is the
rate of synonymous substitution? What is the rate of nonsynonmous
substitution? What is the rate of mutation? What has been the
average length of time between non-synonymous substitutions?
7. Consider a stretch of junk DNA (noncoding, nonfunctional) that is 1000 nucleotides long. Assume that the mutation rate is high -- 2 changes per site per million generations. In a population of 10.000 individuals, how many new alleles will be created in the population each generation? What fraction of these will ultimately be fixed? What is the rate of molecular evolution?
In a population of 50 individuals with the same mutation rate, how many new alleles will be created per generation? What fraction of these will ultimately be fixed? What is the rate of molecular evolution?
Which population will have greater heterozygosity at any one time?
Why?
8. In a population of 10,000 individuals with a neutral mutation rate of 1 change per site per 10 million generations, what is the expected heterozygosity for each site? What is the expected heterozygosity for a protein that is 500 amino acids long? Express the meaning of this result in a sentence.
What is the expected heterozygosity if there are 12 individuals in the population?
If a population is found to have high heterozygosity, is this evidence
of balancing selection?
9. Consider a single amino acid substitution that confers
a selective advantage of 0.02 percent on the individual that carries it
and 0.04 percent on an individual in which it is homozygous. In a
population of 100, will this mutation be more strongly affected by selection
or drift? What about in a population of 1000? In a population
of 10,000? How do you know? Why does population size matter?
10. Suppose that scientists at the University of Washington (better known as U-Dub) obtain the sequence of a certain gene from 100 individuals in one population of coho salmon and another population of chinook salmon. The gene is 1000 nucleotides long. 900 of the sites are invariant in all sequences examined.
At forty sites, every individual of both species has a G, except for four coho and five chinook, all which have T at these sites. Of these sites, 10 cause an amino acid change in the translated protein.
For sixty sites, all chinook sequences an A; all coho sequences have a T. Of these sites, 5 cause an amino acid change.
Is this finding consistent with the neutral theory? Why or
why not? What does it suggest about the role of selection on this
sequence? If there are multiple interpretations, note both.