C3020 — Molecular Evolution

Exercises #3: Phylogenetics

 
Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states
 

1 ACAAACAGTT CGATCGATTT GCAGTCTGGG

2 ACAAACAGTT TCTAGCGATT GCAGTCAGGG

3 ACAGACAGTT CGATCGATTT GCAGTCTCGG

4 ACTGACAGTT CGATCGATTT GCAGTCAGAG

5 ATTGACAGTT CGATCGATTT GCAGTCAGGA

O TTTGACAGTT CGATCGATTT GCAGTCAGGG

 
1. Make a distance matrix using raw distances (number of differences) for the five ingroup sequences.

 

2. Infer the UPGMA tree for these sequences from your matrix. Label the branches with their lengths.

 

3. Using the parsimony criterion,

 
4. Using neighbor-joining,  
5. You now have 3 trees -- a UPGMA tree (problem #2), a parsimony tree (#3d), and a NJ tree (#4i). (Although you have not examined all possible phylogenies for either parsimony or NJ, assume that the best tree from question 3 is the most parsimonious tree and the best tree from question 4 is the best NJ tree. These are, in fact, the optimal tree for each method.)

6. Morphological data do not resolve the relationships among the mammalian orders primates, artiodactyls, and rodents. You would like to use molecular data to establish whcih lineage diverged first from the others. In your laboratory, your research assistant obtains sequences from a cow, a human, and a mouse for three genes: psi-n-globin (a pseudogene of globin, the oxygen-transporting protein in blood), histone A1 (one of the proteins that packs DNA in chromatin), and 18S ribosomal RNA. You use parsimony for a phylogenetic analysis.