How a distance matrix can represent raw counts of mutations as in today’s example
Pairwise distances as linear distances
How a distance matrix can be represented as a 2D plot
Clustering of the taxa with the shortest pairwise distance into a clade
For example, I could give you a matrix and ask “which 2 taxa would form the first clade in a tree made from this matrix”
Representing simple phylogenetic trees in Newick notation
Recalculation of distances in a distance matrix after the first clade has been formed.
For example, in class after clade (a, b) is formed I calculated the distance from (a,b) to e, noted how (a,b) to c and (a,b) to d would would have to be calculated, and that c to d would not change.
For example, given a matrix of distances you should be able to determine which 2 taxa form the first clade, and then create the updated matrix with any distances that need to be re-calculated
You will not need to carry out the UPGMA algorithm in its entirety, calculate branch lengths, or draw a full tree from a distance matrix. We’ll save this for the next unit
C is 27 from A, A is 31 from B A closer to C than B
If B and F combined, take their average
#UPGMA
Blast Alignment – does local alignment
Global Alignment tends to give a lower % identity
Global has big difference in gaps
With UPGMA we will be focusing on differences
Each pairwise has its own distance … can be plottted as linear distance these can be plotted on a 2D plane and can be easier coneptualized
Data Reduction– taking a lot of data and bringing it down to a lower dimensionality
Newick Notation (taxa, taxa) - shows clade / cluster (a,b) Topology off any tree can be represented with newick notation
Building a tree based off of 2D structure Once a clade is made, need to combine the two in a clade and average the rest
UPGMA just averages, WPGMA does a weighted average