How a distance matrix can represent raw counts of mutations as in today’s example

Pairwise distances as linear distances

How a distance matrix can be represented as a 2D plot

Clustering of the taxa with the shortest pairwise distance into a clade

For example, I could give you a matrix and ask “which 2 taxa would form the first clade in a tree made from this matrix”

Representing simple phylogenetic trees in Newick notation

Recalculation of distances in a distance matrix after the first clade has been formed.

For example, in class after clade (a, b) is formed I calculated the distance from (a,b) to e, noted how (a,b) to c and (a,b) to d would would have to be calculated, and that c to d would not change.

For example, given a matrix of distances you should be able to determine which 2 taxa form the first clade, and then create the updated matrix with any distances that need to be re-calculated

You will not need to carry out the UPGMA algorithm in its entirety, calculate branch lengths, or draw a full tree from a distance matrix. We’ll save this for the next unit

UPGMA

  1. MSA and pairwise analusis
  2. Locate first clade based off most similarity and combine those.. redo the distancce matrix The intersection witth the least diffferenes goes first

2D representation of the 3 closest taxa from a matrix

C is 27 from A, A is 31 from B A closer to C than B

After first Clade formation in UPGMA, realulcate distance

If B and F combined, take their average

#UPGMA

Blast Alignment – does local alignment

Global Alignment tends to give a lower % identity

Global has big difference in gaps

With UPGMA we will be focusing on differences

Each pairwise has its own distance … can be plottted as linear distance these can be plotted on a 2D plane and can be easier coneptualized

Data Reduction– taking a lot of data and bringing it down to a lower dimensionality

Newick Notation (taxa, taxa) - shows clade / cluster (a,b) Topology off any tree can be represented with newick notation

Building a tree based off of 2D structure Once a clade is made, need to combine the two in a clade and average the rest

UPGMA just averages, WPGMA does a weighted average