This analysis uses multiple data sets relating to each amino acid. Since this is a multi-variable data set, I will be compressing the data using a dimensionally reducing approach called PCA. This approach uses distance analysis converted to a dendrogram. This will show us clusters of data that are correlated and can be used for futher analysis.
ggpubr is an extension ggplot2. ggpubr adds to the functionality of ggplot2 to make more detailed plots with simpler commands.
library(ggplot2) # Useful for creating plots and data frames
library(ggpubr) # Works with ggplot2 for plot making, adds functionality
library(vegan) # "tools for descriptive community ecology" - from ?vegan
## Loading required package: permute
## Loading required package: lattice
## This is vegan 2.5-6
library(scatterplot3d) # Allows for 3D plotting
The below data represents different quantitative and qualitative components of each amino acid. Each of these amino acid attributes will be used to build a data frame to perform further analysis. Data includes factors unique to the amino acid as well as data regarding interactions with other amino acids.
## 1 letter code
aa <-c('A','C','D','E','F','G','H','I','K','L','M','N', 'P','Q','R','S','T','V','W','Y')
## molecular weight in dalton
MW.da <-c(89,121,133,146,165,75,155,131,146,131,149,132,115,147,174,105,119,117,204,181)
## vol from van der Waals radii
vol <-c(67,86,91,109,135,48,118,124,135,124,124,96,90, 114,148,73,93,105,163,141)
## bulk – a measure of the shape of the side chain
bulk <-c(11.5,13.46,11.68,13.57,19.8,3.4,13.69,21.4,15.71,21.4,16.25,12.28,17.43, 14.45,14.28,9.47,15.77,21.57,21.67,18.03)
## pol – a measure of the electric field strength around the molecule
pol <-c(0,1.48,49.7,49.9,0.35,0,51.6,0.13,49.5,0.13,1.43,3.38,1.58,3.53,52,1.67,1.66,0.13,2.1,1.61)
## isoelec point
isoelec <-c(6,5.07,2.77,3.22,5.48,5.97,7.59,6.02,9.74,5.98,5.74,5.41,6.3,5.65,10.76,5.68,6.16,5.96,5.89,5.66)
## 1st Hydrophobicity scale
H2Opho.34 <-c(1.8,2.5,-3.5,-3.5,2.8,-0.4,-3.2,4.5,-3.9,3.8,1.9,-3.5,-1.6,-3.5,-4.5,-0.8,-0.7,4.2,-0.9,-1.3)
## 2nd Hydrophobicity scale
H2Opho.35 <-c(1.6,2,-9.2,-8.2,3.7,1,-3,3.1,-8.8,2.8,3.4,-4.8,-0.2,-4.1,-12.3,0.6,1.2,2.6,1.9,-0.7)
## Surface area accessible to water in an unfolded peptide
saaH2O <-c(113,140,151,183,218,85,194,182,211,180,204,158,143,189,241,122,146,160,259,229)
## Fraction of accessible area lost when a protein folds
faal.fold <-c(0.74,0.91,0.62,0.62,0.88,0.72,0.78,0.88,0.52,0.85,0.85,0.63,0.64,0.62,0.64,0.66,0.7,0.86,0.85,0.76)
# Polar requirement
polar.req <-c(7,4.8,13,12.5,5,7.9,8.4,4.9,10.1,4.9,5.3,10,6.6,8.6,9.1,7.5,6.6,5.6,5.2,5.4)
## relative frequency of occurance
## "The frequencies column shows the mean
### percentage of each amino acid in the protein sequences ### of modern organisms"
freq <-c(7.8,1.1,5.19,6.72,4.39,6.77,2.03,6.95,6.32,
10.15,2.28,4.37,4.26,3.45,5.23,6.46,5.12,7.01,1.09,3.3)
## charges
## un = Un-charged
## neg = negative
## pos = positive
charge<-c('un','un','neg','neg','un','un','pos','un','pos','un','un','un','un','un','pos','un','un','un','un','un')
## hydropathy
hydropathy<-c('hydrophobic','hydrophobic','hydrophilic','hydrophilic','hydrophobc','neutral','neutral','hydrophobic','hydrophilic','hydrophobic','hydrophobic','hydrophilic','neutral','hydrophilic','hydrophilic','neutral','neutral','hydrophobic','hydrophobic','neutral')
## vol
vol.cat<-c('verysmall','small','small','medium','verylarge','verysmall','medium','large','large','large','large','small','small','medium','large','verysmall','small','medium','verylarge','verylarge')
## pol
pol.cat<-c('nonpolar','nonpolar','polar','polar','nonpolar','nonpolar','polar','nonpolar','polar','nonpolar','nonpolar','polar','nonpolar','polar','polar','polar','polar','nonpolar','nonpolar','polar')
## chemical
chemical<-c('aliphatic','sulfur','acidic','acidic','aromatic','aliphatic','basic','aliphatic','basic','aliphatic','sulfur','amide','aliphatic','amide','basic','hydroxyl','hydroxyl', 'aliphatic','aromatic','aromatic')
Below I build two data frames, one with categorical data and one without the categorical data.
# with categorical data
aa_dat <- data.frame(Amino_Acid_Name = aa,
Molecular_Weight = MW.da,
Vol_VDW = vol,
Vol_Categorical = vol.cat,
Bulk = bulk,
Polarity = pol,
Polarity_Categorical = pol.cat,
Polar_Requirement = polar.req,
Residue = chemical,
Isoelectric_Point = isoelec,
Charge = charge,
First_Hydrophobicity_Scale = H2Opho.34,
Second_Hydrophobicity_Scale = H2Opho.35,
Hydropathy = hydropathy,
Surface_Area = saaH2O,
Fraction_Accessible_Area = faal.fold,
Frequency_of_Occurance = freq)
#wiithout categorical data
aa_dat2 <- data.frame(
Molecular_Weight = MW.da,
Vol_VDW = vol,
Bulk = bulk,
Polarity = pol,
Polar_Requirement = polar.req,
Isoelectric_Point = isoelec,
First_Hydrophobicity_Scale = H2Opho.34,
Second_Hydrophobicity_Scale = H2Opho.35,
Surface_Area = saaH2O,
Fraction_Accessible_Area = faal.fold,
Frequency_of_Occurance = freq)
aa_dat
## Amino_Acid_Name Molecular_Weight Vol_VDW Vol_Categorical Bulk Polarity
## 1 A 89 67 verysmall 11.50 0.00
## 2 C 121 86 small 13.46 1.48
## 3 D 133 91 small 11.68 49.70
## 4 E 146 109 medium 13.57 49.90
## 5 F 165 135 verylarge 19.80 0.35
## 6 G 75 48 verysmall 3.40 0.00
## 7 H 155 118 medium 13.69 51.60
## 8 I 131 124 large 21.40 0.13
## 9 K 146 135 large 15.71 49.50
## 10 L 131 124 large 21.40 0.13
## 11 M 149 124 large 16.25 1.43
## 12 N 132 96 small 12.28 3.38
## 13 P 115 90 small 17.43 1.58
## 14 Q 147 114 medium 14.45 3.53
## 15 R 174 148 large 14.28 52.00
## 16 S 105 73 verysmall 9.47 1.67
## 17 T 119 93 small 15.77 1.66
## 18 V 117 105 medium 21.57 0.13
## 19 W 204 163 verylarge 21.67 2.10
## 20 Y 181 141 verylarge 18.03 1.61
## Polarity_Categorical Polar_Requirement Residue Isoelectric_Point Charge
## 1 nonpolar 7.0 aliphatic 6.00 un
## 2 nonpolar 4.8 sulfur 5.07 un
## 3 polar 13.0 acidic 2.77 neg
## 4 polar 12.5 acidic 3.22 neg
## 5 nonpolar 5.0 aromatic 5.48 un
## 6 nonpolar 7.9 aliphatic 5.97 un
## 7 polar 8.4 basic 7.59 pos
## 8 nonpolar 4.9 aliphatic 6.02 un
## 9 polar 10.1 basic 9.74 pos
## 10 nonpolar 4.9 aliphatic 5.98 un
## 11 nonpolar 5.3 sulfur 5.74 un
## 12 polar 10.0 amide 5.41 un
## 13 nonpolar 6.6 aliphatic 6.30 un
## 14 polar 8.6 amide 5.65 un
## 15 polar 9.1 basic 10.76 pos
## 16 polar 7.5 hydroxyl 5.68 un
## 17 polar 6.6 hydroxyl 6.16 un
## 18 nonpolar 5.6 aliphatic 5.96 un
## 19 nonpolar 5.2 aromatic 5.89 un
## 20 polar 5.4 aromatic 5.66 un
## First_Hydrophobicity_Scale Second_Hydrophobicity_Scale Hydropathy
## 1 1.8 1.6 hydrophobic
## 2 2.5 2.0 hydrophobic
## 3 -3.5 -9.2 hydrophilic
## 4 -3.5 -8.2 hydrophilic
## 5 2.8 3.7 hydrophobc
## 6 -0.4 1.0 neutral
## 7 -3.2 -3.0 neutral
## 8 4.5 3.1 hydrophobic
## 9 -3.9 -8.8 hydrophilic
## 10 3.8 2.8 hydrophobic
## 11 1.9 3.4 hydrophobic
## 12 -3.5 -4.8 hydrophilic
## 13 -1.6 -0.2 neutral
## 14 -3.5 -4.1 hydrophilic
## 15 -4.5 -12.3 hydrophilic
## 16 -0.8 0.6 neutral
## 17 -0.7 1.2 neutral
## 18 4.2 2.6 hydrophobic
## 19 -0.9 1.9 hydrophobic
## 20 -1.3 -0.7 neutral
## Surface_Area Fraction_Accessible_Area Frequency_of_Occurance
## 1 113 0.74 7.80
## 2 140 0.91 1.10
## 3 151 0.62 5.19
## 4 183 0.62 6.72
## 5 218 0.88 4.39
## 6 85 0.72 6.77
## 7 194 0.78 2.03
## 8 182 0.88 6.95
## 9 211 0.52 6.32
## 10 180 0.85 10.15
## 11 204 0.85 2.28
## 12 158 0.63 4.37
## 13 143 0.64 4.26
## 14 189 0.62 3.45
## 15 241 0.64 5.23
## 16 122 0.66 6.46
## 17 146 0.70 5.12
## 18 160 0.86 7.01
## 19 259 0.85 1.09
## 20 229 0.76 3.30
aa_dat2
## Molecular_Weight Vol_VDW Bulk Polarity Polar_Requirement Isoelectric_Point
## 1 89 67 11.50 0.00 7.0 6.00
## 2 121 86 13.46 1.48 4.8 5.07
## 3 133 91 11.68 49.70 13.0 2.77
## 4 146 109 13.57 49.90 12.5 3.22
## 5 165 135 19.80 0.35 5.0 5.48
## 6 75 48 3.40 0.00 7.9 5.97
## 7 155 118 13.69 51.60 8.4 7.59
## 8 131 124 21.40 0.13 4.9 6.02
## 9 146 135 15.71 49.50 10.1 9.74
## 10 131 124 21.40 0.13 4.9 5.98
## 11 149 124 16.25 1.43 5.3 5.74
## 12 132 96 12.28 3.38 10.0 5.41
## 13 115 90 17.43 1.58 6.6 6.30
## 14 147 114 14.45 3.53 8.6 5.65
## 15 174 148 14.28 52.00 9.1 10.76
## 16 105 73 9.47 1.67 7.5 5.68
## 17 119 93 15.77 1.66 6.6 6.16
## 18 117 105 21.57 0.13 5.6 5.96
## 19 204 163 21.67 2.10 5.2 5.89
## 20 181 141 18.03 1.61 5.4 5.66
## First_Hydrophobicity_Scale Second_Hydrophobicity_Scale Surface_Area
## 1 1.8 1.6 113
## 2 2.5 2.0 140
## 3 -3.5 -9.2 151
## 4 -3.5 -8.2 183
## 5 2.8 3.7 218
## 6 -0.4 1.0 85
## 7 -3.2 -3.0 194
## 8 4.5 3.1 182
## 9 -3.9 -8.8 211
## 10 3.8 2.8 180
## 11 1.9 3.4 204
## 12 -3.5 -4.8 158
## 13 -1.6 -0.2 143
## 14 -3.5 -4.1 189
## 15 -4.5 -12.3 241
## 16 -0.8 0.6 122
## 17 -0.7 1.2 146
## 18 4.2 2.6 160
## 19 -0.9 1.9 259
## 20 -1.3 -0.7 229
## Fraction_Accessible_Area Frequency_of_Occurance
## 1 0.74 7.80
## 2 0.91 1.10
## 3 0.62 5.19
## 4 0.62 6.72
## 5 0.88 4.39
## 6 0.72 6.77
## 7 0.78 2.03
## 8 0.88 6.95
## 9 0.52 6.32
## 10 0.85 10.15
## 11 0.85 2.28
## 12 0.63 4.37
## 13 0.64 4.26
## 14 0.62 3.45
## 15 0.64 5.23
## 16 0.66 6.46
## 17 0.70 5.12
## 18 0.86 7.01
## 19 0.85 1.09
## 20 0.76 3.30
The below correlation matrix describes the relationship (correlation) between the factors in the data frame as an average for all the amino acids we are analyzing
The closer to 1 in magnitude, the stronger the correlation
Positive correlation indicates that as one factor increases, so does the other
Negative correlation indicates that as one factor decreases, the other decreases
cor_ <- round(cor(aa_dat2[,-c(1,13:17)]),2)
diag(cor_) <- NA
cor_[upper.tri(cor_)] <- NA
cor_
## Vol_VDW Bulk Polarity Polar_Requirement
## Vol_VDW NA NA NA NA
## Bulk 0.73 NA NA NA
## Polarity 0.24 -0.20 NA NA
## Polar_Requirement -0.19 -0.53 0.76 NA
## Isoelectric_Point 0.36 0.08 0.27 -0.11
## First_Hydrophobicity_Scale -0.08 0.44 -0.67 -0.79
## Second_Hydrophobicity_Scale -0.16 0.32 -0.85 -0.87
## Surface_Area 0.99 0.64 0.29 -0.11
## Fraction_Accessible_Area 0.18 0.49 -0.53 -0.81
## Frequency_of_Occurance -0.30 -0.04 -0.01 0.14
## Isoelectric_Point First_Hydrophobicity_Scale
## Vol_VDW NA NA
## Bulk NA NA
## Polarity NA NA
## Polar_Requirement NA NA
## Isoelectric_Point NA NA
## First_Hydrophobicity_Scale -0.20 NA
## Second_Hydrophobicity_Scale -0.26 0.85
## Surface_Area 0.35 -0.18
## Fraction_Accessible_Area -0.18 0.84
## Frequency_of_Occurance 0.02 0.26
## Second_Hydrophobicity_Scale Surface_Area
## Vol_VDW NA NA
## Bulk NA NA
## Polarity NA NA
## Polar_Requirement NA NA
## Isoelectric_Point NA NA
## First_Hydrophobicity_Scale NA NA
## Second_Hydrophobicity_Scale NA NA
## Surface_Area -0.23 NA
## Fraction_Accessible_Area 0.79 0.12
## Frequency_of_Occurance -0.02 -0.38
## Fraction_Accessible_Area Frequency_of_Occurance
## Vol_VDW NA NA
## Bulk NA NA
## Polarity NA NA
## Polar_Requirement NA NA
## Isoelectric_Point NA NA
## First_Hydrophobicity_Scale NA NA
## Second_Hydrophobicity_Scale NA NA
## Surface_Area NA NA
## Fraction_Accessible_Area NA NA
## Frequency_of_Occurance -0.18 NA
The below code demonstrates which relationship has the greatest positive correlation. saaH2O and volume has the greatest positive correlation.
which(cor_ == max(cor_, na.rm = T), arr.ind = T)
## row col
## Surface_Area 8 1
The below code demonstrates which relationship has the greatest negative correlation. polar.req and hydrophobe 35 has the greatest negative correlation.
which(cor_ == min(cor_, na.rm = T), arr.ind = T)
## row col
## Second_Hydrophobicity_Scale 7 4
The below code is a function that assists in the creation of a scatterplot matrix. It will be used in conjunction with the plot function.
panel.cor <- function(x, y, digits = 2, prefix = "", cex.cor,...)
{
usr <- par("usr"); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
r <- abs(cor(x, y))
txt <- format(c(r, 0.123456789), digits = digits)[1]
txt <- paste0(prefix, txt)
if(missing(cex.cor)) cex.cor <- 0.8/strwidth(txt)
text(0.5, 0.5, txt, cex = cex.cor * r)
}
Strong Correlation::
Molecular weight ~ Volume Van Der Waals
Surface Area ~ Molecular Weight
Polar Requirement ~ Polarity
Hydrophobicity ~ Polarity
Fraction Accessible ~ Polarity
Hydrophobicity ~ Polar Requirement
Fraction Accesible ~ Polar Requirement
Fracction Acccessible ~ Hydrophobicity
Weak Correlation::
Polarity ~ Molecular Weight
Polar Requirement ~ Molecular Weight
Hydrophobicity ~ Molecular Weight
Fraction Accessible ~ Molecular Weight
Polarity ~ Volume Van Der Waals
Polar Requirement ~ Volume Van Der Waals
Hydrophobicity ~ Volume Van Der Waals
Fraction Acccessibility ~ Volume Van Der Waals
Surface Area ~ Polarity
Surface Area ~ Polar Requirement
Surface Area ~ Hydrophobicity
Fraction Accesible ~ Surface Area
Linear Relationship::
Molecular Weight ~ Volume Van Der Waals
Polarity ~ Polar Requirement
Polarity ~ Hydrophobicity
Molecular Weight ~ Surface Area
Volume Van Der Waals ~ Surface Area
Polarity ~ Surface Area
Polarity ~ Fraction Accessible
Hydrophobicity ~ Franction Accesible
Non Linear Relationship::
Molecular Weight ~ Polarity
Volume Van Der Waals ~ Polarity
Molecular Weight ~ Polar Requirement
Volumme Van Der Waals ~ Polar Requirement
Molecular Weight ~ Hydrophobicity
Volume Van Der Waals ~ Hydrophobicity
Polar Requirement ~ Hydrophobicity
Polar Requirement ~ Surface Area
Hydrophobicity ~ Surface Area
Molecular Weight ~ Fraction Accessible
The below is making a scatterplot matrix out of a new data frame with limited vectors
aa_dat3 <-data.frame(
Molecular_Weight = MW.da,
Vol_VDW = vol,
Polarity = pol,
Polar_Requirement = polar.req,
First_Hydrophobicity_Scale = H2Opho.34,
Surface_Area = saaH2O,
Fraction_Accessible_Area = faal.fold)
plot(aa_dat3,upper.panel = panel.cor,
panel = panel.smooth)
The below figure plots the frequency of amino acid occurane with its polar requirement. There does not appear to be a strong correlation between polar requirement and frequency but the most frequently occuring amino acids do have a greater polar requirement.
plot(polar.req ~ freq,
data = aa_dat, # main plotting
xlab = "Polar Req",# text label
ylab = "Frequency ",#text label
main = "Amino Acid Plot Frequency Against Polar Req",# text label
col = 0)
text(polar.req ~ freq,
labels = aa,
data = aa_dat,
col = 1:20)
Below is a scatterplot demonstrating the relationship between Frequency and Polar Requirement
ggscatter(y = "Polar_Requirement",
x = "Frequency_of_Occurance",
size = "Polar_Requirement",
color = "Polar_Requirement",
data = aa_dat,
xlab = "Frequency",
ylab = "Polar Requirement")
Below is a scatterplot demonstrating the relationship between Bulkiness and Molecular Weight
ggscatter(y = "Molecular_Weight",
x = "Bulk",
size = "Molecular_Weight",
color = "Molecular_Weight",
data = aa_dat,
xlab = "Bulk",
ylab = "Molecular_Weight")
Below is a scatterplot demonstrating the relationship between Surface Area and Isoelectricity
ggscatter(y = "Isoelectric_Point",
x = "Surface_Area",
size = "Isoelectric_Point",
color = "Isoelectric_Point",
data = aa_dat,
xlab = "Surface_Area",
ylab = "Isoelectric_Point")
Below is a scatterplot demonstrating the relationship between Isoelectricity and Bulkiness using a smoother
ggscatter(data = aa_dat,
x = "Isoelectric_Point",
y = "Bulk",
size = "Vol_VDW",
color = "Surface_Area",
add = "loess",
xlab = "Isoelectric Point",
ylab = "Bulkiness")
## `geom_smooth()` using formula 'y ~ x'
Log transformation standardizes our data and to remove skew and give a more normal distribution of data.
Log_aa_dat2 <- log(aa_dat2)
## Warning in FUN(X[[i]], ...): NaNs produced
## Warning in FUN(X[[i]], ...): NaNs produced
ggscatter(data = Log_aa_dat2,
x = "Isoelectric_Point",
y = "Bulk",
size = "Polarity",
color = "Fraction_Accessible_Area",
add = "reg.line",
xlab = "Isoelectric Point",
ylab = "Bulkiness")
## `geom_smooth()` using formula 'y ~ x'
## Warning in sqrt(x): NaNs produced
## Warning: Removed 2 rows containing missing values (geom_point).
I chose the below 3 variables because they all were measuring physical characteristic of each amino acid. Upon conferring with the scatterplots created in the scatterplot matrix, the correlation appeared promising.
Bulk, Molecular Weight, and Volume positively correlate with each other as demonstrated in the 3d scatterplot below.
scatterplot3d(x = aa_dat$Molecular_Weight,
y = aa_dat$Vol_VDW,
z = aa_dat$Bulk,
type = "h",
highlight.3d = TRUE)
The purpose of a PCA is to take a data set with many variables (multidimensional) and reduce it to a 2D plane. This allows for easier analysis of data. In PCA, distance is used to show relationship and with UPGMA, plotted in clusters.
“The program positions points in two dimensions such that the distances in the two-dimensional space are as close as possible to the original distances in the multi- dimensional space.” (Higgs and Attwood 2005, pg 6).
The two basic ways that we discussed performing PCA are….
Data Prep for PCA done above in previous data prep
To make my new data frame, I removed any categorical variables. I had this data frame previously made so I could convert the entire data frame on a logarithmic scale in Figure 5.
#########3-4 sentences describinh the two basic wways we made PCCA plots in class############
pca.out <- prcomp(aa_dat2, scale = TRUE)
biplot(pca.out)
Scale = True is used to draw out clusters in reference to one another with “unit variance”… scaling is done with the standad deviation
There are some differences between the PCA I made and the PCA in Higgs and Atwood 2009. First, the PCA I made is seperated into clusters with some lines. Also, our scales are different, mine from -2 to 2 on the y and –4 to 4 on the x and theirs -.6 to .6 on the y and -.6 and 6 on the x. Also, it look like Higgs and Atwood created a 4th grouping of amino acids (W and R and G and C) while these two amino acids are included in already established clusters on my PCA. This also occurred for Y and H, which are part of the same cluster in my PCA and but different in Higgs and Atwood.
rda.out <- vegan::rda(aa_dat2,
scale = TRUE)
rda_scores <- scores(rda.out)
biplot(rda.out, display = "sites",
col = 0)
orditorp(rda.out,
display = "sites",
labels = aa_dat$Amino_Acid_Name, cex = 1.2)
ordihull(rda.out,
group = aa_dat$Hydropathy,
col = 1:7,
lty = 1:7,
lwd = c(3, 6))
The purpose of a cluster analysis is to show the grouping of data points (amino acids) into a dendogram to show the relationships of these amino acids in a dendrogram. This allows for better visualization f clusters that have been established, but using a different data analysis. In this case we will be using Euclidean clustering.
UPGMA works by using a hierarchical clustering approach. Unweighted averages are done in a distance matrix starting from the smallest distance and recalculating the distance matrix counting this smallest distance as its own point. This is done until multiple distances have been calculated and relationships based on unweighted distance can be formed.
As mentioned above, a distance matrix is created by looking at levels of correlation between variables and once the greatest correlation of “shortest distance” is found, recalculate the disance matrix and do it again. This is how distance matrices are reated.
A Euclidean distance is the straight line distance between two values. This can be thought of as the hypotenuse given x and y between two values.
?par
aa_dat3 <- data.frame(
Amino_Acid_Name = aa,
Molecular_Weight = MW.da,
Vol_VDW = vol,
Bulk = bulk,
Polarity = pol,
Polar_Requirement = polar.req,
Isoelectric_Point = isoelec,
First_Hydrophobicity_Scale = H2Opho.34,
Second_Hydrophobicity_Scale = H2Opho.35,
Surface_Area = saaH2O,
Fraction_Accessible_Area = faal.fold,
Frequency_of_Occurance = freq)
## which part indicates the UPGMA usage
dist_euc <- dist(aa_dat3,
method = "euclidean")
## Warning in dist(aa_dat3, method = "euclidean"): NAs introduced by coercion
clust_euc <- hclust(dist_euc)
par(mfrow = c(1,1))
plot(clust_euc,
hang = -1,
cex = 0.5)
dendro_euc <- as.dendrogram(clust_euc)
plot(dendro_euc,
horiz = T)