- Preprocessing and Exploratory Data Analysis
- What was used and how?
- R, dplyr, caret, ggplot2, base R, NBClust
- Results from Unsupervised Learning
June 5, 2018
| Variable | Issue |
|---|---|
| X2 | Same Information for each observation |
| X3 (Dates) | X3 gives the same information as X7 |
| X4 and X6 | Highly Correlated with each other. One removed |
| Clustering | How they Work |
|---|---|
| Hclust: Complete Linkage | Maximal intercluster dissimilarity. Compute all pairwise dissimilarities between the observations in cluster A and the observations in cluster B, and record the largest of these dissimilarities |
| Hclust: Average Linkage | Mean intercluster dissimilarity. Compute all pairwise dissimilarities between the observations in cluster A and the observations in cluster B, and record the average of these dissimilarities |