# Clustering - Part 2 - Assignment 2
#1. List the methods of clustering? Partitioning, Density Based,
# Hierarchical, Grid-Based
#2. What are partitioning methods? They use the information on the
# distances between the cases in the dataset to obtain the k “best”
# groups, according to a certain criterion.
#3. What are hierarchical methods? obtain a hierarchy of alternative
# clustering solutions, known as a dendrogram. These methods can follow a divisive or an
# agglomerative approach to the task of building the hierarchy.
#4. For a dendrogram, given the relationship,
# h(d) < = h(g) <=> d is subset of g
# which node is higher on the tree, d or g? g
#5. The dendrogram follows which approaches for the task of building the hierarchy of
# clustering solutions? Divisive and agglomerative.
#6. Explain the divisive clustering approach.you start with one, all inclusive cluster and, at
# each step, spit a cluster until only singleton clusters of individual points remain. In this
# case, we need to decide which cluster to split at each step and how to do the splitting
#7. What are density-based methods?These methods try to find regions of the feature
# space where cases are packed together with high density, and because of this they are
# frequently also used as a way of finding outliers as these are by definition rather
# different from other cases and thus should not belong to these high-density regions of
# the features space
#8. What are two criteria used to elevate a clustering solution? (i) compactness — how
# similar are the cases on each cluster; and (ii) separation — how different is a cluster
# from the others
#9. Explain agglomerative clustering. Start with as many groups as there are cases in the
# dataset. At each iteration, the pair of groups that is most similar is merged into a single
# group.
#10. What is the goal of the hierarchical clustering methods? Their goal is to obtain a
# hierarchy of possible solutions ranging from one single group to n-groups, where n is
# the number of observations in the dataset.
#11. What is data noise? Data that has meaningless information.
#12. Based on agglomerative hierarchical clustering methods, name three criteria that
# select the pair of groups that is most similar and is merged into a single group. Single,
# Complete and Average Linkage.
#13. Explain the single linkage criteria. The single linkage criterion – measures the
# difference between two groups by the smallest distance between any two
# observations in each group
#14. Which type of clustering is implemented in the function, hclust( )? Agglomerative
# hierarchical clustering.
#15. For the function, hclust( ), what are the first two arguments (in # Torgo’s text)? First
# Argument is the distance and the second is the merging method the default is complete.
#16. Explain average linking. The average linkage uses the average distance between any
# two observations of the two groups. At each iteration, the pair of groups that is most
# similar is merged into a single group.
#