We investigate recently released passenger entry count into Singapore.
We investigate how passenger numbers fluctuate in terms of
Fig: Boxplots showing human traffic grouped by month
Fig:Same Y scale
Fig: Free scale
Fig: Grouped by Year
We proceed to normalise the data, by manipulating the data.points to have the same quantiles.
Fig: Free scale
Fig: Free scale
Fig: Free scale
newDF.wide = newDF.long %>% spread(area, count)
We proceed next with clustering the regions, from which these passengers arrive into groups based on how similar they are in terms of their frequencies.
We plot the elbow plot where we plot the within cluster sum of squares K.
Fig: Elbow plot
From the graph, we see that the best k is 5.
## INDONESIA MALAYSIA PHILIPPINES THAILAND
## 1 1 1 1
## VIETNAM NORTH.EAST.ASIA CHINA HONG.KONG
## 1 1 1 1
## JAPAN SOUTH.ASIA MIDDLE.EAST OCEANIA
## 1 1 1 1
## EUROPE FRANCE GERMANY UNITED.KINGDOM
## 1 1 1 1
## NORTH.AMERICA OTHER.REGIONS
## 1 1
Presenting each cluster as a group in the network.
Finally, we apply SVD to the the data and look for the overarching patterns.
Fig: PC1
You must enable Javascript to view this page properly.
Fig:: Applying Hierarchical Clustering on PCS