Analysis of SG Flight data

Introduction

We investigate recently released passenger entry count into Singapore.

Exploratory analyses

We investigate how passenger numbers fluctuate in terms of

  1. Months
  2. Months and Years

Fig: Boxplots showing human traffic grouped by month

Fig:Same Y scale

Fig: Free scale

Fig: Grouped by Year

Quantile Normalisation

We proceed to normalise the data, by manipulating the data.points to have the same quantiles.

Fig: Free scale

Fig: Free scale

Fig: Free scale

newDF.wide = newDF.long %>% spread(area, count)

Clustering

We proceed next with clustering the regions, from which these passengers arrive into groups based on how similar they are in terms of their frequencies.

1. Hierarchical Clustering

2. K-means

We plot the elbow plot where we plot the within cluster sum of squares K.

Fig: Elbow plot

From the graph, we see that the best k is 5.

##       INDONESIA        MALAYSIA     PHILIPPINES        THAILAND 
##               1               1               1               1 
##         VIETNAM NORTH.EAST.ASIA           CHINA       HONG.KONG 
##               1               1               1               1 
##           JAPAN      SOUTH.ASIA     MIDDLE.EAST         OCEANIA 
##               1               1               1               1 
##          EUROPE          FRANCE         GERMANY  UNITED.KINGDOM 
##               1               1               1               1 
##   NORTH.AMERICA   OTHER.REGIONS 
##               1               1

Presenting each cluster as a group in the network.

Singular Vector Decomposition

Finally, we apply SVD to the the data and look for the overarching patterns.

Fig: PC1

You must enable Javascript to view this page properly.

Fig:: Applying Hierarchical Clustering on PCS