1 1. Learning Outcome

1.1 1.1 Analysis goal

In this hands-on exercise, you will gain hands-on experience on how to delineate homogeneous region by using geographically referenced multivariate data. There are two major analysis, namely:

  • hierarchical cluster analysis; and
  • spatially constrained cluster analysis.

1.2 1.2 Learning Outcome

By the end of this hands-on exercise, you will able:

  • to convert GIS polygon data into R’s simple feature data.frame by using appropriate functions of sf package of R;
  • to convert simple feature data.frame into R’s SpatialPolygonDataFrame object by using appropriate sf of package of R;
  • to perform custer analysis by using hclust() of Base R;
  • to perform spatially constrained cluster analysis using skater() of Base R; and
  • to visualise the analysis output by using ggplot2 and tmap package.

2 2. Getting started

2.1 2.1 The analytical question

In geobusiness and spatial policy, it is a common practice to delineate the market or planning area into homogeneous regions by using multivariate data. In this hands-on exercise, we are interested to delineate Shan State, Myanmar into homogeneous regions by using multiple Information and Communication technology (ICT) measures, namely: Radio, Television, Land line phone, Mobile phone, Computer, and Internet at home.

2.2 2.2 The data

  • Myanmar Township Boundary Data (i.e. myanmar_township_boundaries) : This is a GIS data in ESRI shapefile format. It consists of township bondary information of Myanmar. The spatial data are casptured in polygon features.
  • Shan-ICT.csv: This is an extract of The 2014 Myanmar Population and Housing Census Myanmar at the township level.

2.3 2.3 Installing and loading R packages

The R packages needed for this exercise are as follows:

  • Spatial data handling: sf, and spdep
  • Geospatial analysis package ClustGeo
  • Attribute data handling tidyverse, especially readr, ggplot2 and dplyr
  • Choropleth mapping tmap
packages = c('rgdal', 'spdep', 'ClustGeo',  'tmap', 'sf', 'ggpubr', 'cluster', 'heatmaply', 'corrplot', 'tidyverse')
for (p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
    }
  library(p,character.only = T)
}
## Loading required package: rgdal
## Loading required package: sp
## rgdal: version: 1.4-8, (SVN revision 845)
##  Geospatial Data Abstraction Library extensions to R successfully loaded
##  Loaded GDAL runtime: GDAL 2.2.3, released 2017/11/20
##  Path to GDAL shared files: C:/Users/TYZ/Documents/R/win-library/3.6/rgdal/gdal
##  GDAL binary built with GEOS: TRUE 
##  Loaded PROJ.4 runtime: Rel. 4.9.3, 15 August 2016, [PJ_VERSION: 493]
##  Path to PROJ.4 shared files: C:/Users/TYZ/Documents/R/win-library/3.6/rgdal/proj
##  Linking to sp version: 1.4-1
## Loading required package: spdep
## Loading required package: spData
## To access larger datasets in this package, install the spDataLarge
## package with: `install.packages('spDataLarge',
## repos='https://nowosad.github.io/drat/', type='source')`
## Loading required package: sf
## Linking to GEOS 3.6.1, GDAL 2.2.3, PROJ 4.9.3
## Loading required package: ClustGeo
## Loading required package: tmap
## Loading required package: ggpubr
## Loading required package: ggplot2
## Loading required package: cluster
## Loading required package: heatmaply
## Loading required package: plotly
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
## Loading required package: viridis
## Loading required package: viridisLite
## Registered S3 method overwritten by 'seriation':
##   method         from 
##   reorder.hclust gclus
## 
## ======================
## Welcome to heatmaply version 1.1.0
## 
## Type citation('heatmaply') for how to cite the package.
## Type ?heatmaply for the main documentation.
## 
## The github page is: https://github.com/talgalili/heatmaply/
## Please submit your suggestions and bug-reports at: https://github.com/talgalili/heatmaply/issues
## Or contact: <tal.galili@gmail.com>
## ======================
## Loading required package: corrplot
## corrplot 0.84 loaded
## Loading required package: tidyverse
## -- Attaching packages --------------------------------------- tidyverse 1.3.0 --
## v tibble  3.0.1     v dplyr   0.8.5
## v tidyr   1.0.2     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.5.0
## v purrr   0.3.4
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks plotly::filter(), stats::filter()
## x dplyr::lag()    masks stats::lag()

3 3 Data Import and Prepatation

3.1 3.1 Importing geospatial data into R environment

Import Myanmar Township Boundary GIS data and its associated attrbiute table into R environment.

The Myanmar Township Boundary GIS data is in ESRI shapefile format. It will be imported into R environment by using the st_read() function of sf.

shan_sf <- st_read(dsn = "data/geospatial", layer = "myanmar_township_boundaries") %>%
  filter(ST %in% c("Shan (East)", "Shan (North)", "Shan (South)"))
## Reading layer `myanmar_township_boundaries' from data source `D:\SMU\IS415\Session 10\In-class_Ex10\data\geospatial' using driver `ESRI Shapefile'
## Simple feature collection with 330 features and 14 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 92.17275 ymin: 9.671252 xmax: 101.1699 ymax: 28.54554
## CRS:            4326

The imported township boundary object is called shan_sf. It is saved in simple feature data.frame format. We can view the content of the newly created shan_sf simple features data.frame by using the code chun below. (tibble)

shan_sf
## Simple feature collection with 55 features and 14 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 96.15107 ymin: 19.29932 xmax: 101.1699 ymax: 24.15907
## CRS:            4326
## First 10 features:
##    OBJECTID           ST ST_PCODE       DT   DT_PCODE        TS  TS_PCODE
## 1       163 Shan (North)   MMR015  Mongmit MMR015D008   Mongmit MMR015017
## 2       203 Shan (South)   MMR014 Taunggyi MMR014D001   Pindaya MMR014006
## 3       240 Shan (South)   MMR014 Taunggyi MMR014D001   Ywangan MMR014007
## 4       106 Shan (South)   MMR014 Taunggyi MMR014D001  Pinlaung MMR014009
## 5        72 Shan (North)   MMR015  Mongmit MMR015D008    Mabein MMR015018
## 6        40 Shan (South)   MMR014 Taunggyi MMR014D001     Kalaw MMR014005
## 7       194 Shan (South)   MMR014 Taunggyi MMR014D001     Pekon MMR014010
## 8       159 Shan (South)   MMR014 Taunggyi MMR014D001  Lawksawk MMR014008
## 9        61 Shan (North)   MMR015  Kyaukme MMR015D003 Nawnghkio MMR015013
## 10      124 Shan (North)   MMR015  Kyaukme MMR015D003   Kyaukme MMR015012
##                  ST_2            LABEL2 SELF_ADMIN ST_RG T_NAME_WIN
## 1  Shan State (North)    Mongmit\n61072       <NA> State   rdk;rdwf
## 2  Shan State (South)    Pindaya\n77769       Danu State     yif;w,
## 3  Shan State (South)    Ywangan\n76933       Danu State      &GmiH
## 4  Shan State (South)  Pinlaung\n162537       Pa-O State  yifavmif;
## 5  Shan State (North)     Mabein\n35718       <NA> State     rbdrf;
## 6  Shan State (South)     Kalaw\n163138       <NA> State       uavm
## 7  Shan State (South)      Pekon\n94226       <NA> State     z,fcHk
## 8  Shan State (South)          Lawksawk       <NA> State   &yfapmuf
## 9  Shan State (North) Nawnghkio\n128357       <NA> State  aemifcsdK
## 10 Shan State (North)   Kyaukme\n172874       <NA> State   ausmufrJ
##                                                                   T_NAME_M3
## 1          <U+1019><U+102D><U+102F><U+1038><U+1019><U+102D><U+1010><U+103A>
## 2                          <U+1015><U+1004><U+103A><U+1038><U+1010><U+101A>
## 3                                  <U+101B><U+103D><U+102C><U+1004><U+1036>
## 4  <U+1015><U+1004><U+103A><U+101C><U+1031><U+102C><U+1004><U+103A><U+1038>
## 5                          <U+1019><U+1018><U+102D><U+1019><U+103A><U+1038>
## 6                                          <U+1000><U+101C><U+1031><U+102C>
## 7                          <U+1016><U+101A><U+103A><U+1001><U+102F><U+1036>
## 8          <U+101B><U+1015><U+103A><U+1005><U+1031><U+102C><U+1000><U+103A>
## 9  <U+1014><U+1031><U+102C><U+1004><U+103A><U+1001><U+103B><U+102D><U+102F>
## 10         <U+1000><U+103B><U+1031><U+102C><U+1000><U+103A><U+1019><U+1032>
##        AREA                       geometry
## 1  2703.611 MULTIPOLYGON (((96.96001 23...
## 2   629.025 MULTIPOLYGON (((96.7731 21....
## 3  2984.377 MULTIPOLYGON (((96.78483 21...
## 4  3396.963 MULTIPOLYGON (((96.49518 20...
## 5  5034.413 MULTIPOLYGON (((96.66306 24...
## 6  1456.624 MULTIPOLYGON (((96.49518 20...
## 7  2073.513 MULTIPOLYGON (((97.14738 19...
## 8  5145.659 MULTIPOLYGON (((96.94981 22...
## 9  3271.537 MULTIPOLYGON (((96.75648 22...
## 10 3920.869 MULTIPOLYGON (((96.95498 22...

Since shan_sf is conformed to tidy framework, we can also glimpse() to reveal the data type of it’s fields.

glimpse(shan_sf)
## Rows: 55
## Columns: 15
## $ OBJECTID   <dbl> 163, 203, 240, 106, 72, 40, 194, 159, 61, 124, 71, 155, ...
## $ ST         <fct> Shan (North), Shan (South), Shan (South), Shan (South), ...
## $ ST_PCODE   <fct> MMR015, MMR014, MMR014, MMR014, MMR015, MMR014, MMR014, ...
## $ DT         <fct> Mongmit, Taunggyi, Taunggyi, Taunggyi, Mongmit, Taunggyi...
## $ DT_PCODE   <fct> MMR015D008, MMR014D001, MMR014D001, MMR014D001, MMR015D0...
## $ TS         <fct> Mongmit, Pindaya, Ywangan, Pinlaung, Mabein, Kalaw, Peko...
## $ TS_PCODE   <fct> MMR015017, MMR014006, MMR014007, MMR014009, MMR015018, M...
## $ ST_2       <fct> Shan State (North), Shan State (South), Shan State (Sout...
## $ LABEL2     <fct> Mongmit
## 61072, Pindaya
## 77769, Ywangan
## 76933, Pinlaung
## 16...
## $ SELF_ADMIN <fct> NA, Danu, Danu, Pa-O, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ ST_RG      <fct> State, State, State, State, State, State, State, State, ...
## $ T_NAME_WIN <fct> "rdk;rdwf", "yif;w,", "&GmiH", "yifavmif;", "rbdrf;", "u...
## $ T_NAME_M3  <fct> <U+1019><U+102D><U+102F><U+1038><U+1019><U+102D><U+1010>...
## $ AREA       <dbl> 2703.611, 629.025, 2984.377, 3396.963, 5034.413, 1456.62...
## $ geometry   <MULTIPOLYGON [°]> MULTIPOLYGON (((96.96001 23..., MULTIPOLYGO...

3.2 3.2 Importing aspatial data into R environment