Hong Kong is the seventh most overpopulated city in the world, with a mere land size of 1,106 km² and a population of close to 7.5 million. Hong Kong has been widely labeled as the most densely populated country, and is facing declining birth rates along with an ageing society. It is estimated that by 2023, 26.8% of the population will be over 65 years old. The cost of living in Hong Kong is also exorbitant (especially concerns over housing and child care affordability).
Hong Kong is split into 18 districts (political areas), drawn according to mountains, coastlines and roads. District elections will elect councils for each of Hong Kong’s 18 districts. It makes sense to view the demography of Hong Kong by districts, since each district council (and their focus) is unique. Each district also differs in terms of its degree of urbanisation - For example, Sham Shui Po district is the poorest district in Hong Kong, with the lowest median household income across all districts. Notably, Hong Kong’s population is heavily concentrated along the Northern an Southern Shores where urban and metro areas are located.
Districts in Hong Kong
In this publication, we aim to uncover the demography of Hong Kong according to age groups by its districts. The data is retrieved from 2016 Population By-census - District Profiles (Constituency Areas, the official public platform for spatial data in Hong Kong.
The shapefile uses follows the following district separation:
geography of Hong Kong
| Major challenge | Description |
|---|---|
| Data | Large dataset with many variables; Need to sieve through numerous fields to decide which field to visualise. |
| Data | Numeric fields are stored as strings i.e. “20 000”, thus there is a need to wrangle the data. |
| Data | Column headers are not intuitive i.e. age_1; reference to the main website is needed to deduce the variable. |
| Design | Visualisation may not reflect the exact geography sine the shapefile stores the constituency borders instead of the actual geographic area. |
Proposed sketch design
Proposed sketch design
Steps:
1. Set working directory & load essential libraries
2. Store the dataset for Hong Kong’s population census into df
3. View the dimensions of the dataset, along with the structure of variables and an overview of each column and its records
4. Check for NA values
5. Store the shp file into hkgeo
knitr::opts_chunk$set(warning = FALSE, echo = TRUE, eval = TRUE, message = FALSE, results = 'asis')
setwd("C:/Users/X Lin/Desktop/Notes/4.1/VA/assgn 5")
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
library(tidyr)
library(tmap)
## Warning: package 'tmap' was built under R version 4.0.3
library(sf)
## Warning: package 'sf' was built under R version 4.0.3
## Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
df <- read.csv("hong kong census.csv")
dim(df)
## [1] 432 214
glimpse(df)
## Rows: 432
## Columns: 214
## $ dc_class <chr> "A", "A", "A", "A", "A", "A", "A", "A", "A", "A",...
## $ dc <int> 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 1...
## $ dcca_class <chr> "A01", "A02", "A03", "A04", "A05", "A06", "A07", ...
## $ dcca <int> 1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108, 1...
## $ dc_eng <chr> "Central and Western ", "Central and Western ", "...
## $ ca_eng <chr> "Chung Wan", "Mid Levels East", "Castle Road", "P...
## $ dcca_eng <chr> "Central and Western - Chung Wan", "Central and W...
## $ dc_chi <chr> "ä¸è¥¿å\215\200 ", "ä¸è¥¿å\215\200 ", "ä¸è¥¿å\...
## $ ca_chi <chr> "ä¸ç’°", "å\215Šå±±æ\235±", "衛城", "å±±é ‚", ...
## $ dcca_chi <chr> "ä¸è¥¿å\215\200 - ä¸ç’°", "ä¸è¥¿å\215\200 - å\...
## $ t_pop <chr> " 12 501", " 17 009", " 20 058", " 20 263", " 18 ...
## $ pop_m <chr> " 5 892", " 7 584", " 8 402", " 8 010", " 6 837",...
## $ pop_f <chr> " 6 609", " 9 425", " 11 656", " 12 253", " 11 19...
## $ sr <chr> "892", "805", "721", "654", "611", "822", "866", ...
## $ age_1 <chr> " 1 024", " 1 453", " 2 433", " 2 374", " 2 015",...
## $ age_2 <chr> " 1 100", " 1 478", " 1 948", " 1 945", " 2 037",...
## $ age_3 <chr> " 3 650", " 6 170", " 6 564", " 6 717", " 5 492",...
## $ age_4 <chr> " 4 453", " 5 712", " 6 538", " 7 029", " 5 708",...
## $ age_5 <chr> " 2 274", " 2 196", " 2 575", " 2 198", " 2 780",...
## $ t_ma <dbl> 47.4, 42.8, 42.3, 42.7, 43.0, 43.1, 43.0, 45.5, 4...
## $ ma_m <dbl> 45.7, 44.3, 42.7, 43.8, 45.2, 44.1, 43.1, 45.1, 4...
## $ ma_f <dbl> 48.2, 41.4, 42.0, 41.9, 42.5, 42.4, 43.0, 45.9, 4...
## $ born_hk <chr> " 6 171", " 9 197", " 11 624", " 8 660", " 10 489...
## $ born_chi <chr> " 2 930", " 2 332", " 2 070", " 2 003", " 2 631",...
## $ born_else <chr> " 3 400", " 5 480", " 6 364", " 9 600", " 4 912",...
## $ ethn_chi <chr> " 9 239", " 11 513", " 13 821", " 10 499", " 13 2...
## $ ethn_phi <chr> "598", " 1 440", " 2 614", " 4 670", " 2 251", "6...
## $ ethn_ind <chr> "438", "633", "545", "574", "692", "287", "330", ...
## $ ethn_wh <chr> " 1 507", " 1 929", " 1 692", " 2 656", " 1 034",...
## $ ethn_oth <chr> "719", " 1 494", " 1 386", " 1 864", "763", "863"...
## $ ms_nm_m <chr> " 2 323", " 2 581", " 1 977", " 1 905", " 1 933",...
## $ ms_ma_m <chr> " 2 871", " 3 912", " 4 982", " 4 582", " 3 895",...
## $ ms_win_m <chr> "88", "58", "106", "78", "99", "232", "211", "163...
## $ ms_div_m <chr> "44", "173", "111", "99", "105", "237", "136", "7...
## $ ms_sep_m <chr> "38", "44", "16", "83", "-", "-", "27", "57", "26...
## $ ms_nm_f <chr> " 1 570", " 2 990", " 3 135", " 3 092", " 3 021",...
## $ ms_m_f <chr> " 3 138", " 4 858", " 5 954", " 6 796", " 5 897",...
## $ ms_win_f <chr> "733", "439", "674", "617", "588", " 1 001", "688...
## $ ms_div_f <chr> "545", "455", "547", "576", "415", "359", "396", ...
## $ ms_sep_f <chr> "127", "46", "123", "61", "64", "43", "19", "50",...
## $ ul_can <chr> " 8 117", " 10 508", " 12 210", " 8 179", " 12 21...
## $ ul_put <chr> " 1 004", "490", "421", "590", "329", "526", "552...
## $ ul_othchi <chr> "52", "71", "146", "97", "56", "493", "648", "416...
## $ ul_eng <chr> " 2 647", " 4 571", " 5 357", " 9 281", " 4 200",...
## $ ul_oth <chr> "467", "916", " 1 120", " 1 439", "597", "283", "...
## $ readchi_ablepctn <dbl> 73.8, 69.9, 71.0, 54.4, 74.0, 80.5, 92.0, 89.2, 8...
## $ readeng_ablepctn <dbl> 80.7, 90.3, 93.5, 95.3, 92.3, 77.2, 68.0, 74.8, 8...
## $ writechi_ablepctn <dbl> 72.3, 67.4, 70.0, 53.6, 73.0, 76.7, 89.9, 87.5, 8...
## $ writeeng_ablepctn <dbl> 79.7, 89.3, 92.2, 94.3, 91.7, 72.8, 66.5, 73.3, 7...
## $ edu_prepri <chr> "493", "340", "396", "355", "371", " 1 054", "682...
## $ edu_pri <chr> "998", "773", "511", "381", "828", " 1 322", " 1 ...
## $ edu_lsed <chr> " 1 304", "906", "931", " 1 057", " 1 115", " 1 8...
## $ edu_usec <chr> " 2 926", " 3 790", " 4 281", " 4 299", " 3 833",...
## $ edu_dip <chr> "652", " 1 062", "911", " 1 291", " 1 269", "885"...
## $ edu_sub <chr> "456", "690", "572", "900", "676", "880", "593", ...
## $ edu_deg <chr> " 4 648", " 7 995", " 10 023", " 9 606", " 7 925"...
## $ pls_same <chr> "646", "824", " 1 276", "840", " 1 577", " 1 359"...
## $ pls_diff_hk <chr> "566", "838", " 1 191", " 1 164", "900", "925", "...
## $ pls_diff_kln <chr> "160", "217", "225", "331", "216", "247", "380", ...
## $ pls_diff_nt <chr> "63", "110", "126", "63", "112", "99", "148", "15...
## $ s_diff_oth <chr> "28", "43", "30", "61", "23", "18", "11", "-", "5...
## $ t_lf <chr> " 7 886", " 10 760", " 12 009", " 12 853", " 10 5...
## $ lf_m <chr> " 3 867", " 5 013", " 5 287", " 5 236", " 4 029",...
## $ lf_f <chr> " 4 019", " 5 747", " 6 722", " 7 617", " 6 568",...
## $ lfpr_t <dbl> 68.7, 69.2, 68.1, 71.8, 66.2, 62.1, 65.0, 65.1, 6...
## $ lfpr_m <dbl> 72.1, 74.1, 73.5, 77.6, 66.8, 68.2, 70.6, 70.4, 6...
## $ lfpr_f <dbl> 65.7, 65.4, 64.4, 68.4, 65.8, 57.3, 60.0, 60.2, 6...
## $ t_wp <chr> " 7 647", " 10 509", " 11 655", " 12 615", " 10 3...
## $ wp_ee <chr> " 5 972", " 8 809", " 9 770", " 10 353", " 8 747"...
## $ wp_er <chr> "625", "725", "975", " 1 155", "848", "433", "275...
## $ wp_se <chr> "946", "893", "757", "850", "760", "578", "697", ...
## $ wp_fw <chr> "104", "82", "153", "257", "34", "148", "99", "10...
## $ t_nwp <chr> " 4 854", " 6 500", " 8 403", " 7 648", " 7 643",...
## $ nwp_hm <chr> "591", "991", " 1 064", " 1 402", "989", "976", "...
## $ nwp_st <chr> " 1 487", " 2 146", " 3 315", " 3 372", " 3 186",...
## $ nwp_re <chr> " 1 684", " 2 156", " 2 626", " 1 562", " 2 449",...
## $ nwp_oth <chr> " 1 092", " 1 207", " 1 398", " 1 312", " 1 019",...
## $ plw_same <chr> " 3 061", " 3 604", " 3 279", " 3 419", " 3 120",...
## $ plw_diff_hk <chr> " 1 132", " 1 701", " 2 052", " 1 454", " 1 632",...
## $ plw_diff_kln <chr> "954", " 1 459", " 1 183", "935", " 1 188", " 1 2...
## $ plw_diff_nt <chr> "266", "437", "531", "482", "435", "467", "592", ...
## $ plw_diff_oth <chr> "43", "241", "196", "24", "175", "211", "170", "1...
## $ plw_nofix <chr> "644", "640", "526", "278", "440", "743", "800", ...
## $ plw_hm <chr> " 1 289", " 2 072", " 3 398", " 5 585", " 3 122",...
## $ plw_out <chr> "258", "355", "490", "438", "277", "252", "119", ...
## $ mearn_xfdh_1 <chr> " 1 536", " 2 455", " 3 351", " 4 738", " 3 170",...
## $ mearn_xfdh_2 <chr> "616", "443", "671", " 1 160", "778", "820", "968...
## $ mearn_xfdh_3 <chr> " 1 766", " 1 374", " 1 019", "700", "923", " 2 3...
## $ mearn_xfdh_4 <chr> "956", " 1 188", "840", "576", "833", " 1 217", "...
## $ mearn_xfdh_5 <chr> "564", "669", "645", "435", "639", "806", "596", ...
## $ mearn_xfdh_6 <chr> "796", " 1 322", " 1 004", "918", " 1 054", "934"...
## $ mearn_xfdh_7 <chr> " 1 309", " 2 976", " 3 972", " 3 831", " 2 958",...
## $ t_mmearn <chr> "18,000", "25,500", "28,000", "13,000", "20,000",...
## $ mmearm_m <chr> "27,500", "50,000", "60,000", "80,000", "50,000",...
## $ mmearn_f <chr> "10,000", "12,600", "7,000", "5,000", "7,000", "1...
## $ mearn_xfdhfw_1 <chr> "645", "489", "291", "105", "357", "733", "965", ...
## $ mearn_xfdhfw_2 <chr> "491", "394", "446", "444", "613", "762", "935", ...
## $ mearn_xfdhfw_3 <chr> " 1 766", " 1 374", "996", "639", "923", " 2 309"...
## $ mearn_xfdhfw_4 <chr> "956", " 1 188", "840", "576", "833", " 1 217", "...
## $ mearn_xfdhfw_5 <chr> "564", "669", "645", "435", "639", "806", "596", ...
## $ mearn_xfdhfw_6 <chr> "796", " 1 322", " 1 004", "918", " 1 054", "934"...
## $ mearn_xfdhfw_7 <chr> " 1 309", " 2 976", " 3 972", " 3 831", " 2 958",...
## $ t_mmearn_xfdh <chr> "20,000", "40,000", "55,000", "75,000", "45,000",...
## $ mmearn_xfdh_m <chr> "27,500", "50,000", "60,000", "96,750", "51,000",...
## $ mmearn_xfdh_f <chr> "16,000", "27,000", "45,000", "45,000", "30,000",...
## $ wp_a <chr> " 1 149", " 2 676", " 3 173", " 3 358", " 2 249",...
## $ wp_b <chr> "895", " 1 573", " 1 699", " 1 624", " 1 682", " ...
## $ wp_c <chr> " 1 509", " 2 317", " 1 864", " 1 245", " 1 716",...
## $ wp_d <chr> "720", "937", "704", "511", "861", " 1 164", " 1 ...
## $ wp_e <chr> " 1 465", "498", "615", "270", "502", " 1 229", "...
## $ wp_f <chr> "221", "96", "87", "8", "63", "262", "303", "212"...
## $ wp_g <chr> "100", "62", "56", "152", "34", "321", "385", "28...
## $ wp_h <chr> " 1 588", " 2 350", " 3 450", " 5 447", " 3 282",...
## $ wp_i <chr> "-", "-", "7", "-", "-", "-", "11", "-", "-", "-"...
## $ wp_j <chr> "47", "228", "156", "239", "152", "129", "249", "...
## $ wp_k <chr> "164", "385", "165", "34", "281", "420", "384", "...
## $ wp_l <chr> " 1 651", " 1 625", " 1 634", " 1 713", " 1 429",...
## $ wp_m <chr> "388", "327", "305", "156", "201", "672", "706", ...
## $ wp_n <chr> "483", "188", "369", "213", "323", "489", "817", ...
## $ wp_o <chr> "152", "640", "463", "124", "459", "362", "191", ...
## $ wp_p <chr> " 1 123", " 2 073", " 1 912", " 2 054", " 1 343",...
## $ wp_q <chr> " 1 285", " 1 584", " 1 779", " 1 544", " 1 680",...
## $ wp_r <chr> " 1 023", " 1 111", " 1 375", "911", " 1 367", " ...
## $ wp_s <chr> " 1 296", " 2 333", " 3 448", " 5 627", " 3 118",...
## $ wp_t <chr> "35", "15", "49", "-", "36", "29", "30", "30", "7...
## $ whr_1 <chr> "581", "719", "425", "555", "525", "730", "712", ...
## $ whr_2 <chr> "866", "705", "924", "880", "758", "861", "836", ...
## $ whr_3 <chr> " 2 228", " 3 303", " 3 787", " 3 136", " 3 198",...
## $ whr_4 <chr> " 2 159", " 3 024", " 3 444", " 3 952", " 3 293",...
## $ whr_5 <chr> "842", " 1 467", " 1 235", " 1 808", " 1 503", "9...
## $ whr_6 <chr> "971", " 1 291", " 1 840", " 2 284", " 1 112", "7...
## $ dh <chr> " 5 289", " 6 521", " 6 001", " 5 357", " 5 027",...
## $ dhz_1 <chr> " 2 273", " 2 103", "806", "799", "784", " 1 588"...
## $ dhz_2 <chr> " 1 127", " 1 754", " 1 376", "940", "983", " 1 8...
## $ dhz_3 <chr> "935", " 1 008", " 1 284", "743", " 1 001", " 1 0...
## $ dhz_4 <chr> "422", "810", " 1 159", " 1 092", "925", "955", "...
## $ dhz_5 <chr> "418", "534", "886", "811", "776", "423", "341", ...
## $ dhz_6 <chr> "114", "312", "490", "972", "558", "191", "90", "...
## $ adhz <dbl> 2.2, 2.5, 3.3, 3.7, 3.4, 2.6, 2.5, 2.4, 3.0, 2.6,...
## $ dhc_1 <chr> "788", " 1 291", " 1 338", "899", "969", " 1 092"...
## $ dhc_2 <chr> " 1 022", " 1 925", " 2 492", " 2 469", " 2 125",...
## $ dhc_3 <chr> "354", "240", "431", "80", "263", "585", "613", "...
## $ dhc_4 <chr> "31", "67", "57", "53", "22", "49", "26", "66", "...
## $ dhc_5 <chr> "173", "83", "158", "110", "240", "131", "180", "...
## $ dhc_6 <chr> "391", "322", "375", "269", "347", "437", "518", ...
## $ dhc_7 <chr> " 2 273", " 2 103", "806", "799", "784", " 1 588"...
## $ dhc_8 <chr> "257", "490", "344", "678", "277", "240", "244", ...
## $ dhi_1 <chr> "777", "598", "455", "245", "392", "568", "709", ...
## $ dhi_2 <chr> "367", "158", "138", "270", "158", "251", "549", ...
## $ dhi_3 <chr> "701", "508", "338", "195", "289", " 1 029", " 1 ...
## $ dhi_4 <chr> "613", "598", "281", "208", "270", "785", " 1 158...
## $ dhi_5 <chr> "564", "412", "288", "292", "291", "600", "632", ...
## $ dhi_6 <chr> "786", "964", "631", "362", "532", "842", "829", ...
## $ dhi_7 <chr> " 1 481", " 3 283", " 3 870", " 3 785", " 3 095",...
## $ ma_hh <chr> "33,630", "60,000", "90,000", "132,250", "83,000"...
## $ dhi_e1 <chr> "186", "106", "119", "26", "23", "93", "89", "78"...
## $ dhi_e2 <chr> "161", "24", "38", "30", "16", "126", "190", "69"...
## $ dhi_e3 <chr> "560", "292", "210", "62", "144", "787", "978", "...
## $ dhi_e4 <chr> "539", "548", "198", "116", "192", "763", " 1 115...
## $ dhi_e5 <chr> "545", "380", "244", "246", "241", "546", "615", ...
## $ dhi_e6 <chr> "778", "942", "589", "327", "444", "809", "810", ...
## $ dhi_e7 <chr> " 1 434", " 3 190", " 3 734", " 3 729", " 2 970",...
## $ ma_econhh <chr> "43,000", "70,000", "104,210", "174,460", "100,00...
## $ oq_pub <chr> "-", "-", "-", "-", "-", "615", " 2 210", "-", "-...
## $ oq_s <chr> "-", "-", "-", "-", "-", "-", "-", "-", "-", "-",...
## $ oq_pri <chr> " 4 950", " 6 499", " 6 002", " 5 433", " 4 997",...
## $ oq_non <chr> "444", "141", "25", "24", "495", "46", "534", "96...
## $ oq_tem <chr> "13", "15", "-", "-", "4", "2", "-", "1", "9", "1...
## $ dh_pub <chr> "-", "-", "-", "-", "-", "615", " 2 210", "-", "-...
## $ dh_s <chr> "-", "-", "-", "-", "-", "-", "-", "-", "-", "-",...
## $ dh_pri <chr> " 4 856", " 6 384", " 5 976", " 5 339", " 4 986",...
## $ dh_non <chr> "420", "122", "25", "18", "37", "46", "110", "86"...
## $ dh_tem <chr> "13", "15", "-", "-", "4", "2", "-", "1", "9", "1...
## $ pop_pub <chr> "-", "-", "-", "-", "-", " 2 097", " 6 139", "-",...
## $ pop_s <chr> "-", "-", "-", "-", "-", "-", "-", "-", "-", "-",...
## $ pop_pri <chr> " 11 838", " 16 702", " 19 983", " 20 095", " 17 ...
## $ pop_non <chr> "650", "285", "75", "168", "784", "92", "915", "3...
## $ pop_tem <chr> "13", "22", "-", "-", "16", "2", "-", "1", "9", "...
## $ dh_r1 <chr> "636", "149", "9", "16", "36", "373", "698", "98"...
## $ dh_r2 <chr> " 1 166", " 1 097", "244", "66", "135", " 1 044",...
## $ dh_r3 <chr> " 2 162", " 1 986", "810", "260", "638", " 2 578"...
## $ dh_r4 <chr> "741", " 1 644", " 1 761", "693", " 1 124", " 1 3...
## $ dh_r5 <chr> "257", " 1 050", " 1 640", " 1 418", " 1 603", "4...
## $ dh_r6 <chr> "314", "595", " 1 537", " 2 904", " 1 491", "225"...
## $ dh_r0 <chr> "13", "-", "-", "-", "-", "2", "-", "1", "-", "1"...
## $ dh_ocm <chr> "624", " 1 385", " 1 444", "840", "975", " 1 166"...
## $ dh_ocwm <chr> " 1 975", " 2 467", " 2 443", " 1 762", " 2 483",...
## $ dh_st <chr> " 2 193", " 2 130", " 1 444", " 1 706", " 1 137",...
## $ dh_co <chr> "20", "16", "-", "-", "20", "22", "21", "48", "42...
## $ dh_rf <chr> "219", "329", "309", "380", "258", "120", "228", ...
## $ dh_emp <chr> "258", "194", "361", "669", "154", "93", "162", "...
## $ dhm_1 <chr> "165", "257", "283", "208", "105", "111", "119", ...
## $ dhm_2 <chr> "59", "27", "-", "-", "46", "65", "29", "59", "29...
## $ dhm_3 <chr> "31", "64", "8", "54", "9", "53", "98", "82", "13...
## $ dhm_4 <chr> "-", "48", "-", "44", "20", "97", "85", "207", "1...
## $ dhm_5 <chr> "39", "30", "17", "-", "27", "72", "85", "63", "7...
## $ dhm_6 <chr> "112", "362", "295", "41", "201", "519", "109", "...
## $ dhm_7 <chr> "218", "597", "841", "493", "567", "249", "21", "...
## $ dhm_loan <chr> "15,500", "20,000", "26,250", "48,000", "25,000",...
## $ dhm_lr <chr> "24.1", "18.7", "20", "16.2", "16.4", "19.2", "17...
## $ dhr_1 <chr> "147", "127", "75", "263", "61", "628", " 1 511",...
## $ dhr_2 <chr> "408", "34", "119", "31", "66", "318", " 1 015", ...
## $ dhr_3 <chr> "312", "25", "92", "23", "18", "453", "178", "269...
## $ dhr_4 <chr> " 1 604", " 2 154", " 1 519", " 2 058", " 1 166",...
## $ dm_r <chr> "13,500", "23,000", "34,000", "67,000", "32,000",...
## $ dmr_ir <dbl> 31.3, 30.9, 24.6, 34.4, 28.0, 30.9, 17.1, 38.2, 3...
## $ fa_m <int> 40, 58, 93, 183, 94, 43, 32, 37, 45, 34, 34, 35, ...
## $ pm_hk <chr> "470", "662", " 1 053", " 1 116", "807", " 1 031"...
## $ pm_kln <chr> "270", "346", "310", "265", "384", "827", "457", ...
## $ pm_nt <chr> "308", "336", "428", "177", "518", "767", "581", ...
## $ pm_oth <chr> "46", "16", "75", "79", "51", "149", "119", "51",...
## $ pm_samearea <chr> " 1 070", " 1 836", " 2 825", " 2 524", " 1 599",...
## $ pm_same <chr> " 8 353", " 10 453", " 11 744", " 11 936", " 11 8...
## $ pm_out <chr> " 1 770", " 2 907", " 2 828", " 3 489", " 2 174",...
colSums(is.na(df)) #check NA
## dc_class dc dcca_class dcca
## 0 0 0 0
## dc_eng ca_eng dcca_eng dc_chi
## 0 0 0 0
## ca_chi dcca_chi t_pop pop_m
## 0 0 0 0
## pop_f sr age_1 age_2
## 0 0 0 0
## age_3 age_4 age_5 t_ma
## 0 0 0 0
## ma_m ma_f born_hk born_chi
## 0 0 0 0
## born_else ethn_chi ethn_phi ethn_ind
## 0 0 0 0
## ethn_wh ethn_oth ms_nm_m ms_ma_m
## 0 0 0 0
## ms_win_m ms_div_m ms_sep_m ms_nm_f
## 0 0 0 0
## ms_m_f ms_win_f ms_div_f ms_sep_f
## 0 0 0 0
## ul_can ul_put ul_othchi ul_eng
## 0 0 0 0
## ul_oth readchi_ablepctn readeng_ablepctn writechi_ablepctn
## 0 0 0 0
## writeeng_ablepctn edu_prepri edu_pri edu_lsed
## 0 0 0 0
## edu_usec edu_dip edu_sub edu_deg
## 0 0 0 0
## pls_same pls_diff_hk pls_diff_kln pls_diff_nt
## 0 0 0 0
## s_diff_oth t_lf lf_m lf_f
## 0 0 0 0
## lfpr_t lfpr_m lfpr_f t_wp
## 0 0 0 0
## wp_ee wp_er wp_se wp_fw
## 0 0 0 0
## t_nwp nwp_hm nwp_st nwp_re
## 0 0 0 0
## nwp_oth plw_same plw_diff_hk plw_diff_kln
## 0 0 0 0
## plw_diff_nt plw_diff_oth plw_nofix plw_hm
## 0 0 0 0
## plw_out mearn_xfdh_1 mearn_xfdh_2 mearn_xfdh_3
## 0 0 0 0
## mearn_xfdh_4 mearn_xfdh_5 mearn_xfdh_6 mearn_xfdh_7
## 0 0 0 0
## t_mmearn mmearm_m mmearn_f mearn_xfdhfw_1
## 0 0 0 0
## mearn_xfdhfw_2 mearn_xfdhfw_3 mearn_xfdhfw_4 mearn_xfdhfw_5
## 0 0 0 0
## mearn_xfdhfw_6 mearn_xfdhfw_7 t_mmearn_xfdh mmearn_xfdh_m
## 0 0 0 0
## mmearn_xfdh_f wp_a wp_b wp_c
## 0 0 0 0
## wp_d wp_e wp_f wp_g
## 0 0 0 0
## wp_h wp_i wp_j wp_k
## 0 0 0 0
## wp_l wp_m wp_n wp_o
## 0 0 0 0
## wp_p wp_q wp_r wp_s
## 0 0 0 0
## wp_t whr_1 whr_2 whr_3
## 0 0 0 0
## whr_4 whr_5 whr_6 dh
## 0 0 0 0
## dhz_1 dhz_2 dhz_3 dhz_4
## 0 0 0 0
## dhz_5 dhz_6 adhz dhc_1
## 0 0 0 0
## dhc_2 dhc_3 dhc_4 dhc_5
## 0 0 0 0
## dhc_6 dhc_7 dhc_8 dhi_1
## 0 0 0 0
## dhi_2 dhi_3 dhi_4 dhi_5
## 0 0 0 0
## dhi_6 dhi_7 ma_hh dhi_e1
## 0 0 0 0
## dhi_e2 dhi_e3 dhi_e4 dhi_e5
## 0 0 0 0
## dhi_e6 dhi_e7 ma_econhh oq_pub
## 0 0 0 0
## oq_s oq_pri oq_non oq_tem
## 0 0 0 0
## dh_pub dh_s dh_pri dh_non
## 0 0 0 0
## dh_tem pop_pub pop_s pop_pri
## 0 0 0 0
## pop_non pop_tem dh_r1 dh_r2
## 0 0 0 0
## dh_r3 dh_r4 dh_r5 dh_r6
## 0 0 0 0
## dh_r0 dh_ocm dh_ocwm dh_st
## 0 0 0 0
## dh_co dh_rf dh_emp dhm_1
## 0 0 0 0
## dhm_2 dhm_3 dhm_4 dhm_5
## 0 0 0 0
## dhm_6 dhm_7 dhm_loan dhm_lr
## 0 0 0 0
## dhr_1 dhr_2 dhr_3 dhr_4
## 0 0 0 0
## dm_r dmr_ir fa_m pm_hk
## 0 0 0 0
## pm_kln pm_nt pm_oth pm_samearea
## 0 0 0 0
## pm_same pm_out
## 0 0
hkgeo <- st_read("DC 2015 poly Shapefile/DC_2015_poly Shapefile/GIH3_DC_2015_POLY.shp")
## Reading layer `GIH3_DC_2015_POLY' from data source `C:\Users\X Lin\Desktop\Notes\4.1\VA\assgn 5\DC 2015 poly Shapefile\DC_2015_poly Shapefile\GIH3_DC_2015_POLY.shp' using driver `ESRI Shapefile'
## Simple feature collection with 431 features and 11 fields
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: 799186.8 ymin: 799837 xmax: 869854 ymax: 847618
## projected CRS: Hong Kong 1980 Grid System
hkgeo
## Simple feature collection with 431 features and 11 fields
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: 799186.8 ymin: 799837 xmax: 869854 ymax: 847618
## projected CRS: Hong Kong 1980 Grid System
## First 10 features:
## DCCA_2015_ DCCA_20151 CACODE ENAME
## 1 310 126 Q13 Kwan Po
## 2 309 133 Q25 Kwong Ming
## 3 308 128 G17 Hok Yuen Laguna Verde
## 4 306 96 E17 East Tsim Sha Tsui & King's Park
## 5 304 95 E01 Tsim Sha Tsui West
## 6 302 135 E16 Yau Ma Tei North
## 7 300 129 G23 Oi Man
## 8 299 130 J27 Tsui Ping
## 9 298 141 Q14 Nam On
## 10 295 142 J32 Ting On
## CNAME E00_CENTRO E00_CENT_1
## 1 <U+8ECD><U+5BF6> 845169.5 819389.1
## 2 <U+5EE3><U+660E> 844703.6 819318.0
## 3 <U+9DB4><U+5712><U+6D77><U+9038> 838028.9 819106.4
## 4 <U+5C16><U+6771><U+53CA><U+4EAC><U+58EB><U+67CF> 835609.4 818067.6
## 5 <U+5C16><U+6C99><U+5480><U+897F> 834482.3 818170.5
## 6 <U+6CB9><U+9EBB><U+5730><U+5317> 835393.3 819717.5
## 7 <U+611B><U+6C11> 836637.1 819242.4
## 8 <U+7FE0><U+5C4F> 842073.0 819391.3
## 9 <U+5357><U+5B89> 845169.5 819389.1
## 10 <U+5B9A><U+5B89> 840698.4 819706.8
## DISTRICT_T DISTRICT_E SHAPE_AREA SHAPE_LEN
## 1 <U+897F><U+8CA2><U+5340> Sai Kung 308388.62 4277.714
## 2 <U+897F><U+8CA2><U+5340> Sai Kung 273555.30 2349.297
## 3 <U+4E5D><U+9F8D><U+57CE><U+5340> Kowloon City 960988.59 4492.008
## 4 <U+6CB9><U+5C16><U+65FA><U+5340> Yau Tsim Mong 2371378.23 8441.048
## 5 <U+6CB9><U+5C16><U+65FA><U+5340> Yau Tsim Mong 4397466.69 10279.687
## 6 <U+6CB9><U+5C16><U+65FA><U+5340> Yau Tsim Mong 135793.47 1936.740
## 7 <U+4E5D><U+9F8D><U+57CE><U+5340> Kowloon City 368492.01 4528.530
## 8 <U+89C0><U+5858><U+5340> Kwun Tong 362912.12 4135.462
## 9 <U+897F><U+8CA2><U+5340> Sai Kung 166679.30 2411.632
## 10 <U+89C0><U+5858><U+5340> Kwun Tong 82454.86 1848.950
## geometry
## 1 POLYGON ((845283.5 819328.1...
## 2 POLYGON ((844917.5 819236.8...
## 3 POLYGON ((838847.8 819020.6...
## 4 POLYGON ((836250.3 819250.7...
## 5 POLYGON ((834521 819759.8, ...
## 6 POLYGON ((835998.2 819775.8...
## 7 POLYGON ((836568.8 819776.7...
## 8 POLYGON ((842202.9 819491.2...
## 9 POLYGON ((845508.3 819708.5...
## 10 POLYGON ((841037.1 819565.3...
Steps:
1. filter df into a smaller dataset comprising of only fields we are interested in - total population, district, and age groups. The subset will be saved as hkpop
2. Change the data structure for population to numeric (currently stored as strings)
3. View the dimensions of the dataset, along with the structure of variables and an overview of each column and its records
4. Left join the shapefile with hkpop into a new dataframe hkpop_cleaned
hkpop <- df %>% select(ca_eng, t_pop, age_1, age_2,age_3,age_4,age_5)
#change strings to numbers
hkpop$t_pop <- gsub(" ","",hkpop$t_pop)
hkpop$t_pop <- as.numeric(hkpop$t_pop)
hkpop$age_1 <- gsub(" ","",hkpop$age_1)
hkpop$age_1 <- as.numeric(hkpop$age_1)
hkpop$age_2 <- gsub(" ","",hkpop$age_2)
hkpop$age_2 <- as.numeric(hkpop$age_2)
hkpop$age_3 <- gsub(" ","",hkpop$age_3)
hkpop$age_3 <- as.numeric(hkpop$age_3)
hkpop$age_4 <- gsub(" ","",hkpop$age_4)
hkpop$age_4 <- as.numeric(hkpop$age_4)
hkpop$age_5 <- gsub(" ","",hkpop$age_5)
hkpop$age_5 <- as.numeric(hkpop$age_5)
hkpop_cleaned <- left_join(hkgeo, hkpop, by = c('ENAME'= 'ca_eng'))
Steps:
Using tmap,
totalpop using t_polygons. This map will be made interactive.totalpop_faceted. This map will not be interactive.totalpop_faceted, tm_fill instead of tm_polygons will be usedagepopagepop, labels will be changed to the specific age groups since the column headers are not intuitive i.e. age_1, age_2.knitr::opts_chunk$set(warning = FALSE, error = TRUE, message = FALSE, results = 'asis')
totalpop <- tm_shape(hkpop_cleaned) + tm_polygons("t_pop", textNA = "NA", title = "Total Population", palette = "Greens") + tm_layout(main.title = "Distribution of Population in Hong Kong SAR", main.title.position = "center", main.title.size = 1.5)
totalpop_faceted <- tm_shape(hkpop_cleaned) + tm_fill("t_pop", textNA = "NA", title = "Total Population") + tm_layout(main.title = "Distribution of Population in Hong Kong SAR by District", main.title.position = "center", main.title.size = 1.5) + tm_facets(by = "DISTRICT_E")
tmap_mode("plot")
agepop <- tm_shape(hkpop_cleaned) + tm_polygons(c("age_1","age_2", "age_3", "age_4", "age_5"), n = 5, palette = "Blues") + tm_layout(main.title = "Age distribution in Hong Kong SAR", main.title.position = "center", panel.labels = c("Below 15", "15-24", "25-44","45-64", "Over 65"))
tmap_mode("view")
totalpop
Heavily populated districts:
1. Kwun Tong
Kwun Tong is the first satellite town in the urban area at the early days of Hong Kong and a dominant player in helping shape the territory’s economic development. Today, Kwun Tong continues to be an attractive region conducive to the long term development of Hong Kong. Numerous housing development projects have been unrolled in Kwun Tong, to provide residential facilities for a growing population in the booming business district.
2. Yau Tsim Mong
The Yau Tsim Mong District comprises of urban areas and is home to the Hong Kong Polytechnic University. The bustling District is also a major transport intersection (- the recent (connecting Hong Kong to Shenzhen and Guangzhou via a high-speed rail link completed recently). Home to Hong Kong’s largest shopping malls, cultural institutions and Night Markets, it is no wonder that the District is densely populated.
3. Wong Tai Sin
Wong Tai Sin is home to the working class neighborhood in Kowloon, where 80% of residents live in subsidised housing. At the same time, the district has the highest percentage of elderly, due to convenient transportation and full range of facilities which makes the region very livable.
tmap_mode("plot")
totalpop_faceted
Several observations can be noted from the following maps:
1. Children (and naturally, families) are more concentrated around the North region, particularly in the area of Yuen Long.
This can be attributed to the fact that there is more space available in the North, rendering the area appealing to young couples intending to look for houses to start a family. In 2019, a survey by the Hong Kong Baptist University suggests that the North District ranks highest on the city’s Happiness scale - possibly attributed to more families and greater green space compared to the city.
2. The working population (aged between 25-64) are most concentrated in the city center
The Central District is the central business district of Hong Kong, and evidently a large proportion of the working population would be based nearer to where multinational companies have their headquarters. The government headquarters is also located at Central. The proximity of Central to Victoria Harbour allows the region to serve as the centre of trade and financial activities effectively.
3. The older population (aged above 65) dominate in peripheral regions of Hong Kong
Many of Hong Kong’s elderly reside in Lantau Island (located on the bottom West). As the largest island in Hong Kong and originally a fishing village, Lantau Island is an ideal retirement area for the elderly. Parks cover more than half of the area of Lantau Island, and the pace of life here is much slower than that in the city. Privately owned residential developments are also located in Lantau Island, giving it a reputation as an expatriate enclave as well.
agepop