Part I
In our last chapter we had learnt how to work with tables in R. In this chapter, we will go a notch higher and for the budding spatial scientist, see how we can work with geospatial data in R.
We had last worked with county_sex data. This time round we will load the Kenya shapefile and view it in R.
The Kenya administrative shapefile can be found on this link - https://data.humdata.org/dataset/2c0b7571-4bef-4347-9b81-b2174c13f9ef/resource/03df9cbb-0b4f-4f22-9eb7-3cbd0157fd3d/download/ken_adm_iebc_20191031_shp.zip
Load it and extract it within your directory.
We would like to view the Kenya shapefile in our wonderful R canvas. Before we do so, we would like to call the following packages. Remember from chapter one the install.packages() functions is used to call packages into R. As a very simple exercise, call the following packages - rgdal, sp and sf.
Here is a brief overview of what these packages do from the geniuses who unpack the greek from their descriptions.
rgdal- provides bindings to the Geospatial Data Abstraction Library
sp - Classes and methods for spatial data; the classes document where the spatial location information resides, for 2D or 3D data. Utility functions are provided, e.g. for plotting data as maps, spatial selection, as well as methods for retrieving coordinates, for subsetting, print, summary, etc
sf - simple feature for R, Support for simple features, a standardized way to encode spatial vector data. Binds to ‘GDAL’ for reading and writing data, to ‘GEOS’ for geometrical operations, and to ‘PROJ’ for projection conversions and datum transformations. Uses by default the ‘s2’ package for spherical geometry operations on ellipsoidal (long/lat) coordinates.
Now use library() to load the functions into our R workspace. Remember that these are external packages, and not inbuilt and that is why we need to keep on calling them back to work!
#load the rgdal package
library(rgdal)
## Loading required package: sp
## Please note that rgdal will be retired by the end of 2023,
## plan transition to sf/stars/terra functions using GDAL and PROJ
## at your earliest convenience.
##
## rgdal: version: 1.5-28, (SVN revision 1158)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 3.2.1, released 2020/12/29
## Path to GDAL shared files: C:/Users/User/Documents/R/win-library/4.1/rgdal/gdal
## GDAL binary built with GEOS: TRUE
## Loaded PROJ runtime: Rel. 7.2.1, January 1st, 2021, [PJ_VERSION: 721]
## Path to PROJ shared files: C:/Users/User/Documents/R/win-library/4.1/rgdal/proj
## PROJ CDN enabled: FALSE
## Linking to sp version:1.4-6
## To mute warnings of possible GDAL/OSR exportToProj4() degradation,
## use options("rgdal_show_exportToProj4_warnings"="none") before loading sp or rgdal.
## Overwritten PROJ_LIB was C:/Users/User/Documents/R/win-library/4.1/rgdal/proj
#load sp package
library(sp)
#load sf package
library(sf)
## Linking to GEOS 3.9.1, GDAL 3.2.1, PROJ 7.2.1; sf_use_s2() is TRUE
Let’s load the Kenya administrative shapefile from our directory. To load shapefiles into R, we use the readOGR function.
#save the shapefile of admin1 (counties) into ke_adm object
ke_adm <- readOGR(dsn = './ken_adm', layer = 'ken_admbnda_adm1_iebc_20191031')
## OGR data source with driver: ESRI Shapefile
## Source: "E:\Documents\R Studio is here\R Studio files projects\gis_coding\ken_adm", layer: "ken_admbnda_adm1_iebc_20191031"
## with 47 features
## It has 12 fields
As you can see, the ke_adm has been added into our R environment. However, taking advantage of the help details, here is a shortcut I discovered.
ke_adm2 <- readOGR(dsn = './ken_adm/ken_admbnda_adm1_iebc_20191031.shp')
## OGR data source with driver: ESRI Shapefile
## Source: "E:\Documents\R Studio is here\R Studio files projects\gis_coding\ken_adm\ken_admbnda_adm1_iebc_20191031.shp", layer: "ken_admbnda_adm1_iebc_20191031"
## with 47 features
## It has 12 fields
Using the dsn = .shp' (note the extension) calls the shapefile without much fuss. You may have also seen that we used the./when calling the file instead of a long method of starting from the very beginning, likeE:/know/better/…. If your file is in the same project folder, you can use./` to get to the sub-directory within your project folder.
#remove ke_adm2 as we already have ke_adm
remove(ke_adm2)
Time to display our shapefile for seeing is believing. For this, we will load the tmap package.
#load tm-map package
library(tmap)
Next, we will use the tm_map tools to draw a map using the ke_adm shapefile. tm_map tools are quite a lot and lengthy to even scratch the surface here, so I would recommend you go through the below code slowly as you make good use of the help menu.
But first, lets read the attribute data of our shapefile to know which attributes to call into the tm_map tools. You will see why.
#read the first 6 attribute values for our ke_adm shapefile which will be used in some tm_map
#functions.
head(ke_adm@data)
## Shape_Leng Shape_Area ADM1_EN ADM1_PCODE ADM1_REF ADM1ALT1EN
## 0 5.932315 0.8847323 Baringo KE030 <NA> <NA>
## 1 2.922220 0.1980989 Bomet KE036 <NA> <NA>
## 2 3.062486 0.2450581 Bungoma KE039 <NA> <NA>
## 3 2.670396 0.1471779 Busia KE040 <NA> <NA>
## 4 3.888933 0.2444318 Elgeyo-Marakwet KE028 <NA> <NA>
## 5 3.386653 0.2294135 Embu KE014 <NA> <NA>
## ADM1ALT2EN ADM0_EN ADM0_PCODE date validOn validTo
## 0 <NA> Kenya KE 2017/11/03 2019/10/31 <NA>
## 1 <NA> Kenya KE 2017/11/03 2019/10/31 <NA>
## 2 <NA> Kenya KE 2017/11/03 2019/10/31 <NA>
## 3 <NA> Kenya KE 2017/11/03 2019/10/31 <NA>
## 4 <NA> Kenya KE 2017/11/03 2019/10/31 <NA>
## 5 <NA> Kenya KE 2017/11/03 2019/10/31 <NA>
You may ask where on earth @data came from. Unlike reading data frames where one will use the dataframe name, when reading attributes of shapefiles in R, @data is appended after the file name. If you would like to read the attributes of a column only, you add the $columnname after the appended @data. For example, after the head(ke_adm@data) has magically shown me the first 6 rows of my shapefile attribues, if I would like to view the attribute names of the ADM1_EN column, all I have to do is type ke_adm@data$ADM1_EN.
tm_map gets into action.
# plot the ke_adm shapefile on R. See help for details on the functions and the arguments used
tm_shape(ke_adm, 'Kenya', projection = '+init=epsg:4326', bbox = ke_adm) +
tm_fill(col = 'ADM1_EN', alpha = 0.5, palette('Okabe-Ito'), legend.show = F) + tm_text('ADM1_EN', size = 'Shape_Area', col = 'steelblue', legend.size.show = F, legend.col.show = F)
## Warning in CPL_crs_from_input(x): GDAL Message 1: +init=epsg:XXXX syntax is
## deprecated. It might return a CRS with a non-EPSG compliant axis order.
## Warning: Number of levels of the variable "ADM1_EN" is 47, which is
## larger than max.categories (which is 30), so levels are combined. Set
## tmap_options(max.categories = 47) in the layer function to show all levels.
Before going any further, is it possible to save a shapefile from the R environment? Yes.
#save the ke_adm shapefile from the R environment but with another name
writeOGR(ke_adm, dsn = './ken_adm', layer = 'ke_admins', driver = 'ESRI Shapefile', overwrite_layer = T)
Lets read the metadata of our Kenya shapefile and see if its projection reads as correctly as what we had specified for it +init=epsg:4326.
library(sp)
library(sf)
#check projection for our shapefile
st_crs(ke_adm)
## Coordinate Reference System:
## User input: WGS 84
## wkt:
## GEOGCRS["WGS 84",
## DATUM["World Geodetic System 1984",
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## CS[ellipsoidal,2],
## AXIS["geodetic latitude (Lat)",north,
## ORDER[1],
## ANGLEUNIT["degree",0.0174532925199433]],
## AXIS["geodetic longitude (Lon)",east,
## ORDER[2],
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4326]]
Not sure whether a pre-projected file can be changed to a different projection in tm_shape but you can try yourself.
Lets recall the kenya administrative shapefile layer we wrote into our directory - the ke_admins shapefile layer we had created above using this code: writeOGR(ke_adm, dsn = './ken_adm', layer = 'ke_admins', driver = 'ESRI Shapefile'). We would like to see if files saved from R inherit (or maintain) the same projection of the file in the R environment. Hypothetically, they should.
library(rgdal)
ke_admins <- readOGR(dsn = './ken_adm/ke_admins.shp')
## OGR data source with driver: ESRI Shapefile
## Source: "E:\Documents\R Studio is here\R Studio files projects\gis_coding\ken_adm\ke_admins.shp", layer: "ke_admins"
## with 47 features
## It has 12 fields
st_crs(ke_admins)
## Coordinate Reference System:
## User input: WGS 84
## wkt:
## GEOGCRS["WGS 84",
## DATUM["World Geodetic System 1984",
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## CS[ellipsoidal,2],
## AXIS["geodetic latitude (Lat)",north,
## ORDER[1],
## ANGLEUNIT["degree",0.0174532925199433]],
## AXIS["geodetic longitude (Lon)",east,
## ORDER[2],
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4326]]
Gracia! The same projection we had assigned to our ke_adm file was also carried over to our created shapefile of ke_admins.
#we don wanna get confused with too many names of the same ke_ shapefile
rm(ke_admins)
Part II
Let’s do something a bit more exciting. In our last chapter, we had analysed a population by sex and county dataset. To analyse the socioeconomic situation of our country, we would like to analyze the distribution of house assets per county. Unlike other times where we sourced files into R from a file directory in our laptop, this time we will be bold and source them directly from the web. Viruses don’t seem to stand a chance with R, maybe.
#simply put in the url and just like when calling it from the directory, put in quotes(' ')
house_assets <- read.csv('https://open.africa/dataset/9b94fe50-9d75-4b92-be00-6354c6e6cc88/resource/95eb7a07-71dd-48be-8c82-dd321ab329f1/download/percentage-distribution-of-conventional-households-by-ownership-of-selected-household-assets-201.csv', header = T)
You can also source the file from the web by using the following convention - read.csv(url(), header = T).
#get an overview of the house_assets dataframe
head(house_assets)
## County...Sub.County Conventional.Households Stand...alone.Radio
## 1 Kenya 12,043,016 56.9
## 2 Rural 7,379,282 58.5
## 3 Urban 4,663,734 54.4
## 4 MOMBASA 376,295 47.4
## 5 CHANGAMWE 46,439 51.6
## 6 JOMVU 53,214 49.4
## Desk.Top.Computer..Laptop..Tablet Functional.Television Analogue.Television
## 1 8.8 40.7 4.7
## 2 3.0 26.9 3.4
## 3 18.0 62.5 6.7
## 4 13.0 57.2 7.1
## 5 9.6 59.2 6.8
## 6 9.0 53.5 5.8
## Internet Bicycle Motor.Cycle Refrigerator Car
## 1 17.9 15.0 9.2 8.8 6.3
## 2 6.9 15.6 10.8 2.4 3.5
## 3 35.4 13.9 6.7 19.1 10.8
## 4 28.2 12.1 5.5 22.0 6.8
## 5 24.8 10.1 4.9 14.4 3.9
## 6 22.3 8.6 5.3 14.0 4.2
## Truck..Lorry..Bus..Three.Wheeler.truck Tuk.Tuk
## 1 0.9 0.5
## 2 0.7 0.4
## 3 1.2 0.7
## 4 0.9 1.5
## 5 0.8 1.1
## 6 0.7 0.8
If not a citizen of this country, the names under ’County…Sub.County` may seem like they are all belong to one group - Counties or sub-counties depending on which you choose. But they are not. This field is simply a mixture of Counties and the sub counties that make them. For example, Changamwe and Jomvu are sub-counties of Mombasa. Hard to tell if you are not a citizen.
However, not all values under the County_and_Sub_county variable are counties. Furthermore we are more interested in the highest five counties only. To find out the top 5 most populous counties, we will go back to our county_sex_new dataset. To arrange the values in descending order, that is from most populous to least populous counties, we will have to first install the dplyr package.
county_sex2 <- read.csv('county_sex2.csv', header = T)
county_sex_new <- county_sex2[-c(1), ]
rownames(county_sex_new) <- seq(length = nrow(county_sex_new))
county_sex_new
## X name Male Female Intersex Total
## 1 2 Mombasa 610257 598046 30 1208333
## 2 3 Kwale 425121 441681 18 866820
## 3 4 Kilifi 704089 749673 25 1453787
## 4 5 Tana River 158550 157391 2 315943
## 5 6 Lamu 76103 67813 4 143920
## 6 7 Taita-Taveta 173337 167327 7 340671
## 7 8 Garissa 458975 382344 34 841353
## 8 9 Wajir 415374 365840 49 781263
## 9 10 Mandera 434976 432444 37 867457
## 10 11 Marsabit 243548 216219 18 459785
## 11 12 Isiolo 139510 128483 9 268002
## 12 13 Meru 767698 777975 41 1545714
## 13 14 Tharaka-Nithi 193764 199406 7 393177
## 14 15 Embu 304208 304367 24 608599
## 15 16 Kitui 549003 587151 33 1136187
## 16 17 Machakos 710707 711191 34 1421932
## 17 18 Makueni 489691 497942 20 987653
## 18 19 Nyandarua 315022 323247 20 638289
## 19 20 Nyeri 374288 384845 31 759164
## 20 21 Kirinyaga 302011 308369 31 610411
## 21 22 Murang'a 523940 532669 31 1056640
## 22 23 Kiambu 1187146 1230454 135 2417735
## 23 24 Turkana 478087 448868 21 926976
## 24 25 West Pokot 307013 314213 15 621241
## 25 26 Samburu 156774 153546 7 310327
## 26 27 Trans Nzoia 489107 501206 28 990341
## 27 28 Uasin Gishu 580269 582889 28 1163186
## 28 29 Elgeyo-Marakwet 227317 227151 12 454480
## 29 30 Nandi 441259 444430 22 885711
## 30 31 Baringo 336322 330428 13 666763
## 31 32 Laikipia 259440 259102 18 518560
## 32 33 Nakuru 1077272 1084835 95 2162202
## 33 34 Narok 579042 578805 26 1157873
## 34 35 Kajiado 557098 560704 38 1117840
## 35 36 Kericho 450741 451008 28 901777
## 36 37 Bomet 434287 441379 23 875689
## 37 38 Kakamega 897133 970406 40 1867579
## 38 39 Vihiga 283678 306323 12 590013
## 39 40 Bungoma 812146 858389 35 1670570
## 40 41 Busia 426252 467401 28 893681
## 41 42 Siaya 471669 521496 18 993183
## 42 43 Kisumu 560942 594609 23 1155574
## 43 44 Homa Bay 539560 592367 23 1131950
## 44 45 Migori 536187 580214 35 1116436
## 45 46 Kisii 605784 661038 38 1266860
## 46 47 Nyamira 290907 314656 13 605576
## 47 48 Nairobi 2192452 2204376 245 4397073
#load dplyr after installing the package
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
top_five <- arrange(county_sex_new, desc(Total))%>%
slice_max(Total, n = 5)%>%
group_by(name)
top_five
## # A tibble: 5 x 6
## # Groups: name [5]
## X name Male Female Intersex Total
## <int> <chr> <int> <int> <int> <int>
## 1 48 Nairobi 2192452 2204376 245 4397073
## 2 23 Kiambu 1187146 1230454 135 2417735
## 3 33 Nakuru 1077272 1084835 95 2162202
## 4 38 Kakamega 897133 970406 40 1867579
## 5 40 Bungoma 812146 858389 35 1670570
The sign %>% may seem scary to the uninitiated R coder. It sounds like human beings have resorted to using symbospeech like in a novel I once read back with the above writing seeming to be the precursor to this scenario. You can simply interpret the pipe (%>%) sign as ‘then’. That is, after giving a command, like in the above case to arrange the Total variable of county_sex_new in descending order, then (%>%) proceed to….That’s the function of the pipe operator, to proceed the ongoing operation to something else. Think of it like a pipeline transporting crude oil from the storage tank to the refinery. In this case, the county_sex_new with an arranged Total variable will be ‘piped’ to be ‘sliced’. Read about slice in the help menu.
Run it please.
top_five <- arrange(county_sex_new, desc(Total))%>%
slice_max(Total, n = 5)%>%
group_by(name)
top_five
## # A tibble: 5 x 6
## # Groups: name [5]
## X name Male Female Intersex Total
## <int> <chr> <int> <int> <int> <int>
## 1 48 Nairobi 2192452 2204376 245 4397073
## 2 23 Kiambu 1187146 1230454 135 2417735
## 3 33 Nakuru 1077272 1084835 95 2162202
## 4 38 Kakamega 897133 970406 40 1867579
## 5 40 Bungoma 812146 858389 35 1670570
There goes a neat table of our top five most populous counties. In chapter 1, such a method was not provided, and in fact Excel was used to order the counties in terms of their population density, from highest to lowest. See here - https://rpubs.com/sammigachuhi/decodegisplease.
However, in coding, you are required to be a bit more sophisticated.
See more of dplyr here: https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html
Since we are only interested in the above five counties to display the percentage of total households owning particular assets per county, we will subset our shapefile to the above five counties, and thereafter join these attributes to them.
#to find out the names of all counties under the ADM1_EN variable and see if the names of the
#5 most densely populated counties area also of the same length as in ke_adm
list(ke_adm@data$ADM1_EN)
## [[1]]
## [1] "Baringo" "Bomet" "Bungoma" "Busia"
## [5] "Elgeyo-Marakwet" "Embu" "Garissa" "Homa Bay"
## [9] "Isiolo" "Kajiado" "Kakamega" "Kericho"
## [13] "Kiambu" "Kilifi" "Kirinyaga" "Kisii"
## [17] "Kisumu" "Kitui" "Kwale" "Laikipia"
## [21] "Lamu" "Machakos" "Makueni" "Mandera"
## [25] "Marsabit" "Meru" "Migori" "Mombasa"
## [29] "Murang'a" "Nairobi" "Nandi" "Narok"
## [33] "Nyamira" "Nyandarua" "Nyeri" "Samburu"
## [37] "Siaya" "Taita Taveta" "Tana River" "Tharaka-Nithi"
## [41] "Trans Nzoia" "Turkana" "Uasin Gishu" "Vihiga"
## [45] "Wajir" "West Pokot" "Nakuru"
#using the list above use the correct names to subset ke_adm shapefile to the 5 most densely
#populated counties
adm_subset <- subset(ke_adm, ADM1_EN%in% c('Nairobi', 'Kiambu', 'Nakuru', 'Kakamega', 'Bungoma'))
Remember R is case sensitive and even a mispelling, or (-) appearing where it shouldn’t promote social distancing can bring errors.
But there is a problem. Our house_assets dataframe contains the county and sub county names all in uppercase.
list(house_assets$County...Sub.County[1:7])
## [[1]]
## [1] "Kenya" "Rural" "Urban" "MOMBASA" "CHANGAMWE" "JOMVU"
## [7] "KISAUNI"
Is there a way to capitalize only the first letter and the rest remain lowercase at the touch of a key? Maybe.
*Unlike below, it is preferable you create a new object, like ‘house_assets2’ and then append a new column like `house_assets2\(names <- tolower(house_assets\)County…Sub.County).
#convert all the column values of county and sub counties to lowercase
house_assets$names <- tolower(house_assets$County...Sub.County)
Next, to capitalize the first letter of the lowercase values in the new column of names. To do this, you will have to install the package stringr and load it to R.
library(stringr)
## Warning: package 'stringr' was built under R version 4.1.3
house_assets$admin_name <- str_to_title(house_assets$names)
The names that will work out for our join operation are at the faaaaaaar end under the admin_name column header.
However, we have a pesky problem, house_assets names the capital as ‘Nairobi City’ instead of ‘Nairobi’ as named in the adm_subset layer. The creators sincerely wanted to deny us sleep.
#replace 'Nairobi City' with 'Nairobi'
house_assets[house_assets == 'Nairobi City'] <- 'Nairobi'
#remember adm_subset is the subsetted shapefile
#merge is used to merge the common fields between spatial data adm_subset and dataframe house_assets
county_hs_assets <- merge(adm_subset, house_assets, by.x = 'ADM1_EN', by.y = 'admin_name', duplicateGeoms = T)
#view if county_hs_assets contains the data from house_assets appended to adm_subset
head(county_hs_assets@data)
## ADM1_EN Shape_Leng Shape_Area ADM1_PCODE ADM1_REF ADM1ALT1EN ADM1ALT2EN
## 1 Bungoma 3.062486 0.2450581 KE039 <NA> <NA> <NA>
## 2 Kakamega 4.125304 0.2443814 KE037 <NA> <NA> <NA>
## 3 Kiambu 3.193175 0.2066285 KE022 <NA> <NA> <NA>
## 4 Kiambu 3.193175 0.2066285 KE022 <NA> <NA> <NA>
## 5 Nairobi 1.658061 0.0574956 KE047 <NA> <NA> <NA>
## 6 Nakuru 10.598716 1.2836187 KE032 <NA> <NA> <NA>
## ADM0_EN ADM0_PCODE date validOn validTo County...Sub.County
## 1 Kenya KE 2017/11/03 2019/10/31 <NA> BUNGOMA
## 2 Kenya KE 2017/11/03 2019/10/31 <NA> KAKAMEGA
## 3 Kenya KE 2017/11/03 2019/10/31 <NA> KIAMBU
## 4 Kenya KE 2017/11/03 2019/10/31 <NA> KIAMBU
## 5 Kenya KE 2017/11/03 2019/10/31 <NA> NAIROBI CITY
## 6 Kenya KE 2017/11/03 2019/10/31 <NA> NAKURU
## Conventional.Households Stand...alone.Radio Desk.Top.Computer..Laptop..Tablet
## 1 357,714 62.1 4.1
## 2 432,284 62.0 4.3
## 3 46,816 61.2 28.4
## 4 792,333 61.4 19.6
## 5 1,494,676 53.4 22.7
## 6 598,237 61.1 9.6
## Functional.Television Analogue.Television Internet Bicycle Motor.Cycle
## 1 25.4 3.6 7.2 26.2 11.9
## 2 29.2 4.3 8.6 24.4 11.5
## 3 73.8 7.0 49.7 19.2 5.0
## 4 70.0 6.3 39.1 16.3 6.9
## 5 68.7 7.5 42.1 12.5 4.3
## 6 52.5 5.7 20.9 19.1 10.9
## Refrigerator Car Truck..Lorry..Bus..Three.Wheeler.truck Tuk.Tuk names
## 1 2.8 3.1 0.6 0.4 bungoma
## 2 3.3 3.3 0.6 0.4 kakamega
## 3 32.2 21.1 2.3 0.5 kiambu
## 4 19.6 12.4 1.5 0.5 kiambu
## 5 23.5 12.9 1.2 0.4 nairobi city
## 6 8.9 7.1 1.0 0.4 nakuru
Even R wants to deny us sleep, there are two Kiambu values under the County...Sub.county variable and we are only dealing with counties. Since we don’t have two counties with the name ‘Kiambu’ but only one, definitely one of the two ‘Kiambu’ values is a namesake county. This is something you’ll most likely always run into and which may often eat into your sleeping time, like mine today.
Since we know that we are dealing with statistics at a county level, therefore the row with 792, 333 households is the correct one(the 4th row). Let’s delete the row with the other ‘Kiambu’ value (3rd row) with 46, 816 households within its borders.
#don't forget the comma after the row index!
county_hs_assets@data [-3, ]
## ADM1_EN Shape_Leng Shape_Area ADM1_PCODE ADM1_REF ADM1ALT1EN ADM1ALT2EN
## 1 Bungoma 3.062486 0.2450581 KE039 <NA> <NA> <NA>
## 2 Kakamega 4.125304 0.2443814 KE037 <NA> <NA> <NA>
## 4 Kiambu 3.193175 0.2066285 KE022 <NA> <NA> <NA>
## 5 Nairobi 1.658061 0.0574956 KE047 <NA> <NA> <NA>
## 6 Nakuru 10.598716 1.2836187 KE032 <NA> <NA> <NA>
## ADM0_EN ADM0_PCODE date validOn validTo County...Sub.County
## 1 Kenya KE 2017/11/03 2019/10/31 <NA> BUNGOMA
## 2 Kenya KE 2017/11/03 2019/10/31 <NA> KAKAMEGA
## 4 Kenya KE 2017/11/03 2019/10/31 <NA> KIAMBU
## 5 Kenya KE 2017/11/03 2019/10/31 <NA> NAIROBI CITY
## 6 Kenya KE 2017/11/03 2019/10/31 <NA> NAKURU
## Conventional.Households Stand...alone.Radio Desk.Top.Computer..Laptop..Tablet
## 1 357,714 62.1 4.1
## 2 432,284 62.0 4.3
## 4 792,333 61.4 19.6
## 5 1,494,676 53.4 22.7
## 6 598,237 61.1 9.6
## Functional.Television Analogue.Television Internet Bicycle Motor.Cycle
## 1 25.4 3.6 7.2 26.2 11.9
## 2 29.2 4.3 8.6 24.4 11.5
## 4 70.0 6.3 39.1 16.3 6.9
## 5 68.7 7.5 42.1 12.5 4.3
## 6 52.5 5.7 20.9 19.1 10.9
## Refrigerator Car Truck..Lorry..Bus..Three.Wheeler.truck Tuk.Tuk names
## 1 2.8 3.1 0.6 0.4 bungoma
## 2 3.3 3.3 0.6 0.4 kakamega
## 4 19.6 12.4 1.5 0.5 kiambu
## 5 23.5 12.9 1.2 0.4 nairobi city
## 6 8.9 7.1 1.0 0.4 nakuru
Finally, lets draw a map and call this a day.
library(tmap)
tm_shape(county_hs_assets, name = 'counties', projection = '+init=epsg:4326', bbox = county_hs_assets) + tm_fill(col = 'Internet', alpha = 0.6, palette = 'YlOrRd', title = 'Internet') + tm_text('ADM1_EN', size = 'Internet', col = 'black') + tm_borders(col = 'grey', lwd = 0.3, lty = 'solid', alpha = 0.8) + tm_layout(main.title = 'Household internet access for 5 highest populated counties', title.size = 3, bg.color = 'grey85')
We have handled a fair number of errors in this practical. Time to take a break or break.