[P-03] Tugas Praktikum 3 STA581 SAINS DATA

INSTRUKSI TUGAS

  • Tugas Praktikum 3.a:

    Praktikkan penggunaan inner_join(), left_join(), right_join(), full_join(), semi_join(), anti_join() untuk menggabungkan 2 table atau lebih dari database chinook.db atau Northwind_large.sqlite

  • Tugas Praktikum 3.b:

    Buat satu Peta dari daerah Anda masing-masing (Kecamatan, Kabupaten). Anda bebas menggunakan data apapun.

PENGGUNAAN FUNGSI JOIN

Berikut library yang digunakan:

library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.2     v purrr   0.3.4
## v tibble  3.0.4     v dplyr   1.0.3
## v tidyr   1.1.2     v stringr 1.4.0
## v readr   1.4.0     v forcats 0.5.0
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(RSQLite)
library(DBI)

Database yang akan digunakan pada praktikum ini adalah database Northwind_large.sqlite. Langkah pertama sebelum mempraktikkan beberapa penggunaan fungsi join adalah melakukan koneksi ke database yang akan digunakan dan melakukan database queries

#Koneksi ke database yang akan digunakan
NW <- DBI::dbConnect(RSQLite::SQLite(), "C:/sqlite/db/Northwind_large.sqlite")

#Melihat tabel-tabel yang terdapat di dalam database
RSQLite::dbListTables(NW)
##  [1] "Category"             "Customer"             "CustomerCustomerDemo"
##  [4] "CustomerDemographic"  "Employee"             "EmployeeTerritory"   
##  [7] "Order"                "OrderDetail"          "Product"             
## [10] "ProductDetails_V"     "Region"               "Shipper"             
## [13] "Supplier"             "Territory"
#Mengakses tabel pada database
(NW.S <- dplyr::tbl(NW,"Supplier"))
## # Source:   table<Supplier> [?? x 12]
## # Database: sqlite 3.34.1 [C:\sqlite\db\Northwind_large.sqlite]
##       Id CompanyName ContactName ContactTitle Address City  Region PostalCode
##    <int> <chr>       <chr>       <chr>        <chr>   <chr> <chr>  <chr>     
##  1     1 Exotic Liq~ Charlotte ~ Purchasing ~ 49 Gil~ Lond~ Briti~ EC1 4SD   
##  2     2 New Orlean~ Shelley Bu~ Order Admin~ P.O. B~ New ~ North~ 70117     
##  3     3 Grandma Ke~ Regina Mur~ Sales Repre~ 707 Ox~ Ann ~ North~ 48104     
##  4     4 Tokyo Trad~ Yoshi Naga~ Marketing M~ 9-8 Se~ Tokyo Easte~ 100       
##  5     5 Cooperativ~ Antonio de~ Export Admi~ Calle ~ Ovie~ South~ 33007     
##  6     6 Mayumi's    Mayumi Ohno Marketing R~ 92 Set~ Osaka Easte~ 545       
##  7     7 Pavlova, L~ Ian Devling Marketing M~ 74 Ros~ Melb~ Victo~ 3058      
##  8     8 Specialty ~ Peter Wils~ Sales Repre~ 29 Kin~ Manc~ Briti~ M14 GSD   
##  9     9 PB Knäckeb~ Lars Peter~ Sales Agent  Kaload~ Göte~ North~ S-345 67  
## 10    10 Refrescos ~ Carlos Diaz Marketing M~ Av. da~ Sao ~ South~ 5442      
## # ... with more rows, and 4 more variables: Country <chr>, Phone <chr>,
## #   Fax <chr>, HomePage <chr>
(NW.E <- dplyr::tbl(NW, "Employee"))
## # Source:   table<Employee> [?? x 18]
## # Database: sqlite 3.34.1 [C:\sqlite\db\Northwind_large.sqlite]
##      Id LastName FirstName Title TitleOfCourtesy BirthDate HireDate Address
##   <int> <chr>    <chr>     <chr> <chr>           <chr>     <chr>    <chr>  
## 1     1 Davolio  Nancy     Sale~ Ms.             1980-12-~ 2024-05~ 507 - ~
## 2     2 Fuller   Andrew    Vice~ Dr.             1984-02-~ 2024-08~ 908 W.~
## 3     3 Leverli~ Janet     Sale~ Ms.             1995-08-~ 2024-04~ 722 Mo~
## 4     4 Peacock  Margaret  Sale~ Mrs.            1969-09-~ 2025-05~ 4110 O~
## 5     5 Buchanan Steven    Sale~ Mr.             1987-03-~ 2025-10~ 14 Gar~
## 6     6 Suyama   Michael   Sale~ Mr.             1995-07-~ 2025-10~ Covent~
## 7     7 King     Robert    Sale~ Mr.             1992-05-~ 2026-01~ Edgeha~
## 8     8 Callahan Laura     Insi~ Ms.             1990-01-~ 2026-03~ 4726 -~
## 9     9 Dodswor~ Anne      Sale~ Ms.             1998-01-~ 2026-11~ 7 Houn~
## # ... with 10 more variables: City <chr>, Region <chr>, PostalCode <chr>,
## #   Country <chr>, HomePhone <chr>, Extension <chr>, Photo <blob>, Notes <chr>,
## #   ReportsTo <int>, PhotoPath <chr>

Tabel yang akan digunakan adalah tabel Supplier dan Employee. Pada tabel Supplier hanya akan digunakan variabel CompanyName, City, dan Country. Sementara, pada tabel Employee hanya akan digunakan variabel Title dan City.

#Tabel yang akan digunakan
(S <- tibble (data.frame (NW.S %>% select(CompanyName, City, Country))))
## # A tibble: 29 x 3
##    CompanyName                        City        Country  
##    <chr>                              <chr>       <chr>    
##  1 Exotic Liquids                     London      UK       
##  2 New Orleans Cajun Delights         New Orleans USA      
##  3 Grandma Kelly's Homestead          Ann Arbor   USA      
##  4 Tokyo Traders                      Tokyo       Japan    
##  5 Cooperativa de Quesos 'Las Cabras' Oviedo      Spain    
##  6 Mayumi's                           Osaka       Japan    
##  7 Pavlova, Ltd.                      Melbourne   Australia
##  8 Specialty Biscuits, Ltd.           Manchester  UK       
##  9 PB Knäckebröd AB                   Göteborg    Sweden   
## 10 Refrescos Americanas LTDA          Sao Paulo   Brazil   
## # ... with 19 more rows
(E <- tibble (data.frame(NW.E %>% select(Title, City))))
## # A tibble: 9 x 2
##   Title                    City    
##   <chr>                    <chr>   
## 1 Sales Representative     Seattle 
## 2 Vice President, Sales    Tacoma  
## 3 Sales Representative     Kirkland
## 4 Sales Representative     Redmond 
## 5 Sales Manager            London  
## 6 Sales Representative     London  
## 7 Sales Representative     London  
## 8 Inside Sales Coordinator Seattle 
## 9 Sales Representative     London

Penggunaan inner_join()

Fungsi inner_join() akan menghasilkan semua baris pada tabel x yang memiliki kesamaan nilai dengan tabel y, dan semua kolom dari x dan y.

inner_join(E, S, by="City")
## # A tibble: 4 x 4
##   Title                City   CompanyName    Country
##   <chr>                <chr>  <chr>          <chr>  
## 1 Sales Manager        London Exotic Liquids UK     
## 2 Sales Representative London Exotic Liquids UK     
## 3 Sales Representative London Exotic Liquids UK     
## 4 Sales Representative London Exotic Liquids UK
inner_join(S, E, by="City")
## # A tibble: 4 x 4
##   CompanyName    City   Country Title               
##   <chr>          <chr>  <chr>   <chr>               
## 1 Exotic Liquids London UK      Sales Manager       
## 2 Exotic Liquids London UK      Sales Representative
## 3 Exotic Liquids London UK      Sales Representative
## 4 Exotic Liquids London UK      Sales Representative

Penggunaan left_join()

Fungsi left_join() akan menghasilkan semua baris pada tabel x, dan semua kolom pada x dan y, dimana untuk baris pada tabel x yang tidak memiliki kesamaan dengan tabel y akan diisi dengan nilai NA pada kolom yang baru.

left_join(E, S, by="City")
## # A tibble: 9 x 4
##   Title                    City     CompanyName    Country
##   <chr>                    <chr>    <chr>          <chr>  
## 1 Sales Representative     Seattle  <NA>           <NA>   
## 2 Vice President, Sales    Tacoma   <NA>           <NA>   
## 3 Sales Representative     Kirkland <NA>           <NA>   
## 4 Sales Representative     Redmond  <NA>           <NA>   
## 5 Sales Manager            London   Exotic Liquids UK     
## 6 Sales Representative     London   Exotic Liquids UK     
## 7 Sales Representative     London   Exotic Liquids UK     
## 8 Inside Sales Coordinator Seattle  <NA>           <NA>   
## 9 Sales Representative     London   Exotic Liquids UK
left_join(S, E, by="City")
## # A tibble: 32 x 4
##    CompanyName                        City        Country   Title               
##    <chr>                              <chr>       <chr>     <chr>               
##  1 Exotic Liquids                     London      UK        Sales Manager       
##  2 Exotic Liquids                     London      UK        Sales Representative
##  3 Exotic Liquids                     London      UK        Sales Representative
##  4 Exotic Liquids                     London      UK        Sales Representative
##  5 New Orleans Cajun Delights         New Orleans USA       <NA>                
##  6 Grandma Kelly's Homestead          Ann Arbor   USA       <NA>                
##  7 Tokyo Traders                      Tokyo       Japan     <NA>                
##  8 Cooperativa de Quesos 'Las Cabras' Oviedo      Spain     <NA>                
##  9 Mayumi's                           Osaka       Japan     <NA>                
## 10 Pavlova, Ltd.                      Melbourne   Australia <NA>                
## # ... with 22 more rows

Penggunaan right_join()

Fungsi right_join() akan menghasilkan semua baris pada tabel y, dan semua kolom pada tabel x dan y, dimana untuk baris pada tabel y yang tidak memiliki kesamaan dengan tabel x akan diisi dengan nilai NA pada kolom yang baru.

right_join(E, S, by="City")
## # A tibble: 32 x 4
##    Title                City        CompanyName                        Country  
##    <chr>                <chr>       <chr>                              <chr>    
##  1 Sales Manager        London      Exotic Liquids                     UK       
##  2 Sales Representative London      Exotic Liquids                     UK       
##  3 Sales Representative London      Exotic Liquids                     UK       
##  4 Sales Representative London      Exotic Liquids                     UK       
##  5 <NA>                 New Orleans New Orleans Cajun Delights         USA      
##  6 <NA>                 Ann Arbor   Grandma Kelly's Homestead          USA      
##  7 <NA>                 Tokyo       Tokyo Traders                      Japan    
##  8 <NA>                 Oviedo      Cooperativa de Quesos 'Las Cabras' Spain    
##  9 <NA>                 Osaka       Mayumi's                           Japan    
## 10 <NA>                 Melbourne   Pavlova, Ltd.                      Australia
## # ... with 22 more rows
right_join(S, E, by="City")
## # A tibble: 9 x 4
##   CompanyName    City     Country Title                   
##   <chr>          <chr>    <chr>   <chr>                   
## 1 Exotic Liquids London   UK      Sales Manager           
## 2 Exotic Liquids London   UK      Sales Representative    
## 3 Exotic Liquids London   UK      Sales Representative    
## 4 Exotic Liquids London   UK      Sales Representative    
## 5 <NA>           Seattle  <NA>    Sales Representative    
## 6 <NA>           Tacoma   <NA>    Vice President, Sales   
## 7 <NA>           Kirkland <NA>    Sales Representative    
## 8 <NA>           Redmond  <NA>    Sales Representative    
## 9 <NA>           Seattle  <NA>    Inside Sales Coordinator

Penggunaan full_join()

Fungsi full_join() akan menghasilkan semua baris dan kolom dari tabel x dan y. Jika ada nilai yang tidak sama antara tabel x dan y, maka akan bernilai NA.

full_join(E, S, by="City")
## # A tibble: 37 x 4
##    Title                    City        CompanyName                Country
##    <chr>                    <chr>       <chr>                      <chr>  
##  1 Sales Representative     Seattle     <NA>                       <NA>   
##  2 Vice President, Sales    Tacoma      <NA>                       <NA>   
##  3 Sales Representative     Kirkland    <NA>                       <NA>   
##  4 Sales Representative     Redmond     <NA>                       <NA>   
##  5 Sales Manager            London      Exotic Liquids             UK     
##  6 Sales Representative     London      Exotic Liquids             UK     
##  7 Sales Representative     London      Exotic Liquids             UK     
##  8 Inside Sales Coordinator Seattle     <NA>                       <NA>   
##  9 Sales Representative     London      Exotic Liquids             UK     
## 10 <NA>                     New Orleans New Orleans Cajun Delights USA    
## # ... with 27 more rows
full_join(S, E, by="City")
## # A tibble: 37 x 4
##    CompanyName                        City        Country   Title               
##    <chr>                              <chr>       <chr>     <chr>               
##  1 Exotic Liquids                     London      UK        Sales Manager       
##  2 Exotic Liquids                     London      UK        Sales Representative
##  3 Exotic Liquids                     London      UK        Sales Representative
##  4 Exotic Liquids                     London      UK        Sales Representative
##  5 New Orleans Cajun Delights         New Orleans USA       <NA>                
##  6 Grandma Kelly's Homestead          Ann Arbor   USA       <NA>                
##  7 Tokyo Traders                      Tokyo       Japan     <NA>                
##  8 Cooperativa de Quesos 'Las Cabras' Oviedo      Spain     <NA>                
##  9 Mayumi's                           Osaka       Japan     <NA>                
## 10 Pavlova, Ltd.                      Melbourne   Australia <NA>                
## # ... with 27 more rows

Penggunaan semi_join()

Fungsi semi_join() akan menghasilkan semua baris pada tabel x yang memiliki kesamaan nilai dengan tabel y, dan semua kolom dari tabel x. Mirip dengan inner_join(), hanya saja pada inner_join() mengembalikan semua kolom dari tabel x dan y.

semi_join(E, S, by="City")
## # A tibble: 4 x 2
##   Title                City  
##   <chr>                <chr> 
## 1 Sales Manager        London
## 2 Sales Representative London
## 3 Sales Representative London
## 4 Sales Representative London
semi_join(S, E, by="City")
## # A tibble: 1 x 3
##   CompanyName    City   Country
##   <chr>          <chr>  <chr>  
## 1 Exotic Liquids London UK

Penggunaan anti_join()

Fungsi anti_join() akan menghasilkan semua baris dari tabel x yang TIDAK memiliki kesamaan dengan tabel y, dan semua kolom yang berasal dari tabel x.

anti_join(E, S, by="City")
## # A tibble: 5 x 2
##   Title                    City    
##   <chr>                    <chr>   
## 1 Sales Representative     Seattle 
## 2 Vice President, Sales    Tacoma  
## 3 Sales Representative     Kirkland
## 4 Sales Representative     Redmond 
## 5 Inside Sales Coordinator Seattle
anti_join(S, E, by="City")
## # A tibble: 28 x 3
##    CompanyName                        City        Country  
##    <chr>                              <chr>       <chr>    
##  1 New Orleans Cajun Delights         New Orleans USA      
##  2 Grandma Kelly's Homestead          Ann Arbor   USA      
##  3 Tokyo Traders                      Tokyo       Japan    
##  4 Cooperativa de Quesos 'Las Cabras' Oviedo      Spain    
##  5 Mayumi's                           Osaka       Japan    
##  6 Pavlova, Ltd.                      Melbourne   Australia
##  7 Specialty Biscuits, Ltd.           Manchester  UK       
##  8 PB Knäckebröd AB                   Göteborg    Sweden   
##  9 Refrescos Americanas LTDA          Sao Paulo   Brazil   
## 10 Heli Süßwaren GmbH & Co. KG        Berlin      Germany  
## # ... with 18 more rows

PEMBUATAN PETA

Berikut library yang digunakan:

library(rgdal)
## Loading required package: sp
## rgdal: version: 1.5-23, (SVN revision 1121)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 3.2.1, released 2020/12/29
## Path to GDAL shared files: C:/Users/Wiwik Andriyani/Documents/R/win-library/4.0/rgdal/gdal
## GDAL binary built with GEOS: TRUE 
## Loaded PROJ runtime: Rel. 7.2.1, January 1st, 2021, [PJ_VERSION: 721]
## Path to PROJ shared files: C:/Users/Wiwik Andriyani/Documents/R/win-library/4.0/rgdal/proj
## PROJ CDN enabled: FALSE
## Linking to sp version:1.4-5
## To mute warnings of possible GDAL/OSR exportToProj4() degradation,
## use options("rgdal_show_exportToProj4_warnings"="none") before loading rgdal.
## Overwritten PROJ_LIB was C:/Users/Wiwik Andriyani/Documents/R/win-library/4.0/rgdal/proj
library(sf)
## Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
library(ggplot2)
library(tigris)
## To enable 
## caching of data, set `options(tigris_use_cache = TRUE)` in your R script or .Rprofile.
library(dplyr)
library(tidyverse)

Membaca Peta

Peta yang digunakan pada praktikum ini adalah peta level Kecamatan.

adm.Kec <- "C:/HumData/idn_admbnda_adm3_bps_20200401.shp"
adm.Kec <- st_read(adm.Kec)
## Reading layer `idn_admbnda_adm3_bps_20200401' from data source `C:\HumData\idn_admbnda_adm3_bps_20200401.shp' using driver `ESRI Shapefile'
## Simple feature collection with 7069 features and 16 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 95.01079 ymin: -11.00762 xmax: 141.0194 ymax: 6.07693
## geographic CRS: WGS 84
glimpse(adm.Kec)
## Rows: 7,069
## Columns: 17
## $ Shape_Leng <dbl> 0.2798656, 0.7514001, 0.6900061, 0.6483629, 0.2437073, 1...
## $ Shape_Area <dbl> 0.003107633, 0.016925540, 0.024636382, 0.010761277, 0.00...
## $ ADM3_EN    <chr> "2 X 11 Enam Lingkung", "2 X 11 Kayu Tanam", "Abab", "Ab...
## $ ADM3_PCODE <chr> "ID1306050", "ID1306052", "ID1612030", "ID5107050", "ID7...
## $ ADM3_REF   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ ADM3ALT1EN <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ ADM3ALT2EN <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ ADM2_EN    <chr> "Padang Pariaman", "Padang Pariaman", "Penukal Abab Lema...
## $ ADM2_PCODE <chr> "ID1306", "ID1306", "ID1612", "ID5107", "ID7471", "ID943...
## $ ADM1_EN    <chr> "Sumatera Barat", "Sumatera Barat", "Sumatera Selatan", ...
## $ ADM1_PCODE <chr> "ID13", "ID13", "ID16", "ID51", "ID74", "ID94", "ID94", ...
## $ ADM0_EN    <chr> "Indonesia", "Indonesia", "Indonesia", "Indonesia", "Ind...
## $ ADM0_PCODE <chr> "ID", "ID", "ID", "ID", "ID", "ID", "ID", "ID", "ID", "I...
## $ date       <date> 2019-12-20, 2019-12-20, 2019-12-20, 2019-12-20, 2019-12...
## $ validOn    <date> 2020-04-01, 2020-04-01, 2020-04-01, 2020-04-01, 2020-04...
## $ validTo    <date> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ geometry   <MULTIPOLYGON [°]> MULTIPOLYGON (((100.2811 -0..., MULTIPOLYGO...

Memasukkan Data

Sementara, untuk datanya, digunakan data jumlah penduduk tahun 2020 yang ada di Kabupaten Lombok Tengah menurut Kecamatan.

# Input Data
Pnddk.Loteng <- read.csv("C:/HumData/jumlah_penduduk_2020.csv")
Pnddk.Loteng.Lombok <- read.csv("C:/HumData/penduduk_loteng_lombok_2020.csv", header=TRUE, sep=";")

# Merge Data
merged.Loteng <- geo_join(spatial_data=adm.Kec,
                          data_frame=Pnddk.Loteng, by_sp="ADM3_PCODE",
                          by_df="KodeKecamatan", how = "inner")
merged.Loteng.Lombok <- geo_join(spatial_data=adm.Kec,
                          data_frame=Pnddk.Loteng.Lombok, by_sp="ADM3_PCODE",
                          by_df="admin3Pcode", how = "inner")

# Data yang digunakan:
Pnddk.Loteng
##    KodeKecamatan        Kecamatan Penduduk2020
## 1      ID5202010      Praya Barat        82.75
## 2      ID5202011 Praya Barat Daya        57.80
## 3      ID5202020            Pujut       116.83
## 4      ID5202030      Praya Timur        72.14
## 5      ID5202040         Janapria        84.59
## 6      ID5202050           Kopang        92.56
## 7      ID5202060            Praya       125.89
## 8      ID5202061     Praya Tengah        72.66
## 9      ID5202070          Jonggat       106.05
## 10     ID5202080      Pringgarata        75.64
## 11     ID5202090       Batukliang        87.85
## 12     ID5202091 Batukliang Utara        60.10

Menampilkan Spatial Data

Selanjutnya, disajikan dua jenis peta jumlah penduduk Kabupaten Lombok Tengah menurut Kecamatan Tahun 2020, sebagai berikut:

warna <- c("wheat1", "wheat2", "tomato2", "tomato3")

Peta.1 <- ggplot()+
  geom_sf(data=merged.Loteng,aes(fill=Penduduk2020))+
  scale_fill_gradientn(colours=warna)+
  theme(axis.text = element_text(size = 5))+
  labs(title="Jumlah Penduduk Kabupaten Lombok Tengah Tahun 2020")
Peta.1

Peta.2 <- ggplot()+
  geom_sf(data=merged.Loteng.Lombok,aes(fill=Penduduk2020))+
  scale_fill_gradientn(colours=warna)+
  labs(title="Jumlah Penduduk Kabupaten Lombok Tengah Tahun 2020 di Pulau Lombok")
Peta.2