[P-03] Tugas Praktikum 3 STA581 SAINS DATA
INSTRUKSI TUGAS
Tugas Praktikum 3.a:
Praktikkan penggunaan
inner_join(),left_join(),right_join(),full_join(),semi_join(),anti_join()untuk menggabungkan 2 table atau lebih dari databasechinook.dbatauNorthwind_large.sqliteTugas Praktikum 3.b:
Buat satu Peta dari daerah Anda masing-masing (Kecamatan, Kabupaten). Anda bebas menggunakan data apapun.
PENGGUNAAN FUNGSI JOIN
Berikut
libraryyang digunakan:
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.2 v purrr 0.3.4
## v tibble 3.0.4 v dplyr 1.0.3
## v tidyr 1.1.2 v stringr 1.4.0
## v readr 1.4.0 v forcats 0.5.0
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(RSQLite)
library(DBI)Database yang akan digunakan pada praktikum ini adalah database
Northwind_large.sqlite. Langkah pertama sebelum mempraktikkan beberapa penggunaan fungsijoinadalah melakukan koneksi ke database yang akan digunakan dan melakukan database queries
#Koneksi ke database yang akan digunakan
NW <- DBI::dbConnect(RSQLite::SQLite(), "C:/sqlite/db/Northwind_large.sqlite")
#Melihat tabel-tabel yang terdapat di dalam database
RSQLite::dbListTables(NW)## [1] "Category" "Customer" "CustomerCustomerDemo"
## [4] "CustomerDemographic" "Employee" "EmployeeTerritory"
## [7] "Order" "OrderDetail" "Product"
## [10] "ProductDetails_V" "Region" "Shipper"
## [13] "Supplier" "Territory"
## # Source: table<Supplier> [?? x 12]
## # Database: sqlite 3.34.1 [C:\sqlite\db\Northwind_large.sqlite]
## Id CompanyName ContactName ContactTitle Address City Region PostalCode
## <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 1 Exotic Liq~ Charlotte ~ Purchasing ~ 49 Gil~ Lond~ Briti~ EC1 4SD
## 2 2 New Orlean~ Shelley Bu~ Order Admin~ P.O. B~ New ~ North~ 70117
## 3 3 Grandma Ke~ Regina Mur~ Sales Repre~ 707 Ox~ Ann ~ North~ 48104
## 4 4 Tokyo Trad~ Yoshi Naga~ Marketing M~ 9-8 Se~ Tokyo Easte~ 100
## 5 5 Cooperativ~ Antonio de~ Export Admi~ Calle ~ Ovie~ South~ 33007
## 6 6 Mayumi's Mayumi Ohno Marketing R~ 92 Set~ Osaka Easte~ 545
## 7 7 Pavlova, L~ Ian Devling Marketing M~ 74 Ros~ Melb~ Victo~ 3058
## 8 8 Specialty ~ Peter Wils~ Sales Repre~ 29 Kin~ Manc~ Briti~ M14 GSD
## 9 9 PB Knäckeb~ Lars Peter~ Sales Agent Kaload~ Göte~ North~ S-345 67
## 10 10 Refrescos ~ Carlos Diaz Marketing M~ Av. da~ Sao ~ South~ 5442
## # ... with more rows, and 4 more variables: Country <chr>, Phone <chr>,
## # Fax <chr>, HomePage <chr>
## # Source: table<Employee> [?? x 18]
## # Database: sqlite 3.34.1 [C:\sqlite\db\Northwind_large.sqlite]
## Id LastName FirstName Title TitleOfCourtesy BirthDate HireDate Address
## <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 1 Davolio Nancy Sale~ Ms. 1980-12-~ 2024-05~ 507 - ~
## 2 2 Fuller Andrew Vice~ Dr. 1984-02-~ 2024-08~ 908 W.~
## 3 3 Leverli~ Janet Sale~ Ms. 1995-08-~ 2024-04~ 722 Mo~
## 4 4 Peacock Margaret Sale~ Mrs. 1969-09-~ 2025-05~ 4110 O~
## 5 5 Buchanan Steven Sale~ Mr. 1987-03-~ 2025-10~ 14 Gar~
## 6 6 Suyama Michael Sale~ Mr. 1995-07-~ 2025-10~ Covent~
## 7 7 King Robert Sale~ Mr. 1992-05-~ 2026-01~ Edgeha~
## 8 8 Callahan Laura Insi~ Ms. 1990-01-~ 2026-03~ 4726 -~
## 9 9 Dodswor~ Anne Sale~ Ms. 1998-01-~ 2026-11~ 7 Houn~
## # ... with 10 more variables: City <chr>, Region <chr>, PostalCode <chr>,
## # Country <chr>, HomePhone <chr>, Extension <chr>, Photo <blob>, Notes <chr>,
## # ReportsTo <int>, PhotoPath <chr>
Tabel yang akan digunakan adalah tabel
SupplierdanEmployee. Pada tabelSupplierhanya akan digunakan variabelCompanyName,City, danCountry. Sementara, pada tabelEmployeehanya akan digunakan variabelTitledanCity.
## # A tibble: 29 x 3
## CompanyName City Country
## <chr> <chr> <chr>
## 1 Exotic Liquids London UK
## 2 New Orleans Cajun Delights New Orleans USA
## 3 Grandma Kelly's Homestead Ann Arbor USA
## 4 Tokyo Traders Tokyo Japan
## 5 Cooperativa de Quesos 'Las Cabras' Oviedo Spain
## 6 Mayumi's Osaka Japan
## 7 Pavlova, Ltd. Melbourne Australia
## 8 Specialty Biscuits, Ltd. Manchester UK
## 9 PB Knäckebröd AB Göteborg Sweden
## 10 Refrescos Americanas LTDA Sao Paulo Brazil
## # ... with 19 more rows
## # A tibble: 9 x 2
## Title City
## <chr> <chr>
## 1 Sales Representative Seattle
## 2 Vice President, Sales Tacoma
## 3 Sales Representative Kirkland
## 4 Sales Representative Redmond
## 5 Sales Manager London
## 6 Sales Representative London
## 7 Sales Representative London
## 8 Inside Sales Coordinator Seattle
## 9 Sales Representative London
Penggunaan inner_join()
Fungsi
inner_join()akan menghasilkan semua baris pada tabel x yang memiliki kesamaan nilai dengan tabel y, dan semua kolom dari x dan y.
## # A tibble: 4 x 4
## Title City CompanyName Country
## <chr> <chr> <chr> <chr>
## 1 Sales Manager London Exotic Liquids UK
## 2 Sales Representative London Exotic Liquids UK
## 3 Sales Representative London Exotic Liquids UK
## 4 Sales Representative London Exotic Liquids UK
## # A tibble: 4 x 4
## CompanyName City Country Title
## <chr> <chr> <chr> <chr>
## 1 Exotic Liquids London UK Sales Manager
## 2 Exotic Liquids London UK Sales Representative
## 3 Exotic Liquids London UK Sales Representative
## 4 Exotic Liquids London UK Sales Representative
Penggunaan left_join()
Fungsi
left_join()akan menghasilkan semua baris pada tabel x, dan semua kolom pada x dan y, dimana untuk baris pada tabel x yang tidak memiliki kesamaan dengan tabel y akan diisi dengan nilaiNApada kolom yang baru.
## # A tibble: 9 x 4
## Title City CompanyName Country
## <chr> <chr> <chr> <chr>
## 1 Sales Representative Seattle <NA> <NA>
## 2 Vice President, Sales Tacoma <NA> <NA>
## 3 Sales Representative Kirkland <NA> <NA>
## 4 Sales Representative Redmond <NA> <NA>
## 5 Sales Manager London Exotic Liquids UK
## 6 Sales Representative London Exotic Liquids UK
## 7 Sales Representative London Exotic Liquids UK
## 8 Inside Sales Coordinator Seattle <NA> <NA>
## 9 Sales Representative London Exotic Liquids UK
## # A tibble: 32 x 4
## CompanyName City Country Title
## <chr> <chr> <chr> <chr>
## 1 Exotic Liquids London UK Sales Manager
## 2 Exotic Liquids London UK Sales Representative
## 3 Exotic Liquids London UK Sales Representative
## 4 Exotic Liquids London UK Sales Representative
## 5 New Orleans Cajun Delights New Orleans USA <NA>
## 6 Grandma Kelly's Homestead Ann Arbor USA <NA>
## 7 Tokyo Traders Tokyo Japan <NA>
## 8 Cooperativa de Quesos 'Las Cabras' Oviedo Spain <NA>
## 9 Mayumi's Osaka Japan <NA>
## 10 Pavlova, Ltd. Melbourne Australia <NA>
## # ... with 22 more rows
Penggunaan right_join()
Fungsi
right_join()akan menghasilkan semua baris pada tabel y, dan semua kolom pada tabel x dan y, dimana untuk baris pada tabel y yang tidak memiliki kesamaan dengan tabel x akan diisi dengan nilaiNApada kolom yang baru.
## # A tibble: 32 x 4
## Title City CompanyName Country
## <chr> <chr> <chr> <chr>
## 1 Sales Manager London Exotic Liquids UK
## 2 Sales Representative London Exotic Liquids UK
## 3 Sales Representative London Exotic Liquids UK
## 4 Sales Representative London Exotic Liquids UK
## 5 <NA> New Orleans New Orleans Cajun Delights USA
## 6 <NA> Ann Arbor Grandma Kelly's Homestead USA
## 7 <NA> Tokyo Tokyo Traders Japan
## 8 <NA> Oviedo Cooperativa de Quesos 'Las Cabras' Spain
## 9 <NA> Osaka Mayumi's Japan
## 10 <NA> Melbourne Pavlova, Ltd. Australia
## # ... with 22 more rows
## # A tibble: 9 x 4
## CompanyName City Country Title
## <chr> <chr> <chr> <chr>
## 1 Exotic Liquids London UK Sales Manager
## 2 Exotic Liquids London UK Sales Representative
## 3 Exotic Liquids London UK Sales Representative
## 4 Exotic Liquids London UK Sales Representative
## 5 <NA> Seattle <NA> Sales Representative
## 6 <NA> Tacoma <NA> Vice President, Sales
## 7 <NA> Kirkland <NA> Sales Representative
## 8 <NA> Redmond <NA> Sales Representative
## 9 <NA> Seattle <NA> Inside Sales Coordinator
Penggunaan full_join()
Fungsi
full_join()akan menghasilkan semua baris dan kolom dari tabel x dan y. Jika ada nilai yang tidak sama antara tabel x dan y, maka akan bernilaiNA.
## # A tibble: 37 x 4
## Title City CompanyName Country
## <chr> <chr> <chr> <chr>
## 1 Sales Representative Seattle <NA> <NA>
## 2 Vice President, Sales Tacoma <NA> <NA>
## 3 Sales Representative Kirkland <NA> <NA>
## 4 Sales Representative Redmond <NA> <NA>
## 5 Sales Manager London Exotic Liquids UK
## 6 Sales Representative London Exotic Liquids UK
## 7 Sales Representative London Exotic Liquids UK
## 8 Inside Sales Coordinator Seattle <NA> <NA>
## 9 Sales Representative London Exotic Liquids UK
## 10 <NA> New Orleans New Orleans Cajun Delights USA
## # ... with 27 more rows
## # A tibble: 37 x 4
## CompanyName City Country Title
## <chr> <chr> <chr> <chr>
## 1 Exotic Liquids London UK Sales Manager
## 2 Exotic Liquids London UK Sales Representative
## 3 Exotic Liquids London UK Sales Representative
## 4 Exotic Liquids London UK Sales Representative
## 5 New Orleans Cajun Delights New Orleans USA <NA>
## 6 Grandma Kelly's Homestead Ann Arbor USA <NA>
## 7 Tokyo Traders Tokyo Japan <NA>
## 8 Cooperativa de Quesos 'Las Cabras' Oviedo Spain <NA>
## 9 Mayumi's Osaka Japan <NA>
## 10 Pavlova, Ltd. Melbourne Australia <NA>
## # ... with 27 more rows
Penggunaan semi_join()
Fungsi
semi_join()akan menghasilkan semua baris pada tabel x yang memiliki kesamaan nilai dengan tabel y, dan semua kolom dari tabel x. Mirip denganinner_join(), hanya saja padainner_join()mengembalikan semua kolom dari tabel x dan y.
## # A tibble: 4 x 2
## Title City
## <chr> <chr>
## 1 Sales Manager London
## 2 Sales Representative London
## 3 Sales Representative London
## 4 Sales Representative London
## # A tibble: 1 x 3
## CompanyName City Country
## <chr> <chr> <chr>
## 1 Exotic Liquids London UK
Penggunaan anti_join()
Fungsi
anti_join()akan menghasilkan semua baris dari tabel x yang TIDAK memiliki kesamaan dengan tabel y, dan semua kolom yang berasal dari tabel x.
## # A tibble: 5 x 2
## Title City
## <chr> <chr>
## 1 Sales Representative Seattle
## 2 Vice President, Sales Tacoma
## 3 Sales Representative Kirkland
## 4 Sales Representative Redmond
## 5 Inside Sales Coordinator Seattle
## # A tibble: 28 x 3
## CompanyName City Country
## <chr> <chr> <chr>
## 1 New Orleans Cajun Delights New Orleans USA
## 2 Grandma Kelly's Homestead Ann Arbor USA
## 3 Tokyo Traders Tokyo Japan
## 4 Cooperativa de Quesos 'Las Cabras' Oviedo Spain
## 5 Mayumi's Osaka Japan
## 6 Pavlova, Ltd. Melbourne Australia
## 7 Specialty Biscuits, Ltd. Manchester UK
## 8 PB Knäckebröd AB Göteborg Sweden
## 9 Refrescos Americanas LTDA Sao Paulo Brazil
## 10 Heli Süßwaren GmbH & Co. KG Berlin Germany
## # ... with 18 more rows
PEMBUATAN PETA
Berikut
libraryyang digunakan:
library(rgdal)
## Loading required package: sp
## rgdal: version: 1.5-23, (SVN revision 1121)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 3.2.1, released 2020/12/29
## Path to GDAL shared files: C:/Users/Wiwik Andriyani/Documents/R/win-library/4.0/rgdal/gdal
## GDAL binary built with GEOS: TRUE
## Loaded PROJ runtime: Rel. 7.2.1, January 1st, 2021, [PJ_VERSION: 721]
## Path to PROJ shared files: C:/Users/Wiwik Andriyani/Documents/R/win-library/4.0/rgdal/proj
## PROJ CDN enabled: FALSE
## Linking to sp version:1.4-5
## To mute warnings of possible GDAL/OSR exportToProj4() degradation,
## use options("rgdal_show_exportToProj4_warnings"="none") before loading rgdal.
## Overwritten PROJ_LIB was C:/Users/Wiwik Andriyani/Documents/R/win-library/4.0/rgdal/proj
library(sf)
## Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
library(ggplot2)
library(tigris)
## To enable
## caching of data, set `options(tigris_use_cache = TRUE)` in your R script or .Rprofile.
library(dplyr)
library(tidyverse)Membaca Peta
Peta yang digunakan pada praktikum ini adalah peta level Kecamatan.
## Reading layer `idn_admbnda_adm3_bps_20200401' from data source `C:\HumData\idn_admbnda_adm3_bps_20200401.shp' using driver `ESRI Shapefile'
## Simple feature collection with 7069 features and 16 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: 95.01079 ymin: -11.00762 xmax: 141.0194 ymax: 6.07693
## geographic CRS: WGS 84
## Rows: 7,069
## Columns: 17
## $ Shape_Leng <dbl> 0.2798656, 0.7514001, 0.6900061, 0.6483629, 0.2437073, 1...
## $ Shape_Area <dbl> 0.003107633, 0.016925540, 0.024636382, 0.010761277, 0.00...
## $ ADM3_EN <chr> "2 X 11 Enam Lingkung", "2 X 11 Kayu Tanam", "Abab", "Ab...
## $ ADM3_PCODE <chr> "ID1306050", "ID1306052", "ID1612030", "ID5107050", "ID7...
## $ ADM3_REF <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ ADM3ALT1EN <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ ADM3ALT2EN <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
## $ ADM2_EN <chr> "Padang Pariaman", "Padang Pariaman", "Penukal Abab Lema...
## $ ADM2_PCODE <chr> "ID1306", "ID1306", "ID1612", "ID5107", "ID7471", "ID943...
## $ ADM1_EN <chr> "Sumatera Barat", "Sumatera Barat", "Sumatera Selatan", ...
## $ ADM1_PCODE <chr> "ID13", "ID13", "ID16", "ID51", "ID74", "ID94", "ID94", ...
## $ ADM0_EN <chr> "Indonesia", "Indonesia", "Indonesia", "Indonesia", "Ind...
## $ ADM0_PCODE <chr> "ID", "ID", "ID", "ID", "ID", "ID", "ID", "ID", "ID", "I...
## $ date <date> 2019-12-20, 2019-12-20, 2019-12-20, 2019-12-20, 2019-12...
## $ validOn <date> 2020-04-01, 2020-04-01, 2020-04-01, 2020-04-01, 2020-04...
## $ validTo <date> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ geometry <MULTIPOLYGON [°]> MULTIPOLYGON (((100.2811 -0..., MULTIPOLYGO...
Memasukkan Data
Sementara, untuk datanya, digunakan data jumlah penduduk tahun 2020 yang ada di Kabupaten Lombok Tengah menurut Kecamatan.
# Input Data
Pnddk.Loteng <- read.csv("C:/HumData/jumlah_penduduk_2020.csv")
Pnddk.Loteng.Lombok <- read.csv("C:/HumData/penduduk_loteng_lombok_2020.csv", header=TRUE, sep=";")
# Merge Data
merged.Loteng <- geo_join(spatial_data=adm.Kec,
data_frame=Pnddk.Loteng, by_sp="ADM3_PCODE",
by_df="KodeKecamatan", how = "inner")
merged.Loteng.Lombok <- geo_join(spatial_data=adm.Kec,
data_frame=Pnddk.Loteng.Lombok, by_sp="ADM3_PCODE",
by_df="admin3Pcode", how = "inner")
# Data yang digunakan:
Pnddk.Loteng## KodeKecamatan Kecamatan Penduduk2020
## 1 ID5202010 Praya Barat 82.75
## 2 ID5202011 Praya Barat Daya 57.80
## 3 ID5202020 Pujut 116.83
## 4 ID5202030 Praya Timur 72.14
## 5 ID5202040 Janapria 84.59
## 6 ID5202050 Kopang 92.56
## 7 ID5202060 Praya 125.89
## 8 ID5202061 Praya Tengah 72.66
## 9 ID5202070 Jonggat 106.05
## 10 ID5202080 Pringgarata 75.64
## 11 ID5202090 Batukliang 87.85
## 12 ID5202091 Batukliang Utara 60.10
Menampilkan Spatial Data
Selanjutnya, disajikan dua jenis peta jumlah penduduk Kabupaten Lombok Tengah menurut Kecamatan Tahun 2020, sebagai berikut:
warna <- c("wheat1", "wheat2", "tomato2", "tomato3")
Peta.1 <- ggplot()+
geom_sf(data=merged.Loteng,aes(fill=Penduduk2020))+
scale_fill_gradientn(colours=warna)+
theme(axis.text = element_text(size = 5))+
labs(title="Jumlah Penduduk Kabupaten Lombok Tengah Tahun 2020")
Peta.1Peta.2 <- ggplot()+
geom_sf(data=merged.Loteng.Lombok,aes(fill=Penduduk2020))+
scale_fill_gradientn(colours=warna)+
labs(title="Jumlah Penduduk Kabupaten Lombok Tengah Tahun 2020 di Pulau Lombok")
Peta.2