Sains Data (Database Queries with R & Spatial Data)
Tugas Praktikum 3a: Database Queries with R
Combining Multiple Tables with packages dplyr in R
Persiapan
- Penginstallan packages
RSQLitedanDBIuntuk menghubungan R ke sistem Database
# install.packages(c("RSQLite","DBI"), dependecies=True)- Pemanggilan package
dplyr,dbplyrdan/atautidyversesertaRSQLitedanDBI
library(tidyverse)## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.4 v dplyr 1.0.7
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 2.0.1 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(RSQLite)
library(DBI)- Melakukan koneksi terhadap database pada R
Northwind <-DBI::dbConnect(RSQLite::SQLite(),"C:/sqlite/db/Northwind_large.sqlite")class(Northwind)## [1] "SQLiteConnection"
## attr(,"package")
## [1] "RSQLite"
- Melihat daftar tabel yang terdapat pada database chinook.db
dbListTables(Northwind)## [1] "Category" "Customer" "CustomerCustomerDemo"
## [4] "CustomerDemographic" "Employee" "EmployeeTerritory"
## [7] "Order" "OrderDetail" "Product"
## [10] "ProductDetails_V" "Region" "Shipper"
## [13] "Supplier" "Territory"
Menggabungkan Beberapa Table dengan dplyr di R
Akses beberapa table untuk dilakukan penggabungan tabel
Producttable
produk <- tbl(Northwind, "Product")
glimpse(produk)## Rows: ??
## Columns: 10
## Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
## $ Id <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,~
## $ ProductName <chr> "Chai", "Chang", "Aniseed Syrup", "Chef Anton's Cajun ~
## $ SupplierId <int> 1, 1, 1, 2, 2, 3, 3, 3, 4, 4, 5, 5, 6, 6, 6, 7, 7, 7, ~
## $ CategoryId <int> 1, 1, 2, 2, 2, 2, 7, 2, 6, 8, 4, 4, 8, 7, 2, 3, 6, 8, ~
## $ QuantityPerUnit <chr> "10 boxes x 20 bags", "24 - 12 oz bottles", "12 - 550 ~
## $ UnitPrice <dbl> 18.00, 19.00, 10.00, 22.00, 21.35, 25.00, 30.00, 40.00~
## $ UnitsInStock <int> 39, 17, 13, 53, 0, 120, 15, 6, 29, 31, 22, 86, 24, 35,~
## $ UnitsOnOrder <int> 0, 40, 70, 0, 0, 0, 0, 0, 0, 0, 30, 0, 0, 0, 0, 0, 0, ~
## $ ReorderLevel <int> 10, 25, 25, 0, 0, 25, 10, 0, 0, 0, 30, 0, 5, 0, 5, 10,~
## $ Discontinued <int> 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, ~
Order Detailtable
supplier <- tbl(Northwind, "Supplier")
glimpse(supplier)## Rows: ??
## Columns: 12
## Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
## $ Id <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17~
## $ CompanyName <chr> "Exotic Liquids", "New Orleans Cajun Delights", "Grandma ~
## $ ContactName <chr> "Charlotte Cooper", "Shelley Burke", "Regina Murphy", "Yo~
## $ ContactTitle <chr> "Purchasing Manager", "Order Administrator", "Sales Repre~
## $ Address <chr> "49 Gilbert St.", "P.O. Box 78934", "707 Oxford Rd.", "9-~
## $ City <chr> "London", "New Orleans", "Ann Arbor", "Tokyo", "Oviedo", ~
## $ Region <chr> "British Isles", "North America", "North America", "Easte~
## $ PostalCode <chr> "EC1 4SD", "70117", "48104", "100", "33007", "545", "3058~
## $ Country <chr> "UK", "USA", "USA", "Japan", "Spain", "Japan", "Australia~
## $ Phone <chr> "(171) 555-2222", "(100) 555-4822", "(313) 555-5735", "(0~
## $ Fax <chr> NA, NA, "(313) 555-3349", NA, NA, NA, "(03) 444-6588", NA~
## $ HomePage <chr> NA, "#CAJUN.HTM#", NA, NA, NA, "Mayumi's (on the World Wi~
inner_join
Melakukan penggabungan tabel Product dan Supplier dengan metode inner_join, dengan key antar table adalah Id
1. Menggunakan syntax R
p1 <- inner_join(produk, supplier, by="Id")
p1## # Source: lazy query [?? x 21]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
## Id ProductName SupplierId CategoryId QuantityPerUnit UnitPrice
## <int> <chr> <int> <int> <chr> <dbl>
## 1 1 Chai 1 1 10 boxes x 20 ba~ 18
## 2 2 Chang 1 1 24 - 12 oz bottl~ 19
## 3 3 Aniseed Syrup 1 2 12 - 550 ml bott~ 10
## 4 4 Chef Anton's Cajun S~ 2 2 48 - 6 oz jars 22
## 5 5 Chef Anton's Gumbo M~ 2 2 36 boxes 21.4
## 6 6 Grandma's Boysenberr~ 3 2 12 - 8 oz jars 25
## 7 7 Uncle Bob's Organic ~ 3 7 12 - 1 lb pkgs. 30
## 8 8 Northwoods Cranberry~ 3 2 12 - 12 oz jars 40
## 9 9 Mishi Kobe Niku 4 6 18 - 500 g pkgs. 97
## 10 10 Ikura 4 8 12 - 200 ml jars 31
## # ... with more rows, and 15 more variables: UnitsInStock <int>,
## # UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>,
## # CompanyName <chr>, ContactName <chr>, ContactTitle <chr>, Address <chr>,
## # City <chr>, Region <chr>, PostalCode <chr>, Country <chr>, Phone <chr>,
## # Fax <chr>, HomePage <chr>
2. Menggunakan syntax SQL
p2 <- tbl(Northwind, sql
("SELECT * FROM Product
INNER JOIN Supplier
ON Product.Id = Supplier.Id"))
p2## # Source: SQL [?? x 22]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
## Id ProductName SupplierId CategoryId QuantityPerUnit UnitPrice
## <int> <chr> <int> <int> <chr> <dbl>
## 1 1 Chai 1 1 10 boxes x 20 ba~ 18
## 2 2 Chang 1 1 24 - 12 oz bottl~ 19
## 3 3 Aniseed Syrup 1 2 12 - 550 ml bott~ 10
## 4 4 Chef Anton's Cajun S~ 2 2 48 - 6 oz jars 22
## 5 5 Chef Anton's Gumbo M~ 2 2 36 boxes 21.4
## 6 6 Grandma's Boysenberr~ 3 2 12 - 8 oz jars 25
## 7 7 Uncle Bob's Organic ~ 3 7 12 - 1 lb pkgs. 30
## 8 8 Northwoods Cranberry~ 3 2 12 - 12 oz jars 40
## 9 9 Mishi Kobe Niku 4 6 18 - 500 g pkgs. 97
## 10 10 Ikura 4 8 12 - 200 ml jars 31
## # ... with more rows, and 16 more variables: UnitsInStock <int>,
## # UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>, Id:1 <int>,
## # CompanyName <chr>, ContactName <chr>, ContactTitle <chr>, Address <chr>,
## # City <chr>, Region <chr>, PostalCode <chr>, Country <chr>, Phone <chr>,
## # Fax <chr>, HomePage <chr>
left_join
Melakukan penggabungan tabel Product dan Supplier dengan metode left_join, dengan key antar table adalah Id
1. Menggunakan syntax R
q1 <- left_join(produk, supplier, by="Id")
q1## # Source: lazy query [?? x 21]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
## Id ProductName SupplierId CategoryId QuantityPerUnit UnitPrice
## <int> <chr> <int> <int> <chr> <dbl>
## 1 1 Chai 1 1 10 boxes x 20 ba~ 18
## 2 2 Chang 1 1 24 - 12 oz bottl~ 19
## 3 3 Aniseed Syrup 1 2 12 - 550 ml bott~ 10
## 4 4 Chef Anton's Cajun S~ 2 2 48 - 6 oz jars 22
## 5 5 Chef Anton's Gumbo M~ 2 2 36 boxes 21.4
## 6 6 Grandma's Boysenberr~ 3 2 12 - 8 oz jars 25
## 7 7 Uncle Bob's Organic ~ 3 7 12 - 1 lb pkgs. 30
## 8 8 Northwoods Cranberry~ 3 2 12 - 12 oz jars 40
## 9 9 Mishi Kobe Niku 4 6 18 - 500 g pkgs. 97
## 10 10 Ikura 4 8 12 - 200 ml jars 31
## # ... with more rows, and 15 more variables: UnitsInStock <int>,
## # UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>,
## # CompanyName <chr>, ContactName <chr>, ContactTitle <chr>, Address <chr>,
## # City <chr>, Region <chr>, PostalCode <chr>, Country <chr>, Phone <chr>,
## # Fax <chr>, HomePage <chr>
2. Menggunakan syntax SQL
show_query(q1)## <SQL>
## SELECT `LHS`.`Id` AS `Id`, `ProductName`, `SupplierId`, `CategoryId`, `QuantityPerUnit`, `UnitPrice`, `UnitsInStock`, `UnitsOnOrder`, `ReorderLevel`, `Discontinued`, `CompanyName`, `ContactName`, `ContactTitle`, `Address`, `City`, `Region`, `PostalCode`, `Country`, `Phone`, `Fax`, `HomePage`
## FROM `Product` AS `LHS`
## LEFT JOIN `Supplier` AS `RHS`
## ON (`LHS`.`Id` = `RHS`.`Id`)
syntax di atas akan sama dengan synax berikut
q2 <- tbl(Northwind, sql
("SELECT * FROM Product AS pd
LEFT JOIN Supplier AS sp
ON pd.Id = sp.Id"))
q2## # Source: SQL [?? x 22]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
## Id ProductName SupplierId CategoryId QuantityPerUnit UnitPrice
## <int> <chr> <int> <int> <chr> <dbl>
## 1 1 Chai 1 1 10 boxes x 20 ba~ 18
## 2 2 Chang 1 1 24 - 12 oz bottl~ 19
## 3 3 Aniseed Syrup 1 2 12 - 550 ml bott~ 10
## 4 4 Chef Anton's Cajun S~ 2 2 48 - 6 oz jars 22
## 5 5 Chef Anton's Gumbo M~ 2 2 36 boxes 21.4
## 6 6 Grandma's Boysenberr~ 3 2 12 - 8 oz jars 25
## 7 7 Uncle Bob's Organic ~ 3 7 12 - 1 lb pkgs. 30
## 8 8 Northwoods Cranberry~ 3 2 12 - 12 oz jars 40
## 9 9 Mishi Kobe Niku 4 6 18 - 500 g pkgs. 97
## 10 10 Ikura 4 8 12 - 200 ml jars 31
## # ... with more rows, and 16 more variables: UnitsInStock <int>,
## # UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>, Id:1 <int>,
## # CompanyName <chr>, ContactName <chr>, ContactTitle <chr>, Address <chr>,
## # City <chr>, Region <chr>, PostalCode <chr>, Country <chr>, Phone <chr>,
## # Fax <chr>, HomePage <chr>
right_join
Melakukan penggabungan tabel Product dan Supplier dengan metode right_join, dengan key antar table adalahId`
1. Menggunakan syntax R
r1 <- right_join(produk, supplier, by="Id")
r1## # Source: lazy query [?? x 21]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
## Id ProductName SupplierId CategoryId QuantityPerUnit UnitPrice
## <int> <chr> <int> <int> <chr> <dbl>
## 1 1 Chai 1 1 10 boxes x 20 ba~ 18
## 2 2 Chang 1 1 24 - 12 oz bottl~ 19
## 3 3 Aniseed Syrup 1 2 12 - 550 ml bott~ 10
## 4 4 Chef Anton's Cajun S~ 2 2 48 - 6 oz jars 22
## 5 5 Chef Anton's Gumbo M~ 2 2 36 boxes 21.4
## 6 6 Grandma's Boysenberr~ 3 2 12 - 8 oz jars 25
## 7 7 Uncle Bob's Organic ~ 3 7 12 - 1 lb pkgs. 30
## 8 8 Northwoods Cranberry~ 3 2 12 - 12 oz jars 40
## 9 9 Mishi Kobe Niku 4 6 18 - 500 g pkgs. 97
## 10 10 Ikura 4 8 12 - 200 ml jars 31
## # ... with more rows, and 15 more variables: UnitsInStock <int>,
## # UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>,
## # CompanyName <chr>, ContactName <chr>, ContactTitle <chr>, Address <chr>,
## # City <chr>, Region <chr>, PostalCode <chr>, Country <chr>, Phone <chr>,
## # Fax <chr>, HomePage <chr>
2. Menggunakan syntax SQL
Perlu diperhatikan bahwa right_join tidak didukung oleh fungsi tbl sehinggga akan muncul peringatan error.
# tbl(Northwind, sql ("SELECT * FROM Product RIGHT JOIN Supplier ON Product.Id = Supplier.Id"))Untuk menangani masalah di atas, dapat di atasi dengan menggunakan perintah left_join namun dengan urutan tabel yang berbeda dari kasus sebelumnya (dibalik).
show_query(r1)## <SQL>
## SELECT `RHS`.`Id` AS `Id`, `ProductName`, `SupplierId`, `CategoryId`, `QuantityPerUnit`, `UnitPrice`, `UnitsInStock`, `UnitsOnOrder`, `ReorderLevel`, `Discontinued`, `CompanyName`, `ContactName`, `ContactTitle`, `Address`, `City`, `Region`, `PostalCode`, `Country`, `Phone`, `Fax`, `HomePage`
## FROM `Supplier` AS `RHS`
## LEFT JOIN `Product` AS `LHS`
## ON (`LHS`.`Id` = `RHS`.`Id`)
r2 <- tbl(Northwind, sql
("SELECT * FROM Supplier
LEFT JOIN Product
ON Product.Id = Supplier.Id"))
r2## # Source: SQL [?? x 22]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
## Id CompanyName ContactName ContactTitle Address City Region PostalCode
## <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 1 Exotic Liqu~ Charlotte C~ Purchasing M~ 49 Gil~ Lond~ Briti~ EC1 4SD
## 2 2 New Orleans~ Shelley Bur~ Order Admini~ P.O. B~ New ~ North~ 70117
## 3 3 Grandma Kel~ Regina Murp~ Sales Repres~ 707 Ox~ Ann ~ North~ 48104
## 4 4 Tokyo Trade~ Yoshi Nagase Marketing Ma~ 9-8 Se~ Tokyo Easte~ 100
## 5 5 Cooperativa~ Antonio del~ Export Admin~ Calle ~ Ovie~ South~ 33007
## 6 6 Mayumi's Mayumi Ohno Marketing Re~ 92 Set~ Osaka Easte~ 545
## 7 7 Pavlova, Lt~ Ian Devling Marketing Ma~ 74 Ros~ Melb~ Victo~ 3058
## 8 8 Specialty B~ Peter Wilson Sales Repres~ 29 Kin~ Manc~ Briti~ M14 GSD
## 9 9 PB Knäckebr~ Lars Peters~ Sales Agent Kaload~ Göte~ North~ S-345 67
## 10 10 Refrescos A~ Carlos Diaz Marketing Ma~ Av. da~ Sao ~ South~ 5442
## # ... with more rows, and 14 more variables: Country <chr>, Phone <chr>,
## # Fax <chr>, HomePage <chr>, Id:1 <int>, ProductName <chr>, SupplierId <int>,
## # CategoryId <int>, QuantityPerUnit <chr>, UnitPrice <dbl>,
## # UnitsInStock <int>, UnitsOnOrder <int>, ReorderLevel <int>,
## # Discontinued <int>
full_join
Melakukan penggabungan tabel Product dan Supplier dengan metode full_join, dengan key antar table adalahId`
1. Menggunakan syntax R
s1 <- full_join(produk, supplier, by="Id")
s1## # Source: lazy query [?? x 21]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
## Id ProductName SupplierId CategoryId QuantityPerUnit UnitPrice
## <int> <chr> <int> <int> <chr> <dbl>
## 1 1 Chai 1 1 10 boxes x 20 ba~ 18
## 2 2 Chang 1 1 24 - 12 oz bottl~ 19
## 3 3 Aniseed Syrup 1 2 12 - 550 ml bott~ 10
## 4 4 Chef Anton's Cajun S~ 2 2 48 - 6 oz jars 22
## 5 5 Chef Anton's Gumbo M~ 2 2 36 boxes 21.4
## 6 6 Grandma's Boysenberr~ 3 2 12 - 8 oz jars 25
## 7 7 Uncle Bob's Organic ~ 3 7 12 - 1 lb pkgs. 30
## 8 8 Northwoods Cranberry~ 3 2 12 - 12 oz jars 40
## 9 9 Mishi Kobe Niku 4 6 18 - 500 g pkgs. 97
## 10 10 Ikura 4 8 12 - 200 ml jars 31
## # ... with more rows, and 15 more variables: UnitsInStock <int>,
## # UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>,
## # CompanyName <chr>, ContactName <chr>, ContactTitle <chr>, Address <chr>,
## # City <chr>, Region <chr>, PostalCode <chr>, Country <chr>, Phone <chr>,
## # Fax <chr>, HomePage <chr>
2. Menggunakan syntax SQL
show_query(s1)## <SQL>
## SELECT `Id`, `ProductName`, `SupplierId`, `CategoryId`, `QuantityPerUnit`, `UnitPrice`, `UnitsInStock`, `UnitsOnOrder`, `ReorderLevel`, `Discontinued`, `CompanyName`, `ContactName`, `ContactTitle`, `Address`, `City`, `Region`, `PostalCode`, `Country`, `Phone`, `Fax`, `HomePage`
## FROM (SELECT COALESCE(`LHS`.`Id`, `RHS`.`Id`) AS `Id`, `ProductName`, `SupplierId`, `CategoryId`, `QuantityPerUnit`, `UnitPrice`, `UnitsInStock`, `UnitsOnOrder`, `ReorderLevel`, `Discontinued`, `CompanyName`, `ContactName`, `ContactTitle`, `Address`, `City`, `Region`, `PostalCode`, `Country`, `Phone`, `Fax`, `HomePage`
## FROM `Product` AS `LHS`
## LEFT JOIN `Supplier` AS `RHS`
## ON (`LHS`.`Id` = `RHS`.`Id`)
## UNION
## SELECT COALESCE(`LHS`.`Id`, `RHS`.`Id`) AS `Id`, `ProductName`, `SupplierId`, `CategoryId`, `QuantityPerUnit`, `UnitPrice`, `UnitsInStock`, `UnitsOnOrder`, `ReorderLevel`, `Discontinued`, `CompanyName`, `ContactName`, `ContactTitle`, `Address`, `City`, `Region`, `PostalCode`, `Country`, `Phone`, `Fax`, `HomePage`
## FROM `Supplier` AS `RHS`
## LEFT JOIN `Product` AS `LHS`
## ON (`LHS`.`Id` = `RHS`.`Id`)
## ) AS `q01`
syntax di atas dapat di tulis sebagai berikut.
s2 <- tbl(Northwind, sql
("SELECT * FROM Product
LEFT JOIN Supplier ON Product.Id = Supplier.Id
UNION
SELECT * FROM Product
LEFT JOIN Supplier ON Product.Id = Supplier.Id"))
s2## # Source: SQL [?? x 22]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
## Id ProductName SupplierId CategoryId QuantityPerUnit UnitPrice
## <int> <chr> <int> <int> <chr> <dbl>
## 1 1 Chai 1 1 10 boxes x 20 ba~ 18
## 2 2 Chang 1 1 24 - 12 oz bottl~ 19
## 3 3 Aniseed Syrup 1 2 12 - 550 ml bott~ 10
## 4 4 Chef Anton's Cajun S~ 2 2 48 - 6 oz jars 22
## 5 5 Chef Anton's Gumbo M~ 2 2 36 boxes 21.4
## 6 6 Grandma's Boysenberr~ 3 2 12 - 8 oz jars 25
## 7 7 Uncle Bob's Organic ~ 3 7 12 - 1 lb pkgs. 30
## 8 8 Northwoods Cranberry~ 3 2 12 - 12 oz jars 40
## 9 9 Mishi Kobe Niku 4 6 18 - 500 g pkgs. 97
## 10 10 Ikura 4 8 12 - 200 ml jars 31
## # ... with more rows, and 16 more variables: UnitsInStock <int>,
## # UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>, Id:1 <int>,
## # CompanyName <chr>, ContactName <chr>, ContactTitle <chr>, Address <chr>,
## # City <chr>, Region <chr>, PostalCode <chr>, Country <chr>, Phone <chr>,
## # Fax <chr>, HomePage <chr>
semi_join
Melakukan penggabungan tabel Product dan Supplier dengan metode semi_join, dengan key antar table adalahId`
1. Menggunakan syntax R
t1 <- semi_join(produk, supplier, by="Id")
t1## # Source: lazy query [?? x 10]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
## Id ProductName SupplierId CategoryId QuantityPerUnit UnitPrice
## <int> <chr> <int> <int> <chr> <dbl>
## 1 1 Chai 1 1 10 boxes x 20 ba~ 18
## 2 2 Chang 1 1 24 - 12 oz bottl~ 19
## 3 3 Aniseed Syrup 1 2 12 - 550 ml bott~ 10
## 4 4 Chef Anton's Cajun S~ 2 2 48 - 6 oz jars 22
## 5 5 Chef Anton's Gumbo M~ 2 2 36 boxes 21.4
## 6 6 Grandma's Boysenberr~ 3 2 12 - 8 oz jars 25
## 7 7 Uncle Bob's Organic ~ 3 7 12 - 1 lb pkgs. 30
## 8 8 Northwoods Cranberry~ 3 2 12 - 12 oz jars 40
## 9 9 Mishi Kobe Niku 4 6 18 - 500 g pkgs. 97
## 10 10 Ikura 4 8 12 - 200 ml jars 31
## # ... with more rows, and 4 more variables: UnitsInStock <int>,
## # UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>
2. Menggunakan syntax SQL
show_query(t1)## <SQL>
## SELECT * FROM `Product` AS `LHS`
## WHERE EXISTS (
## SELECT 1 FROM `Supplier` AS `RHS`
## WHERE (`LHS`.`Id` = `RHS`.`Id`)
## )
syntax di atas dapat di tulis sebagai berikut.
t2 <- tbl(Northwind, sql
("SELECT * FROM Product
WHERE EXISTS(SELECT 1 FROM Supplier
WHERE(Product.Id = Supplier.Id))"))
t2## # Source: SQL [?? x 10]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
## Id ProductName SupplierId CategoryId QuantityPerUnit UnitPrice
## <int> <chr> <int> <int> <chr> <dbl>
## 1 1 Chai 1 1 10 boxes x 20 ba~ 18
## 2 2 Chang 1 1 24 - 12 oz bottl~ 19
## 3 3 Aniseed Syrup 1 2 12 - 550 ml bott~ 10
## 4 4 Chef Anton's Cajun S~ 2 2 48 - 6 oz jars 22
## 5 5 Chef Anton's Gumbo M~ 2 2 36 boxes 21.4
## 6 6 Grandma's Boysenberr~ 3 2 12 - 8 oz jars 25
## 7 7 Uncle Bob's Organic ~ 3 7 12 - 1 lb pkgs. 30
## 8 8 Northwoods Cranberry~ 3 2 12 - 12 oz jars 40
## 9 9 Mishi Kobe Niku 4 6 18 - 500 g pkgs. 97
## 10 10 Ikura 4 8 12 - 200 ml jars 31
## # ... with more rows, and 4 more variables: UnitsInStock <int>,
## # UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>
anti_join
Melakukan penggabungan tabel Product dan Supplier dengan metode anti_join, dengan key antar table adalahId`
1. Menggunakan syntax R
u1 <- anti_join(produk, supplier, by="Id")
u1## # Source: lazy query [?? x 10]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
## Id ProductName SupplierId CategoryId QuantityPerUnit UnitPrice
## <int> <chr> <int> <int> <chr> <dbl>
## 1 30 Nord-Ost Matjeshering 13 8 10 - 200 g glass~ 25.9
## 2 31 Gorgonzola Telino 14 4 12 - 100 g pkgs 12.5
## 3 32 Mascarpone Fabioli 14 4 24 - 200 g pkgs. 32
## 4 33 Geitost 15 4 500 g 2.5
## 5 34 Sasquatch Ale 16 1 24 - 12 oz bottl~ 14
## 6 35 Steeleye Stout 16 1 24 - 12 oz bottl~ 18
## 7 36 Inlagd Sill 17 8 24 - 250 g jars 19
## 8 37 Gravad lax 17 8 12 - 500 g pkgs. 26
## 9 38 Côte de Blaye 18 1 12 - 75 cl bottl~ 264.
## 10 39 Chartreuse verte 18 1 750 cc per bottle 18
## # ... with more rows, and 4 more variables: UnitsInStock <int>,
## # UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>
2. Menggunakan syntax SQL
show_query(u1)## <SQL>
## SELECT * FROM `Product` AS `LHS`
## WHERE NOT EXISTS (
## SELECT 1 FROM `Supplier` AS `RHS`
## WHERE (`LHS`.`Id` = `RHS`.`Id`)
## )
syntax di atas dapat di tulis sebagai berikut.
u2 <- tbl(Northwind, sql
("SELECT * FROM Product
WHERE NOT EXISTS(SELECT 1 FROM Supplier
WHERE(Product.Id = Supplier.Id))"))
u2## # Source: SQL [?? x 10]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
## Id ProductName SupplierId CategoryId QuantityPerUnit UnitPrice
## <int> <chr> <int> <int> <chr> <dbl>
## 1 30 Nord-Ost Matjeshering 13 8 10 - 200 g glass~ 25.9
## 2 31 Gorgonzola Telino 14 4 12 - 100 g pkgs 12.5
## 3 32 Mascarpone Fabioli 14 4 24 - 200 g pkgs. 32
## 4 33 Geitost 15 4 500 g 2.5
## 5 34 Sasquatch Ale 16 1 24 - 12 oz bottl~ 14
## 6 35 Steeleye Stout 16 1 24 - 12 oz bottl~ 18
## 7 36 Inlagd Sill 17 8 24 - 250 g jars 19
## 8 37 Gravad lax 17 8 12 - 500 g pkgs. 26
## 9 38 Côte de Blaye 18 1 12 - 75 cl bottl~ 264.
## 10 39 Chartreuse verte 18 1 750 cc per bottle 18
## # ... with more rows, and 4 more variables: UnitsInStock <int>,
## # UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>
Tugas Praktikum 3b: Spatial Data
Membuat suatu peta dari daerah masing-masing (Kecamatan, Kabupaten) apapun.
Data yang digunakan pada praktikum ini adalah sebagai berikut:
Data Indonesia-Subnational Administrative Boundaries yang dapat diakses di
HumData.Data Prodeskel (Profil Desa Kelurahan) Kabupaten Blitar khususnya Produksi Buah di per Desember 2020 yang dapat diakses di
ProduksiBuah.
Memasukkan Data .shp (Spatial Data)
Untuk memasukkan data .shp (Spatial Data) dapat dilakukan dengan mengakses file .shp yang telah di download
library(sf)## Linking to GEOS 3.9.1, GDAL 3.2.1, PROJ 7.2.1
Admin3Kecamatan <- "idn_admbnda_adm3_bps_20200401.shp"
# mengubah dengan fungsi `st_read`
Admin3 <- st_read(Admin3Kecamatan)## Reading layer `idn_admbnda_adm3_bps_20200401' from data source
## `C:\Users\LENOVO\Documents\R\Tugas Peta Kab\idn_admbnda_adm3_bps_20200401.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 7069 features and 16 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 95.01079 ymin: -11.00762 xmax: 141.0194 ymax: 6.07693
## Geodetic CRS: WGS 84
Mengetahui isi file Admin3 .shp
glimpse(Admin3)## Rows: 7,069
## Columns: 17
## $ Shape_Leng <dbl> 0.2798656, 0.7514001, 0.6900061, 0.6483629, 0.2437073, 1.35~
## $ Shape_Area <dbl> 0.003107633, 0.016925540, 0.024636382, 0.010761277, 0.00116~
## $ ADM3_EN <chr> "2 X 11 Enam Lingkung", "2 X 11 Kayu Tanam", "Abab", "Abang~
## $ ADM3_PCODE <chr> "ID1306050", "ID1306052", "ID1612030", "ID5107050", "ID7471~
## $ ADM3_REF <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ ADM3ALT1EN <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ ADM3ALT2EN <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ ADM2_EN <chr> "Padang Pariaman", "Padang Pariaman", "Penukal Abab Lematan~
## $ ADM2_PCODE <chr> "ID1306", "ID1306", "ID1612", "ID5107", "ID7471", "ID9432",~
## $ ADM1_EN <chr> "Sumatera Barat", "Sumatera Barat", "Sumatera Selatan", "Ba~
## $ ADM1_PCODE <chr> "ID13", "ID13", "ID16", "ID51", "ID74", "ID94", "ID94", "ID~
## $ ADM0_EN <chr> "Indonesia", "Indonesia", "Indonesia", "Indonesia", "Indone~
## $ ADM0_PCODE <chr> "ID", "ID", "ID", "ID", "ID", "ID", "ID", "ID", "ID", "ID",~
## $ date <date> 2019-12-20, 2019-12-20, 2019-12-20, 2019-12-20, 2019-12-20~
## $ validOn <date> 2020-04-01, 2020-04-01, 2020-04-01, 2020-04-01, 2020-04-01~
## $ validTo <date> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ geometry <MULTIPOLYGON [°]> MULTIPOLYGON (((100.2811 -0..., MULTIPOLYGON (~
Memasukkan Data .dbf atau dBase file via File .csv (Data Frame)
Pada langkah ini, yang akan digunakan adalah file dbf atau dBase yang selanjutnya dapat dibukan pada Microsoft Excel. File yang saya gunakan bernama: idn_admbnda_adm3_bps_20200401.dbf
Selanjutnya, di dalam file tersebut ditambahkan beberapa kolom sebagai Data Kuantitatif yang akan menentukan gradasi warna pada peta. Kolom ini saya isi dengan data dari dataset Prodeskel Kabupaten Blitar. File ini selanjutnya disimpan dengan ekstensi .csv. Nama file idn_admbnda_adm3_bps_20200401_blitar.csv berisi file yang telah disesuaikan.
KabBlitar <-read.csv("idn_admbnda_adm3_bps_20200401_blitar.csv", sep=";")
glimpse(KabBlitar)## Rows: 22
## Columns: 20
## $ Shape_Leng <chr> "0,477549495", "0,439669207", "0,650696187", "0,667~
## $ Shape_Area <chr> "0,008462566", "0,006455181", "0,008015827", "0,011~
## $ ADM3_EN <chr> "Bakung", "Binangun", "Doko", "Gandusari", "Garum",~
## $ ADM3_PCODE <chr> "ID3505010", "ID3505050", "ID3505130", "ID3505150",~
## $ ADM3_REF <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ ADM3ALT1EN <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ ADM3ALT2EN <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ ADM2_EN <chr> "Blitar", "Blitar", "Blitar", "Blitar", "Blitar", "~
## $ ADM2_PCODE <chr> "ID3505", "ID3505", "ID3505", "ID3505", "ID3505", "~
## $ ADM1_EN <chr> "Jawa Timur", "Jawa Timur", "Jawa Timur", "Jawa Tim~
## $ ADM1_PCODE <chr> "ID35", "ID35", "ID35", "ID35", "ID35", "ID35", "ID~
## $ ADM0_EN <chr> "Indonesia", "Indonesia", "Indonesia", "Indonesia",~
## $ ADM0_PCODE <chr> "ID", "ID", "ID", "ID", "ID", "ID", "ID", "ID", "ID~
## $ date <chr> "20/12/2019", "20/12/2019", "20/12/2019", "20/12/20~
## $ validOn <chr> "01/04/2020", "01/04/2020", "01/04/2020", "01/04/20~
## $ jumlah_prod_buah <int> 571, 357, 465, 913, 574, 480, 211957, 225, 6555, 40~
## $ jumlah_luas_lahan <int> 88, 66, 131, 259, 72, 107, 1135, 66, 1044, 62, 1429~
## $ prod_buah <int> 1, 1, 1, 4, 4, 14, 6, 1, 6, 1, 14, 4, 3, 1, 13, 1, ~
## $ jumlah_prod_unggas <int> 198120, 203514, 223028, 218600, 210295, 250045, 321~
## $ prod_unggas <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 6, 6, 4, 13, 8, 6, 14~
Merge Data .shp & .csv
Merge Data dilakukan dengan membandingkan primary key antara Spatial Data (Admin3 hasil import file .shp) dan DataFrame (KabBlitar hasil import file .csv). Proses ini menggunakan fungsi geo_join() dari package tigris
library(tigris)## To enable
## caching of data, set `options(tigris_use_cache = TRUE)` in your R script or .Rprofile.
merged.kab.blitar <- geo_join(spatial_data=Admin3,
data_frame=KabBlitar, by_sp="ADM3_PCODE",
by_df="ADM3_PCODE", how = "inner")## Warning: We recommend using the dplyr::*_join() family of functions instead.
Visualisasi
- Visualisasi Jumlah Produksi Buah di Kabupaten Blitar
library(ggplot2)
# Pengaturan Warna
my.col <- c("darkolivegreen2","darkolivegreen3", "darkolivegreen4")
# Plot Data
pDATA<-ggplot()+
geom_sf(data=merged.kab.blitar,aes(fill=prod_buah))+
scale_fill_gradientn(colours=my.col)+
labs(title = ~ atop(paste('Prodeskel - ',("Produksi Buah di Kab Blitar")),
paste(scriptstyle("Per Desember 2020"))),
fill="Percentage")+
theme(plot.title=element_text(hjust=0.5))
pDATA Kawasan yang tidak berwarna (seolah berlubang) merupakan daerah Kota Blitar sehingga tidak termasuk dalam visualisasi
- Visualisasi Luas Lahan (HA) Produksi Buah di Kabupaten Blitar
# Pengaturan Warna
my.col2 <- c("orange2","orange3", "orange4")
# Plot Data
pDATA2 <-ggplot()+
geom_sf(data=merged.kab.blitar,aes(fill=jumlah_luas_lahan))+
scale_fill_gradientn(colours=my.col2)+
labs(title = ~ atop(paste('Prodeskel - ',(" Luas Lahan Produksi Buah di Kab Blitar")),
paste(scriptstyle("Per Desember 2020"))),
fill="Luas Daerah (HA)")+
theme(plot.title=element_text(hjust=0.5))
pDATA2 Kawasan yang tidak berwarna (seolah berlubang) merupakan daerah Kota Blitar sehingga tidak termasuk dalam visualisasi
Referensi
Nurussadad, Abdul Aziz. Day 3 of #30DayMapChallenge: Polygons.11 Maret 2020. https://rpubs.com/nurussadad/Polygons-30DMC
Anisa, Rahma.Pengenalan R untuk Data Spatial. 10 Februari 2021. https://rpubs.com/r_anisa/Pengenalan-R-untuk-Data-Spasial
Mubarok, Muhammad Ilham. Peta Tematik dengan R. 14 April 2020. https://muhammadilhammubarok.wordpress.com/2020/04/14/peta-tematik-dengan-r/