Sains Data (Database Queries with R & Spatial Data)

Tugas Praktikum 3a: Database Queries with R

Combining Multiple Tables with packages dplyr in R

Persiapan

  1. Penginstallan packages RSQLite dan DBI untuk menghubungan R ke sistem Database
# install.packages(c("RSQLite","DBI"), dependecies=True)
  1. Pemanggilan package dplyr, dbplyr dan/atau tidyverse serta RSQLite dan DBI
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.4     v dplyr   1.0.7
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   2.0.1     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(RSQLite)
library(DBI)
  1. Melakukan koneksi terhadap database pada R
Northwind <-DBI::dbConnect(RSQLite::SQLite(),"C:/sqlite/db/Northwind_large.sqlite")
class(Northwind)
## [1] "SQLiteConnection"
## attr(,"package")
## [1] "RSQLite"
  1. Melihat daftar tabel yang terdapat pada database chinook.db
dbListTables(Northwind)
##  [1] "Category"             "Customer"             "CustomerCustomerDemo"
##  [4] "CustomerDemographic"  "Employee"             "EmployeeTerritory"   
##  [7] "Order"                "OrderDetail"          "Product"             
## [10] "ProductDetails_V"     "Region"               "Shipper"             
## [13] "Supplier"             "Territory"

Menggabungkan Beberapa Table dengan dplyr di R

Akses beberapa table untuk dilakukan penggabungan tabel

  1. Product table
produk <- tbl(Northwind, "Product")
glimpse(produk)
## Rows: ??
## Columns: 10
## Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
## $ Id              <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,~
## $ ProductName     <chr> "Chai", "Chang", "Aniseed Syrup", "Chef Anton's Cajun ~
## $ SupplierId      <int> 1, 1, 1, 2, 2, 3, 3, 3, 4, 4, 5, 5, 6, 6, 6, 7, 7, 7, ~
## $ CategoryId      <int> 1, 1, 2, 2, 2, 2, 7, 2, 6, 8, 4, 4, 8, 7, 2, 3, 6, 8, ~
## $ QuantityPerUnit <chr> "10 boxes x 20 bags", "24 - 12 oz bottles", "12 - 550 ~
## $ UnitPrice       <dbl> 18.00, 19.00, 10.00, 22.00, 21.35, 25.00, 30.00, 40.00~
## $ UnitsInStock    <int> 39, 17, 13, 53, 0, 120, 15, 6, 29, 31, 22, 86, 24, 35,~
## $ UnitsOnOrder    <int> 0, 40, 70, 0, 0, 0, 0, 0, 0, 0, 30, 0, 0, 0, 0, 0, 0, ~
## $ ReorderLevel    <int> 10, 25, 25, 0, 0, 25, 10, 0, 0, 0, 30, 0, 5, 0, 5, 10,~
## $ Discontinued    <int> 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, ~
  1. Order Detail table
supplier <- tbl(Northwind, "Supplier")
glimpse(supplier)
## Rows: ??
## Columns: 12
## Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
## $ Id           <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17~
## $ CompanyName  <chr> "Exotic Liquids", "New Orleans Cajun Delights", "Grandma ~
## $ ContactName  <chr> "Charlotte Cooper", "Shelley Burke", "Regina Murphy", "Yo~
## $ ContactTitle <chr> "Purchasing Manager", "Order Administrator", "Sales Repre~
## $ Address      <chr> "49 Gilbert St.", "P.O. Box 78934", "707 Oxford Rd.", "9-~
## $ City         <chr> "London", "New Orleans", "Ann Arbor", "Tokyo", "Oviedo", ~
## $ Region       <chr> "British Isles", "North America", "North America", "Easte~
## $ PostalCode   <chr> "EC1 4SD", "70117", "48104", "100", "33007", "545", "3058~
## $ Country      <chr> "UK", "USA", "USA", "Japan", "Spain", "Japan", "Australia~
## $ Phone        <chr> "(171) 555-2222", "(100) 555-4822", "(313) 555-5735", "(0~
## $ Fax          <chr> NA, NA, "(313) 555-3349", NA, NA, NA, "(03) 444-6588", NA~
## $ HomePage     <chr> NA, "#CAJUN.HTM#", NA, NA, NA, "Mayumi's (on the World Wi~

inner_join

Melakukan penggabungan tabel Product dan Supplier dengan metode inner_join, dengan key antar table adalah Id

1. Menggunakan syntax R

p1 <- inner_join(produk, supplier, by="Id")
p1
## # Source:   lazy query [?? x 21]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
##       Id ProductName           SupplierId CategoryId QuantityPerUnit   UnitPrice
##    <int> <chr>                      <int>      <int> <chr>                 <dbl>
##  1     1 Chai                           1          1 10 boxes x 20 ba~      18  
##  2     2 Chang                          1          1 24 - 12 oz bottl~      19  
##  3     3 Aniseed Syrup                  1          2 12 - 550 ml bott~      10  
##  4     4 Chef Anton's Cajun S~          2          2 48 - 6 oz jars         22  
##  5     5 Chef Anton's Gumbo M~          2          2 36 boxes               21.4
##  6     6 Grandma's Boysenberr~          3          2 12 - 8 oz jars         25  
##  7     7 Uncle Bob's Organic ~          3          7 12 - 1 lb pkgs.        30  
##  8     8 Northwoods Cranberry~          3          2 12 - 12 oz jars        40  
##  9     9 Mishi Kobe Niku                4          6 18 - 500 g pkgs.       97  
## 10    10 Ikura                          4          8 12 - 200 ml jars       31  
## # ... with more rows, and 15 more variables: UnitsInStock <int>,
## #   UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>,
## #   CompanyName <chr>, ContactName <chr>, ContactTitle <chr>, Address <chr>,
## #   City <chr>, Region <chr>, PostalCode <chr>, Country <chr>, Phone <chr>,
## #   Fax <chr>, HomePage <chr>

2. Menggunakan syntax SQL

p2 <- tbl(Northwind, sql
      ("SELECT * FROM Product
        INNER JOIN Supplier
        ON Product.Id = Supplier.Id"))
p2
## # Source:   SQL [?? x 22]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
##       Id ProductName           SupplierId CategoryId QuantityPerUnit   UnitPrice
##    <int> <chr>                      <int>      <int> <chr>                 <dbl>
##  1     1 Chai                           1          1 10 boxes x 20 ba~      18  
##  2     2 Chang                          1          1 24 - 12 oz bottl~      19  
##  3     3 Aniseed Syrup                  1          2 12 - 550 ml bott~      10  
##  4     4 Chef Anton's Cajun S~          2          2 48 - 6 oz jars         22  
##  5     5 Chef Anton's Gumbo M~          2          2 36 boxes               21.4
##  6     6 Grandma's Boysenberr~          3          2 12 - 8 oz jars         25  
##  7     7 Uncle Bob's Organic ~          3          7 12 - 1 lb pkgs.        30  
##  8     8 Northwoods Cranberry~          3          2 12 - 12 oz jars        40  
##  9     9 Mishi Kobe Niku                4          6 18 - 500 g pkgs.       97  
## 10    10 Ikura                          4          8 12 - 200 ml jars       31  
## # ... with more rows, and 16 more variables: UnitsInStock <int>,
## #   UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>, Id:1 <int>,
## #   CompanyName <chr>, ContactName <chr>, ContactTitle <chr>, Address <chr>,
## #   City <chr>, Region <chr>, PostalCode <chr>, Country <chr>, Phone <chr>,
## #   Fax <chr>, HomePage <chr>

left_join

Melakukan penggabungan tabel Product dan Supplier dengan metode left_join, dengan key antar table adalah Id

1. Menggunakan syntax R

q1 <- left_join(produk, supplier, by="Id")
q1
## # Source:   lazy query [?? x 21]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
##       Id ProductName           SupplierId CategoryId QuantityPerUnit   UnitPrice
##    <int> <chr>                      <int>      <int> <chr>                 <dbl>
##  1     1 Chai                           1          1 10 boxes x 20 ba~      18  
##  2     2 Chang                          1          1 24 - 12 oz bottl~      19  
##  3     3 Aniseed Syrup                  1          2 12 - 550 ml bott~      10  
##  4     4 Chef Anton's Cajun S~          2          2 48 - 6 oz jars         22  
##  5     5 Chef Anton's Gumbo M~          2          2 36 boxes               21.4
##  6     6 Grandma's Boysenberr~          3          2 12 - 8 oz jars         25  
##  7     7 Uncle Bob's Organic ~          3          7 12 - 1 lb pkgs.        30  
##  8     8 Northwoods Cranberry~          3          2 12 - 12 oz jars        40  
##  9     9 Mishi Kobe Niku                4          6 18 - 500 g pkgs.       97  
## 10    10 Ikura                          4          8 12 - 200 ml jars       31  
## # ... with more rows, and 15 more variables: UnitsInStock <int>,
## #   UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>,
## #   CompanyName <chr>, ContactName <chr>, ContactTitle <chr>, Address <chr>,
## #   City <chr>, Region <chr>, PostalCode <chr>, Country <chr>, Phone <chr>,
## #   Fax <chr>, HomePage <chr>

2. Menggunakan syntax SQL

show_query(q1)
## <SQL>
## SELECT `LHS`.`Id` AS `Id`, `ProductName`, `SupplierId`, `CategoryId`, `QuantityPerUnit`, `UnitPrice`, `UnitsInStock`, `UnitsOnOrder`, `ReorderLevel`, `Discontinued`, `CompanyName`, `ContactName`, `ContactTitle`, `Address`, `City`, `Region`, `PostalCode`, `Country`, `Phone`, `Fax`, `HomePage`
## FROM `Product` AS `LHS`
## LEFT JOIN `Supplier` AS `RHS`
## ON (`LHS`.`Id` = `RHS`.`Id`)

syntax di atas akan sama dengan synax berikut

q2 <- tbl(Northwind, sql
      ("SELECT * FROM Product AS pd
        LEFT JOIN Supplier AS sp
        ON pd.Id = sp.Id"))
q2
## # Source:   SQL [?? x 22]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
##       Id ProductName           SupplierId CategoryId QuantityPerUnit   UnitPrice
##    <int> <chr>                      <int>      <int> <chr>                 <dbl>
##  1     1 Chai                           1          1 10 boxes x 20 ba~      18  
##  2     2 Chang                          1          1 24 - 12 oz bottl~      19  
##  3     3 Aniseed Syrup                  1          2 12 - 550 ml bott~      10  
##  4     4 Chef Anton's Cajun S~          2          2 48 - 6 oz jars         22  
##  5     5 Chef Anton's Gumbo M~          2          2 36 boxes               21.4
##  6     6 Grandma's Boysenberr~          3          2 12 - 8 oz jars         25  
##  7     7 Uncle Bob's Organic ~          3          7 12 - 1 lb pkgs.        30  
##  8     8 Northwoods Cranberry~          3          2 12 - 12 oz jars        40  
##  9     9 Mishi Kobe Niku                4          6 18 - 500 g pkgs.       97  
## 10    10 Ikura                          4          8 12 - 200 ml jars       31  
## # ... with more rows, and 16 more variables: UnitsInStock <int>,
## #   UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>, Id:1 <int>,
## #   CompanyName <chr>, ContactName <chr>, ContactTitle <chr>, Address <chr>,
## #   City <chr>, Region <chr>, PostalCode <chr>, Country <chr>, Phone <chr>,
## #   Fax <chr>, HomePage <chr>

right_join

Melakukan penggabungan tabel Product dan Supplier dengan metode right_join, dengan key antar table adalahId`

1. Menggunakan syntax R

r1 <- right_join(produk, supplier, by="Id")
r1
## # Source:   lazy query [?? x 21]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
##       Id ProductName           SupplierId CategoryId QuantityPerUnit   UnitPrice
##    <int> <chr>                      <int>      <int> <chr>                 <dbl>
##  1     1 Chai                           1          1 10 boxes x 20 ba~      18  
##  2     2 Chang                          1          1 24 - 12 oz bottl~      19  
##  3     3 Aniseed Syrup                  1          2 12 - 550 ml bott~      10  
##  4     4 Chef Anton's Cajun S~          2          2 48 - 6 oz jars         22  
##  5     5 Chef Anton's Gumbo M~          2          2 36 boxes               21.4
##  6     6 Grandma's Boysenberr~          3          2 12 - 8 oz jars         25  
##  7     7 Uncle Bob's Organic ~          3          7 12 - 1 lb pkgs.        30  
##  8     8 Northwoods Cranberry~          3          2 12 - 12 oz jars        40  
##  9     9 Mishi Kobe Niku                4          6 18 - 500 g pkgs.       97  
## 10    10 Ikura                          4          8 12 - 200 ml jars       31  
## # ... with more rows, and 15 more variables: UnitsInStock <int>,
## #   UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>,
## #   CompanyName <chr>, ContactName <chr>, ContactTitle <chr>, Address <chr>,
## #   City <chr>, Region <chr>, PostalCode <chr>, Country <chr>, Phone <chr>,
## #   Fax <chr>, HomePage <chr>

2. Menggunakan syntax SQL

Perlu diperhatikan bahwa right_join tidak didukung oleh fungsi tbl sehinggga akan muncul peringatan error.

# tbl(Northwind, sql ("SELECT * FROM Product RIGHT JOIN Supplier ON Product.Id = Supplier.Id"))

Untuk menangani masalah di atas, dapat di atasi dengan menggunakan perintah left_join namun dengan urutan tabel yang berbeda dari kasus sebelumnya (dibalik).

show_query(r1)
## <SQL>
## SELECT `RHS`.`Id` AS `Id`, `ProductName`, `SupplierId`, `CategoryId`, `QuantityPerUnit`, `UnitPrice`, `UnitsInStock`, `UnitsOnOrder`, `ReorderLevel`, `Discontinued`, `CompanyName`, `ContactName`, `ContactTitle`, `Address`, `City`, `Region`, `PostalCode`, `Country`, `Phone`, `Fax`, `HomePage`
## FROM `Supplier` AS `RHS`
## LEFT JOIN `Product` AS `LHS`
## ON (`LHS`.`Id` = `RHS`.`Id`)
r2 <- tbl(Northwind, sql 
      ("SELECT * FROM Supplier
        LEFT JOIN Product
        ON Product.Id = Supplier.Id"))
r2
## # Source:   SQL [?? x 22]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
##       Id CompanyName  ContactName  ContactTitle  Address City  Region PostalCode
##    <int> <chr>        <chr>        <chr>         <chr>   <chr> <chr>  <chr>     
##  1     1 Exotic Liqu~ Charlotte C~ Purchasing M~ 49 Gil~ Lond~ Briti~ EC1 4SD   
##  2     2 New Orleans~ Shelley Bur~ Order Admini~ P.O. B~ New ~ North~ 70117     
##  3     3 Grandma Kel~ Regina Murp~ Sales Repres~ 707 Ox~ Ann ~ North~ 48104     
##  4     4 Tokyo Trade~ Yoshi Nagase Marketing Ma~ 9-8 Se~ Tokyo Easte~ 100       
##  5     5 Cooperativa~ Antonio del~ Export Admin~ Calle ~ Ovie~ South~ 33007     
##  6     6 Mayumi's     Mayumi Ohno  Marketing Re~ 92 Set~ Osaka Easte~ 545       
##  7     7 Pavlova, Lt~ Ian Devling  Marketing Ma~ 74 Ros~ Melb~ Victo~ 3058      
##  8     8 Specialty B~ Peter Wilson Sales Repres~ 29 Kin~ Manc~ Briti~ M14 GSD   
##  9     9 PB Knäckebr~ Lars Peters~ Sales Agent   Kaload~ Göte~ North~ S-345 67  
## 10    10 Refrescos A~ Carlos Diaz  Marketing Ma~ Av. da~ Sao ~ South~ 5442      
## # ... with more rows, and 14 more variables: Country <chr>, Phone <chr>,
## #   Fax <chr>, HomePage <chr>, Id:1 <int>, ProductName <chr>, SupplierId <int>,
## #   CategoryId <int>, QuantityPerUnit <chr>, UnitPrice <dbl>,
## #   UnitsInStock <int>, UnitsOnOrder <int>, ReorderLevel <int>,
## #   Discontinued <int>

full_join

Melakukan penggabungan tabel Product dan Supplier dengan metode full_join, dengan key antar table adalahId`

1. Menggunakan syntax R

s1 <- full_join(produk, supplier, by="Id")
s1
## # Source:   lazy query [?? x 21]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
##       Id ProductName           SupplierId CategoryId QuantityPerUnit   UnitPrice
##    <int> <chr>                      <int>      <int> <chr>                 <dbl>
##  1     1 Chai                           1          1 10 boxes x 20 ba~      18  
##  2     2 Chang                          1          1 24 - 12 oz bottl~      19  
##  3     3 Aniseed Syrup                  1          2 12 - 550 ml bott~      10  
##  4     4 Chef Anton's Cajun S~          2          2 48 - 6 oz jars         22  
##  5     5 Chef Anton's Gumbo M~          2          2 36 boxes               21.4
##  6     6 Grandma's Boysenberr~          3          2 12 - 8 oz jars         25  
##  7     7 Uncle Bob's Organic ~          3          7 12 - 1 lb pkgs.        30  
##  8     8 Northwoods Cranberry~          3          2 12 - 12 oz jars        40  
##  9     9 Mishi Kobe Niku                4          6 18 - 500 g pkgs.       97  
## 10    10 Ikura                          4          8 12 - 200 ml jars       31  
## # ... with more rows, and 15 more variables: UnitsInStock <int>,
## #   UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>,
## #   CompanyName <chr>, ContactName <chr>, ContactTitle <chr>, Address <chr>,
## #   City <chr>, Region <chr>, PostalCode <chr>, Country <chr>, Phone <chr>,
## #   Fax <chr>, HomePage <chr>

2. Menggunakan syntax SQL

show_query(s1)
## <SQL>
## SELECT `Id`, `ProductName`, `SupplierId`, `CategoryId`, `QuantityPerUnit`, `UnitPrice`, `UnitsInStock`, `UnitsOnOrder`, `ReorderLevel`, `Discontinued`, `CompanyName`, `ContactName`, `ContactTitle`, `Address`, `City`, `Region`, `PostalCode`, `Country`, `Phone`, `Fax`, `HomePage`
## FROM (SELECT COALESCE(`LHS`.`Id`, `RHS`.`Id`) AS `Id`, `ProductName`, `SupplierId`, `CategoryId`, `QuantityPerUnit`, `UnitPrice`, `UnitsInStock`, `UnitsOnOrder`, `ReorderLevel`, `Discontinued`, `CompanyName`, `ContactName`, `ContactTitle`, `Address`, `City`, `Region`, `PostalCode`, `Country`, `Phone`, `Fax`, `HomePage`
## FROM `Product` AS `LHS`
## LEFT JOIN `Supplier` AS `RHS`
## ON (`LHS`.`Id` = `RHS`.`Id`)
## UNION
## SELECT COALESCE(`LHS`.`Id`, `RHS`.`Id`) AS `Id`, `ProductName`, `SupplierId`, `CategoryId`, `QuantityPerUnit`, `UnitPrice`, `UnitsInStock`, `UnitsOnOrder`, `ReorderLevel`, `Discontinued`, `CompanyName`, `ContactName`, `ContactTitle`, `Address`, `City`, `Region`, `PostalCode`, `Country`, `Phone`, `Fax`, `HomePage`
## FROM `Supplier` AS `RHS`
## LEFT JOIN `Product` AS `LHS`
## ON (`LHS`.`Id` = `RHS`.`Id`)
## ) AS `q01`

syntax di atas dapat di tulis sebagai berikut.

s2 <- tbl(Northwind, sql 
      ("SELECT * FROM Product
      LEFT JOIN Supplier ON Product.Id = Supplier.Id
      UNION
      SELECT * FROM Product
      LEFT JOIN Supplier ON Product.Id = Supplier.Id"))
s2
## # Source:   SQL [?? x 22]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
##       Id ProductName           SupplierId CategoryId QuantityPerUnit   UnitPrice
##    <int> <chr>                      <int>      <int> <chr>                 <dbl>
##  1     1 Chai                           1          1 10 boxes x 20 ba~      18  
##  2     2 Chang                          1          1 24 - 12 oz bottl~      19  
##  3     3 Aniseed Syrup                  1          2 12 - 550 ml bott~      10  
##  4     4 Chef Anton's Cajun S~          2          2 48 - 6 oz jars         22  
##  5     5 Chef Anton's Gumbo M~          2          2 36 boxes               21.4
##  6     6 Grandma's Boysenberr~          3          2 12 - 8 oz jars         25  
##  7     7 Uncle Bob's Organic ~          3          7 12 - 1 lb pkgs.        30  
##  8     8 Northwoods Cranberry~          3          2 12 - 12 oz jars        40  
##  9     9 Mishi Kobe Niku                4          6 18 - 500 g pkgs.       97  
## 10    10 Ikura                          4          8 12 - 200 ml jars       31  
## # ... with more rows, and 16 more variables: UnitsInStock <int>,
## #   UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>, Id:1 <int>,
## #   CompanyName <chr>, ContactName <chr>, ContactTitle <chr>, Address <chr>,
## #   City <chr>, Region <chr>, PostalCode <chr>, Country <chr>, Phone <chr>,
## #   Fax <chr>, HomePage <chr>

semi_join

Melakukan penggabungan tabel Product dan Supplier dengan metode semi_join, dengan key antar table adalahId`

1. Menggunakan syntax R

t1 <- semi_join(produk, supplier, by="Id")
t1
## # Source:   lazy query [?? x 10]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
##       Id ProductName           SupplierId CategoryId QuantityPerUnit   UnitPrice
##    <int> <chr>                      <int>      <int> <chr>                 <dbl>
##  1     1 Chai                           1          1 10 boxes x 20 ba~      18  
##  2     2 Chang                          1          1 24 - 12 oz bottl~      19  
##  3     3 Aniseed Syrup                  1          2 12 - 550 ml bott~      10  
##  4     4 Chef Anton's Cajun S~          2          2 48 - 6 oz jars         22  
##  5     5 Chef Anton's Gumbo M~          2          2 36 boxes               21.4
##  6     6 Grandma's Boysenberr~          3          2 12 - 8 oz jars         25  
##  7     7 Uncle Bob's Organic ~          3          7 12 - 1 lb pkgs.        30  
##  8     8 Northwoods Cranberry~          3          2 12 - 12 oz jars        40  
##  9     9 Mishi Kobe Niku                4          6 18 - 500 g pkgs.       97  
## 10    10 Ikura                          4          8 12 - 200 ml jars       31  
## # ... with more rows, and 4 more variables: UnitsInStock <int>,
## #   UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>

2. Menggunakan syntax SQL

show_query(t1)
## <SQL>
## SELECT * FROM `Product` AS `LHS`
## WHERE EXISTS (
##   SELECT 1 FROM `Supplier` AS `RHS`
##   WHERE (`LHS`.`Id` = `RHS`.`Id`)
## )

syntax di atas dapat di tulis sebagai berikut.

t2 <- tbl(Northwind, sql 
      ("SELECT * FROM Product
      WHERE EXISTS(SELECT 1 FROM Supplier
                   WHERE(Product.Id = Supplier.Id))"))
t2
## # Source:   SQL [?? x 10]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
##       Id ProductName           SupplierId CategoryId QuantityPerUnit   UnitPrice
##    <int> <chr>                      <int>      <int> <chr>                 <dbl>
##  1     1 Chai                           1          1 10 boxes x 20 ba~      18  
##  2     2 Chang                          1          1 24 - 12 oz bottl~      19  
##  3     3 Aniseed Syrup                  1          2 12 - 550 ml bott~      10  
##  4     4 Chef Anton's Cajun S~          2          2 48 - 6 oz jars         22  
##  5     5 Chef Anton's Gumbo M~          2          2 36 boxes               21.4
##  6     6 Grandma's Boysenberr~          3          2 12 - 8 oz jars         25  
##  7     7 Uncle Bob's Organic ~          3          7 12 - 1 lb pkgs.        30  
##  8     8 Northwoods Cranberry~          3          2 12 - 12 oz jars        40  
##  9     9 Mishi Kobe Niku                4          6 18 - 500 g pkgs.       97  
## 10    10 Ikura                          4          8 12 - 200 ml jars       31  
## # ... with more rows, and 4 more variables: UnitsInStock <int>,
## #   UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>

anti_join

Melakukan penggabungan tabel Product dan Supplier dengan metode anti_join, dengan key antar table adalahId`

1. Menggunakan syntax R

u1 <- anti_join(produk, supplier, by="Id")
u1
## # Source:   lazy query [?? x 10]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
##       Id ProductName           SupplierId CategoryId QuantityPerUnit   UnitPrice
##    <int> <chr>                      <int>      <int> <chr>                 <dbl>
##  1    30 Nord-Ost Matjeshering         13          8 10 - 200 g glass~      25.9
##  2    31 Gorgonzola Telino             14          4 12 - 100 g pkgs        12.5
##  3    32 Mascarpone Fabioli            14          4 24 - 200 g pkgs.       32  
##  4    33 Geitost                       15          4 500 g                   2.5
##  5    34 Sasquatch Ale                 16          1 24 - 12 oz bottl~      14  
##  6    35 Steeleye Stout                16          1 24 - 12 oz bottl~      18  
##  7    36 Inlagd Sill                   17          8 24 - 250 g  jars       19  
##  8    37 Gravad lax                    17          8 12 - 500 g pkgs.       26  
##  9    38 Côte de Blaye                 18          1 12 - 75 cl bottl~     264. 
## 10    39 Chartreuse verte              18          1 750 cc per bottle      18  
## # ... with more rows, and 4 more variables: UnitsInStock <int>,
## #   UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>

2. Menggunakan syntax SQL

show_query(u1)
## <SQL>
## SELECT * FROM `Product` AS `LHS`
## WHERE NOT EXISTS (
##   SELECT 1 FROM `Supplier` AS `RHS`
##   WHERE (`LHS`.`Id` = `RHS`.`Id`)
## )

syntax di atas dapat di tulis sebagai berikut.

u2 <- tbl(Northwind, sql 
      ("SELECT * FROM Product
      WHERE NOT EXISTS(SELECT 1 FROM Supplier
                       WHERE(Product.Id = Supplier.Id))"))
u2
## # Source:   SQL [?? x 10]
## # Database: sqlite 3.36.0 [C:\sqlite\db\Northwind_large.sqlite]
##       Id ProductName           SupplierId CategoryId QuantityPerUnit   UnitPrice
##    <int> <chr>                      <int>      <int> <chr>                 <dbl>
##  1    30 Nord-Ost Matjeshering         13          8 10 - 200 g glass~      25.9
##  2    31 Gorgonzola Telino             14          4 12 - 100 g pkgs        12.5
##  3    32 Mascarpone Fabioli            14          4 24 - 200 g pkgs.       32  
##  4    33 Geitost                       15          4 500 g                   2.5
##  5    34 Sasquatch Ale                 16          1 24 - 12 oz bottl~      14  
##  6    35 Steeleye Stout                16          1 24 - 12 oz bottl~      18  
##  7    36 Inlagd Sill                   17          8 24 - 250 g  jars       19  
##  8    37 Gravad lax                    17          8 12 - 500 g pkgs.       26  
##  9    38 Côte de Blaye                 18          1 12 - 75 cl bottl~     264. 
## 10    39 Chartreuse verte              18          1 750 cc per bottle      18  
## # ... with more rows, and 4 more variables: UnitsInStock <int>,
## #   UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>

Tugas Praktikum 3b: Spatial Data

Membuat suatu peta dari daerah masing-masing (Kecamatan, Kabupaten) apapun.

Data yang digunakan pada praktikum ini adalah sebagai berikut:

  1. Data Indonesia-Subnational Administrative Boundaries yang dapat diakses di HumData.

  2. Data Prodeskel (Profil Desa Kelurahan) Kabupaten Blitar khususnya Produksi Buah di per Desember 2020 yang dapat diakses diProduksiBuah.

Memasukkan Data .shp (Spatial Data)

Untuk memasukkan data .shp (Spatial Data) dapat dilakukan dengan mengakses file .shp yang telah di download

library(sf)
## Linking to GEOS 3.9.1, GDAL 3.2.1, PROJ 7.2.1
Admin3Kecamatan <- "idn_admbnda_adm3_bps_20200401.shp"

# mengubah dengan fungsi `st_read`
Admin3 <- st_read(Admin3Kecamatan)
## Reading layer `idn_admbnda_adm3_bps_20200401' from data source 
##   `C:\Users\LENOVO\Documents\R\Tugas Peta Kab\idn_admbnda_adm3_bps_20200401.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 7069 features and 16 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 95.01079 ymin: -11.00762 xmax: 141.0194 ymax: 6.07693
## Geodetic CRS:  WGS 84

Mengetahui isi file Admin3 .shp

glimpse(Admin3)
## Rows: 7,069
## Columns: 17
## $ Shape_Leng <dbl> 0.2798656, 0.7514001, 0.6900061, 0.6483629, 0.2437073, 1.35~
## $ Shape_Area <dbl> 0.003107633, 0.016925540, 0.024636382, 0.010761277, 0.00116~
## $ ADM3_EN    <chr> "2 X 11 Enam Lingkung", "2 X 11 Kayu Tanam", "Abab", "Abang~
## $ ADM3_PCODE <chr> "ID1306050", "ID1306052", "ID1612030", "ID5107050", "ID7471~
## $ ADM3_REF   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ ADM3ALT1EN <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ ADM3ALT2EN <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ ADM2_EN    <chr> "Padang Pariaman", "Padang Pariaman", "Penukal Abab Lematan~
## $ ADM2_PCODE <chr> "ID1306", "ID1306", "ID1612", "ID5107", "ID7471", "ID9432",~
## $ ADM1_EN    <chr> "Sumatera Barat", "Sumatera Barat", "Sumatera Selatan", "Ba~
## $ ADM1_PCODE <chr> "ID13", "ID13", "ID16", "ID51", "ID74", "ID94", "ID94", "ID~
## $ ADM0_EN    <chr> "Indonesia", "Indonesia", "Indonesia", "Indonesia", "Indone~
## $ ADM0_PCODE <chr> "ID", "ID", "ID", "ID", "ID", "ID", "ID", "ID", "ID", "ID",~
## $ date       <date> 2019-12-20, 2019-12-20, 2019-12-20, 2019-12-20, 2019-12-20~
## $ validOn    <date> 2020-04-01, 2020-04-01, 2020-04-01, 2020-04-01, 2020-04-01~
## $ validTo    <date> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ geometry   <MULTIPOLYGON [°]> MULTIPOLYGON (((100.2811 -0..., MULTIPOLYGON (~

Memasukkan Data .dbf atau dBase file via File .csv (Data Frame)

Pada langkah ini, yang akan digunakan adalah file dbf atau dBase yang selanjutnya dapat dibukan pada Microsoft Excel. File yang saya gunakan bernama: idn_admbnda_adm3_bps_20200401.dbf

Selanjutnya, di dalam file tersebut ditambahkan beberapa kolom sebagai Data Kuantitatif yang akan menentukan gradasi warna pada peta. Kolom ini saya isi dengan data dari dataset Prodeskel Kabupaten Blitar. File ini selanjutnya disimpan dengan ekstensi .csv. Nama file idn_admbnda_adm3_bps_20200401_blitar.csv berisi file yang telah disesuaikan.

KabBlitar <-read.csv("idn_admbnda_adm3_bps_20200401_blitar.csv", sep=";")
glimpse(KabBlitar)
## Rows: 22
## Columns: 20
## $ Shape_Leng         <chr> "0,477549495", "0,439669207", "0,650696187", "0,667~
## $ Shape_Area         <chr> "0,008462566", "0,006455181", "0,008015827", "0,011~
## $ ADM3_EN            <chr> "Bakung", "Binangun", "Doko", "Gandusari", "Garum",~
## $ ADM3_PCODE         <chr> "ID3505010", "ID3505050", "ID3505130", "ID3505150",~
## $ ADM3_REF           <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ ADM3ALT1EN         <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ ADM3ALT2EN         <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ ADM2_EN            <chr> "Blitar", "Blitar", "Blitar", "Blitar", "Blitar", "~
## $ ADM2_PCODE         <chr> "ID3505", "ID3505", "ID3505", "ID3505", "ID3505", "~
## $ ADM1_EN            <chr> "Jawa Timur", "Jawa Timur", "Jawa Timur", "Jawa Tim~
## $ ADM1_PCODE         <chr> "ID35", "ID35", "ID35", "ID35", "ID35", "ID35", "ID~
## $ ADM0_EN            <chr> "Indonesia", "Indonesia", "Indonesia", "Indonesia",~
## $ ADM0_PCODE         <chr> "ID", "ID", "ID", "ID", "ID", "ID", "ID", "ID", "ID~
## $ date               <chr> "20/12/2019", "20/12/2019", "20/12/2019", "20/12/20~
## $ validOn            <chr> "01/04/2020", "01/04/2020", "01/04/2020", "01/04/20~
## $ jumlah_prod_buah   <int> 571, 357, 465, 913, 574, 480, 211957, 225, 6555, 40~
## $ jumlah_luas_lahan  <int> 88, 66, 131, 259, 72, 107, 1135, 66, 1044, 62, 1429~
## $ prod_buah          <int> 1, 1, 1, 4, 4, 14, 6, 1, 6, 1, 14, 4, 3, 1, 13, 1, ~
## $ jumlah_prod_unggas <int> 198120, 203514, 223028, 218600, 210295, 250045, 321~
## $ prod_unggas        <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 6, 6, 4, 13, 8, 6, 14~

Merge Data .shp & .csv

Merge Data dilakukan dengan membandingkan primary key antara Spatial Data (Admin3 hasil import file .shp) dan DataFrame (KabBlitar hasil import file .csv). Proses ini menggunakan fungsi geo_join() dari package tigris

library(tigris)
## To enable 
## caching of data, set `options(tigris_use_cache = TRUE)` in your R script or .Rprofile.
merged.kab.blitar <- geo_join(spatial_data=Admin3, 
                              data_frame=KabBlitar, by_sp="ADM3_PCODE", 
                              by_df="ADM3_PCODE", how = "inner")
## Warning: We recommend using the dplyr::*_join() family of functions instead.

Visualisasi

  1. Visualisasi Jumlah Produksi Buah di Kabupaten Blitar
library(ggplot2)

# Pengaturan Warna
my.col <- c("darkolivegreen2","darkolivegreen3", "darkolivegreen4") 


# Plot Data
pDATA<-ggplot()+ 
  geom_sf(data=merged.kab.blitar,aes(fill=prod_buah))+
  scale_fill_gradientn(colours=my.col)+
  labs(title = ~ atop(paste('Prodeskel - ',("Produksi Buah di Kab Blitar")),
                      paste(scriptstyle("Per Desember 2020"))),
       fill="Percentage")+
  theme(plot.title=element_text(hjust=0.5))
pDATA

Kawasan yang tidak berwarna (seolah berlubang) merupakan daerah Kota Blitar sehingga tidak termasuk dalam visualisasi

  1. Visualisasi Luas Lahan (HA) Produksi Buah di Kabupaten Blitar
# Pengaturan Warna
my.col2 <- c("orange2","orange3", "orange4") 

# Plot Data
pDATA2 <-ggplot()+
  geom_sf(data=merged.kab.blitar,aes(fill=jumlah_luas_lahan))+
  scale_fill_gradientn(colours=my.col2)+
  labs(title = ~ atop(paste('Prodeskel - ',(" Luas Lahan Produksi Buah di Kab Blitar")),
                      paste(scriptstyle("Per Desember 2020"))),
       fill="Luas Daerah (HA)")+
  theme(plot.title=element_text(hjust=0.5))
pDATA2 

Kawasan yang tidak berwarna (seolah berlubang) merupakan daerah Kota Blitar sehingga tidak termasuk dalam visualisasi

Referensi

  1. Nurussadad, Abdul Aziz. Day 3 of #30DayMapChallenge: Polygons.11 Maret 2020. https://rpubs.com/nurussadad/Polygons-30DMC

  2. Anisa, Rahma.Pengenalan R untuk Data Spatial. 10 Februari 2021. https://rpubs.com/r_anisa/Pengenalan-R-untuk-Data-Spasial

  3. Mubarok, Muhammad Ilham. Peta Tematik dengan R. 14 April 2020. https://muhammadilhammubarok.wordpress.com/2020/04/14/peta-tematik-dengan-r/