Tugas Akhir Mandarel

Engine SQL

Database yang digunakan adalah database chinook . Untuk mengakses dan mengolah database ini melalui R diperlukan package DBI dan RSQLite.

Melakukan instalasi package koneksi database

install.packages(c("RSQLite", "DBI"), dependencies = TRUE)

Memanggil Library

library(tidyverse)
library(RSQLite)
library(DBI)

Secara umum, koneksi terhadap database pada R dapat menggunakan sintaks berikut: DBI::dbConnect(RSQLite::SQLite(), path = ":dbname:")

Melakukan Koneksi Database dengan drive DBMS SQLite

chinook<-DBI::dbConnect(RSQLite::SQLite(), "C:/sqlite/chinook.db")

Setelah melakukan dbConnect() kita dapat melakukan perintah SQL.

SELECT 
  *
FROM
  DEPENDENT
WHERE
 Sex = "M";
  ;
3 records
Essn Dependent_name Sex Bdate Relationship
333445555 Theodore M 1983-10-25 Son
987654321 Abner M 1942-02-28 Spouse
123456789 Michael M 1988-01-04 Son

Melakukan Operasi SQL di R

SELECT 
  *
FROM
  EMPLOYEE
ORDER BY
  Salary;
  ;
8 records
Fname Minit Lname Ssn Bdate Address Sex Salary Super_ssn Dno
Alicia J Zelaya 999887777 1968-01-19 3321 Castle, Spring TX F 25000 987654321 4
Joyce A English 453453453 1972-07-31 5631 Rice, Houston TX F 25000 333445555 5
Ahmad V Jabbar 987987987 1969-03-29 980 Dallas, Houston TX M 25000 987654321 4
John B Smith 123456789 1965-01-09 731 Fondren, Houston TX M 30000 333445555 5
Ramesh K Narayan 666884444 1962-09-15 975 Fire Oak, Humble TX M 38000 333445555 5
Franklin T Wong 333445555 1965-12-08 638 Voss, Houston TX M 40000 888665555 5
Jennifer S Wallace 987654321 1941-06-20 291 Berry, Bellaire TX F 43000 888665555 4
James E Borg 888665555 1937-11-10 450 Stone, Houston TX M 55000 NA 1

Menggunakan DPLYR

Mengkoneksikan database chinook

chinook<-DBI::dbConnect(RSQLite::SQLite(), "C:/sqlite/chinook.db")

Melihat daftar tabel dalam database

RSQLite::dbListTables(chinook)
##  [1] "DEPARTMENT"      "DEPENDENT"       "DEPT_LOCATIONS"  "EMPLOYEE"       
##  [5] "PROJECT"         "WORKS_ON"        "albums"          "artists"        
##  [9] "customers"       "employees"       "genres"          "invoice_items"  
## [13] "invoices"        "media_types"     "playlist_track"  "playlists"      
## [17] "sqlite_sequence" "sqlite_stat1"    "tracks"

Memuat tabel dari database

dependent <- dplyr::tbl(chinook,"DEPENDENT")
dependent
## # Source:   table<DEPENDENT> [?? x 5]
## # Database: sqlite 3.36.0 [C:\sqlite\chinook.db]
##   Essn      Dependent_name Sex   Bdate      Relationship
##   <chr>     <chr>          <chr> <chr>      <chr>       
## 1 333445555 Alice          F     1986-04-04 Daughter    
## 2 333445555 Theodore       M     1983-10-25 Son         
## 3 333445555 Joy            F     1958-05-03 Spouse      
## 4 987654321 Abner          M     1942-02-28 Spouse      
## 5 123456789 Michael        M     1988-01-04 Son         
## 6 123456789 Alice          F     1988-12-30 Daughter    
## 7 123456789 Elizabeth      F     1967-05-05 Spouse

Melihat kelas dari tabel

class(dependent)
## [1] "tbl_SQLiteConnection" "tbl_dbi"              "tbl_sql"             
## [4] "tbl_lazy"             "tbl"

Data Wrangling

Dataset yang digunakan adalah warpbreaks yang tersedia pada package datasets.

library(datasets)
warpbreaks
##    breaks wool tension
## 1      26    A       L
## 2      30    A       L
## 3      54    A       L
## 4      25    A       L
## 5      70    A       L
## 6      52    A       L
## 7      51    A       L
## 8      26    A       L
## 9      67    A       L
## 10     18    A       M
## 11     21    A       M
## 12     29    A       M
## 13     17    A       M
## 14     12    A       M
## 15     18    A       M
## 16     35    A       M
## 17     30    A       M
## 18     36    A       M
## 19     36    A       H
## 20     21    A       H
## 21     24    A       H
## 22     18    A       H
## 23     10    A       H
## 24     43    A       H
## 25     28    A       H
## 26     15    A       H
## 27     26    A       H
## 28     27    B       L
## 29     14    B       L
## 30     29    B       L
## 31     19    B       L
## 32     29    B       L
## 33     31    B       L
## 34     41    B       L
## 35     20    B       L
## 36     44    B       L
## 37     42    B       M
## 38     26    B       M
## 39     19    B       M
## 40     16    B       M
## 41     39    B       M
## 42     28    B       M
## 43     21    B       M
## 44     39    B       M
## 45     29    B       M
## 46     20    B       H
## 47     21    B       H
## 48     24    B       H
## 49     17    B       H
## 50     13    B       H
## 51     15    B       H
## 52     15    B       H
## 53     16    B       H
## 54     28    B       H

Dataset ini memberikan jumlah putus lusi per alat tenun, di mana alat tenun sesuai dengan panjang benang yang tetap. Untuk mengetahui informasi terkait dataset dapat menggunakan sintaks ?sebelum nama dataset . #Mengecek informasi dataset

?warpbreaks
View(warpbreaks)

Memilih kolom pada dataset

warpbreaks %>% select(breaks,wool,tension)
##    breaks wool tension
## 1      26    A       L
## 2      30    A       L
## 3      54    A       L
## 4      25    A       L
## 5      70    A       L
## 6      52    A       L
## 7      51    A       L
## 8      26    A       L
## 9      67    A       L
## 10     18    A       M
## 11     21    A       M
## 12     29    A       M
## 13     17    A       M
## 14     12    A       M
## 15     18    A       M
## 16     35    A       M
## 17     30    A       M
## 18     36    A       M
## 19     36    A       H
## 20     21    A       H
## 21     24    A       H
## 22     18    A       H
## 23     10    A       H
## 24     43    A       H
## 25     28    A       H
## 26     15    A       H
## 27     26    A       H
## 28     27    B       L
## 29     14    B       L
## 30     29    B       L
## 31     19    B       L
## 32     29    B       L
## 33     31    B       L
## 34     41    B       L
## 35     20    B       L
## 36     44    B       L
## 37     42    B       M
## 38     26    B       M
## 39     19    B       M
## 40     16    B       M
## 41     39    B       M
## 42     28    B       M
## 43     21    B       M
## 44     39    B       M
## 45     29    B       M
## 46     20    B       H
## 47     21    B       H
## 48     24    B       H
## 49     17    B       H
## 50     13    B       H
## 51     15    B       H
## 52     15    B       H
## 53     16    B       H
## 54     28    B       H

Mengurutkan dataset ascending dan descending berdasarkan peubah tertentu

warpbreaks %>% arrange(breaks)
##    breaks wool tension
## 1      10    A       H
## 2      12    A       M
## 3      13    B       H
## 4      14    B       L
## 5      15    A       H
## 6      15    B       H
## 7      15    B       H
## 8      16    B       M
## 9      16    B       H
## 10     17    A       M
## 11     17    B       H
## 12     18    A       M
## 13     18    A       M
## 14     18    A       H
## 15     19    B       L
## 16     19    B       M
## 17     20    B       L
## 18     20    B       H
## 19     21    A       M
## 20     21    A       H
## 21     21    B       M
## 22     21    B       H
## 23     24    A       H
## 24     24    B       H
## 25     25    A       L
## 26     26    A       L
## 27     26    A       L
## 28     26    A       H
## 29     26    B       M
## 30     27    B       L
## 31     28    A       H
## 32     28    B       M
## 33     28    B       H
## 34     29    A       M
## 35     29    B       L
## 36     29    B       L
## 37     29    B       M
## 38     30    A       L
## 39     30    A       M
## 40     31    B       L
## 41     35    A       M
## 42     36    A       M
## 43     36    A       H
## 44     39    B       M
## 45     39    B       M
## 46     41    B       L
## 47     42    B       M
## 48     43    A       H
## 49     44    B       L
## 50     51    A       L
## 51     52    A       L
## 52     54    A       L
## 53     67    A       L
## 54     70    A       L
warpbreaks %>% arrange(desc(breaks))
##    breaks wool tension
## 1      70    A       L
## 2      67    A       L
## 3      54    A       L
## 4      52    A       L
## 5      51    A       L
## 6      44    B       L
## 7      43    A       H
## 8      42    B       M
## 9      41    B       L
## 10     39    B       M
## 11     39    B       M
## 12     36    A       M
## 13     36    A       H
## 14     35    A       M
## 15     31    B       L
## 16     30    A       L
## 17     30    A       M
## 18     29    A       M
## 19     29    B       L
## 20     29    B       L
## 21     29    B       M
## 22     28    A       H
## 23     28    B       M
## 24     28    B       H
## 25     27    B       L
## 26     26    A       L
## 27     26    A       L
## 28     26    A       H
## 29     26    B       M
## 30     25    A       L
## 31     24    A       H
## 32     24    B       H
## 33     21    A       M
## 34     21    A       H
## 35     21    B       M
## 36     21    B       H
## 37     20    B       L
## 38     20    B       H
## 39     19    B       L
## 40     19    B       M
## 41     18    A       M
## 42     18    A       M
## 43     18    A       H
## 44     17    A       M
## 45     17    B       H
## 46     16    B       M
## 47     16    B       H
## 48     15    A       H
## 49     15    B       H
## 50     15    B       H
## 51     14    B       L
## 52     13    B       H
## 53     12    A       M
## 54     10    A       H

Melakukan filter pada raw data

warpbreaks %>% filter(breaks>20)
##    breaks wool tension
## 1      26    A       L
## 2      30    A       L
## 3      54    A       L
## 4      25    A       L
## 5      70    A       L
## 6      52    A       L
## 7      51    A       L
## 8      26    A       L
## 9      67    A       L
## 10     21    A       M
## 11     29    A       M
## 12     35    A       M
## 13     30    A       M
## 14     36    A       M
## 15     36    A       H
## 16     21    A       H
## 17     24    A       H
## 18     43    A       H
## 19     28    A       H
## 20     26    A       H
## 21     27    B       L
## 22     29    B       L
## 23     29    B       L
## 24     31    B       L
## 25     41    B       L
## 26     44    B       L
## 27     42    B       M
## 28     26    B       M
## 29     39    B       M
## 30     28    B       M
## 31     21    B       M
## 32     39    B       M
## 33     29    B       M
## 34     21    B       H
## 35     24    B       H
## 36     28    B       H

Membuat variabel baru dengan fungsi mutate()

warpbreaks %>% mutate(durabilty=breaks*80/100)
##    breaks wool tension durabilty
## 1      26    A       L      20.8
## 2      30    A       L      24.0
## 3      54    A       L      43.2
## 4      25    A       L      20.0
## 5      70    A       L      56.0
## 6      52    A       L      41.6
## 7      51    A       L      40.8
## 8      26    A       L      20.8
## 9      67    A       L      53.6
## 10     18    A       M      14.4
## 11     21    A       M      16.8
## 12     29    A       M      23.2
## 13     17    A       M      13.6
## 14     12    A       M       9.6
## 15     18    A       M      14.4
## 16     35    A       M      28.0
## 17     30    A       M      24.0
## 18     36    A       M      28.8
## 19     36    A       H      28.8
## 20     21    A       H      16.8
## 21     24    A       H      19.2
## 22     18    A       H      14.4
## 23     10    A       H       8.0
## 24     43    A       H      34.4
## 25     28    A       H      22.4
## 26     15    A       H      12.0
## 27     26    A       H      20.8
## 28     27    B       L      21.6
## 29     14    B       L      11.2
## 30     29    B       L      23.2
## 31     19    B       L      15.2
## 32     29    B       L      23.2
## 33     31    B       L      24.8
## 34     41    B       L      32.8
## 35     20    B       L      16.0
## 36     44    B       L      35.2
## 37     42    B       M      33.6
## 38     26    B       M      20.8
## 39     19    B       M      15.2
## 40     16    B       M      12.8
## 41     39    B       M      31.2
## 42     28    B       M      22.4
## 43     21    B       M      16.8
## 44     39    B       M      31.2
## 45     29    B       M      23.2
## 46     20    B       H      16.0
## 47     21    B       H      16.8
## 48     24    B       H      19.2
## 49     17    B       H      13.6
## 50     13    B       H      10.4
## 51     15    B       H      12.0
## 52     15    B       H      12.0
## 53     16    B       H      12.8
## 54     28    B       H      22.4

Mengelompokkan data berdasarkan peubah tertentu dan melakukan summarise()

warpbreaks %>% group_by(tension)%>%summarise(mean=mean(breaks))
## # A tibble: 3 x 2
##   tension  mean
##   <fct>   <dbl>
## 1 L        36.4
## 2 M        26.4
## 3 H        21.7

Melakukan kelima perintah bersamaan

warpbreaks%>% select(breaks,wool,tension)%>% arrange(desc(breaks))%>% filter(breaks>20)%>% mutate(durabilty=breaks*80/100)%>% group_by(tension)%>%summarise(mean=mean(breaks))
## # A tibble: 3 x 2
##   tension  mean
##   <fct>   <dbl>
## 1 L        40.1
## 2 M        31.2
## 3 H        27.9