Tugas Akhir Mandarel
Engine SQL
Database yang digunakan adalah database chinook . Untuk mengakses dan mengolah database ini melalui R diperlukan package DBI dan RSQLite.
Melakukan instalasi package koneksi database
install.packages(c("RSQLite", "DBI"), dependencies = TRUE)Memanggil Library
library(tidyverse)
library(RSQLite)
library(DBI)Secara umum, koneksi terhadap database pada R dapat menggunakan sintaks berikut: DBI::dbConnect(RSQLite::SQLite(), path = ":dbname:")
Melakukan Koneksi Database dengan drive DBMS SQLite
chinook<-DBI::dbConnect(RSQLite::SQLite(), "C:/sqlite/chinook.db")Setelah melakukan dbConnect() kita dapat melakukan perintah SQL.
SELECT
*
FROM
DEPENDENT
WHERE
Sex = "M";
;| Essn | Dependent_name | Sex | Bdate | Relationship |
|---|---|---|---|---|
| 333445555 | Theodore | M | 1983-10-25 | Son |
| 987654321 | Abner | M | 1942-02-28 | Spouse |
| 123456789 | Michael | M | 1988-01-04 | Son |
Melakukan Operasi SQL di R
SELECT
*
FROM
EMPLOYEE
ORDER BY
Salary;
;| Fname | Minit | Lname | Ssn | Bdate | Address | Sex | Salary | Super_ssn | Dno |
|---|---|---|---|---|---|---|---|---|---|
| Alicia | J | Zelaya | 999887777 | 1968-01-19 | 3321 Castle, Spring TX | F | 25000 | 987654321 | 4 |
| Joyce | A | English | 453453453 | 1972-07-31 | 5631 Rice, Houston TX | F | 25000 | 333445555 | 5 |
| Ahmad | V | Jabbar | 987987987 | 1969-03-29 | 980 Dallas, Houston TX | M | 25000 | 987654321 | 4 |
| John | B | Smith | 123456789 | 1965-01-09 | 731 Fondren, Houston TX | M | 30000 | 333445555 | 5 |
| Ramesh | K | Narayan | 666884444 | 1962-09-15 | 975 Fire Oak, Humble TX | M | 38000 | 333445555 | 5 |
| Franklin | T | Wong | 333445555 | 1965-12-08 | 638 Voss, Houston TX | M | 40000 | 888665555 | 5 |
| Jennifer | S | Wallace | 987654321 | 1941-06-20 | 291 Berry, Bellaire TX | F | 43000 | 888665555 | 4 |
| James | E | Borg | 888665555 | 1937-11-10 | 450 Stone, Houston TX | M | 55000 | NA | 1 |
Menggunakan DPLYR
Mengkoneksikan database chinook
chinook<-DBI::dbConnect(RSQLite::SQLite(), "C:/sqlite/chinook.db")Melihat daftar tabel dalam database
RSQLite::dbListTables(chinook)## [1] "DEPARTMENT" "DEPENDENT" "DEPT_LOCATIONS" "EMPLOYEE"
## [5] "PROJECT" "WORKS_ON" "albums" "artists"
## [9] "customers" "employees" "genres" "invoice_items"
## [13] "invoices" "media_types" "playlist_track" "playlists"
## [17] "sqlite_sequence" "sqlite_stat1" "tracks"
Memuat tabel dari database
dependent <- dplyr::tbl(chinook,"DEPENDENT")
dependent## # Source: table<DEPENDENT> [?? x 5]
## # Database: sqlite 3.36.0 [C:\sqlite\chinook.db]
## Essn Dependent_name Sex Bdate Relationship
## <chr> <chr> <chr> <chr> <chr>
## 1 333445555 Alice F 1986-04-04 Daughter
## 2 333445555 Theodore M 1983-10-25 Son
## 3 333445555 Joy F 1958-05-03 Spouse
## 4 987654321 Abner M 1942-02-28 Spouse
## 5 123456789 Michael M 1988-01-04 Son
## 6 123456789 Alice F 1988-12-30 Daughter
## 7 123456789 Elizabeth F 1967-05-05 Spouse
Melihat kelas dari tabel
class(dependent)## [1] "tbl_SQLiteConnection" "tbl_dbi" "tbl_sql"
## [4] "tbl_lazy" "tbl"
Data Wrangling
Dataset yang digunakan adalah warpbreaks yang tersedia pada package datasets.
library(datasets)
warpbreaks## breaks wool tension
## 1 26 A L
## 2 30 A L
## 3 54 A L
## 4 25 A L
## 5 70 A L
## 6 52 A L
## 7 51 A L
## 8 26 A L
## 9 67 A L
## 10 18 A M
## 11 21 A M
## 12 29 A M
## 13 17 A M
## 14 12 A M
## 15 18 A M
## 16 35 A M
## 17 30 A M
## 18 36 A M
## 19 36 A H
## 20 21 A H
## 21 24 A H
## 22 18 A H
## 23 10 A H
## 24 43 A H
## 25 28 A H
## 26 15 A H
## 27 26 A H
## 28 27 B L
## 29 14 B L
## 30 29 B L
## 31 19 B L
## 32 29 B L
## 33 31 B L
## 34 41 B L
## 35 20 B L
## 36 44 B L
## 37 42 B M
## 38 26 B M
## 39 19 B M
## 40 16 B M
## 41 39 B M
## 42 28 B M
## 43 21 B M
## 44 39 B M
## 45 29 B M
## 46 20 B H
## 47 21 B H
## 48 24 B H
## 49 17 B H
## 50 13 B H
## 51 15 B H
## 52 15 B H
## 53 16 B H
## 54 28 B H
Dataset ini memberikan jumlah putus lusi per alat tenun, di mana alat tenun sesuai dengan panjang benang yang tetap. Untuk mengetahui informasi terkait dataset dapat menggunakan sintaks ?sebelum nama dataset . #Mengecek informasi dataset
?warpbreaks
View(warpbreaks)Memilih kolom pada dataset
warpbreaks %>% select(breaks,wool,tension)## breaks wool tension
## 1 26 A L
## 2 30 A L
## 3 54 A L
## 4 25 A L
## 5 70 A L
## 6 52 A L
## 7 51 A L
## 8 26 A L
## 9 67 A L
## 10 18 A M
## 11 21 A M
## 12 29 A M
## 13 17 A M
## 14 12 A M
## 15 18 A M
## 16 35 A M
## 17 30 A M
## 18 36 A M
## 19 36 A H
## 20 21 A H
## 21 24 A H
## 22 18 A H
## 23 10 A H
## 24 43 A H
## 25 28 A H
## 26 15 A H
## 27 26 A H
## 28 27 B L
## 29 14 B L
## 30 29 B L
## 31 19 B L
## 32 29 B L
## 33 31 B L
## 34 41 B L
## 35 20 B L
## 36 44 B L
## 37 42 B M
## 38 26 B M
## 39 19 B M
## 40 16 B M
## 41 39 B M
## 42 28 B M
## 43 21 B M
## 44 39 B M
## 45 29 B M
## 46 20 B H
## 47 21 B H
## 48 24 B H
## 49 17 B H
## 50 13 B H
## 51 15 B H
## 52 15 B H
## 53 16 B H
## 54 28 B H
Mengurutkan dataset ascending dan descending berdasarkan peubah tertentu
warpbreaks %>% arrange(breaks)## breaks wool tension
## 1 10 A H
## 2 12 A M
## 3 13 B H
## 4 14 B L
## 5 15 A H
## 6 15 B H
## 7 15 B H
## 8 16 B M
## 9 16 B H
## 10 17 A M
## 11 17 B H
## 12 18 A M
## 13 18 A M
## 14 18 A H
## 15 19 B L
## 16 19 B M
## 17 20 B L
## 18 20 B H
## 19 21 A M
## 20 21 A H
## 21 21 B M
## 22 21 B H
## 23 24 A H
## 24 24 B H
## 25 25 A L
## 26 26 A L
## 27 26 A L
## 28 26 A H
## 29 26 B M
## 30 27 B L
## 31 28 A H
## 32 28 B M
## 33 28 B H
## 34 29 A M
## 35 29 B L
## 36 29 B L
## 37 29 B M
## 38 30 A L
## 39 30 A M
## 40 31 B L
## 41 35 A M
## 42 36 A M
## 43 36 A H
## 44 39 B M
## 45 39 B M
## 46 41 B L
## 47 42 B M
## 48 43 A H
## 49 44 B L
## 50 51 A L
## 51 52 A L
## 52 54 A L
## 53 67 A L
## 54 70 A L
warpbreaks %>% arrange(desc(breaks))## breaks wool tension
## 1 70 A L
## 2 67 A L
## 3 54 A L
## 4 52 A L
## 5 51 A L
## 6 44 B L
## 7 43 A H
## 8 42 B M
## 9 41 B L
## 10 39 B M
## 11 39 B M
## 12 36 A M
## 13 36 A H
## 14 35 A M
## 15 31 B L
## 16 30 A L
## 17 30 A M
## 18 29 A M
## 19 29 B L
## 20 29 B L
## 21 29 B M
## 22 28 A H
## 23 28 B M
## 24 28 B H
## 25 27 B L
## 26 26 A L
## 27 26 A L
## 28 26 A H
## 29 26 B M
## 30 25 A L
## 31 24 A H
## 32 24 B H
## 33 21 A M
## 34 21 A H
## 35 21 B M
## 36 21 B H
## 37 20 B L
## 38 20 B H
## 39 19 B L
## 40 19 B M
## 41 18 A M
## 42 18 A M
## 43 18 A H
## 44 17 A M
## 45 17 B H
## 46 16 B M
## 47 16 B H
## 48 15 A H
## 49 15 B H
## 50 15 B H
## 51 14 B L
## 52 13 B H
## 53 12 A M
## 54 10 A H
Melakukan filter pada raw data
warpbreaks %>% filter(breaks>20)## breaks wool tension
## 1 26 A L
## 2 30 A L
## 3 54 A L
## 4 25 A L
## 5 70 A L
## 6 52 A L
## 7 51 A L
## 8 26 A L
## 9 67 A L
## 10 21 A M
## 11 29 A M
## 12 35 A M
## 13 30 A M
## 14 36 A M
## 15 36 A H
## 16 21 A H
## 17 24 A H
## 18 43 A H
## 19 28 A H
## 20 26 A H
## 21 27 B L
## 22 29 B L
## 23 29 B L
## 24 31 B L
## 25 41 B L
## 26 44 B L
## 27 42 B M
## 28 26 B M
## 29 39 B M
## 30 28 B M
## 31 21 B M
## 32 39 B M
## 33 29 B M
## 34 21 B H
## 35 24 B H
## 36 28 B H
Membuat variabel baru dengan fungsi mutate()
warpbreaks %>% mutate(durabilty=breaks*80/100)## breaks wool tension durabilty
## 1 26 A L 20.8
## 2 30 A L 24.0
## 3 54 A L 43.2
## 4 25 A L 20.0
## 5 70 A L 56.0
## 6 52 A L 41.6
## 7 51 A L 40.8
## 8 26 A L 20.8
## 9 67 A L 53.6
## 10 18 A M 14.4
## 11 21 A M 16.8
## 12 29 A M 23.2
## 13 17 A M 13.6
## 14 12 A M 9.6
## 15 18 A M 14.4
## 16 35 A M 28.0
## 17 30 A M 24.0
## 18 36 A M 28.8
## 19 36 A H 28.8
## 20 21 A H 16.8
## 21 24 A H 19.2
## 22 18 A H 14.4
## 23 10 A H 8.0
## 24 43 A H 34.4
## 25 28 A H 22.4
## 26 15 A H 12.0
## 27 26 A H 20.8
## 28 27 B L 21.6
## 29 14 B L 11.2
## 30 29 B L 23.2
## 31 19 B L 15.2
## 32 29 B L 23.2
## 33 31 B L 24.8
## 34 41 B L 32.8
## 35 20 B L 16.0
## 36 44 B L 35.2
## 37 42 B M 33.6
## 38 26 B M 20.8
## 39 19 B M 15.2
## 40 16 B M 12.8
## 41 39 B M 31.2
## 42 28 B M 22.4
## 43 21 B M 16.8
## 44 39 B M 31.2
## 45 29 B M 23.2
## 46 20 B H 16.0
## 47 21 B H 16.8
## 48 24 B H 19.2
## 49 17 B H 13.6
## 50 13 B H 10.4
## 51 15 B H 12.0
## 52 15 B H 12.0
## 53 16 B H 12.8
## 54 28 B H 22.4
Mengelompokkan data berdasarkan peubah tertentu dan melakukan summarise()
warpbreaks %>% group_by(tension)%>%summarise(mean=mean(breaks))## # A tibble: 3 x 2
## tension mean
## <fct> <dbl>
## 1 L 36.4
## 2 M 26.4
## 3 H 21.7
Melakukan kelima perintah bersamaan
warpbreaks%>% select(breaks,wool,tension)%>% arrange(desc(breaks))%>% filter(breaks>20)%>% mutate(durabilty=breaks*80/100)%>% group_by(tension)%>%summarise(mean=mean(breaks))## # A tibble: 3 x 2
## tension mean
## <fct> <dbl>
## 1 L 40.1
## 2 M 31.2
## 3 H 27.9