Tugas Akhir Mandarel
Engine SQL
Database yang digunakan pada publikasi ini adalah database sampel northwindSumber . Untuk mengakses dan mengolah database ini melalui R diperlukan package DBI dan RSQLite.
Melakukan Instalasi Package yang dibutuhkan
install.packages(c("RSQLite", "DBI"), dependencies = TRUE)
install.packages("tidyverse")Selanjutnya package tidyverse yang mencakup beberapa package dipanggil. Fungsi dalam suatu package dapat juga dipanggil tanpa perintah library() atau require() jika sintaks yang ditulis menyertakan nama package tempat fungsi tersebut berasal. Penyertaan ini dengan menggunakan tanda ::.
Memanggil library
library(tidyverse)
library(RSQLite)
library(DBI)Secara umum, koneksi terhadap database pada R dapat menggunakan sintaks berikut: DBI::dbConnect(RSQLite::SQLite(), path = ":dbname:")
Melakukan Koneksi pada Database
northwind<-DBI::dbConnect(RSQLite::SQLite(), "C:/sqlite/northwind_large.sqlite")Menggunakan perintah SQL pada database
Setelah melakukan dbConnect() kita dapat melakukan perintah SQL.
SELECT
*
FROM
EMPLOYEE
WHERE
Region ="North America";| Id | LastName | FirstName | Title | TitleOfCourtesy | BirthDate | HireDate | Address | City | Region | PostalCode | Country | HomePhone | Extension | Photo | Notes | ReportsTo | PhotoPath |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Davolio | Nancy | Sales Representative | Ms. | 1980-12-08 | 2024-05-01 | 507 - 20th Ave. E. Apt. 2A | Seattle | North America | 98122 | USA | (206) 555-9857 | 5467 | Education includes a BA in psychology from Colorado State University in 1970. She also completed ‘The Art of the Cold Call.’ Nancy is a member of Toastmasters International. | 2 | http://accweb/emmployees/davolio.bmp | |
| 2 | Fuller | Andrew | Vice President, Sales | Dr. | 1984-02-19 | 2024-08-14 | 908 W. Capital Way | Tacoma | North America | 98401 | USA | (206) 555-9482 | 3457 | Andrew received his BTS commercial in 1974 and a Ph.D. in international marketing from the University of Dallas in 1981. He is fluent in French and Italian and reads German. He joined the company as a sales representative, was promoted to sales manager in January 1992 and to vice president of sales in March 1993. Andrew is a member of the Sales Management Roundtable, the Seattle Chamber of Commerce, and the Pacific Rim Importers Association. | NA | http://accweb/emmployees/fuller.bmp | |
| 3 | Leverling | Janet | Sales Representative | Ms. | 1995-08-30 | 2024-04-01 | 722 Moss Bay Blvd. | Kirkland | North America | 98033 | USA | (206) 555-3412 | 3355 | Janet has a BS degree in chemistry from Boston College (1984). She has also completed a certificate program in food retailing management. Janet was hired as a sales associate in 1991 and promoted to sales representative in February 1992. | 2 | http://accweb/emmployees/leverling.bmp | |
| 4 | Peacock | Margaret | Sales Representative | Mrs. | 1969-09-19 | 2025-05-03 | 4110 Old Redmond Rd. | Redmond | North America | 98052 | USA | (206) 555-8122 | 5176 | Margaret holds a BA in English literature from Concordia College (1958) and an MA from the American Institute of Culinary Arts (1966). She was assigned to the London office temporarily from July through November 1992. | 2 | http://accweb/emmployees/peacock.bmp | |
| 8 | Callahan | Laura | Inside Sales Coordinator | Ms. | 1990-01-09 | 2026-03-05 | 4726 - 11th Ave. N.E. | Seattle | North America | 98105 | USA | (206) 555-1189 | 2344 | Laura received a BA in psychology from the University of Washington. She has also completed a course in business French. She reads and writes French. | 2 | http://accweb/emmployees/davolio.bmp |
Memutus Koneksi Database
Setelah selesai mengolah database terkait dianjurkan memutuskan koneksi dengan fungsi dbDisconnect() dan jika ingin menyambungkan kembali cukup menggunakan kembali dbConnect()
dbDisconnect(northwind)Menggunakan DPLYR
Melakukan Koneksi pada Database Northwind
northwind<-DBI::dbConnect(RSQLite::SQLite(), "C:/sqlite/northwind_large.sqlite")Memuat Tabel
employee <- dplyr::tbl(northwind,"Employee")
employee## # Source: table<Employee> [?? x 18]
## # Database: sqlite 3.36.0 [C:\sqlite\northwind_large.sqlite]
## Id LastName FirstName Title TitleOfCourtesy BirthDate HireDate Address
## <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 1 Davolio Nancy Sales ~ Ms. 1980-12-~ 2024-05~ 507 - 20~
## 2 2 Fuller Andrew Vice P~ Dr. 1984-02-~ 2024-08~ 908 W. C~
## 3 3 Leverling Janet Sales ~ Ms. 1995-08-~ 2024-04~ 722 Moss~
## 4 4 Peacock Margaret Sales ~ Mrs. 1969-09-~ 2025-05~ 4110 Old~
## 5 5 Buchanan Steven Sales ~ Mr. 1987-03-~ 2025-10~ 14 Garre~
## 6 6 Suyama Michael Sales ~ Mr. 1995-07-~ 2025-10~ Coventry~
## 7 7 King Robert Sales ~ Mr. 1992-05-~ 2026-01~ Edgeham ~
## 8 8 Callahan Laura Inside~ Ms. 1990-01-~ 2026-03~ 4726 - 1~
## 9 9 Dodsworth Anne Sales ~ Ms. 1998-01-~ 2026-11~ 7 Hounds~
## # ... with 10 more variables: City <chr>, Region <chr>, PostalCode <chr>,
## # Country <chr>, HomePhone <chr>, Extension <chr>, Photo <blob>, Notes <chr>,
## # ReportsTo <int>, PhotoPath <chr>
Melihat Kelas dari Tabel
class(employee)## [1] "tbl_SQLiteConnection" "tbl_dbi" "tbl_sql"
## [4] "tbl_lazy" "tbl"
Menyeleksi Kolom pada Tabel
query <- employee %>% select(-Photo,-Notes,-PhotoPath)
query## # Source: lazy query [?? x 15]
## # Database: sqlite 3.36.0 [C:\sqlite\northwind_large.sqlite]
## Id LastName FirstName Title TitleOfCourtesy BirthDate HireDate Address
## <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 1 Davolio Nancy Sales ~ Ms. 1980-12-~ 2024-05~ 507 - 20~
## 2 2 Fuller Andrew Vice P~ Dr. 1984-02-~ 2024-08~ 908 W. C~
## 3 3 Leverling Janet Sales ~ Ms. 1995-08-~ 2024-04~ 722 Moss~
## 4 4 Peacock Margaret Sales ~ Mrs. 1969-09-~ 2025-05~ 4110 Old~
## 5 5 Buchanan Steven Sales ~ Mr. 1987-03-~ 2025-10~ 14 Garre~
## 6 6 Suyama Michael Sales ~ Mr. 1995-07-~ 2025-10~ Coventry~
## 7 7 King Robert Sales ~ Mr. 1992-05-~ 2026-01~ Edgeham ~
## 8 8 Callahan Laura Inside~ Ms. 1990-01-~ 2026-03~ 4726 - 1~
## 9 9 Dodsworth Anne Sales ~ Ms. 1998-01-~ 2026-11~ 7 Hounds~
## # ... with 7 more variables: City <chr>, Region <chr>, PostalCode <chr>,
## # Country <chr>, HomePhone <chr>, Extension <chr>, ReportsTo <int>
Data Wrangling
Dataset yang digunakan adalah storms yang tersedia pada package default R datasets. ## Memanggil Library datasets dan memilih dataset
library(datasets)
storms## # A tibble: 10,010 x 13
## name year month day hour lat long status category wind pressure
## <chr> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <chr> <ord> <int> <int>
## 1 Amy 1975 6 27 0 27.5 -79 tropical d~ -1 25 1013
## 2 Amy 1975 6 27 6 28.5 -79 tropical d~ -1 25 1013
## 3 Amy 1975 6 27 12 29.5 -79 tropical d~ -1 25 1013
## 4 Amy 1975 6 27 18 30.5 -79 tropical d~ -1 25 1013
## 5 Amy 1975 6 28 0 31.5 -78.8 tropical d~ -1 25 1012
## 6 Amy 1975 6 28 6 32.4 -78.7 tropical d~ -1 25 1012
## 7 Amy 1975 6 28 12 33.3 -78 tropical d~ -1 25 1011
## 8 Amy 1975 6 28 18 34 -77 tropical d~ -1 30 1006
## 9 Amy 1975 6 29 0 34.4 -75.8 tropical s~ 0 35 1004
## 10 Amy 1975 6 29 6 34 -74.8 tropical s~ 0 40 1002
## # ... with 10,000 more rows, and 2 more variables: ts_diameter <dbl>,
## # hu_diameter <dbl>
Data ini adalah bagian dari data track terbaik database badai Atlantik NOAA, Sumber . Data tersebut mencakup posisi dan atribut dari 198 badai tropis, yang diukur setiap enam jam selama masa badai. ## Melihat Detail dari Dataset Sampel Untuk mengetahui informasi terkait dataset dapat menggunakan sintaks ?sebelum nama dataset .
?storms
View(storms)Dataset tersebut akan dilakukan perintah arrange , mutate ,filter,select, dan summarise.
Memilih Kolom Tertentu
Sintaks dibawah jika dijalankan akan memuat semua kolom pada tabel storms kecuali kolom ts_diameter dan hu_diameter.
storms%>% select(-ts_diameter,-hu_diameter)## # A tibble: 10,010 x 11
## name year month day hour lat long status category wind pressure
## <chr> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <chr> <ord> <int> <int>
## 1 Amy 1975 6 27 0 27.5 -79 tropical d~ -1 25 1013
## 2 Amy 1975 6 27 6 28.5 -79 tropical d~ -1 25 1013
## 3 Amy 1975 6 27 12 29.5 -79 tropical d~ -1 25 1013
## 4 Amy 1975 6 27 18 30.5 -79 tropical d~ -1 25 1013
## 5 Amy 1975 6 28 0 31.5 -78.8 tropical d~ -1 25 1012
## 6 Amy 1975 6 28 6 32.4 -78.7 tropical d~ -1 25 1012
## 7 Amy 1975 6 28 12 33.3 -78 tropical d~ -1 25 1011
## 8 Amy 1975 6 28 18 34 -77 tropical d~ -1 30 1006
## 9 Amy 1975 6 29 0 34.4 -75.8 tropical s~ 0 35 1004
## 10 Amy 1975 6 29 6 34 -74.8 tropical s~ 0 40 1002
## # ... with 10,000 more rows
Mengurutkan Data Berdasarkan Peubah Tertentu
Sintaks dibawah jika dijalankan akan mengurutkan data secara descending berdasarkan peubah pressure
storms %>% arrange(desc(pressure))## # A tibble: 10,010 x 13
## name year month day hour lat long status category wind pressure
## <chr> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <chr> <ord> <int> <int>
## 1 AL072~ 2003 7 26 12 32.3 -82 tropical ~ -1 20 1022
## 2 AL072~ 2003 7 26 18 32.8 -82.6 tropical ~ -1 15 1022
## 3 AL072~ 2003 7 27 0 33 -83 tropical ~ -1 15 1022
## 4 Emily 1993 8 22 18 19.9 -52.6 tropical ~ -1 30 1020
## 5 Emily 1993 8 23 0 20.5 -53.6 tropical ~ -1 30 1020
## 6 Emily 1993 8 23 6 21.3 -54.8 tropical ~ -1 30 1020
## 7 Emily 1993 8 23 12 22.3 -56 tropical ~ -1 30 1020
## 8 Emily 1993 8 23 18 23.2 -57.1 tropical ~ -1 30 1020
## 9 Emily 1993 8 24 0 24.3 -57.8 tropical ~ -1 30 1020
## 10 Emily 1993 8 24 6 25.4 -58.6 tropical ~ -1 30 1020
## # ... with 10,000 more rows, and 2 more variables: ts_diameter <dbl>,
## # hu_diameter <dbl>
Melakukan Filter pada Tabel
storms %>% filter(wind>20)## # A tibble: 9,723 x 13
## name year month day hour lat long status category wind pressure
## <chr> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <chr> <ord> <int> <int>
## 1 Amy 1975 6 27 0 27.5 -79 tropical d~ -1 25 1013
## 2 Amy 1975 6 27 6 28.5 -79 tropical d~ -1 25 1013
## 3 Amy 1975 6 27 12 29.5 -79 tropical d~ -1 25 1013
## 4 Amy 1975 6 27 18 30.5 -79 tropical d~ -1 25 1013
## 5 Amy 1975 6 28 0 31.5 -78.8 tropical d~ -1 25 1012
## 6 Amy 1975 6 28 6 32.4 -78.7 tropical d~ -1 25 1012
## 7 Amy 1975 6 28 12 33.3 -78 tropical d~ -1 25 1011
## 8 Amy 1975 6 28 18 34 -77 tropical d~ -1 30 1006
## 9 Amy 1975 6 29 0 34.4 -75.8 tropical s~ 0 35 1004
## 10 Amy 1975 6 29 6 34 -74.8 tropical s~ 0 40 1002
## # ... with 9,713 more rows, and 2 more variables: ts_diameter <dbl>,
## # hu_diameter <dbl>
Menambahkan Peubah Baru berdasarkan Peubah Lain dengan fungsi mutate()
storms %>% mutate(duration=day*hour)## # A tibble: 10,010 x 14
## name year month day hour lat long status category wind pressure
## <chr> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <chr> <ord> <int> <int>
## 1 Amy 1975 6 27 0 27.5 -79 tropical d~ -1 25 1013
## 2 Amy 1975 6 27 6 28.5 -79 tropical d~ -1 25 1013
## 3 Amy 1975 6 27 12 29.5 -79 tropical d~ -1 25 1013
## 4 Amy 1975 6 27 18 30.5 -79 tropical d~ -1 25 1013
## 5 Amy 1975 6 28 0 31.5 -78.8 tropical d~ -1 25 1012
## 6 Amy 1975 6 28 6 32.4 -78.7 tropical d~ -1 25 1012
## 7 Amy 1975 6 28 12 33.3 -78 tropical d~ -1 25 1011
## 8 Amy 1975 6 28 18 34 -77 tropical d~ -1 30 1006
## 9 Amy 1975 6 29 0 34.4 -75.8 tropical s~ 0 35 1004
## 10 Amy 1975 6 29 6 34 -74.8 tropical s~ 0 40 1002
## # ... with 10,000 more rows, and 3 more variables: ts_diameter <dbl>,
## # hu_diameter <dbl>, duration <dbl>
Menggunakan fungsi summarise()
storms %>% group_by(status)%>%summarise(mean=mean(pressure))## # A tibble: 3 x 2
## status mean
## <chr> <dbl>
## 1 hurricane 969.
## 2 tropical depression 1008.
## 3 tropical storm 999.
Melakukan kelima perintah bersamaan
storms%>% select(-ts_diameter,-hu_diameter) %>% arrange(desc(pressure)) %>% filter(wind>20)%>% mutate(duration=day*hour)%>% group_by(status)%>%summarise(mean=mean(pressure))## # A tibble: 3 x 2
## status mean
## <chr> <dbl>
## 1 hurricane 969.
## 2 tropical depression 1008.
## 3 tropical storm 999.