Tugas Akhir STA261
Engine SQL
Database yang digunakan adalah database northwindLink Download . Diperlukan package tidyverse, DBI , danRSQLite.
Melakukan Instalasi Package
install.packages(c("RSQLite", "DBI"), dependencies = TRUE)
install.packages("tidyverse")Memanggil Library
Selanjutnya package tidyverse,RSQLite, dan DBI dipanggil .
library(tidyverse)
library(RSQLite)
library(DBI)Mengoneksikan Database
Koneksi terhadap database menggunakan drive DBMS SQLite dapat digunakan pada R dengan menggunakan sintaks berikut: DBI::dbConnect(RSQLite::SQLite(), path = ":dbname:")
northwind<-DBI::dbConnect(RSQLite::SQLite(), "C:/sqlite/northwind_large.sqlite")Menggunakan Query SQL pada Database
Setelah melakukan dbConnect(), baru kita dapat melakukan perintah SQL pada RStudio.
SELECT
*
FROM
OrderDetail
WHERE
OrderId = "10250"
ORDER BY
ProductId;| Id | OrderId | ProductId | UnitPrice | Quantity | Discount |
|---|---|---|---|---|---|
| 10250/41 | 10250 | 41 | 7.7 | 10 | 0.00 |
| 10250/51 | 10250 | 51 | 42.4 | 35 | 0.15 |
| 10250/65 | 10250 | 65 | 16.8 | 15 | 0.15 |
Memutus Koneksi Database
dbDisconnect(northwind)Menggunakan dplyr
northwind<-DBI::dbConnect(RSQLite::SQLite(), "C:/sqlite/northwind_large.sqlite")Melihat Daftar Entity/Tabel dalam Database
RSQLite::dbListTables(northwind)## [1] "Category" "Customer" "CustomerCustomerDemo"
## [4] "CustomerDemographic" "Employee" "EmployeeTerritory"
## [7] "Order" "OrderDetail" "Product"
## [10] "ProductDetails_V" "Region" "Shipper"
## [13] "Supplier" "Territory"
Memilih Entity dalam Database
product <- dplyr::tbl(northwind,"Product")
product## # Source: table<Product> [?? x 10]
## # Database: sqlite 3.36.0 [C:\sqlite\northwind_large.sqlite]
## Id ProductName SupplierId CategoryId QuantityPerUnit UnitPrice
## <int> <chr> <int> <int> <chr> <dbl>
## 1 1 Chai 1 1 10 boxes x 20 ba~ 18
## 2 2 Chang 1 1 24 - 12 oz bottl~ 19
## 3 3 Aniseed Syrup 1 2 12 - 550 ml bott~ 10
## 4 4 Chef Anton's Cajun S~ 2 2 48 - 6 oz jars 22
## 5 5 Chef Anton's Gumbo M~ 2 2 36 boxes 21.4
## 6 6 Grandma's Boysenberr~ 3 2 12 - 8 oz jars 25
## 7 7 Uncle Bob's Organic ~ 3 7 12 - 1 lb pkgs. 30
## 8 8 Northwoods Cranberry~ 3 2 12 - 12 oz jars 40
## 9 9 Mishi Kobe Niku 4 6 18 - 500 g pkgs. 97
## 10 10 Ikura 4 8 12 - 200 ml jars 31
## # ... with more rows, and 4 more variables: UnitsInStock <int>,
## # UnitsOnOrder <int>, ReorderLevel <int>, Discontinued <int>
Melihat Kelas dari Tabel
class(product)## [1] "tbl_SQLiteConnection" "tbl_dbi" "tbl_sql"
## [4] "tbl_lazy" "tbl"
Melakukan Wrangling pada Tabel
Sintaks berikut akan menyeleksi tabel dengan membuang kolom UnitsOnOrder dan ReorderLevel. Selanjutnya tabel difilter dan hanya menampilkan data dengan UnitPrice > 10.
query <- product %>% select(-UnitsOnOrder,-ReorderLevel) %>% filter(UnitPrice>10)
query ## # Source: lazy query [?? x 8]
## # Database: sqlite 3.36.0 [C:\sqlite\northwind_large.sqlite]
## Id ProductName SupplierId CategoryId QuantityPerUnit UnitPrice
## <int> <chr> <int> <int> <chr> <dbl>
## 1 1 Chai 1 1 10 boxes x 20 b~ 18
## 2 2 Chang 1 1 24 - 12 oz bott~ 19
## 3 4 Chef Anton's Cajun Se~ 2 2 48 - 6 oz jars 22
## 4 5 Chef Anton's Gumbo Mix 2 2 36 boxes 21.4
## 5 6 Grandma's Boysenberry~ 3 2 12 - 8 oz jars 25
## 6 7 Uncle Bob's Organic D~ 3 7 12 - 1 lb pkgs. 30
## 7 8 Northwoods Cranberry ~ 3 2 12 - 12 oz jars 40
## 8 9 Mishi Kobe Niku 4 6 18 - 500 g pkgs. 97
## 9 10 Ikura 4 8 12 - 200 ml jars 31
## 10 11 Queso Cabrales 5 4 1 kg pkg. 21
## # ... with more rows, and 2 more variables: UnitsInStock <int>,
## # Discontinued <int>
Melihat Query SQL untuk Menampilkan Tabel yang Telah Diubah
dplyr::show_query(query)## <SQL>
## SELECT *
## FROM (SELECT `Id`, `ProductName`, `SupplierId`, `CategoryId`, `QuantityPerUnit`, `UnitPrice`, `UnitsInStock`, `Discontinued`
## FROM `Product`)
## WHERE (`UnitPrice` > 10.0)
Data Wrangling
Dataset yang digunakan adalah esoph yang tersedia pada package default R datasets.
Memanggil Library Dataset
library(datasets)
esoph## agegp alcgp tobgp ncases ncontrols
## 1 25-34 0-39g/day 0-9g/day 0 40
## 2 25-34 0-39g/day 10-19 0 10
## 3 25-34 0-39g/day 20-29 0 6
## 4 25-34 0-39g/day 30+ 0 5
## 5 25-34 40-79 0-9g/day 0 27
## 6 25-34 40-79 10-19 0 7
## 7 25-34 40-79 20-29 0 4
## 8 25-34 40-79 30+ 0 7
## 9 25-34 80-119 0-9g/day 0 2
## 10 25-34 80-119 10-19 0 1
## 11 25-34 80-119 30+ 0 2
## 12 25-34 120+ 0-9g/day 0 1
## 13 25-34 120+ 10-19 1 1
## 14 25-34 120+ 20-29 0 1
## 15 25-34 120+ 30+ 0 2
## 16 35-44 0-39g/day 0-9g/day 0 60
## 17 35-44 0-39g/day 10-19 1 14
## 18 35-44 0-39g/day 20-29 0 7
## 19 35-44 0-39g/day 30+ 0 8
## 20 35-44 40-79 0-9g/day 0 35
## 21 35-44 40-79 10-19 3 23
## 22 35-44 40-79 20-29 1 14
## 23 35-44 40-79 30+ 0 8
## 24 35-44 80-119 0-9g/day 0 11
## 25 35-44 80-119 10-19 0 6
## 26 35-44 80-119 20-29 0 2
## 27 35-44 80-119 30+ 0 1
## 28 35-44 120+ 0-9g/day 2 3
## 29 35-44 120+ 10-19 0 3
## 30 35-44 120+ 20-29 2 4
## 31 45-54 0-39g/day 0-9g/day 1 46
## 32 45-54 0-39g/day 10-19 0 18
## 33 45-54 0-39g/day 20-29 0 10
## 34 45-54 0-39g/day 30+ 0 4
## 35 45-54 40-79 0-9g/day 6 38
## 36 45-54 40-79 10-19 4 21
## 37 45-54 40-79 20-29 5 15
## 38 45-54 40-79 30+ 5 7
## 39 45-54 80-119 0-9g/day 3 16
## 40 45-54 80-119 10-19 6 14
## 41 45-54 80-119 20-29 1 5
## 42 45-54 80-119 30+ 2 4
## 43 45-54 120+ 0-9g/day 4 4
## 44 45-54 120+ 10-19 3 4
## 45 45-54 120+ 20-29 2 3
## 46 45-54 120+ 30+ 4 4
## 47 55-64 0-39g/day 0-9g/day 2 49
## 48 55-64 0-39g/day 10-19 3 22
## 49 55-64 0-39g/day 20-29 3 12
## 50 55-64 0-39g/day 30+ 4 6
## 51 55-64 40-79 0-9g/day 9 40
## 52 55-64 40-79 10-19 6 21
## 53 55-64 40-79 20-29 4 17
## 54 55-64 40-79 30+ 3 6
## 55 55-64 80-119 0-9g/day 9 18
## 56 55-64 80-119 10-19 8 15
## 57 55-64 80-119 20-29 3 6
## 58 55-64 80-119 30+ 4 4
## 59 55-64 120+ 0-9g/day 5 10
## 60 55-64 120+ 10-19 6 7
## 61 55-64 120+ 20-29 2 3
## 62 55-64 120+ 30+ 5 6
## 63 65-74 0-39g/day 0-9g/day 5 48
## 64 65-74 0-39g/day 10-19 4 14
## 65 65-74 0-39g/day 20-29 2 7
## 66 65-74 0-39g/day 30+ 0 2
## 67 65-74 40-79 0-9g/day 17 34
## 68 65-74 40-79 10-19 3 10
## 69 65-74 40-79 20-29 5 9
## 70 65-74 80-119 0-9g/day 6 13
## 71 65-74 80-119 10-19 4 12
## 72 65-74 80-119 20-29 2 3
## 73 65-74 80-119 30+ 1 1
## 74 65-74 120+ 0-9g/day 3 4
## 75 65-74 120+ 10-19 1 2
## 76 65-74 120+ 20-29 1 1
## 77 65-74 120+ 30+ 1 1
## 78 75+ 0-39g/day 0-9g/day 1 18
## 79 75+ 0-39g/day 10-19 2 6
## 80 75+ 0-39g/day 30+ 1 3
## 81 75+ 40-79 0-9g/day 2 5
## 82 75+ 40-79 10-19 1 3
## 83 75+ 40-79 20-29 0 3
## 84 75+ 40-79 30+ 1 1
## 85 75+ 80-119 0-9g/day 1 1
## 86 75+ 80-119 10-19 1 1
## 87 75+ 120+ 0-9g/day 2 2
## 88 75+ 120+ 10-19 1 1
Melihat Informasi Spesifik Dataset
?trees
View(trees)Dataset esoph berasal dari studi kasus-kontrol dari (o) kanker kerongkongan di Ille-et-Vilaine, Prancis.
Dataset tersebut akan dilakukan perintah arrange , mutate ,filter,select, dan summarise.
Menyeleksi Data
esoph %>% select(-ncases)## agegp alcgp tobgp ncontrols
## 1 25-34 0-39g/day 0-9g/day 40
## 2 25-34 0-39g/day 10-19 10
## 3 25-34 0-39g/day 20-29 6
## 4 25-34 0-39g/day 30+ 5
## 5 25-34 40-79 0-9g/day 27
## 6 25-34 40-79 10-19 7
## 7 25-34 40-79 20-29 4
## 8 25-34 40-79 30+ 7
## 9 25-34 80-119 0-9g/day 2
## 10 25-34 80-119 10-19 1
## 11 25-34 80-119 30+ 2
## 12 25-34 120+ 0-9g/day 1
## 13 25-34 120+ 10-19 1
## 14 25-34 120+ 20-29 1
## 15 25-34 120+ 30+ 2
## 16 35-44 0-39g/day 0-9g/day 60
## 17 35-44 0-39g/day 10-19 14
## 18 35-44 0-39g/day 20-29 7
## 19 35-44 0-39g/day 30+ 8
## 20 35-44 40-79 0-9g/day 35
## 21 35-44 40-79 10-19 23
## 22 35-44 40-79 20-29 14
## 23 35-44 40-79 30+ 8
## 24 35-44 80-119 0-9g/day 11
## 25 35-44 80-119 10-19 6
## 26 35-44 80-119 20-29 2
## 27 35-44 80-119 30+ 1
## 28 35-44 120+ 0-9g/day 3
## 29 35-44 120+ 10-19 3
## 30 35-44 120+ 20-29 4
## 31 45-54 0-39g/day 0-9g/day 46
## 32 45-54 0-39g/day 10-19 18
## 33 45-54 0-39g/day 20-29 10
## 34 45-54 0-39g/day 30+ 4
## 35 45-54 40-79 0-9g/day 38
## 36 45-54 40-79 10-19 21
## 37 45-54 40-79 20-29 15
## 38 45-54 40-79 30+ 7
## 39 45-54 80-119 0-9g/day 16
## 40 45-54 80-119 10-19 14
## 41 45-54 80-119 20-29 5
## 42 45-54 80-119 30+ 4
## 43 45-54 120+ 0-9g/day 4
## 44 45-54 120+ 10-19 4
## 45 45-54 120+ 20-29 3
## 46 45-54 120+ 30+ 4
## 47 55-64 0-39g/day 0-9g/day 49
## 48 55-64 0-39g/day 10-19 22
## 49 55-64 0-39g/day 20-29 12
## 50 55-64 0-39g/day 30+ 6
## 51 55-64 40-79 0-9g/day 40
## 52 55-64 40-79 10-19 21
## 53 55-64 40-79 20-29 17
## 54 55-64 40-79 30+ 6
## 55 55-64 80-119 0-9g/day 18
## 56 55-64 80-119 10-19 15
## 57 55-64 80-119 20-29 6
## 58 55-64 80-119 30+ 4
## 59 55-64 120+ 0-9g/day 10
## 60 55-64 120+ 10-19 7
## 61 55-64 120+ 20-29 3
## 62 55-64 120+ 30+ 6
## 63 65-74 0-39g/day 0-9g/day 48
## 64 65-74 0-39g/day 10-19 14
## 65 65-74 0-39g/day 20-29 7
## 66 65-74 0-39g/day 30+ 2
## 67 65-74 40-79 0-9g/day 34
## 68 65-74 40-79 10-19 10
## 69 65-74 40-79 20-29 9
## 70 65-74 80-119 0-9g/day 13
## 71 65-74 80-119 10-19 12
## 72 65-74 80-119 20-29 3
## 73 65-74 80-119 30+ 1
## 74 65-74 120+ 0-9g/day 4
## 75 65-74 120+ 10-19 2
## 76 65-74 120+ 20-29 1
## 77 65-74 120+ 30+ 1
## 78 75+ 0-39g/day 0-9g/day 18
## 79 75+ 0-39g/day 10-19 6
## 80 75+ 0-39g/day 30+ 3
## 81 75+ 40-79 0-9g/day 5
## 82 75+ 40-79 10-19 3
## 83 75+ 40-79 20-29 3
## 84 75+ 40-79 30+ 1
## 85 75+ 80-119 0-9g/day 1
## 86 75+ 80-119 10-19 1
## 87 75+ 120+ 0-9g/day 2
## 88 75+ 120+ 10-19 1
Mengurutkan Data
Sintaks dibawah akan mengurutkan data secara descending (besar ke kecil)
esoph %>% arrange(desc(ncontrols))## agegp alcgp tobgp ncases ncontrols
## 1 35-44 0-39g/day 0-9g/day 0 60
## 2 55-64 0-39g/day 0-9g/day 2 49
## 3 65-74 0-39g/day 0-9g/day 5 48
## 4 45-54 0-39g/day 0-9g/day 1 46
## 5 25-34 0-39g/day 0-9g/day 0 40
## 6 55-64 40-79 0-9g/day 9 40
## 7 45-54 40-79 0-9g/day 6 38
## 8 35-44 40-79 0-9g/day 0 35
## 9 65-74 40-79 0-9g/day 17 34
## 10 25-34 40-79 0-9g/day 0 27
## 11 35-44 40-79 10-19 3 23
## 12 55-64 0-39g/day 10-19 3 22
## 13 45-54 40-79 10-19 4 21
## 14 55-64 40-79 10-19 6 21
## 15 45-54 0-39g/day 10-19 0 18
## 16 55-64 80-119 0-9g/day 9 18
## 17 75+ 0-39g/day 0-9g/day 1 18
## 18 55-64 40-79 20-29 4 17
## 19 45-54 80-119 0-9g/day 3 16
## 20 45-54 40-79 20-29 5 15
## 21 55-64 80-119 10-19 8 15
## 22 35-44 0-39g/day 10-19 1 14
## 23 35-44 40-79 20-29 1 14
## 24 45-54 80-119 10-19 6 14
## 25 65-74 0-39g/day 10-19 4 14
## 26 65-74 80-119 0-9g/day 6 13
## 27 55-64 0-39g/day 20-29 3 12
## 28 65-74 80-119 10-19 4 12
## 29 35-44 80-119 0-9g/day 0 11
## 30 25-34 0-39g/day 10-19 0 10
## 31 45-54 0-39g/day 20-29 0 10
## 32 55-64 120+ 0-9g/day 5 10
## 33 65-74 40-79 10-19 3 10
## 34 65-74 40-79 20-29 5 9
## 35 35-44 0-39g/day 30+ 0 8
## 36 35-44 40-79 30+ 0 8
## 37 25-34 40-79 10-19 0 7
## 38 25-34 40-79 30+ 0 7
## 39 35-44 0-39g/day 20-29 0 7
## 40 45-54 40-79 30+ 5 7
## 41 55-64 120+ 10-19 6 7
## 42 65-74 0-39g/day 20-29 2 7
## 43 25-34 0-39g/day 20-29 0 6
## 44 35-44 80-119 10-19 0 6
## 45 55-64 0-39g/day 30+ 4 6
## 46 55-64 40-79 30+ 3 6
## 47 55-64 80-119 20-29 3 6
## 48 55-64 120+ 30+ 5 6
## 49 75+ 0-39g/day 10-19 2 6
## 50 25-34 0-39g/day 30+ 0 5
## 51 45-54 80-119 20-29 1 5
## 52 75+ 40-79 0-9g/day 2 5
## 53 25-34 40-79 20-29 0 4
## 54 35-44 120+ 20-29 2 4
## 55 45-54 0-39g/day 30+ 0 4
## 56 45-54 80-119 30+ 2 4
## 57 45-54 120+ 0-9g/day 4 4
## 58 45-54 120+ 10-19 3 4
## 59 45-54 120+ 30+ 4 4
## 60 55-64 80-119 30+ 4 4
## 61 65-74 120+ 0-9g/day 3 4
## 62 35-44 120+ 0-9g/day 2 3
## 63 35-44 120+ 10-19 0 3
## 64 45-54 120+ 20-29 2 3
## 65 55-64 120+ 20-29 2 3
## 66 65-74 80-119 20-29 2 3
## 67 75+ 0-39g/day 30+ 1 3
## 68 75+ 40-79 10-19 1 3
## 69 75+ 40-79 20-29 0 3
## 70 25-34 80-119 0-9g/day 0 2
## 71 25-34 80-119 30+ 0 2
## 72 25-34 120+ 30+ 0 2
## 73 35-44 80-119 20-29 0 2
## 74 65-74 0-39g/day 30+ 0 2
## 75 65-74 120+ 10-19 1 2
## 76 75+ 120+ 0-9g/day 2 2
## 77 25-34 80-119 10-19 0 1
## 78 25-34 120+ 0-9g/day 0 1
## 79 25-34 120+ 10-19 1 1
## 80 25-34 120+ 20-29 0 1
## 81 35-44 80-119 30+ 0 1
## 82 65-74 80-119 30+ 1 1
## 83 65-74 120+ 20-29 1 1
## 84 65-74 120+ 30+ 1 1
## 85 75+ 40-79 30+ 1 1
## 86 75+ 80-119 0-9g/day 1 1
## 87 75+ 80-119 10-19 1 1
## 88 75+ 120+ 10-19 1 1
Melakukan Filter
Sintaks di bawah akan menampilkan data dengan kriteria tertentu .Data yang ditampilkan hanya data yang memiliki record 25-34pada kolom agegp
esoph %>% filter(agegp =="25-34")## agegp alcgp tobgp ncases ncontrols
## 1 25-34 0-39g/day 0-9g/day 0 40
## 2 25-34 0-39g/day 10-19 0 10
## 3 25-34 0-39g/day 20-29 0 6
## 4 25-34 0-39g/day 30+ 0 5
## 5 25-34 40-79 0-9g/day 0 27
## 6 25-34 40-79 10-19 0 7
## 7 25-34 40-79 20-29 0 4
## 8 25-34 40-79 30+ 0 7
## 9 25-34 80-119 0-9g/day 0 2
## 10 25-34 80-119 10-19 0 1
## 11 25-34 80-119 30+ 0 2
## 12 25-34 120+ 0-9g/day 0 1
## 13 25-34 120+ 10-19 1 1
## 14 25-34 120+ 20-29 0 1
## 15 25-34 120+ 30+ 0 2
Menambah Variabel Baru dengan fungsi mutate()
esoph %>% mutate(ntotal=ncases+ncontrols)## agegp alcgp tobgp ncases ncontrols ntotal
## 1 25-34 0-39g/day 0-9g/day 0 40 40
## 2 25-34 0-39g/day 10-19 0 10 10
## 3 25-34 0-39g/day 20-29 0 6 6
## 4 25-34 0-39g/day 30+ 0 5 5
## 5 25-34 40-79 0-9g/day 0 27 27
## 6 25-34 40-79 10-19 0 7 7
## 7 25-34 40-79 20-29 0 4 4
## 8 25-34 40-79 30+ 0 7 7
## 9 25-34 80-119 0-9g/day 0 2 2
## 10 25-34 80-119 10-19 0 1 1
## 11 25-34 80-119 30+ 0 2 2
## 12 25-34 120+ 0-9g/day 0 1 1
## 13 25-34 120+ 10-19 1 1 2
## 14 25-34 120+ 20-29 0 1 1
## 15 25-34 120+ 30+ 0 2 2
## 16 35-44 0-39g/day 0-9g/day 0 60 60
## 17 35-44 0-39g/day 10-19 1 14 15
## 18 35-44 0-39g/day 20-29 0 7 7
## 19 35-44 0-39g/day 30+ 0 8 8
## 20 35-44 40-79 0-9g/day 0 35 35
## 21 35-44 40-79 10-19 3 23 26
## 22 35-44 40-79 20-29 1 14 15
## 23 35-44 40-79 30+ 0 8 8
## 24 35-44 80-119 0-9g/day 0 11 11
## 25 35-44 80-119 10-19 0 6 6
## 26 35-44 80-119 20-29 0 2 2
## 27 35-44 80-119 30+ 0 1 1
## 28 35-44 120+ 0-9g/day 2 3 5
## 29 35-44 120+ 10-19 0 3 3
## 30 35-44 120+ 20-29 2 4 6
## 31 45-54 0-39g/day 0-9g/day 1 46 47
## 32 45-54 0-39g/day 10-19 0 18 18
## 33 45-54 0-39g/day 20-29 0 10 10
## 34 45-54 0-39g/day 30+ 0 4 4
## 35 45-54 40-79 0-9g/day 6 38 44
## 36 45-54 40-79 10-19 4 21 25
## 37 45-54 40-79 20-29 5 15 20
## 38 45-54 40-79 30+ 5 7 12
## 39 45-54 80-119 0-9g/day 3 16 19
## 40 45-54 80-119 10-19 6 14 20
## 41 45-54 80-119 20-29 1 5 6
## 42 45-54 80-119 30+ 2 4 6
## 43 45-54 120+ 0-9g/day 4 4 8
## 44 45-54 120+ 10-19 3 4 7
## 45 45-54 120+ 20-29 2 3 5
## 46 45-54 120+ 30+ 4 4 8
## 47 55-64 0-39g/day 0-9g/day 2 49 51
## 48 55-64 0-39g/day 10-19 3 22 25
## 49 55-64 0-39g/day 20-29 3 12 15
## 50 55-64 0-39g/day 30+ 4 6 10
## 51 55-64 40-79 0-9g/day 9 40 49
## 52 55-64 40-79 10-19 6 21 27
## 53 55-64 40-79 20-29 4 17 21
## 54 55-64 40-79 30+ 3 6 9
## 55 55-64 80-119 0-9g/day 9 18 27
## 56 55-64 80-119 10-19 8 15 23
## 57 55-64 80-119 20-29 3 6 9
## 58 55-64 80-119 30+ 4 4 8
## 59 55-64 120+ 0-9g/day 5 10 15
## 60 55-64 120+ 10-19 6 7 13
## 61 55-64 120+ 20-29 2 3 5
## 62 55-64 120+ 30+ 5 6 11
## 63 65-74 0-39g/day 0-9g/day 5 48 53
## 64 65-74 0-39g/day 10-19 4 14 18
## 65 65-74 0-39g/day 20-29 2 7 9
## 66 65-74 0-39g/day 30+ 0 2 2
## 67 65-74 40-79 0-9g/day 17 34 51
## 68 65-74 40-79 10-19 3 10 13
## 69 65-74 40-79 20-29 5 9 14
## 70 65-74 80-119 0-9g/day 6 13 19
## 71 65-74 80-119 10-19 4 12 16
## 72 65-74 80-119 20-29 2 3 5
## 73 65-74 80-119 30+ 1 1 2
## 74 65-74 120+ 0-9g/day 3 4 7
## 75 65-74 120+ 10-19 1 2 3
## 76 65-74 120+ 20-29 1 1 2
## 77 65-74 120+ 30+ 1 1 2
## 78 75+ 0-39g/day 0-9g/day 1 18 19
## 79 75+ 0-39g/day 10-19 2 6 8
## 80 75+ 0-39g/day 30+ 1 3 4
## 81 75+ 40-79 0-9g/day 2 5 7
## 82 75+ 40-79 10-19 1 3 4
## 83 75+ 40-79 20-29 0 3 3
## 84 75+ 40-79 30+ 1 1 2
## 85 75+ 80-119 0-9g/day 1 1 2
## 86 75+ 80-119 10-19 1 1 2
## 87 75+ 120+ 0-9g/day 2 2 4
## 88 75+ 120+ 10-19 1 1 2
Menggunakan fungsi summarise()
Sintaks di bawah akan menampilkan rata-rata dari peubah ncontrols pada tiap kategori yang berbeda pada kolom agegp
esoph %>% group_by(agegp) %>% summarise(mean=mean(ncontrols))## # A tibble: 6 x 2
## agegp mean
## <ord> <dbl>
## 1 25-34 7.73
## 2 35-44 13.3
## 3 45-54 13.3
## 4 55-64 15.1
## 5 65-74 10.7
## 6 75+ 4
Melakukan kelima perintah bersamaan
esoph %>% select(agegp,ncontrols,ncases)%>% arrange(desc(ncontrols))%>% filter(agegp =="25-34") %>% mutate(ntotal=ncontrols + ncases) %>% group_by(agegp) %>% summarise(mean=mean(ncontrols))## # A tibble: 1 x 2
## agegp mean
## <ord> <dbl>
## 1 25-34 7.73