Mempersiapkan package (pastikan package sudah terinstall dengan command <install.packages(“nama_package”)>):
library(RSQLite)
library(DBI)
Connect database
connect <- dbConnect(SQLite(), "/Users/User/Documents/DBFiles/sqlite-sakila.db")
Beberapa kasus syntax disimpan ke dalam object baru bertipe character (tanpa diakhiri tanda “;”) jika terlalu panjang. Format menyusun syntax dan mengambil database di R bisa dengan: <dbGetQuery(connector, syntax)>
Merupakan query paling dasar dalam SQL, untuk menyeleksi data. Dalam hal ini, dilakukan seleksi semua baris dan kolom dalam tabel payment.
payment <- dbGetQuery(connect, "select * from payment")
nrow(payment)
## [1] 16049
#menampilkan sebagian data teratas
head(payment, 20)
## payment_id customer_id staff_id rental_id amount payment_date
## 1 1 1 1 76 2.99 2005-05-25 11:30:37.000
## 2 2 1 1 573 0.99 2005-05-28 10:35:23.000
## 3 3 1 1 1185 5.99 2005-06-15 00:54:12.000
## 4 4 1 2 1422 0.99 2005-06-15 18:02:53.000
## 5 5 1 2 1476 9.99 2005-06-15 21:08:46.000
## 6 6 1 1 1725 4.99 2005-06-16 15:18:57.000
## 7 7 1 1 2308 4.99 2005-06-18 08:41:48.000
## 8 8 1 2 2363 0.99 2005-06-18 13:33:59.000
## 9 9 1 1 3284 3.99 2005-06-21 06:24:45.000
## 10 10 1 2 4526 5.99 2005-07-08 03:17:05.000
## 11 11 1 1 4611 5.99 2005-07-08 07:33:56.000
## 12 12 1 1 5244 4.99 2005-07-09 13:24:07.000
## 13 13 1 1 5326 4.99 2005-07-09 16:38:01.000
## 14 14 1 1 6163 7.99 2005-07-11 10:13:46.000
## 15 15 1 2 7273 2.99 2005-07-27 11:31:22.000
## 16 16 1 1 7841 4.99 2005-07-28 09:04:45.000
## 17 17 1 2 8033 4.99 2005-07-28 16:18:23.000
## 18 18 1 1 8074 0.99 2005-07-28 17:33:39.000
## 19 19 1 2 8116 0.99 2005-07-28 19:20:07.000
## 20 20 1 2 8326 2.99 2005-07-29 03:58:49.000
## last_update
## 1 2021-03-06 15:55:57
## 2 2021-03-06 15:55:57
## 3 2021-03-06 15:55:57
## 4 2021-03-06 15:55:57
## 5 2021-03-06 15:55:57
## 6 2021-03-06 15:55:57
## 7 2021-03-06 15:55:57
## 8 2021-03-06 15:55:57
## 9 2021-03-06 15:55:57
## 10 2021-03-06 15:55:57
## 11 2021-03-06 15:55:57
## 12 2021-03-06 15:55:57
## 13 2021-03-06 15:55:57
## 14 2021-03-06 15:55:57
## 15 2021-03-06 15:55:57
## 16 2021-03-06 15:55:57
## 17 2021-03-06 15:55:57
## 18 2021-03-06 15:55:57
## 19 2021-03-06 15:55:57
## 20 2021-03-06 15:55:57
Digunakan untuk mengurutkan data berdasarkan kriteria kolom tertentu. Dalam hal ini, dilakukan seleksi beberapa kolom dalam tabel payment (Payment ID, Rental ID, Amount, Payment Date) dan mengurutkannya berdasarkan transaksi dengan amount (nominal) terbesar.
orderby1 <- "select payment_id
, rental_id
, amount
, payment_date
from payment
order by amount desc"
biggestAmount.Payment <- dbGetQuery(connect, orderby1)
nrow(biggestAmount.Payment)
## [1] 16049
head(biggestAmount.Payment, 20)
## payment_id rental_id amount payment_date
## 1 342 8831 11.99 2005-07-29 22:37:41.000
## 2 3146 14763 11.99 2005-08-21 23:34:00.000
## 3 5280 16040 11.99 2005-08-23 22:19:33.000
## 4 5281 106 11.99 2005-05-25 18:18:19.000
## 5 5550 15415 11.99 2005-08-22 23:48:56.000
## 6 6409 11479 11.99 2005-08-02 22:18:13.000
## 7 8272 2166 11.99 2005-06-17 23:51:21.000
## 8 9803 14759 11.99 2005-08-21 23:28:58.000
## 9 15821 4383 11.99 2005-07-07 20:45:51.000
## 10 15850 3973 11.99 2005-07-06 22:58:31.000
## 11 44 9236 10.99 2005-07-30 13:47:43.000
## 12 69 7503 10.99 2005-07-27 20:23:12.000
## 13 324 10392 10.99 2005-08-01 06:50:26.000
## 14 551 3212 10.99 2005-06-21 01:04:35.000
## 15 793 5442 10.99 2005-07-09 21:55:19.000
## 16 908 1301 10.99 2005-06-15 09:46:33.000
## 17 1254 15812 10.99 2005-08-23 14:47:26.000
## 18 1347 8553 10.99 2005-07-29 11:15:36.000
## 19 1379 9691 10.99 2005-07-31 07:09:55.000
## 20 1511 15226 10.99 2005-08-22 17:20:17.000
Diperoleh bahwa transaksi penyewaan DVD terbesar adalah dengan nominal $11.99.
Mengambil data berbeda dalam tabel:
# Ada 200 baris dalam tabel actor
actor1 <- dbGetQuery(connect, "select * from actor")
nrow(actor1)
## [1] 200
head(actor1, 20)
## actor_id first_name last_name last_update
## 1 1 PENELOPE GUINESS 2021-03-06 15:51:59
## 2 2 NICK WAHLBERG 2021-03-06 15:51:59
## 3 3 ED CHASE 2021-03-06 15:51:59
## 4 4 JENNIFER DAVIS 2021-03-06 15:51:59
## 5 5 JOHNNY LOLLOBRIGIDA 2021-03-06 15:51:59
## 6 6 BETTE NICHOLSON 2021-03-06 15:51:59
## 7 7 GRACE MOSTEL 2021-03-06 15:51:59
## 8 8 MATTHEW JOHANSSON 2021-03-06 15:51:59
## 9 9 JOE SWANK 2021-03-06 15:51:59
## 10 10 CHRISTIAN GABLE 2021-03-06 15:51:59
## 11 11 ZERO CAGE 2021-03-06 15:51:59
## 12 12 KARL BERRY 2021-03-06 15:51:59
## 13 13 UMA WOOD 2021-03-06 15:51:59
## 14 14 VIVIEN BERGEN 2021-03-06 15:51:59
## 15 15 CUBA OLIVIER 2021-03-06 15:51:59
## 16 16 FRED COSTNER 2021-03-06 15:51:59
## 17 17 HELEN VOIGHT 2021-03-06 15:51:59
## 18 18 DAN TORN 2021-03-06 15:51:59
## 19 19 BOB FAWCETT 2021-03-06 15:51:59
## 20 20 LUCILLE TRACY 2021-03-06 15:51:59
# Mengambil jumlah actor dengan nama belakang berbeda
selectdistinct1 <- "select count(distinct last_name)
from actor
where last_name is not null"
diffLastName.Actor <- dbGetQuery(connect, selectdistinct1)
diffLastName.Actor
## count(distinct last_name)
## 1 121
Ada 121 aktor dengan nama belakang berbeda dari 200 aktor film yang terdaftar.
Digunakan untuk menyeleksi kolom/baris dengan kriteria/kekhususan tertentu. Dalam hal ini dilakukan seleksi customer mana yang tidak aktif (active=0).
active0.Customer <- dbGetQuery(connect, "select * from customer where active=0")
active0.Customer
## customer_id store_id first_name last_name
## 1 16 2 SANDRA MARTIN
## 2 64 2 JUDITH COX
## 3 124 1 SHEILA WELLS
## 4 169 2 ERICA MATTHEWS
## 5 241 2 HEIDI LARSON
## 6 271 1 PENNY NEAL
## 7 315 2 KENNETH GOODEN
## 8 368 1 HARRY ARCE
## 9 406 1 NATHAN RUNYON
## 10 446 2 THEODORE CULP
## 11 482 1 MAURICE CRAWLEY
## 12 510 2 BEN EASTER
## 13 534 1 CHRISTIAN JUNG
## 14 558 1 JIMMIE EGGLESTON
## 15 592 1 TERRANCE ROUSH
## email address_id active
## 1 SANDRA.MARTIN@sakilacustomer.org 20 0
## 2 JUDITH.COX@sakilacustomer.org 68 0
## 3 SHEILA.WELLS@sakilacustomer.org 128 0
## 4 ERICA.MATTHEWS@sakilacustomer.org 173 0
## 5 HEIDI.LARSON@sakilacustomer.org 245 0
## 6 PENNY.NEAL@sakilacustomer.org 276 0
## 7 KENNETH.GOODEN@sakilacustomer.org 320 0
## 8 HARRY.ARCE@sakilacustomer.org 373 0
## 9 NATHAN.RUNYON@sakilacustomer.org 411 0
## 10 THEODORE.CULP@sakilacustomer.org 451 0
## 11 MAURICE.CRAWLEY@sakilacustomer.org 487 0
## 12 BEN.EASTER@sakilacustomer.org 515 0
## 13 CHRISTIAN.JUNG@sakilacustomer.org 540 0
## 14 JIMMIE.EGGLESTON@sakilacustomer.org 564 0
## 15 TERRANCE.ROUSH@sakilacustomer.org 598 0
## create_date last_update
## 1 2006-02-14 22:04:36.000 2021-03-06 15:53:36
## 2 2006-02-14 22:04:36.000 2021-03-06 15:53:36
## 3 2006-02-14 22:04:36.000 2021-03-06 15:53:37
## 4 2006-02-14 22:04:36.000 2021-03-06 15:53:37
## 5 2006-02-14 22:04:36.000 2021-03-06 15:53:38
## 6 2006-02-14 22:04:36.000 2021-03-06 15:53:38
## 7 2006-02-14 22:04:37.000 2021-03-06 15:53:38
## 8 2006-02-14 22:04:37.000 2021-03-06 15:53:39
## 9 2006-02-14 22:04:37.000 2021-03-06 15:53:39
## 10 2006-02-14 22:04:37.000 2021-03-06 15:53:39
## 11 2006-02-14 22:04:37.000 2021-03-06 15:53:40
## 12 2006-02-14 22:04:37.000 2021-03-06 15:53:40
## 13 2006-02-14 22:04:37.000 2021-03-06 15:53:40
## 14 2006-02-14 22:04:37.000 2021-03-06 15:53:40
## 15 2006-02-14 22:04:37.000 2021-03-06 15:53:41
Ada 15 record customer yang tidak aktif.
Digunakan untuk mengurutkan data dengan 10 transaksi yg memiliki nominal terbesar.
orderby.limit10 <- "select payment_id
, rental_id
, amount
, payment_date
from payment
order by amount desc
limit 10"
best10Amount.Payment <- dbGetQuery(connect, orderby.limit10)
best10Amount.Payment
## payment_id rental_id amount payment_date
## 1 342 8831 11.99 2005-07-29 22:37:41.000
## 2 3146 14763 11.99 2005-08-21 23:34:00.000
## 3 5280 16040 11.99 2005-08-23 22:19:33.000
## 4 5281 106 11.99 2005-05-25 18:18:19.000
## 5 5550 15415 11.99 2005-08-22 23:48:56.000
## 6 6409 11479 11.99 2005-08-02 22:18:13.000
## 7 8272 2166 11.99 2005-06-17 23:51:21.000
## 8 9803 14759 11.99 2005-08-21 23:28:58.000
## 9 15821 4383 11.99 2005-07-07 20:45:51.000
## 10 15850 3973 11.99 2005-07-06 22:58:31.000
Digunakan sebagai pengkhususan clausa WHERE ketika meminta berdasarkan suatu nilai dengan batas bawah dan atas tertentu. Misalkan ingin menyeleksi transaksi yang bernilai $5-$10 diurutkan dari yang terbesar.
between510 <- "select payment_id
, rental_id
, amount
, payment_date
from payment
where amount
between 5 and 10
order by amount desc"
between510Amount.Payment <- dbGetQuery(connect, between510)
nrow(between510Amount.Payment)
## [1] 3843
head(between510Amount.Payment, 20)
## payment_id rental_id amount payment_date
## 1 5 1476 9.99 2005-06-15 21:08:46.000
## 2 137 12145 9.99 2005-08-18 00:10:04.000
## 3 224 11232 9.99 2005-08-02 13:04:12.000
## 4 292 8950 9.99 2005-07-30 03:17:13.000
## 5 343 9260 9.99 2005-07-30 14:38:22.000
## 6 355 14252 9.99 2005-08-21 05:44:07.000
## 7 359 346 9.99 2005-05-27 04:34:41.000
## 8 491 110 9.99 2005-05-25 18:43:49.000
## 9 503 10077 9.99 2005-07-31 20:01:06.000
## 10 507 12211 9.99 2005-08-18 02:31:18.000
## 11 572 13699 9.99 2005-08-20 09:26:14.000
## 12 593 13060 9.99 2005-08-19 09:43:25.000
## 13 619 9255 9.99 2005-07-30 14:26:46.000
## 14 689 2093 9.99 2005-06-17 18:14:08.000
## 15 764 7580 9.99 2005-07-27 23:07:40.000
## 16 826 5108 9.99 2005-07-09 06:44:30.000
## 17 866 4738 9.99 2005-07-08 13:24:58.000
## 18 889 5307 9.99 2005-07-09 15:57:15.000
## 19 902 12414 9.99 2005-08-18 09:50:40.000
## 20 1009 14647 9.99 2005-08-21 19:15:33.000
GROUP BY dipakai untuk menyeleksi gabungan baris dengan nilai yang sama dalam daftar bidang tertentu menjadi satu baris Sedangkan HAVING digunakan untuk menggantikan WHERE disertai dengan fungsi agregat, karena WHERE tidak bisa digunakan bersamaan dengan fungsi agregat. Klausa HAVING diletakkan setelah GROUP BY. Beberapa fungsi agregat yang menyertai HAVING antara lain: SUM(), COUNT(), VARIANCE(), dan lain-lain. Misalkan ingin menyeleksi total pembayaran yang terbesar dan dilakukan lebih dari 30 kali transaksi berdasarkan masing-masing customer:
groupby.having <- "select customer_id
, count(customer_id)
as frequency, sum(amount)
as total_amount
from payment
group by customer_id
having frequency > 30
order by frequency desc"
more30Transactions.Customer <- dbGetQuery(connect, groupby.having)
nrow(more30Transactions.Customer)
## [1] 134
head(more30Transactions.Customer, 20)
## customer_id frequency total_amount
## 1 148 46 216.54
## 2 526 45 221.55
## 3 144 42 195.58
## 4 236 42 175.58
## 5 75 41 155.59
## 6 197 40 154.60
## 7 469 40 177.60
## 8 137 39 194.61
## 9 178 39 194.61
## 10 468 39 175.61
## 11 5 38 144.62
## 12 295 38 162.62
## 13 410 38 167.62
## 14 459 38 186.62
## 15 176 37 173.63
## 16 198 37 141.63
## 17 257 37 151.63
## 18 366 37 152.63
## 19 29 36 140.64
## 20 267 36 159.64
Ada beberapa cara untuk membatasi banyak return record dari sebuah query, hal ini dimaksudkan untuk melihat keluaran teratas/gambaran sebagian dari keseluruhan data yang dipanggil agar tidak memenuhi output page. Beberapa diantaranya adalah LIMIT dan OFFSET. LIMIT membatasi pada record dengan jumlah tertentu yang ditampilkan, sedangkan OFFSET menampilkan data dengan melewatkan record teratas sebanyak yang ditentukan. Misalkan dalam contoh ini ingin ditampilkan biodata customer yang meliputi id, nama lengkap yang merupakan gabungan first_name dan last_name, email, kode pos, kota, dan negara yang dibatasi 20 record dengan offset 10. Sekedar catatan, offset bisa lebih besar dari nilai argumen limit.
# Jika offset < limit
query1 <- "select c.customer_id
, c.first_name || ' ' || c.last_name as name
, c.email
, a.postal_code
, t.city
, o.country
from customer as c
join address as a on a.address_id=c.address_id
join city as t on t.city_id=a.city_id
join country as o on o.country_id=t.country_id
limit 20
offset 10"
customerProfile1 <- dbGetQuery(connect, query1)
customerProfile1
## customer_id name email postal_code
## 1 11 LISA ANDERSON LISA.ANDERSON@sakilacustomer.org 1027
## 2 12 NANCY THOMAS NANCY.THOMAS@sakilacustomer.org 10672
## 3 13 KAREN JACKSON KAREN.JACKSON@sakilacustomer.org 29610
## 4 14 BETTY WHITE BETTY.WHITE@sakilacustomer.org 16266
## 5 15 HELEN HARRIS HELEN.HARRIS@sakilacustomer.org 72878
## 6 16 SANDRA MARTIN SANDRA.MARTIN@sakilacustomer.org 54308
## 7 17 DONNA THOMPSON DONNA.THOMPSON@sakilacustomer.org 81766
## 8 18 CAROL GARCIA CAROL.GARCIA@sakilacustomer.org 43331
## 9 19 RUTH MARTINEZ RUTH.MARTINEZ@sakilacustomer.org 72192
## 10 20 SHARON ROBINSON SHARON.ROBINSON@sakilacustomer.org 21954
## 11 21 MICHELLE CLARK MICHELLE.CLARK@sakilacustomer.org 34418
## 12 22 LAURA RODRIGUEZ LAURA.RODRIGUEZ@sakilacustomer.org 37551
## 13 23 SARAH LEWIS SARAH.LEWIS@sakilacustomer.org 7716
## 14 24 KIMBERLY LEE KIMBERLY.LEE@sakilacustomer.org 99865
## 15 25 DEBORAH WALKER DEBORAH.WALKER@sakilacustomer.org 99780
## 16 26 JESSICA HALL JESSICA.HALL@sakilacustomer.org 58327
## 17 27 SHIRLEY ALLEN SHIRLEY.ALLEN@sakilacustomer.org 49521
## 18 28 CYNTHIA YOUNG CYNTHIA.YOUNG@sakilacustomer.org 65599
## 19 29 ANGELA HERNANDEZ ANGELA.HERNANDEZ@sakilacustomer.org 65750
## 20 30 MELISSA KING MELISSA.KING@sakilacustomer.org 50199
## city country
## 1 Sagamihara Japan
## 2 Yamuna Nagar India
## 3 Osmaniye Turkey
## 4 Citrus Heights United States
## 5 Bhopal India
## 6 Southend-on-Sea United Kingdom
## 7 Elista Russian Federation
## 8 Kaduna Nigeria
## 9 Kimberley South Africa
## 10 Mardan Pakistan
## 11 Tangail Bangladesh
## 12 Sal Morocco
## 13 Liepaja Latvia
## 14 Crdoba Argentina
## 15 Shikarpur Pakistan
## 16 Citt del Vaticano Holy See (Vatican City State)
## 17 Davao Philippines
## 18 Munger (Monghyr) India
## 19 Shimonoseki Japan
## 20 Lungtan Taiwan
# Jika offset > limit
query2 <- "select c.customer_id
, c.first_name || ' ' || c.last_name as name
, c.email
, a.postal_code
, t.city
, o.country
from customer as c
join address as a on a.address_id=c.address_id
join city as t on t.city_id=a.city_id
join country as o on o.country_id=t.country_id
limit 20
offset 30"
customerProfile2 <- dbGetQuery(connect, query2)
customerProfile2
## customer_id name email
## 1 31 BRENDA WRIGHT BRENDA.WRIGHT@sakilacustomer.org
## 2 32 AMY LOPEZ AMY.LOPEZ@sakilacustomer.org
## 3 33 ANNA HILL ANNA.HILL@sakilacustomer.org
## 4 34 REBECCA SCOTT REBECCA.SCOTT@sakilacustomer.org
## 5 35 VIRGINIA GREEN VIRGINIA.GREEN@sakilacustomer.org
## 6 36 KATHLEEN ADAMS KATHLEEN.ADAMS@sakilacustomer.org
## 7 37 PAMELA BAKER PAMELA.BAKER@sakilacustomer.org
## 8 38 MARTHA GONZALEZ MARTHA.GONZALEZ@sakilacustomer.org
## 9 39 DEBRA NELSON DEBRA.NELSON@sakilacustomer.org
## 10 40 AMANDA CARTER AMANDA.CARTER@sakilacustomer.org
## 11 41 STEPHANIE MITCHELL STEPHANIE.MITCHELL@sakilacustomer.org
## 12 42 CAROLYN PEREZ CAROLYN.PEREZ@sakilacustomer.org
## 13 43 CHRISTINE ROBERTS CHRISTINE.ROBERTS@sakilacustomer.org
## 14 44 MARIE TURNER MARIE.TURNER@sakilacustomer.org
## 15 45 JANET PHILLIPS JANET.PHILLIPS@sakilacustomer.org
## 16 46 CATHERINE CAMPBELL CATHERINE.CAMPBELL@sakilacustomer.org
## 17 47 FRANCES PARKER FRANCES.PARKER@sakilacustomer.org
## 18 48 ANN EVANS ANN.EVANS@sakilacustomer.org
## 19 49 JOYCE EDWARDS JOYCE.EDWARDS@sakilacustomer.org
## 20 50 DIANE COLLINS DIANE.COLLINS@sakilacustomer.org
## postal_code city country
## 1 30348 Kamarhati India
## 2 65213 Jhansi India
## 3 79388 Alessandria Italy
## 4 94065 Kurashiki Japan
## 5 34021 Toulouse France
## 6 38145 Arak Iran
## 7 47929 Nanyang China
## 8 34689 Chisinau Moldova
## 9 3989 Vila Velha Brazil
## 10 94399 Nador Morocco
## 11 16744 Yerevan Armenia
## 12 61117 Pak Kret Thailand
## 13 48942 Faaa French Polynesia
## 14 76022 Lipetsk Russian Federation
## 15 37359 Antofagasta Chile
## 16 23616 Moscow Russian Federation
## 17 52535 Juazeiro do Norte Brazil
## 18 69367 Niznekamsk Russian Federation
## 19 74428 Jedda Saudi Arabia
## 20 80168 Omdurman Sudan
Jika offset > limit, offset dieksekusi dulu baru limit.
IN digunakan untuk menspesifikasikan beberapa value yang dikhususkan dengan klausa WHERE, sehingga IN terletak setelah WHERE. Argumen setelah IN memungkinkan data menyeleksi semua record yang memenuhi kriteria tersebut, sedangkan NOT IN memungkinkan data menyeleksi SELAIN semua record yang terpenuhi dalam argumen IN.
# In
query3 <- "select f.film_id
, f.title
, l.name language
, c.name category
from film f
join language l on l.language_id=f.language_id
join film_category fc on fc.film_id=f.film_id
join category c on c.category_id=fc.category_id
where category in('Sci-Fi', 'Animation', 'Family', 'Children')"
childFilms <- dbGetQuery(connect, query3)
nrow(childFilms)
## [1] 256
head(childFilms, 20)
## film_id title language category
## 1 5 AFRICAN EGG English Family
## 2 18 ALTER VICTORY English Animation
## 3 23 ANACONDA CONFESSIONS English Animation
## 4 26 ANNIE IDENTITY English Sci-Fi
## 5 31 APACHE DIVINE English Family
## 6 36 ARGONAUTS TOWN English Animation
## 7 39 ARMAGEDDON LOST English Sci-Fi
## 8 43 ATLANTIS CAUSE English Family
## 9 44 ATTACKS HATE English Sci-Fi
## 10 48 BACKLASH UNDEFEATED English Children
## 11 49 BADMAN DAWN English Sci-Fi
## 12 50 BAKED CLEOPATRA English Family
## 13 53 BANG KWAI English Family
## 14 55 BARBARELLA STREETCAR English Sci-Fi
## 15 59 BEAR GRACELAND English Children
## 16 63 BEDAZZLED MARRIED English Family
## 17 66 BENEATH RUSH English Children
## 18 68 BETRAYED REAR English Children
## 19 69 BEVERLY OUTLAW English Sci-Fi
## 20 70 BIKINI BORROWERS English Animation
# Not In
query4 <- "select f.film_id
, f.title
, l.name language
, c.name category
from film f
join language l on l.language_id=f.language_id
join film_category fc on fc.film_id=f.film_id
join category c on c.category_id=fc.category_id
where language not in('English', 'Japanese')"
languange.NotJPN_ENG <- dbGetQuery(connect, query4)
languange.NotJPN_ENG
## [1] film_id title language category
## <0 rows> (or 0-length row.names)
Field dengan nilai NULL berarti ia adalah field yang tidak memiliki nilai. Berbeda dengan field yang bernilai 0 atau spasi/karakter kosong. Adanya field dengan nilai NULL akan menyulitkan proses pengolahan data dan analisis. IS NULL digunakan untuk menyeleksi field pada kolom tertentu yang memiliki nilai NULL, sedangkan IS NOT NULL adalah sebaliknya. Sama seperti sebelumnya, IS NULL dan IS NOT NULL berada setelah klausa WHERE.
# IS NULL
query5 <- "select c.customer_id
, c.first_name || ' ' || c.last_name as customer
, s.first_name || ' ' || s.last_name as staff
, r.rental_date
, r.return_date
from customer c
join rental r on r.customer_id=c.customer_id
join staff s on s.staff_id=r.staff_id
where r.return_date is null"
returnDate.Null <- dbGetQuery(connect, query5)
nrow(returnDate.Null)
## [1] 183
head(returnDate.Null, 20)
## customer_id customer staff rental_date return_date
## 1 155 GAIL KNIGHT Mike Hillyer 2006-02-14 15:16:03.000 NA
## 2 335 GREGORY MAULDIN Mike Hillyer 2006-02-14 15:16:03.000 NA
## 3 83 LOUISE JENKINS Mike Hillyer 2006-02-14 15:16:03.000 NA
## 4 219 WILLIE HOWELL Jon Stephens 2006-02-14 15:16:03.000 NA
## 5 99 EMILY DIAZ Mike Hillyer 2006-02-14 15:16:03.000 NA
## 6 192 LAURIE LAWRENCE Jon Stephens 2006-02-14 15:16:03.000 NA
## 7 11 LISA ANDERSON Jon Stephens 2006-02-14 15:16:03.000 NA
## 8 597 FREDDIE DUGGAN Jon Stephens 2006-02-14 15:16:03.000 NA
## 9 53 HEATHER MORRIS Jon Stephens 2006-02-14 15:16:03.000 NA
## 10 521 ROLAND SOUTH Jon Stephens 2006-02-14 15:16:03.000 NA
## 11 216 NATALIE MEYER Jon Stephens 2006-02-14 15:16:03.000 NA
## 12 330 SCOTT SHELLEY Mike Hillyer 2006-02-14 15:16:03.000 NA
## 13 373 LOUIS LEONE Jon Stephens 2006-02-14 15:16:03.000 NA
## 14 163 CATHY SPENCER Jon Stephens 2006-02-14 15:16:03.000 NA
## 15 550 GUY BROWNLEE Jon Stephens 2006-02-14 15:16:03.000 NA
## 16 354 JUSTIN NGO Mike Hillyer 2006-02-14 15:16:03.000 NA
## 17 337 JERRY JORDON Mike Hillyer 2006-02-14 15:16:03.000 NA
## 18 152 ALICIA MILLS Mike Hillyer 2006-02-14 15:16:03.000 NA
## 19 200 JEANNE LAWSON Jon Stephens 2006-02-14 15:16:03.000 NA
## 20 474 DUSTIN GILLETTE Mike Hillyer 2006-02-14 15:16:03.000 NA
# IS NOT NULL
query6 <- "select c.customer_id
, c.first_name || ' ' || c.last_name as customer
, s.first_name || ' ' || s.last_name as staff
, r.rental_date
, r.return_date
from customer c
join rental r on r.customer_id=c.customer_id
join staff s on s.staff_id=r.staff_id
where r.return_date is not null"
returnDate.NotNull <- dbGetQuery(connect, query6)
nrow(returnDate.NotNull)
## [1] 15861
head(returnDate.NotNull, 20)
## customer_id customer staff rental_date
## 1 130 CHARLOTTE HUNTER Mike Hillyer 2005-05-24 22:53:30.000
## 2 459 TOMMY COLLAZO Mike Hillyer 2005-05-24 22:54:33.000
## 3 408 MANUEL MURRELL Mike Hillyer 2005-05-24 23:03:39.000
## 4 333 ANDREW PURDY Jon Stephens 2005-05-24 23:04:41.000
## 5 222 DELORES HANSEN Mike Hillyer 2005-05-24 23:05:21.000
## 6 549 NELSON CHRISTENSON Mike Hillyer 2005-05-24 23:08:07.000
## 7 269 CASSANDRA WALTERS Jon Stephens 2005-05-24 23:11:53.000
## 8 239 MINNIE ROMERO Jon Stephens 2005-05-24 23:31:46.000
## 9 126 ELLEN SIMPSON Mike Hillyer 2005-05-25 00:00:40.000
## 10 399 DANNY ISOM Jon Stephens 2005-05-25 00:02:21.000
## 11 142 APRIL BURNS Jon Stephens 2005-05-25 00:09:02.000
## 12 261 DEANNA BYRD Jon Stephens 2005-05-25 00:19:27.000
## 13 334 RAYMOND MCWHORTER Mike Hillyer 2005-05-25 00:22:55.000
## 14 446 THEODORE CULP Mike Hillyer 2005-05-25 00:31:15.000
## 15 319 RONALD WEINER Mike Hillyer 2005-05-25 00:39:22.000
## 16 316 STEVEN CURLEY Jon Stephens 2005-05-25 00:43:11.000
## 17 575 ISAAC OGLESBY Mike Hillyer 2005-05-25 01:06:36.000
## 18 19 RUTH MARTINEZ Jon Stephens 2005-05-25 01:10:47.000
## 19 456 RONNIE RICKETTS Mike Hillyer 2005-05-25 01:17:24.000
## 20 185 ROBERTA HARPER Jon Stephens 2005-05-25 01:48:41.000
## return_date
## 1 2005-05-26 22:04:30.000
## 2 2005-05-28 19:40:33.000
## 3 2005-06-01 22:12:39.000
## 4 2005-06-03 01:43:41.000
## 5 2005-06-02 04:33:21.000
## 6 2005-05-27 01:32:07.000
## 7 2005-05-29 20:34:53.000
## 8 2005-05-27 23:33:46.000
## 9 2005-05-28 00:22:40.000
## 10 2005-05-31 22:44:21.000
## 11 2005-06-02 20:56:02.000
## 12 2005-05-30 05:44:27.000
## 13 2005-05-30 04:28:55.000
## 14 2005-05-26 02:56:15.000
## 15 2005-06-03 03:30:22.000
## 16 2005-05-26 04:42:11.000
## 17 2005-05-27 00:43:36.000
## 18 2005-05-31 06:35:47.000
## 19 2005-05-31 06:00:24.000
## 20 2005-05-27 02:20:41.000
Klausa WHERE juga dapat dikombinasikan dengan operator AND dan OR. Sama seperti operator logic pada umumnya, AND menampilkan record yang memenuhi semua kriteria yang dipisahkan oleh AND, sedangkan OR menampilkan record yang memenuhi salah satu dari kriteria yang dipisahkan oleh OR.
query7 <- "select film_id
, title
, rental_duration
, rental_rate
, length
, rating
from film
--where rental_rate <= 1 or rental_rate > 4;
--where rental_duration < 4 and rental_rate > 4;
where rating in('R', 'G') and (length > 150 or rental_rate < 2) and rental_duration > 5"
query.AndOr <- dbGetQuery(connect, query7)
query.AndOr
## film_id title rental_duration rental_rate length rating
## 1 11 ALAMO VIDEOTAPE 6 0.99 126 G
## 2 24 ANALYZE HOOSIERS 6 2.99 181 R
## 3 43 ATLANTIS CAUSE 6 2.99 170 G
## 4 49 BADMAN DAWN 6 2.99 162 R
## 5 52 BALLROOM MOCKINGBIRD 6 0.99 173 G
## 6 75 BIRD INDEPENDENCE 6 4.99 163 G
## 7 89 BORROWERS BEDAZZLED 7 0.99 63 G
## 8 97 BRIDE INTRIGUE 7 0.99 56 G
## 9 100 BROOKLYN DESERT 7 4.99 161 R
## 10 118 CANYON STOCK 7 0.99 85 R
## 11 119 CAPER MOTIONS 6 0.99 176 G
## 12 128 CATCH AMISTAD 7 0.99 183 G
## 13 140 CHEAPER CLYDE 6 0.99 87 G
## 14 153 CITIZEN SHREK 7 0.99 165 G
## 15 178 CONNECTION MICROCOSMOS 6 0.99 115 G
## 16 182 CONTROL ANTHEM 7 4.99 185 G
## 17 196 CRUELTY UNFORGIVEN 7 0.99 69 G
## 18 206 DANCING FEVER 6 0.99 144 G
## 19 212 DARN FORRESTER 7 4.99 185 G
## 20 249 DRACULA CRYSTAL 7 0.99 176 G
## 21 256 DROP WATERFRONT 6 4.99 178 R
## 22 263 DURHAM PANKY 6 4.99 154 R
## 23 280 EMPIRE MALKOVICH 7 0.99 177 G
## 24 281 ENCINO ELF 6 0.99 143 G
## 25 288 ESCAPE METROPOLIS 7 2.99 167 R
## 26 293 EXORCIST STING 6 2.99 167 G
## 27 309 FEUD FROGMEN 6 0.99 98 R
## 28 318 FIREHOUSE VIETNAM 7 0.99 103 G
## 29 323 FLIGHT LIES 7 4.99 179 R
## 30 355 GHOSTBUSTERS ELF 7 0.99 101 R
## 31 357 GILBERT PELICAN 7 0.99 114 G
## 32 367 GOLDMINE TYCOON 6 0.99 153 R
## 33 373 GRADUATE LORD 7 2.99 156 G
## 34 377 GREASE YOUTH 7 0.99 135 G
## 35 379 GREEDY ROOTS 7 0.99 166 R
## 36 387 GUN BONNIE 7 0.99 100 G
## 37 416 HIGHBALL POTTER 6 0.99 110 R
## 38 424 HOLOCAUST HIGHBALL 6 0.99 149 R
## 39 425 HOLY TADPOLE 6 0.99 88 R
## 40 426 HOME PITY 7 4.99 185 R
## 41 430 HOOK CHARIOTS 7 0.99 49 G
## 42 440 HUNGER ROOF 6 0.99 105 G
## 43 467 INTRIGUE WORST 6 0.99 181 G
## 44 475 JAPANESE RUN 6 0.99 135 G
## 45 498 KILLER INNOCENT 7 2.99 161 R
## 46 510 LAWLESS VISION 6 4.99 181 G
## 47 523 LIGHTS DEER 7 0.99 174 R
## 48 535 LOVE SUICIDES 6 0.99 181 R
## 49 554 MALKOVICH PET 6 2.99 159 G
## 50 563 MASSACRE USUAL 6 4.99 165 R
## 51 566 MAUDE MOD 6 0.99 72 R
## 52 593 MONTEREY LABYRINTH 6 0.99 158 G
## 53 598 MOSQUITO ARMAGEDDON 6 0.99 57 G
## 54 609 MUSCLE BRIGHT 7 2.99 185 G
## 55 610 MUSIC BOONDOCK 7 0.99 129 R
## 56 612 MUSSOLINI SPOILERS 6 2.99 180 G
## 57 638 OPERATION OPERATION 7 2.99 156 G
## 58 697 PRIMARY GLASS 7 0.99 53 G
## 59 718 REBEL AIRPORT 7 0.99 73 G
## 60 725 REQUIEM TYCOON 6 4.99 167 R
## 61 742 ROOF CHAMPION 7 0.99 101 R
## 62 749 RULES HUMAN 6 4.99 153 R
## 63 762 SASSY PACKER 6 0.99 154 G
## 64 765 SATURN NAME 7 4.99 182 R
## 65 786 SHEPHERD MIDSUMMER 7 0.99 113 R
## 66 795 SIEGE MADRE 7 0.99 111 R
## 67 797 SILENCE KANE 7 0.99 67 R
## 68 803 SLACKER LIAISONS 7 4.99 179 R
## 69 813 SMOOCHY CONTROL 7 0.99 184 R
## 70 817 SOLDIERS EVOLUTION 7 4.99 185 R
## 71 822 SOUP WISDOM 6 0.99 169 R
## 72 830 SPIRIT FLINTSTONES 7 0.99 149 R
## 73 850 STORY SIDE 7 0.99 163 R
## 74 958 WARDROBE PHANTOM 6 2.99 178 G
## 75 961 WASH HEAVENLY 7 4.99 161 R
## 76 965 WATERSHIP FRONTIER 6 0.99 112 G
## 77 976 WIND PHANTOM 6 0.99 111 R
## 78 996 YOUNG LANGUAGE 6 0.99 183 G