Model Data Relasional adalah suatu model basis data yang terdiri dari beberapa tabel berdimensi dua yg disebut (relasi atau tabel) , dengan masing-masing relation(relasi), yang tersusun atas tuple (baris) dan atribut (kolom) untuk menggambarkan sebuah basis data. Pada model relasional, tidak ada data yang kembar. Karena ada kunci primer (Primary Key). Kunci primer adalah satu item yang dipilih dalam suatu kolom yang unik dan tidak sama yang berfungsi untuk membedakan antara satu kolom dengan kolom lainnya.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.3 v purrr 0.3.4
## v tibble 3.0.6 v stringr 1.4.0
## v tidyr 1.1.2 v forcats 0.5.1
## v readr 1.4.0
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(nycflights13)
## Warning: package 'nycflights13' was built under R version 4.0.4
library(ggplot2)
library(maps)
## Warning: package 'maps' was built under R version 4.0.4
##
## Attaching package: 'maps'
## The following object is masked from 'package:purrr':
##
## map
Maskapai penerbangan di ambil dari data set airlines, kita memungkinkan untuk mencari nama lengkap maskapai dan fungsi str() kita bisa mengetahui struktur dari data set airlines.
str(airlines)
## tibble [16 x 2] (S3: tbl_df/tbl/data.frame)
## $ carrier: chr [1:16] "9E" "AA" "AS" "B6" ...
## $ name : chr [1:16] "Endeavor Air Inc." "American Airlines Inc." "Alaska Airlines Inc." "JetBlue Airways" ...
airlines
## # A tibble: 16 x 2
## carrier name
## <chr> <chr>
## 1 9E Endeavor Air Inc.
## 2 AA American Airlines Inc.
## 3 AS Alaska Airlines Inc.
## 4 B6 JetBlue Airways
## 5 DL Delta Air Lines Inc.
## 6 EV ExpressJet Airlines Inc.
## 7 F9 Frontier Airlines Inc.
## 8 FL AirTran Airways Corporation
## 9 HA Hawaiian Airlines Inc.
## 10 MQ Envoy Air
## 11 OO SkyWest Airlines Inc.
## 12 UA United Air Lines Inc.
## 13 US US Airways Inc.
## 14 VX Virgin America
## 15 WN Southwest Airlines Co.
## 16 YV Mesa Airlines Inc.
Bandara di ambil dari data set airports, kita memungkinkan untuk mencari nama lengkap bandara dan fungsi str() kita bisa mengetahui struktur dari data set bandara.
str(airports)
## tibble [1,458 x 8] (S3: tbl_df/tbl/data.frame)
## $ faa : chr [1:1458] "04G" "06A" "06C" "06N" ...
## $ name : chr [1:1458] "Lansdowne Airport" "Moton Field Municipal Airport" "Schaumburg Regional" "Randall Airport" ...
## $ lat : num [1:1458] 41.1 32.5 42 41.4 31.1 ...
## $ lon : num [1:1458] -80.6 -85.7 -88.1 -74.4 -81.4 ...
## $ alt : num [1:1458] 1044 264 801 523 11 ...
## $ tz : num [1:1458] -5 -6 -6 -5 -5 -5 -5 -5 -5 -8 ...
## $ dst : chr [1:1458] "A" "A" "A" "A" ...
## $ tzone: chr [1:1458] "America/New_York" "America/Chicago" "America/Chicago" "America/New_York" ...
## - attr(*, "spec")=
## .. cols(
## .. id = col_double(),
## .. name = col_character(),
## .. city = col_character(),
## .. country = col_character(),
## .. faa = col_character(),
## .. icao = col_character(),
## .. lat = col_double(),
## .. lon = col_double(),
## .. alt = col_double(),
## .. tz = col_double(),
## .. dst = col_character(),
## .. tzone = col_character()
## .. )
airports
## # A tibble: 1,458 x 8
## faa name lat lon alt tz dst tzone
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
## 1 04G Lansdowne Airport 41.1 -80.6 1044 -5 A America/New_Yo~
## 2 06A Moton Field Municipal A~ 32.5 -85.7 264 -6 A America/Chicago
## 3 06C Schaumburg Regional 42.0 -88.1 801 -6 A America/Chicago
## 4 06N Randall Airport 41.4 -74.4 523 -5 A America/New_Yo~
## 5 09J Jekyll Island Airport 31.1 -81.4 11 -5 A America/New_Yo~
## 6 0A9 Elizabethton Municipal ~ 36.4 -82.2 1593 -5 A America/New_Yo~
## 7 0G6 Williams County Airport 41.5 -84.5 730 -5 A America/New_Yo~
## 8 0G7 Finger Lakes Regional A~ 42.9 -76.8 492 -5 A America/New_Yo~
## 9 0P2 Shoestring Aviation Air~ 39.8 -76.6 1000 -5 U America/New_Yo~
## 10 0S9 Jefferson County Intl 48.1 -123. 108 -8 A America/Los_An~
## # ... with 1,448 more rows
Pesawat terbang di ambil dari data set planes, kita memungkinkan untuk mencari nama lengkap planes dan fungsi str() kita bisa mengetahui struktur dari data set planes.
str(planes)
## tibble [3,322 x 9] (S3: tbl_df/tbl/data.frame)
## $ tailnum : chr [1:3322] "N10156" "N102UW" "N103US" "N104UW" ...
## $ year : int [1:3322] 2004 1998 1999 1999 2002 1999 1999 1999 1999 1999 ...
## $ type : chr [1:3322] "Fixed wing multi engine" "Fixed wing multi engine" "Fixed wing multi engine" "Fixed wing multi engine" ...
## $ manufacturer: chr [1:3322] "EMBRAER" "AIRBUS INDUSTRIE" "AIRBUS INDUSTRIE" "AIRBUS INDUSTRIE" ...
## $ model : chr [1:3322] "EMB-145XR" "A320-214" "A320-214" "A320-214" ...
## $ engines : int [1:3322] 2 2 2 2 2 2 2 2 2 2 ...
## $ seats : int [1:3322] 55 182 182 182 55 182 182 182 182 182 ...
## $ speed : int [1:3322] NA NA NA NA NA NA NA NA NA NA ...
## $ engine : chr [1:3322] "Turbo-fan" "Turbo-fan" "Turbo-fan" "Turbo-fan" ...
planes
## # A tibble: 3,322 x 9
## tailnum year type manufacturer model engines seats speed engine
## <chr> <int> <chr> <chr> <chr> <int> <int> <int> <chr>
## 1 N10156 2004 Fixed wing m~ EMBRAER EMB-1~ 2 55 NA Turbo-~
## 2 N102UW 1998 Fixed wing m~ AIRBUS INDUST~ A320-~ 2 182 NA Turbo-~
## 3 N103US 1999 Fixed wing m~ AIRBUS INDUST~ A320-~ 2 182 NA Turbo-~
## 4 N104UW 1999 Fixed wing m~ AIRBUS INDUST~ A320-~ 2 182 NA Turbo-~
## 5 N10575 2002 Fixed wing m~ EMBRAER EMB-1~ 2 55 NA Turbo-~
## 6 N105UW 1999 Fixed wing m~ AIRBUS INDUST~ A320-~ 2 182 NA Turbo-~
## 7 N107US 1999 Fixed wing m~ AIRBUS INDUST~ A320-~ 2 182 NA Turbo-~
## 8 N108UW 1999 Fixed wing m~ AIRBUS INDUST~ A320-~ 2 182 NA Turbo-~
## 9 N109UW 1999 Fixed wing m~ AIRBUS INDUST~ A320-~ 2 182 NA Turbo-~
## 10 N110UW 1999 Fixed wing m~ AIRBUS INDUST~ A320-~ 2 182 NA Turbo-~
## # ... with 3,312 more rows
planes.se <- dplyr::select(planes, tailnum:manufacturer)
head(planes.se,n=5)
## # A tibble: 5 x 4
## tailnum year type manufacturer
## <chr> <int> <chr> <chr>
## 1 N10156 2004 Fixed wing multi engine EMBRAER
## 2 N102UW 1998 Fixed wing multi engine AIRBUS INDUSTRIE
## 3 N103US 1999 Fixed wing multi engine AIRBUS INDUSTRIE
## 4 N104UW 1999 Fixed wing multi engine AIRBUS INDUSTRIE
## 5 N10575 2002 Fixed wing multi engine EMBRAER
str(weather)
## tibble [26,115 x 15] (S3: tbl_df/tbl/data.frame)
## $ origin : chr [1:26115] "EWR" "EWR" "EWR" "EWR" ...
## $ year : int [1:26115] 2013 2013 2013 2013 2013 2013 2013 2013 2013 2013 ...
## $ month : int [1:26115] 1 1 1 1 1 1 1 1 1 1 ...
## $ day : int [1:26115] 1 1 1 1 1 1 1 1 1 1 ...
## $ hour : int [1:26115] 1 2 3 4 5 6 7 8 9 10 ...
## $ temp : num [1:26115] 39 39 39 39.9 39 ...
## $ dewp : num [1:26115] 26.1 27 28 28 28 ...
## $ humid : num [1:26115] 59.4 61.6 64.4 62.2 64.4 ...
## $ wind_dir : num [1:26115] 270 250 240 250 260 240 240 250 260 260 ...
## $ wind_speed: num [1:26115] 10.36 8.06 11.51 12.66 12.66 ...
## $ wind_gust : num [1:26115] NA NA NA NA NA NA NA NA NA NA ...
## $ precip : num [1:26115] 0 0 0 0 0 0 0 0 0 0 ...
## $ pressure : num [1:26115] 1012 1012 1012 1012 1012 ...
## $ visib : num [1:26115] 10 10 10 10 10 10 10 10 10 10 ...
## $ time_hour : POSIXct[1:26115], format: "2013-01-01 01:00:00" "2013-01-01 02:00:00" ...
weather
## # A tibble: 26,115 x 15
## origin year month day hour temp dewp humid wind_dir wind_speed
## <chr> <int> <int> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 EWR 2013 1 1 1 39.0 26.1 59.4 270 10.4
## 2 EWR 2013 1 1 2 39.0 27.0 61.6 250 8.06
## 3 EWR 2013 1 1 3 39.0 28.0 64.4 240 11.5
## 4 EWR 2013 1 1 4 39.9 28.0 62.2 250 12.7
## 5 EWR 2013 1 1 5 39.0 28.0 64.4 260 12.7
## 6 EWR 2013 1 1 6 37.9 28.0 67.2 240 11.5
## 7 EWR 2013 1 1 7 39.0 28.0 64.4 240 15.0
## 8 EWR 2013 1 1 8 39.9 28.0 62.2 250 10.4
## 9 EWR 2013 1 1 9 39.9 28.0 62.2 260 15.0
## 10 EWR 2013 1 1 10 41 28.0 59.6 260 13.8
## # ... with 26,105 more rows, and 5 more variables: wind_gust <dbl>,
## # precip <dbl>, pressure <dbl>, visib <dbl>, time_hour <dttm>
weather.se <- dplyr::filter(weather, temp >= 40)
print(weather.se)
## # A tibble: 19,401 x 15
## origin year month day hour temp dewp humid wind_dir wind_speed
## <chr> <int> <int> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 EWR 2013 1 1 10 41 28.0 59.6 260 13.8
## 2 EWR 2013 1 1 11 41 27.0 57.1 260 15.0
## 3 EWR 2013 1 5 10 41 21.0 44.5 310 10.4
## 4 EWR 2013 1 5 11 42.1 19.9 40.8 320 15.0
## 5 EWR 2013 1 5 12 43.0 19.9 39.4 310 13.8
## 6 EWR 2013 1 5 13 44.1 19.9 37.8 290 11.5
## 7 EWR 2013 1 5 14 44.1 19.9 37.8 310 9.21
## 8 EWR 2013 1 5 15 43.0 19.0 37.9 310 8.06
## 9 EWR 2013 1 5 16 41 19.9 42.5 310 4.60
## 10 EWR 2013 1 6 11 43.0 30.0 59.9 240 10.4
## # ... with 19,391 more rows, and 5 more variables: wind_gust <dbl>,
## # precip <dbl>, pressure <dbl>, visib <dbl>, time_hour <dttm>
head(select(weather, origin, year, temp), n = 5)
## # A tibble: 5 x 3
## origin year temp
## <chr> <int> <dbl>
## 1 EWR 2013 39.0
## 2 EWR 2013 39.0
## 3 EWR 2013 39.0
## 4 EWR 2013 39.9
## 5 EWR 2013 39.0
dengan menggunakan pipe, dapat dituliskan sebagai berikut :
weather %>%
select(origin, year, temp) %>%
head(n = 5)
## # A tibble: 5 x 3
## origin year temp
## <chr> <int> <dbl>
## 1 EWR 2013 39.0
## 2 EWR 2013 39.0
## 3 EWR 2013 39.0
## 4 EWR 2013 39.9
## 5 EWR 2013 39.0
weather.ar <- arrange(weather, temp)
head(weather.ar , n = 10)
## # A tibble: 10 x 15
## origin year month day hour temp dewp humid wind_dir wind_speed
## <chr> <int> <int> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 EWR 2013 1 23 5 10.9 -4 50.2 270 10.4
## 2 EWR 2013 1 23 6 10.9 -4 50.2 270 11.5
## 3 EWR 2013 1 23 2 12.0 -7.06 41.3 270 10.4
## 4 EWR 2013 1 23 3 12.0 -5.98 43.5 270 8.06
## 5 EWR 2013 1 23 4 12.0 -5.08 45.4 270 9.21
## 6 EWR 2013 1 23 7 12.0 -2.92 50.4 270 11.5
## 7 EWR 2013 1 24 3 12.0 1.94 63.3 30 4.60
## 8 JFK 2013 1 23 4 12.0 -7.06 41.3 280 12.7
## 9 JFK 2013 1 23 5 12.0 -5.08 45.4 280 15.0
## 10 JFK 2013 1 23 6 12.0 -5.08 45.4 290 13.8
## # ... with 5 more variables: wind_gust <dbl>, precip <dbl>, pressure <dbl>,
## # visib <dbl>, time_hour <dttm>
weather.ab <- arrange(weather, desc(temp))
head(weather.ab , n = 10)
## # A tibble: 10 x 15
## origin year month day hour temp dewp humid wind_dir wind_speed
## <chr> <int> <int> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 EWR 2013 7 18 15 100. 66.0 33.2 300 9.21
## 2 EWR 2013 7 19 16 100. 71.1 39.5 230 20.7
## 3 EWR 2013 7 19 13 99.0 72.0 42.1 260 18.4
## 4 EWR 2013 7 19 14 99.0 71.1 40.8 250 20.7
## 5 EWR 2013 7 19 15 99.0 72.0 42.1 240 16.1
## 6 LGA 2013 7 18 15 99.0 64.0 32.0 250 12.7
## 7 LGA 2013 7 19 15 99.0 69.1 38.2 240 16.1
## 8 LGA 2013 7 19 16 99.0 69.1 38.2 230 15.0
## 9 EWR 2013 7 18 12 98.1 69.1 39.2 300 9.21
## 10 EWR 2013 7 18 13 98.1 66.9 36.4 300 11.5
## # ... with 5 more variables: wind_gust <dbl>, precip <dbl>, pressure <dbl>,
## # visib <dbl>, time_hour <dttm>
weather %>%
mutate(weather_proportion = temp/dewp) %>%
head(5)
## # A tibble: 5 x 16
## origin year month day hour temp dewp humid wind_dir wind_speed wind_gust
## <chr> <int> <int> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 EWR 2013 1 1 1 39.0 26.1 59.4 270 10.4 NA
## 2 EWR 2013 1 1 2 39.0 27.0 61.6 250 8.06 NA
## 3 EWR 2013 1 1 3 39.0 28.0 64.4 240 11.5 NA
## 4 EWR 2013 1 1 4 39.9 28.0 62.2 250 12.7 NA
## 5 EWR 2013 1 1 5 39.0 28.0 64.4 260 12.7 NA
## # ... with 5 more variables: precip <dbl>, pressure <dbl>, visib <dbl>,
## # time_hour <dttm>, weather_proportion <dbl>
weather %>%
summarise(rata2 = mean(dewp), median = median(dewp),
minimum = min(dewp), maximum = max(dewp))
## # A tibble: 1 x 4
## rata2 median minimum maximum
## <dbl> <dbl> <dbl> <dbl>
## 1 NA NA NA NA