R Markdown
Create the code makes a table for each of the below questions.
1. Join + Filter - Which airplanes fly LGA to
XNA?
(1 POINT)
q1 <- flights %>%
filter(origin == "LGA", dest == "XNA") %>%
left_join(planes, by = "tailnum") %>%
select(origin, dest, tailnum, model, manufacturer) %>%
distinct(tailnum, .keep_all = TRUE) %>%
arrange(tailnum)
q1
## # A tibble: 70 × 5
## origin dest tailnum model manufacturer
## <chr> <chr> <chr> <chr> <chr>
## 1 LGA XNA N0EGMQ <NA> <NA>
## 2 LGA XNA N501MQ <NA> <NA>
## 3 LGA XNA N507MQ <NA> <NA>
## 4 LGA XNA N510MQ <NA> <NA>
## 5 LGA XNA N511MQ <NA> <NA>
## 6 LGA XNA N512MQ <NA> <NA>
## 7 LGA XNA N514MQ <NA> <NA>
## 8 LGA XNA N518MQ <NA> <NA>
## 9 LGA XNA N520MQ <NA> <NA>
## 10 LGA XNA N521MQ <NA> <NA>
## # ℹ 60 more rows
2. Join - Add the airline name to the flights
table
(1 POINT)
q2 <- airlines %>%
right_join(flights, by="carrier")
q2
## # A tibble: 336,776 × 20
## carrier name year month day dep_time sched_dep_time dep_delay arr_time
## <chr> <chr> <int> <int> <int> <int> <int> <dbl> <int>
## 1 9E Endeavo… 2013 1 1 810 810 0 1048
## 2 9E Endeavo… 2013 1 1 1451 1500 -9 1634
## 3 9E Endeavo… 2013 1 1 1452 1455 -3 1637
## 4 9E Endeavo… 2013 1 1 1454 1500 -6 1635
## 5 9E Endeavo… 2013 1 1 1507 1515 -8 1651
## 6 9E Endeavo… 2013 1 1 1530 1530 0 1650
## 7 9E Endeavo… 2013 1 1 1546 1540 6 1753
## 8 9E Endeavo… 2013 1 1 1550 1550 0 1844
## 9 9E Endeavo… 2013 1 1 1552 1600 -8 1749
## 10 9E Endeavo… 2013 1 1 1554 1600 -6 1701
## # ℹ 336,766 more rows
## # ℹ 11 more variables: sched_arr_time <int>, arr_delay <dbl>, flight <int>,
## # tailnum <chr>, origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>,
## # hour <dbl>, minute <dbl>, time_hour <dttm>
3. Join + Select + Distinct() - Which airports have no
commercial flights?
(1 POINT)
q3 <- airports %>%
anti_join(flights, by = c("faa" = "dest")) %>%
anti_join(flights, by = c("faa" = "origin")) %>%
select(faa, name, lat, lon, alt, tz, dst) %>%
distinct(faa, .keep_all = TRUE)
q3
## # A tibble: 1,355 × 7
## faa name lat lon alt tz dst
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 04G Lansdowne Airport 41.1 -80.6 1044 -5 A
## 2 06A Moton Field Municipal Airport 32.5 -85.7 264 -6 A
## 3 06C Schaumburg Regional 42.0 -88.1 801 -6 A
## 4 06N Randall Airport 41.4 -74.4 523 -5 A
## 5 09J Jekyll Island Airport 31.1 -81.4 11 -5 A
## 6 0A9 Elizabethton Municipal Airport 36.4 -82.2 1593 -5 A
## 7 0G6 Williams County Airport 41.5 -84.5 730 -5 A
## 8 0G7 Finger Lakes Regional Airport 42.9 -76.8 492 -5 A
## 9 0P2 Shoestring Aviation Airfield 39.8 -76.6 1000 -5 U
## 10 0S9 Jefferson County Intl 48.1 -123. 108 -8 A
## # ℹ 1,345 more rows
4. EXTRA CREDIT - NO HELP - NO PARTIAL CREDIT
Create a table with the names of the airports with the most winds
(wind_speed > 30) The table must contain only the airport name
(airports$name) and no duplicate rows
(2 POINTS)
extra_credit <- flights %>%
left_join(airports, by =c("origin"="faa")) %>%
arrange(desc(name)) %>%
full_join(weather) %>%
filter(wind_speed>30) %>%
distinct(name, .keep_all = TRUE) %>%
select(name) %>%
filter(!is.na(name))
## Joining with `by = join_by(year, month, day, origin, hour, time_hour)`
extra_credit
## # A tibble: 3 × 1
## name
## <chr>
## 1 Newark Liberty Intl
## 2 La Guardia
## 3 John F Kennedy Intl