library(nycflights13)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.5     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.4     ✓ stringr 1.4.0
## ✓ readr   2.0.2     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

nycflights13

Note: you will probablly have to install nycflights13 using install.packages and the load it with the library command. nycflights13 is a relational database containig the following tables (data frames). This is data about all airline flights into and out of New York City in 2021. This project will parallel Chapter 5 in Wickham and Hadley

data frame description
airlines Airline names
airports Airport metadata
flights Flights data
planes Planes meta data
weather Hourly data

This is data about all airline flights into and out of New York City in 2021. This project will parallel Chapter 5 in Wickham and Hadley. You should start reading this chapter now, and try to complete reading it by the end of this week.

flights
## # A tibble: 336,776 × 19
##     year month   day dep_time sched_dep_time dep_delay arr_time sched_arr_time
##    <int> <int> <int>    <int>          <int>     <dbl>    <int>          <int>
##  1  2013     1     1      517            515         2      830            819
##  2  2013     1     1      533            529         4      850            830
##  3  2013     1     1      542            540         2      923            850
##  4  2013     1     1      544            545        -1     1004           1022
##  5  2013     1     1      554            600        -6      812            837
##  6  2013     1     1      554            558        -4      740            728
##  7  2013     1     1      555            600        -5      913            854
##  8  2013     1     1      557            600        -3      709            723
##  9  2013     1     1      557            600        -3      838            846
## 10  2013     1     1      558            600        -2      753            745
## # … with 336,766 more rows, and 11 more variables: arr_delay <dbl>,
## #   carrier <chr>, flight <int>, tailnum <chr>, origin <chr>, dest <chr>,
## #   air_time <dbl>, distance <dbl>, hour <dbl>, minute <dbl>, time_hour <dttm>

Task 1

  1. Change your name to author name in the YAML block at the top of this page.
  2. Install nycflights13 if necessary.
  3. Do the Exercises in section 5.2.4 of the Wickham book you will have to read the material prior to the exercises, though you have probably seen all this in the preceeding sections.

Write up your solutions to the exercises in 5.2.4 in this document, including the code chunks you use to determine the answer.

Task 2

  1. Do the exercises in 5.3.1

Write up your solutions to the exercises in 5.3.1 in this document, including the code chunks you use to determine the answer.

Task 3

  1. Do the exercises 5.4.1

Write up your solutions to the exercises in 5.3.1 in this document, including the code chunks you use to determine the answer.

Task 4

  1. Do the exercises 5.5.2

Write up your solutions to the exercises in 5.5.2 in this document, including the code chunks you use to determine the answer.

Task 5

  1. Do the exercises 5.6.7

Write up your solutions to the exercises in 5.6.7 in this document, including the code chunks you use to determine the answer.

Turning in your work and announcements for the remainder of the Semester

Final Project

  • Then there will be a final project. There will be no final exam. I will provide a list of three projects from which you may choose the project you will do.
  • I will provide all the data necessary.
  • You will also post your final project on RPubs.
  • The final project will be due on Midnight the day the final is scheduled.