Open the assign07.qmd file and complete the exercises.
This is somewhat open-ended assignment.
The file domestic_flights_jan_2016.csv is nearly the same as the me_flights.csv file we work with in the temporal data chapter except it has two additional columns for destination city and state and it is for all domestic flights reported for on-time performance in the US for January 2016. You’ll find it helpful to recreate all of the calculated fields we create in the unit.
You are to write a report uploaded to RPubs that compares what you consider interesting metrics for a select group of carriers, airports, or states. You may not parrot the queries from the text or the practice questions. Your report must contain at least two questions that you ask about the flights data. Your answers to those questions must also contain a visualization of the data or a table along with a specific answer in the narrative.
The csv file contains 445,827 observations. You’ll want to subset the data to the area(s) you are looking at, then write it out to a csv file using write_csv(), and start your assignment by importing that csv instead. Do this in a separate script file that you don’t need to submit. In your narrative, describe your subset. I don’t need to see how you subsetted the data because it might cause performance issues when you render the document. Note: you will receive deductions for not using tidyverse syntax in this assignment. That includes the use of filter, mutate, and the up-to-date pipe operator |>.
The Grading Rubric is available at the end of this document.
This is your work area. Add as many code cells as you need.
1 ✈️ Overview
This report analyzes flight cancellations for flights departing from Portland International Jetport (PWM).
The data was filtered to include only flights from PWM, and cancellations and flight mileage were summarized by carrier.
#Question #1: Which carrier at PWM had the most flight cancellations in January, 2016?
library(readr)library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
library(lubridate)
Attaching package: 'lubridate'
The following objects are masked from 'package:base':
date, intersect, setdiff, union
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (8): FlightDate, Carrier, TailNum, Origin, Dest, DestCityName, DestStat...
dbl (12): FlightNum, CRSDepTime, DepTime, WheelsOff, WheelsOn, CRSArrTime, A...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 377 Columns: 20
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (8): FlightDate, Carrier, TailNum, Origin, Dest, DestCityName, DestStat...
dbl (12): FlightNum, CRSDepTime, DepTime, WheelsOff, WheelsOn, CRSArrTime, A...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.