MILESTONE EXAM 1 - STUDENT TEMPLATE

Elijah Adelegan Due: October 2 2024

Academic Honesty Statement

I, Elijah Adelegan_________, hereby state that I have not communicated with or gained information in any way from my classmates or anyone other than the Professor or TA during this exam, and that all work is my own.

Load packages

# load required packages here

Questions

Question 1

flights %>% arrange(desc(dest)) %>% select(year, origin, dest,time_hour,) %>% top_n(10)

SYR,SJU,ROC,PWM,PSE,BUF,BTV,BQN,BOS,XNA

Question 2

Frontier Airlines Inc. F9
2 AirTran Airways Corporation FL
3 Hawaiian Airlines Inc. HA
4 Envoy Air MQ
5 SkyWest Airlines Inc. OO
6 United Air Lines Inc. UA
7 US Airways Inc. US
8 Virgin America VX
9 Southwest Airlines Co. WN
10 Mesa Airlines Inc. YV
airlines %>% select(name,carrier) %>% top_n(10)

Question 3

A. Yv airlane has to highest mean arrvial delay flights %>% select(dep_delay,year,month,day,carrier) %>% top_n(10) B. 9E flights %>% group_by(month,dest,carrier) %>% summarize(avg_delay = mean(dep_delay, na.rm = TRUE)) %>% relocate(dest,carrier) https://dplyr.tidyverse.org/reference/top_n.html#:~:text=Usage.%20top_n(x,%20n,%20wt)%20top_frac(x,%20n,%20wt)%20Arguments.%20x.%20A https://r4ds.hadley.nz/data-transform

Question 4

JFK, January 9 2013, Temperature 44.96 View(weather) arrange(flights, desc(dep_delay)) %>% select(origin,dep_delay,month,day,hour) https://stackoverflow.com/questions/24212739/how-to-find-the-highest-value-of-a-column-in-a-data-frame-in-r#:~:text=I%20tried.%20max(ozone,%20na.rm=T)%20which%20gives%20me%20the%20highest

Question 5

  1. Flights that are late are delayed more often
  2. The later the time the more likely it is to be delayed

Question 6

  1. 11 hours and 58 minutes B. It flew to Hawaii flights %>% arrange(desc(air_time)) %>% select(year,origin, dest,carrier,air_time,tailnum) %>% relocate(tailnum) %>% top_n(10) C. tailnum N77066 View(planes) 292 seats

Question 7

The graphical distribution of the airports for the Contiguous United States are mostly in the located in the Eastern United States and Chicago, also in the western United states Los Angles.
https://en.wikipedia.org/wiki/List_of_extreme_points_of_the_United_States https://en.wikipedia.org/wiki/Contiguous_United_States View(airports)

Question 8

The point of this visualization is that EWR is the second most delayed to RDU. JFK is the most delayed to RDU and second most on JFK. LGA is the least delayed for RDU and the second most delayed for PHL. flights %>% mutate(arrival= on_time = arr_delay <= 0, delayed = arr_delay > 0 ) ggplot(flights, aes(x= arr_time, y = dep_delay)) + geom_boxplot(aes(color=origin)) + facet_wrap(~origin, ncol=1)

Extra Credit

There appears to be a relationship between temperature and delay time with higher delays at higher temperatures ggplot(flights, aes(x=dep_delay,y=dest)) + geom_point(aes(color=origin))