Admistrative:

Please indicate

  • Who you collaborated with: Nina, Amanda, Katherine, Brenda
  • Roughly how much time you spent on this HW so far: 9 hours
  • The URL of the RPubs published URL here.
  • What gave you the most trouble: overriding my data frames and not knowing how to reverse my mistake
  • Any comments you have: Need to learn how to search for all the N/A values and either rename so they are assigned to a state/region or remove regarding Question #4. This will also probably help address warnings regarding removing rows with missing values in other questions. I have since figured this out and using google discoverd the drop_na(year) function. Additionally, I have attempted to determined how to used fct_reorder to reorder the boxplots in Question #2 but don’t know how to reorder the plot by the median plane age for each carrier.

Question 1:

Plot a “time series” of the proportion of flights that were delayed by > 30 minutes on each day. i.e.

  • the x-axis should be some notion of time
  • the y-axis should be the proportion.

Using this plot, indicate describe the seasonality of when delays over 30 minutes tend to occur.

It would appear that the seasonality of when flight delays (over 30 minutes) tend to occur is such that there tend to be greater proportion of flights delayed on any given day primarily in the month of July, perhaps due to summer thunderstorms in Texas. Secondarily, in the month of January perhaps due to winter weather delays. However, the process of determining the seasonality of delays out of Houston Airport would be strengthed by data across multiple years, rather than just data on flights from 2011.

Question 2:

Some people prefer flying on older planes. Even though they aren’t as nice, they tend to have more room. Which airlines should these people favor?

People who tend to prefer older planes for the sake of more room in their seat should fly American Eagle (MQ) because it has the oldest median age for their airplanes and a much smaller range of years compared to American Airlines. American Airlines has the second oldest median airplane age but has a much wider ranger so it is a possibility to fly in a much younger plane when flying AA, whereas with American Eagle you will be guaranteed to be flying in a plane built in the early 1980s or earlier. *It should be noted that I removed 11,402 flights from the data frame because the plane used in the flight did not have a year built value.

Question 3:

  • What states did Southwest Airlines’ flight paths tend to fly to?
  • What states did Southwest Airlines’ flights tend to fly to?

For example, Southwest Airlines Flight 60 to Dallas consists of a single flight path, but since it flew 299 times in 2013, it would be counted as 299 flights.

Notes: Southwest’s airline carrier code is WN. N/A entries for state all are the ECP flights which I think is an airport in Florida

The two plot below displays how many southwest flight go to each state and how many flight paths Southwest has to each state. Out of IAH, Southwest has the most flight paths to (in descending order) Texas, Florida, Louisiana, California, Oklahoma. Additionally, out of IAH, Southwest has the most flights to (in descending order) Texas, Florida, Louisiana, California, Oklahoma. It is not surprising that the order of the states for most flight paths and most flights in general match. *It should be noted I removed 729 flights from the data frame because their values for their destination/state were N/A.

Question 4:

I want to know proportionately what regions (NE, south, west, midwest) each carrier flies to/from Houston in the month of July. Consider the month() function from the lubridate package.

As we can see below not all carriers fly to each region of the United States. In fact some carriers only fly to one region out of IAH. For example, American Airlines only flies to the south. The only two airline that fly to all four regions of the country are Continental and Skywest. However, it is important to keep in mind that in 2011, Continental and United were undergoing their merger, thus potentially explaining why United has no flights to the Northeast, despite flying to all three other regions.

Below is two sets of code that accomplish the same plot, the first is my original attemp and the second is the more concise version recommended by Professor Kim.