library(tidyverse)
## Warning in system("timedatectl", intern = TRUE): running command 'timedatectl'
## had status 1
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.1.0 ✓ dplyr 1.0.4
## ✓ tidyr 1.1.2 ✓ stringr 1.4.0
## ✓ readr 1.4.0 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(nycflights13)
library(ggplot2)
Relationship between Average speed and Distance for the flights
flights <- flights %>% mutate(avg_speed = distance / (arr_time/60))
ggplot(data = flights, aes(distance,avg_speed)) +
geom_point( shape = 21, size = 1.5, stroke = 1,color = "steelblue", fill = "white") +
labs(x = "Distance", y = "Average Speed") +
ggtitle("Scatterplot of Average Speed vs. Distance")
## Warning: Removed 8713 rows containing missing values (geom_point).

The average speed grows slghtly based on the speed. The reason is because the take off and landing time does not affect the long flights as much as they affect short flights. The longest distance flight here is from NYC to HNL(Honolulu).