plot <- flights1 |>ggplot(aes(x=dep_delay, y=arr_delay)) +#color difference shows if flights arrive on time or notgeom_point(aes(color=factor(arr_delay>0)),alpha=0.6,na.rm=TRUE)+#rename legend title/labelsscale_color_discrete(name ="Arrival Delays", labels =c("Early","Delayed"))+geom_smooth(method ="lm",#delete CIse=FALSE,size=1,#regression line col="green")+labs(x ="Depature Delay (minutes)", y ="Arrival Delay (minutes)",color="Arrival Delays",title ="Relationship Between Departure and Arrival Delays ",caption ="Time:2023-01-01 06:00-07:00 Souce:nycflights23 dataset")+# theme_minimal() before theme()theme_minimal()+theme(plot.title=element_text(hjust=0.5))
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
plot
`geom_smooth()` using formula = 'y ~ x'
Data Analysis
The scatterplot above shows the relationship between departure and Arrival Delays between 06:00 and 07:00 on January 1, 2023. The x-axis represents Departure Delay in minutes, and the y-axis shows Arrival Delay in minutes. The green regression line represents the trend in this dataset. We can conclude that there is a positive relationship between departure delay and arrival delay. In general, the flights that depart late tend to arrive late as well.
Additionally, most delays are distributed around -20 to 20 minutes, indicating that most flights during 06:00 and 07:00 on that day have only minor delays. However, there are still 3 outliers who have been delayed for more than 1 hour at the same time. Last but not least, the pink dots represent all flights that arrive on time, while the blue dots represent flights with arrival delays.We can conclude that most flights arrive on time at this time.
One interesting part of this visualization is the regression line. This shows a clear positive relationship between departure delays and arrival delays. It also suggests that departure delays have a ripple effect on arrival delays, making it difficult to recover the lost time once they start very late.
Citation
All code references are based on previous Data 101 homeworks,classnotes and Data 101 homeworks.