── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.2.0 ✔ readr 2.1.6
✔ forcats 1.0.1 ✔ stringr 1.6.0
✔ ggplot2 4.0.2 ✔ tibble 3.3.1
✔ lubridate 1.9.5 ✔ tidyr 1.3.2
✔ purrr 1.2.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(nycflights23)data(flights)
summ <-summary(flights$air_time)summ
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
18.0 77.0 121.0 141.8 177.0 701.0 12534
My First Free Code
ggplot(flights, aes(distance, air_time)) +geom_point(aes(size = arr_delay, color = arr_delay), alpha =0.4) +geom_smooth(se =FALSE, color ="#4a2be3") +scale_size_area() +scale_color_gradient(low ="#4877f0", high ="#f02929") +theme_minimal() +labs(x ="Flight Distance(miles)",y ="Air Time",title ="Flights Delays in Terms of Distance",size ="Arrival Delay (minutes)", color ="Arrival Delay (minutes)",caption ="FAA Aircraft registry")
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
Warning: Removed 12534 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 12534 rows containing missing values or values outside the scale range
(`geom_point()`).
Essay
The scatterplot shows the relationship between flight time and delays. The visualization uses dot size and color to depict how the relationship changes, showing the length of delays through both color and size. The legend explains what the colors and sizes represent. The plotted graph shows that there is little relationship between delay time in minutes and air time. This means that, regardless of whether a flight is short or long, the probability of delay being high or low is almost equal. I would like to highlight the use of both size and color combined to demonstrate the change in one variable.