데이터 확인 & 정리

data<-airquality
head(data)
##   Ozone Solar.R Wind Temp Month Day
## 1    41     190  7.4   67     5   1
## 2    36     118  8.0   72     5   2
## 3    12     149 12.6   74     5   3
## 4    18     313 11.5   62     5   4
## 5    NA      NA 14.3   56     5   5
## 6    28      NA 14.9   66     5   6
data_clean <- na.omit(data)

1. Scatter Plot

library(ggplot2)
ggplot(data_clean, aes(x=Temp, y=Ozone))+
  geom_point()+
  labs(title="Scatter Plot of Temperature vs. Ozone", x="Temperature(°F)", y="Ozone(ppb)")

This graph can be used to answer: Does ozone increase when the temperature goes up?
-> Yes, it shows that ozone levels tend to increase as temperature rises.

2. Box Plot

# factor(Month)를 사용하면 데이터를 달별로 그룹화해서 쪼개준다. 
ggplot(data_clean, aes(x = factor(Month), y = Ozone)) +
  geom_boxplot(fill = "lightblue") +
  labs(title = "Box Plot of Ozone Levels by Month",
       x = "Month", y = "Ozone (ppb)")

This plot helps us compare how ozone levels change by month and see which months have more stable or extreme values.
-> The box plot shows that ozone levels vary by month, with summer months like July and August generally having higher median ozone levels compared to May and September. Some months show more variation (wider boxes) and others are more consistent (narrower boxes).

3. Histogram

ggplot(data_clean, aes(x = Wind)) +
  geom_histogram(binwidth = 2, fill = "skyblue", color = "black") +
  labs(title = "Histogram of Wind Speed",
       x = "Wind Speed (mph)", y = "Frequency")

This graph can be used to answer: Are most wind speeds in a specific range?
-> Most wind speeds form a bell curve, with the majority clustering around 10 mph. Fewer wind speeds occur at the minimum and maximum ends of the range, creating a clear peak in the middle.