The atmos
data set resides in the
nasaweather
package of the R programming language.
It contains a collection of atmospheric variables measured between 1995
and 2000 on a grid of 576 coordinates in the western hemisphere. The
data set comes from the 2006 ASA Data
Expo.
Some of the variables in the atmos data set are:
\(\forall x \in X, \quad \exists y \leq \epsilon\)
\[\cos (2\theta) = \cos^2 \theta - \sin^2 \theta\]
You can convert the temperature unit from Kelvin to Celsius with the formula
\[\text{celcius} = kelvin - 273.15 \]
And you can convert the result to Fahrenheit with the formula
\[\text{fahrenheit} = \text{celsius} \times \frac{9}{5} + 32 \]
library(tidyverse)
## 'data.frame': 32 obs. of 11 variables:
## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
## $ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
## $ disp: num 160 160 108 258 360 ...
## $ hp : num 110 110 93 110 175 105 245 62 95 123 ...
## $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
## $ wt : num 2.62 2.88 2.32 3.21 3.44 ...
## $ qsec: num 16.5 17 18.6 19.4 17 ...
## $ vs : num 0 0 1 1 0 1 0 1 1 1 ...
## $ am : num 1 1 1 0 0 0 0 0 0 0 ...
## $ gear: num 4 4 4 3 3 3 3 4 4 4 ...
## $ carb: num 4 4 1 1 2 1 4 2 2 4 ...
dim(mtcars)
To analyze this data, we will use the following R packages:
library(nasaweather)
library(tidyverse)
means <- atmos %>%
filter(year == year) %>%
group_by(long, lat) %>%
summarize(temp = mean(temp, na.rm = TRUE),
pressure = mean(pressure, na.rm = TRUE),
ozone = mean(ozone, na.rm = TRUE),
cloudlow = mean(cloudlow, na.rm = TRUE),
cloudmid = mean(cloudmid, na.rm = TRUE),
cloudhigh = mean(cloudhigh, na.rm = TRUE)) %>%
ungroup()
## `summarise()` has grouped output by 'long'. You can override using the
## `.groups` argument.
1:20 + 1:6
## [1] 2 4 6 8 10 12 8 10 12 14 16 18 14 16 18 20 22 24 20 22
where the year
objects equals 2000
Is the relationship between ozone and temperature useful for understanding fluctuations in ozone? A scatterplot of the variables shows a strong, but unusual relationship.
ggplot(data = means, aes(x = temp, y = ozone)) +
geom_point()
means$locale <- "north america"
means$locale[means$lat < 10] <- "south pacific"
means$locale[means$long > -80 & means$lat < 10] <- "south america"
means$locale[means$long > -80 & means$lat > 10] <- "north atlantic"
We suggest that ozone is highly correlated with temperature, but that a different relationship exists for each geographic region. We capture this relationship with a second order linear model of the form
\[ ozone = \alpha + \beta_{1} temperature + \sum_{locales} \beta_{i} locale_{i} + \sum_{locales} \beta_{j} interaction_{j} + \epsilon\]
This yields the following coefficients and relationships.
lm(ozone ~ temp + locale + temp:locale, data = means)
##
## Call:
## lm(formula = ozone ~ temp + locale + temp:locale, data = means)
##
## Coefficients:
## (Intercept) temp
## 1336.508 -3.559
## localenorth atlantic localesouth america
## 548.248 -1061.452
## localesouth pacific temp:localenorth atlantic
## -549.906 -1.827
## temp:localesouth america temp:localesouth pacific
## 3.496 1.785
## `geom_smooth()` using formula = 'y ~ x'
An anova test suggests that both locale and the interaction effect of locale and temperature are useful for predicting ozone (i.e., the p-value that compares the full model to the reduced models is statistically significant).
mod <- lm(ozone ~ temp, data = means)
mod2 <- lm(ozone ~ temp + locale, data = means)
mod3 <- lm(ozone ~ temp + locale + temp:locale, data = means)
anova(mod, mod2, mod3)
## Analysis of Variance Table
##
## Model 1: ozone ~ temp
## Model 2: ozone ~ temp + locale
## Model 3: ozone ~ temp + locale + temp:locale
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 574 99335
## 2 571 41425 3 57911 706.17 < 2.2e-16 ***
## 3 568 15527 3 25898 315.81 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
mtcars <- read.csv("mtcars.csv")
mtcars_csv <- read.csv("mtcars.csv")
head(mtcars_csv)
## X mpg cyl disp hp drat wt qsec vs am gear carb
## 1 Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## 2 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## 3 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## 4 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## 5 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## 6 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
knitr::kable(head(mtcars_csv))
X | mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb |
---|---|---|---|---|---|---|---|---|---|---|---|
Mazda RX4 | 21.0 | 6 | 160 | 110 | 3.90 | 2.620 | 16.46 | 0 | 1 | 4 | 4 |
Mazda RX4 Wag | 21.0 | 6 | 160 | 110 | 3.90 | 2.875 | 17.02 | 0 | 1 | 4 | 4 |
Datsun 710 | 22.8 | 4 | 108 | 93 | 3.85 | 2.320 | 18.61 | 1 | 1 | 4 | 1 |
Hornet 4 Drive | 21.4 | 6 | 258 | 110 | 3.08 | 3.215 | 19.44 | 1 | 0 | 3 | 1 |
Hornet Sportabout | 18.7 | 8 | 360 | 175 | 3.15 | 3.440 | 17.02 | 0 | 0 | 3 | 2 |
Valiant | 18.1 | 6 | 225 | 105 | 2.76 | 3.460 | 20.22 | 1 | 0 | 3 | 1 |