Data

The atmos data set resides in the ‘nasaweather’ package of the R programming language. It contains a collection of atmospheric variables measured between 1995 and 2000 on a grid of 576 coordinates in the western hemisphere. The data set comes from the 2006 ASA Data Expo

Some of the variables in the atmos data set are:

some text here along with formula: \(\forall x \in X, \quad \exists y \leq \epsilon\)

OR we can do some text with the formula like this: \[\forall x \in X, \quad \exists y \leq \epsilon\]

You can convert the temperature unit from Kelvin to Celsius with the formula \[ celsius = kelvin - 273.15 \] And you can convert the result to Fahrenheit with the formula \[ fahrenheit = celsius \times \frac{9}{5} + 32 \]

crtl + alt + i

2 + 2
## [1] 4

Cleaning

To analyze this data, we will use the following R packages:

# code chunk 1
library(nasaweather)
library(tidyverse)
## [1] 2000

For the remainder of the report, we will look only at data from the year 2000. We aggregate our data by location, using the R code below.

Monospaced text example. What is the result of 2 + 3? Answer: 5. And the year we analyzed is 2000.

means <- atmos %>%
  filter(year == year) %>%
  group_by(long, lat) %>%
  summarize(temp = mean(temp, na.rm = TRUE),
            pressure = mean(pressure, na.rm = TRUE),
            ozone = mean(ozone, na.rm = TRUE),
            cloudlow = mean(cloudlow, na.rm = TRUE),
            cloudmid = mean(cloudmid, na.rm = TRUE),
            cloudhigh = mean(cloudhigh, na.rm = TRUE)) %>%
  ungroup()

summarise() has grouped output by ‘long’. You can override by using the .groups argument.

where the year object equals 2000.

Ozone and temperature

Is the relationship between ozone and temperature useful for understanding fluctuations in ozone? A scatterplot of the variables shows a strong, but unusual relationship.

We suspect that group level effects are caused by environmental conditions that vary by locale. To test this idea, we sort each data point into one of four geographic regions:

# code chunk 7
means$locale <- "north america"
means$locale[means$lat < 10] <- "south pacific"
means$locale[means$long > -80 & means$lat < 10] <- "south america"
means$locale[means$long > -80 & means$lat > 10] <- "north atlantic"

Model

We suggest that ozone is highly correlated with temperature, but that a different relationship exists for each geographic region. We capture this relationship with a second order linear model of the form \[ ozone = \alpha + \beta_{1} temperature + \sum_{locales} \beta_{i} locale_{i} + \sum_{locales} \beta_{j} interaction_{j} + \epsilon\] This yields the following coefficients and relationships.

# code chunk 8
lm(ozone ~ temp + locale + temp:locale, data = means)
## 
## Call:
## lm(formula = ozone ~ temp + locale + temp:locale, data = means)
## 
## Coefficients:
##               (Intercept)                       temp  
##                  1336.508                     -3.559  
##      localenorth atlantic        localesouth america  
##                   548.248                  -1061.452  
##       localesouth pacific  temp:localenorth atlantic  
##                  -549.906                     -1.827  
##  temp:localesouth america   temp:localesouth pacific  
##                     3.496                      1.785
## `geom_smooth()` using formula = 'y ~ x'

Diagnostics

An anova test suggests that both locale and the interaction effect of locale and temperature are useful for predicting ozone (i.e., the p-value that compares the full model to the reduced models is statistically significant).

# code chunk 10
mod <- lm(ozone ~ temp, data = means)
mod2 <- lm(ozone ~ temp + locale, data = means)
mod3 <- lm(ozone ~ temp + locale + temp:locale, data = means)

anova(mod, mod2, mod3)
## Analysis of Variance Table
## 
## Model 1: ozone ~ temp
## Model 2: ozone ~ temp + locale
## Model 3: ozone ~ temp + locale + temp:locale
##   Res.Df   RSS Df Sum of Sq      F    Pr(>F)    
## 1    574 99335                                  
## 2    571 41425  3     57911 706.17 < 2.2e-16 ***
## 3    568 15527  3     25898 315.81 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1