The Annual Data Challenge Expo is jointly sponsored by three American Statistical Association (ASA) Sections – Statistical Computing, Statistical Graphics, and Government Statistics.
The atmos
data set resides in the
nasaweather
package of the R programming language.
It contains a collection of atmospheric variables measured between 1995
and 2000 on a grid of 576 coordinates in the western hemisphere. The
data set comes from the ASA Data Expo
Some of the variables in the atmos data set are:
You can convert the temperature unit from Kelvin to Celsius with the formula \[ celsius = kelvin - 273.15 \]
And you can convert the result to Fahrenheit with the formula
\[ fahrenheit = celsius \times \frac{9}{5} + 32\] ## Preparing the Data
To analyze this data, we will use the following R packages: library(nasaweather) and library(tidyverse)
library(nasaweather)
library(tidyverse)
means <- atmos %>%
filter(year == year) %>%
group_by(long, lat) %>%
summarize(temp = mean(temp, na.rm = TRUE),
pressure = mean(pressure, na.rm = TRUE),
ozone = mean(ozone, na.rm = TRUE),
cloudlow = mean(cloudlow, na.rm = TRUE),
cloudmid = mean(cloudmid, na.rm = TRUE),
cloudhigh = mean(cloudhigh, na.rm = TRUE)) %>%
ungroup()
## `summarise()` has grouped output by 'long'. You can override using the
## `.groups` argument.
Is the relationship between ozone and temperature useful for understanding fluctuations in ozone? A scatterplot of the variables shows a strong, but unusual relationship.
We suspect that group level effects are caused by environmental conditions that vary by locale. To test this idea, we sort each data point into one of four geographic regions:
means$locale <- "north america"
means$locale[means$lat < 10] <- "south pacific"
means$locale[means$long > -80 & means$lat < 10] <- "south america"
means$locale[means$long > -80 & means$lat > 10] <- "north atlantic"
We suggest that ozone is highly correlated with temperature, but that a different relationship exists for each geographic region.