The Annual Data Challenge Expo is jointly sponsored by three American Statistical Association (ASA) Sections – Statistical Computing, Statistical Graphics, and Government Statistics.
##Data The ‘atmos’ data set resides in the nasaweather package of the R programming language. It contains a collection of atmospheric variables measured between 1995 and 2000 on a grid of 576 coordinates in the western hemisphere. The data set comes from the ASA Data Expo.
Some of the variables in the atmos data set are:
You can convert the temperature unit from Kelvin to Celsius with the formula
$celsius = kelvin – 273.15 $
And you can convert the result to Fahrenheit with the formula
\[ fahrenheit = celsius \times \frac{9}{5} + 32 \]
library(nasaweather)
library(tidyverse)
For the remainder of the report, we will look only at data from the year 1995 . We aggregate our data by location, using the R code below.
means <- atmos %>%
filter(year == year) %>%
group_by(long, lat) %>%
summarize(temp = mean(temp, na.rm = TRUE),
pressure = mean(pressure, na.rm = TRUE),
ozone = mean(ozone, na.rm = TRUE),
cloudlow = mean(cloudlow, na.rm = TRUE),
cloudmid = mean(cloudmid, na.rm = TRUE),
cloudhigh = mean(cloudhigh, na.rm = TRUE)) %>%
ungroup()
## `summarise()` has grouped output by 'long'. You can override using the
## `.groups` argument.
Is the relationship between ozone and temperature useful for understanding fluctuations in ozone? A scatterplot of the variables shows a strong, but unusual relationship.
ggplot(data = means, aes(x = temp, y = ozone)) + geom_point()
We suspect that group level effects are caused by environmental conditions that vary by locale. To test this idea, we sort each data point into one of four geographic regions:
means$locale <- "north america"
means$locale[means$lat < 10] <- "south pacific"
means$locale[means$long > -80 & means$lat < 10] <- "south america"
means$locale[means$long > -80 & means$lat > 10] <- "north atlantic"
We suggest that ozone is highly correlated with temperature, but that a different relationship exists for each geographic region.