About the Challenge Expo

The Annual Data Challenge Expo is jointly sponsored by three American Statistical Association (ASA) Sections – Statistical Computing, Statistical Graphics, and Government Statistics.

Data

Data

The atmos data set resides in the nasaweather package of the R programming language. It contains a collection of atmospheric variables measured between 1995 and 2000 on a grid of 576 coordinates in the western hemisphere. The data set comes from the ASA Data Expo. The atmos data set resides in the nasaweather package of the R programming language. It contains a collection of atmospheric variables measured between 1995 and 2000 on a grid of 576 coordinates in the western hemisphere. The data set comes from the ASA Data Expo.

The atmos data set resides in the nasaweather package of the R programming language. It contains a collection of atmospheric variables measured between 1995 and 2000 on a grid of 576 coordinates in the western hemisphere. The data set comes from the ASA Data Expo.

The atmos data set resides in the nasaweather package of the R programming language. It contains a collection of atmospheric variables measured between 1995 and 2000 on a grid of 576 coordinates in the western hemisphere. The data set comes from the 2006 ASA Data Expo.

Some of the variables in the atmos data set are:

temp - The mean monthly air temperature near the surface of the Earth (measured in degrees kelvin (K)) pressure - The mean monthly air pressure at the surface of the Earth (measured in millibars (mb)) ozone - The mean monthly abundance of atmospheric ozone (measured in Dobson units (DU))

  • temp - The mean monthly air temperature near the surface of the Earth (measured in degrees kelvin (K))
  • pressure - The mean monthly air pressure at the surface of the Earth (measured in millibars (mb))
  • ozone - The mean monthly abundance of atmospheric ozone (measured in Dobson units (DU))
  • temp - The mean monthly air temperature near the surface of the Earth (measured in degrees kelvin (K))
  • pressure - The mean monthly air pressure at the surface of the Earth (measured in millibars (mb))
  • ozone - The mean monthly abundance of atmospheric ozone (measured in Dobson units (DU))
  • temp - The mean monthly air temperature near the surface of the Earth (measured in degrees kelvin (K))
  • pressure - The mean monthly air pressure at the surface of the Earth (measured in millibars (mb))
  • ozone - The mean monthly abundance of atmospheric ozone (measured in Dobson units (DU))

You can convert the temperature unit from Kelvin to Celsius with the formula

\[celsius = kelvin – 273.15\]

And you can convert the result to Fahrenheit with the formula

\[ fahrenheit = celsius \times \frac{9}{5} + 32 \]

Preparing the Data

To analyze this data, we will use the following R packages: library(nasaweather) and library(tidyverse)

library(nasaweather)
library(tidyverse)

For the remainder of the report, we will look only at data from the year 1995 . We aggregate our data by location, using the R code below.

means <- atmos %>%
  filter(year == year) %>%
  group_by(long, lat) %>%
  summarize(temp = mean(temp, na.rm = TRUE),
            pressure = mean(pressure, na.rm = TRUE),
            ozone = mean(ozone, na.rm = TRUE),
            cloudlow = mean(cloudlow, na.rm = TRUE),
            cloudmid = mean(cloudmid, na.rm = TRUE),
            cloudhigh = mean(cloudhigh, na.rm = TRUE)) %>%
  ungroup()

For the remainder of the report, we will look only at data from the year 1995 . We aggregate our data by location, using the R code below.

Ozone and temperature

Is the relationship between ozone and temperature useful for understanding fluctuations in ozone? A scatterplot of the variables shows a strong, but unusual relationship.

We suspect that group level effects are caused by environmental conditions that vary by locale. To test this idea, we sort each data point into one of four geographic regions: