italic demo #it will show up in italics when you click knit
italics demo #another way
bold demo #it will show up in bold when you click knit
bold demo #another way
some R code or R packages
#shows r code or r
packages
superscript2
subscript2
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
You can also embed plots, for example:
Note that the echo = FALSE
parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.
The atmos
data set resides in the
nasaweather
package of the R programming language.
It contains a collection of atmospheric variables measured between 1995
and 2000 on a grid of 576 coordinates in the western hemisphere. The
data set comes from the 2006 ASA Data
Expo.
Some of the variables in the atmos data set are:
You can convert the temperature unit from Kelvin to Celsius with the formula
\[ celsius = kelvin - 273.15 \]
And you can convert the result to Fahrenheit with the formula
\[ fahrenheit = celsius \times \frac{9}{5} + 32 \]
To analyze this data, we will use the following R packages:
library(nasaweather)
library(tidyverse)
1:20 + 1:6
## [1] 2 4 6 8 10 12 8 10 12 14 16 18 14 16 18 20 22 24 20 22
slide 43 stuff. what is the result of 2+3: 5. and the year we analyzed is 1995.
For the remainder of the report, we will look only at data from the year <!—code chunk 3: Insert inline code to reference year –>1995. We aggregate our data by location, using the R code below.
means <- atmos %>%
filter(year == year) %>%
group_by(long, lat) %>%
summarize(temp = mean(temp, na.rm = TRUE),
pressure = mean(pressure, na.rm = TRUE),
ozone = mean(ozone, na.rm = TRUE),
cloudlow = mean(cloudlow, na.rm = TRUE),
cloudmid = mean(cloudmid, na.rm = TRUE),
cloudhigh = mean(cloudhigh, na.rm = TRUE)) %>%
ungroup()
## `summarise()` has grouped output by 'long'. You can override using the
## `.groups` argument.
summarise()
has grouped output by ‘long’. You can
override using the .groups
argument.
where the year
object equals 1995.
Is the relationship between ozone and temperature useful for understanding fluctuations in ozone? A scatterplot of the variables shows a strong, but unusual relationship.
We suggest that ozone is highly correlated with temperature, but that a different relationship exists for each geographic region. We capture this relationship with a second order linear model of the form
\[ ozone = \alpha + \beta_{1} temperature + \sum_{locales} \beta_{i} locale_{i} + \sum_{locales} \beta_{j} interaction_{j} + \epsilon\]
This yields the following coefficients and relationships.
# code chunk 7
means$locale <- "north america"
means$locale[means$lat < 10] <- "south pacific"
means$locale[means$long > -80 & means$lat < 10] <- "south america"
means$locale[means$long > -80 & means$lat > 10] <- "north atlantic"
# code chunk 8
lm(ozone ~ temp + locale + temp:locale, data = means)
##
## Call:
## lm(formula = ozone ~ temp + locale + temp:locale, data = means)
##
## Coefficients:
## (Intercept) temp
## 1336.508 -3.559
## localenorth atlantic localesouth america
## 548.248 -1061.452
## localesouth pacific temp:localenorth atlantic
## -549.906 -1.827
## temp:localesouth america temp:localesouth pacific
## 3.496 1.785
# code chunk 9
ggplot(means, aes(temp, ozone, color = locale)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
facet_wrap(~ locale)
## `geom_smooth()` using formula = 'y ~ x'
An anova test suggests that both locale and the interaction effect of locale and temperature are useful for predicting ozone (i.e., the p-value that compares the full model to the reduced models is statistically significant).
# code chunk 10
mod <- lm(ozone ~ temp, data = means)
mod2 <- lm(ozone ~ temp + locale, data = means)
mod3 <- lm(ozone ~ temp + locale + temp:locale, data = means)
anova(mod, mod2, mod3)
## Analysis of Variance Table
##
## Model 1: ozone ~ temp
## Model 2: ozone ~ temp + locale
## Model 3: ozone ~ temp + locale + temp:locale
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 574 99335
## 2 571 41425 3 57911 706.17 < 2.2e-16 ***
## 3 568 15527 3 25898 315.81 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1