Problem Set # 2

Rob Leteff

date()
## [1] "Sat Sep 22 18:17:45 2012"

Due Date: September 27, 2012
Total Points: 30

1 The following ten observations, taken during the years 1970-1979, are on October snow cover for Eurasia in units of millions of square kilometers. Follow the instructions and answer questions by typing the appropriate R commands.

Year Snow
1970 6.5
1971 12.0
1972 14.9
1973 10.0
1974 10.7
1975 7.9
1976 21.9
1977 12.5
1978 14.5
1979 9.2

a. Create a data frame from these data. (2)
b. What are the mean and median snow cover over this decade? (2)
c. What is the standard deviation of the snow cover over this decade? (2)
d. How many Octobers had snow cover greater than 10 million km\( ^2 \)? (2)

(a)Create a data frame from these data. (2)

EurSnow = data.frame(Year = 1970:1979, MilSqKm = c(6.5, 12, 14.9, 10, 10.7, 
    7.9, 21.9, 12.5, 14.5, 9.2))
EurSnow
##    Year MilSqKm
## 1  1970     6.5
## 2  1971    12.0
## 3  1972    14.9
## 4  1973    10.0
## 5  1974    10.7
## 6  1975     7.9
## 7  1976    21.9
## 8  1977    12.5
## 9  1978    14.5
## 10 1979     9.2

(b) What are the mean and median snow cover over this decade? (2)

mean(EurSnow$MilSqKm)
## [1] 12.01
median(EurSnow$MilSqKm)
## [1] 11.35

© What is the standard deviation of the snow cover over this decade? (2)

sd(EurSnow$MilSqKm)
## [1] 4.391

(d) How many Octobers had snow cover greater than 10 million km\( ^2 \)? (2)

sum(EurSnow$MilSqKm > 10)
## [1] 6

2 The data set rivers contains the lengths (miles) of 141 major rivers in North America.
a. What proportion of the rivers are shorter than 500 miles long? (2)
b. What proportion of the rivers are shorter than the mean length? (2)
c. What is the 75th percentile river length? (2)
d. What is the interquartile range in river length? (2)

(a) What proportion of the rivers are shorter than 500 miles long? (2)

short.rivers = sum(rivers < 500)
short.rivers/length(rivers)
## [1] 0.5816

(b)What proportion of the rivers are shorter than the mean length? (2)

short.riv.mean = sum(rivers < mean(rivers))
short.riv.mean/length(rivers)
## [1] 0.6667

© What is the 75th percentile river length? (2)

quantile(rivers, probs = 0.75)
## 75% 
## 680

(d) What is the interquartile range in river length? (2)

quantile(rivers, probs = c(0.25, 0.75))
## 25% 75% 
## 310 680

3 Consider the SSN.txt file on Blackboard. The file contains monthly sunspot numbers for since 1851.
a. Read the data into R to create a data frame. (4)
b. Create a histogram of the September sunspot numbers. (2)
c. Create a scatter plot placing the June sunspot numbers on the horizontal axis and September sunspot numbers on the vertical axis. (4)
d. Use the grammar of graphics syntax to create a plot showing the September sunspot numbers on the vertical axis and the year on the horizontal axis. (4)

(a) Read the data into R to create a data frame. (4)

sunspot = read.table("SSN.txt", header = TRUE)

(b) Create a histogram of the September sunspot numbers. (2)

hist(sunspot$Sep)

plot of chunk unnamed-chunk-11

© Create a scatter plot placing the June sunspot numbers on the horizontal axis and September sunspot numbers on the vertical axis. (4)

plot(sunspot$Jun, sunspot$Sep, xlab = "June Sunspot Activity", ylab = "September Sunspot Activity")

plot of chunk unnamed-chunk-12

(d) Use the grammar of graphics syntax to create a plot showing the September sunspot numbers on the vertical axis and the year on the horizontal axis. (4)

require(ggplot2)
## Loading required package: ggplot2
ggplot(sunspot, aes(x = Year, y = Sep)) + geom_line() + xlab("Year") + ylab("September Sunspot Activity")

plot of chunk unnamed-chunk-13