Problem Set # 2

Param Maharaj

date()
## [1] "Wed Sep 26 18:59:54 2012"

Due Date: September 27, 2012
Total Points: 30

1 The following ten observations, taken during the years 1970-1979, are on October snow cover for Eurasia in units of millions of square kilometers. Follow the instructions and answer questions by typing the appropriate R commands.

Year Snow
1970 6.5
1971 12.0
1972 14.9
1973 10.0
1974 10.7
1975 7.9
1976 21.9
1977 12.5
1978 14.5
1979 9.2

a. Create a data frame from these data. (2)

snow = c(6.5, 12, 14.9, 10, 10.7, 7.9, 21.9, 12.5, 14.5, 9.2)
year = c(1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979)
snowcover = data.frame(year = year, snow = snow)

b. What are the mean and median snow cover over this decade? (2)

mean(snow)
## [1] 12.01

#Mean snow cover is 12.01

median(snow)
## [1] 11.35

#Median snow cover is 11.35

c. What is the standard deviation of the snow cover over this decade? (2)

sd(snow)
## [1] 4.391

#The standard deviation of the snow cover is 4.390761

d. How many Octobers had snow cover greater than 10 million km\( ^2 \)? (2)

sum(snow > 10)
## [1] 6

#There were 6 Octobers that had snow cover greater than 10 million km\( ^2 \).

2 The data set rivers contains the lengths (miles) of 141 major rivers in North America.
a. What proportion of the rivers are shorter than 500 miles long? (2)

require(UsingR)
## Loading required package: UsingR
## Loading required package: MASS
sum(rivers < 500)/length(rivers)
## [1] 0.5816

#Proportion of rivers shorter than 500 miles long is 0.5815603

b. What proportion of the rivers are shorter than the mean length? (2)

sum(rivers < mean(rivers))/length(rivers)
## [1] 0.6667

#The proportion of rivers shorter than the mean length is 0.6667

c. What is the 75th percentile river length? (2)

quantile(rivers, 0.75)
## 75% 
## 680

#The 75th percentile river length is 680.

d. What is the interquartile range in river length? (2)

IQR(rivers)
## [1] 370

#The interquartile range in river length is 370

3 Consider the SSN.txt file on Blackboard. The file contains monthly sunspot numbers for since 1851.
a. Read the data into R to create a data frame. (4)

loc = "http://myweb.fsu.edu/jelsner/SSN.txt"
SSN = read.table(loc, header = TRUE)

b. Create a histogram of the September sunspot numbers. (2)

hist(SSN$Sep, main = "September Sunspot Histogram", xlab = "September Sunspots")

plot of chunk histSSNSept

c. Create a scatter plot placing the June sunspot numbers on the horizontal axis and September sunspot numbers on the vertical axis. (4)

plot(SSN$Jun, SSN$Sep, main = "September vs June Sunspot Scatterplot", ylab = "September Sunspots", 
    xlab = "June Sunspots")

plot of chunk SSNScatterPlot

d. Use the grammar of graphics syntax to create a plot showing the September sunspot numbers on the vertical axis and the year on the horizontal axis. (4)

require(ggplot2)
## Loading required package: ggplot2
## Attaching package: 'ggplot2'
## The following object(s) are masked from 'package:UsingR':
## 
## movies
ggplot(SSN, aes(y = Sep, x = Year)) + geom_line(col = "blue") + ylab("September Sunspots") + 
    xlab("Year")

plot of chunk GGPlotSSN