date()
## [1] "Thu Sep 19 03:31:24 2013"
Due Date: September 19, 2013 Total Points: 30
1 The following ten observations, taken during the years 1970-1979, are on October snow cover for Eurasia in units of millions of square kilometers. Follow the instructions and answer questions by typing the appropriate R commands.
Year Snow 1970 6.5 1971 12.0 1972 14.9 1973 10.0 1974 10.7 1975 7.9 1976 21.9 1977 12.5 1978 14.5 1979 9.2
a. Create a data frame from these data. (2)
snow = c(6.5, 12, 14.9, 10, 10.7, 7.9, 21.9, 12.5, 14.5, 9.2)
year = 1970:1979
snowcover = data.frame(Year = year, Snow = snow)
b. What are the mean and median snow cover over this decade? (2)
mean(snowcover$Snow)
## [1] 12.01
median(snowcover$Snow)
## [1] 11.35
c. What is the standard deviation of the snow cover over this decade? (2)
sd(snowcover$Snow)
## [1] 4.391
d. How many Octobers had snow cover greater than 10 million km\( ^2 \)? (2)
sum(snowcover$Snow > 10)
## [1] 6
2 The data set rivers contains the lengths (miles) of 141 major rivers in North America.
a. What proportion of the rivers are shorter than 500 miles long? (2)
require(UsingR)
## Loading required package: UsingR Loading required package: MASS
sum(rivers < 500)/length(rivers)
## [1] 0.5816
b. What proportion of the rivers are shorter than the mean length? (2)
sum(rivers < mean(rivers))/length(rivers)
## [1] 0.6667
c. What is the 75th percentile river length? (2)
quantile(rivers, probs = 0.75)
## 75%
## 680
d. What is the interquartile range in river length? (2)
IQR(rivers)
## [1] 370
3 Consider the SSN.txt file from http://myweb.fsu.edu/jelsner/data/SSN.txt. The file contains monthly sunspot numbers for since 1851.
a. Import the data into R. (4)
loc = "http://myweb.fsu.edu/jelsner/data/SSN.txt"
SSN = read.table(loc, header = TRUE)
b. Create a histogram of the September sunspot numbers. (2)
require(ggplot2)
## Loading required package: ggplot2
##
## Attaching package: 'ggplot2'
##
## The following object is masked from 'package:UsingR':
##
## movies
ggplot(SSN, aes(Sep)) + geom_histogram(fill = "green") + xlab("September Sunspots") +
ylab("Frequency")
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust
## this.
c. Create a boxplot of the June sunspot numbers. Label the axis. (4)
boxplot(SSN$Jun, ylab = "June Sunspots")
d. Create a scatter plot placing the June sunspot numbers on the horizontal axis and September sunspot numbers on the vertical axis. Label the axes. (4)
ggplot(SSN, aes(x = Jun, y = Sep)) + geom_point() + xlab("June Sunspots") +
ylab("September Sunspots")