Problem Set # 2

Sean P Nickerson

Sun Sep 15 13:15:43 2013

Due Date: September 19, 2013 Total Points: 30

1 The following ten observations, taken during the years 1970-1979, are on October snow cover for Eurasia in units of millions of square kilometers. Follow the instructions and answer questions by typing the appropriate R commands.

Year Snow 1970 6.5 1971 12.0 1972 14.9 1973 10.0 1974 10.7 1975 7.9 1976 21.9 1977 12.5 1978 14.5 1979 9.2

a. Create a data frame from these data. (2)

  y <- as.numeric(1970:1979)
  sqkm = as.numeric(c(6.5, 12, 14.9, 10, 10.7, 7.9, 21.9, 12.5, 14.5, 9.2))
  dfSnow <- (data.frame(a = y, b = sqkm))
  names(dfSnow) <- c("Year", "Snow")

b. What are the mean and median snow cover over this decade? (2)

  mean(dfSnow$Snow)
  ## [1] 12.01
  median(dfSnow$Snow)
  ## [1] 11.35

c. What is the standard deviation of the snow cover over this decade? (2)

  sd(dfSnow$Snow)
  ## [1] 4.391

d. How many Octobers had snow cover greater than 10 million km\( ^2 \)? (2)

  length(which(dfSnow$Snow > 10))
  ## [1] 6

2 The data set rivers contains the lengths (miles) of 141 major rivers in North America.

a. What proportion of the rivers are shorter than 500 miles long? (2)

  tRivers <- length(rivers)  #Total number of rivers
  sRivers <- length(which(rivers < 500))  #Numbers of rivers shorter than 500 miles long
  (sRivers/tRivers) * 100  #Percentage of rivers shorter than 500 miles long
  ## [1] 58.16

b. What proportion of the rivers are shorter than the mean length? (2)

  tRivers <- length(rivers)  #Total number of rivers
  smRivers <- length(which(rivers < (mean(rivers))))  #Numbers of rivers shorter than mean length of all rivers
  (smRivers/tRivers) * 100  #Percentage of rivers shorter than mean length of all rivers
  ## [1] 66.67

c. What is the 75th percentile river length? (2)

  quantile(rivers, probs = c(0.75))
  ## 75% 
  ## 680

d. What is the interquartile range in river length? (2)

  IQR(rivers)
  ## [1] 370

3 Consider the SSN.txt file from http://myweb.fsu.edu/jelsner/data/SSN.txt. The file contains monthly sunspot numbers for since 1851.

a. Import the data into R. (4)

  loc <- "http://myweb.fsu.edu/jelsner/data/SSN.txt"
  ss <- read.table(loc, header = TRUE)

b. Create a histogram of the September sunspot numbers. (2)

  require(ggplot2)
  ## Loading required package: ggplot2
  ggplot(ss, aes(x = ss$Sep)) + geom_histogram(binwidth = 5) + xlab("September Sunspot Activity")

plot of chunk histSunspots

c. Create a boxplot of the June sunspot numbers. Label the axis. (4)

  boxplot(ss$Jun, ylab = "June Sunspot Activity")

plot of chunk boxJuneSunspots

d. Create a scatter plot placing the June sunspot numbers on the horizontal axis and September sunspot numbers on the vertical axis. Label the axes. (4)

  ggplot(ss, aes(x = ss$Jun, y = ss$Sep)) + geom_point() + xlab("June Sunspot Activity") + 
      ylab("September Sunspot Activity")

plot of chunk sctJuneSepSunspots