I am playing with the knitcitations package written by Boettiger (2014).
Consider the data frame PAMTEMP from the PASWR2 package (Arnholt 2014) which contains temperature and precipitation for Pamplona, Spain from January 1, 1990 to Decmber 31, 2010.
tmean for each month.month is correct. Hint: look at the examples for PAMTEMP. Characterize the pattern of side-by-side violin plots.library(PASWR2)
levels(PAMTEMP$month)
[1] "Apr" "Aug" "Dec" "Feb" "Jan" "Jul" "Jun" "Mar" "May" "Nov" "Oct" "Sep"
PAMTEMP$month <- factor(PAMTEMP$month, levels = month.abb[1:12])
levels(PAMTEMP$month)
[1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
ggplot(data = PAMTEMP) +
geom_violin(aes(x = month, y = tmean, fill = month)) +
theme_bw() +
guides(fill = FALSE) +
labs(x = "", y = "Temperature (Celcius)")
The center of each violin plot as one moves from January to July generally increases. As one moves from August to December the center of each violin plot decreases. There is a cyclical pattern of warming up and then cooling down as one goes through the year.
tmean for each year.ggplot(data = PAMTEMP) +
geom_violin(aes(x = as.factor(year), y = tmean, fill = as.factor(year))) +
theme_bw() +
guides(fill = FALSE) +
labs(x = "", y = "Temperature (Celcius)") +
theme(axis.text.x = element_text(angle = 60, hjust = 1))
There is no apparent pattern from the side-by-side violin plots of tmean. Temperature variation over the time period 1990 to 2010 for Pamplona, Spain appears similar.
tmean.PAMTEMP[which.min(PAMTEMP$tmean), ]
tmax tmin precip day month year tmean
4285 2 -10 0.5 25 Dec 2001 -4
The minimum value of tmean is -4 \(^{\circ}\) C which occured on Dec 25, 2001.
tmean.PAMTEMP[which.max(PAMTEMP$tmean), ]
tmax tmin precip day month year tmean
4873 39 23 0 5 Aug 2003 31
The maximum value of tmean is 31 \(^{\circ}\) C which occured on Aug 5, 2003.
tmax value greater than 38 \(^{\circ}\) C?sum(PAMTEMP$tmax > 38)
[1] 15
15 days reported a tmax value greater than 38 \(^{\circ}\) C.
precip.PAMTEMP[which.max(PAMTEMP$precip), ]
tmax tmin precip day month year tmean
1455 8.6 4 69.2 25 Dec 1993 6.3
The maximum value of precip is 69.2 mm which occured on Dec 25, 1993.
month for the period January 1, 1990 to Decmeber 31, 2010. Based on your barplot, which month has had the least amount of precipitation? Which month has had the greatest amount of precipitiation? Hint: use the plyr package (Wickham 2011) to create an appropriate data frame.library(plyr)
SEL <- ddply(PAMTEMP, .(year, month), summarize, TP = sum(precip))
head(SEL)
year month TP
1 1990 Jan 31.1008
2 1990 Feb 26.8005
3 1990 Mar 9.3001
4 1990 Apr 121.1001
5 1990 May 120.5002
6 1990 Jun 77.0006
ggplot(data = SEL, aes(x = month, y = TP, fill = month)) +
geom_bar(stat = "identity") +
labs(y = "Total Percipitation (1990-2010) in mm", x= "") +
theme_bw() +
guides(fill = FALSE)
August has the minimum total percipitation of all of the months for the period 1990-2010. November has the maximum total percipitation of all of the months for the period 1990-2010.
year for the period January 1, 1990 to Decmeber 31, 2010. Based on your barplot, which year has had the least amount of precipitation? Which year has had the greatest amount of precipitiation? Hint: use the plyr package to create an appropriate data frame.SELY <- ddply(PAMTEMP, .(year), summarize, TP = sum(precip))
head(SELY)
year TP
1 1990 692.5048
2 1991 704.0052
3 1992 902.8038
4 1993 752.1041
5 1994 638.8045
6 1995 582.3028
ggplot(data = SELY, aes(x = year, y = TP, fill = as.factor(year))) +
geom_bar(stat = "identity") +
labs(x = "", y = "Total Percipitation (mm)") +
theme_bw() +
guides(fill = FALSE)
SELY[which.max(SELY$TP), ]
year TP
8 1997 929.2025
SELY[which.min(SELY$TP), ]
year TP
9 1998 566.2011
The greatest yearly total percipitiation on record (929.2025 mm) occurred in 1997. The least yearly total percipitiation on record (566.2011 mm) occurred in 1998.
year and the minimum temperature versus year. Does the graph suggest temperatures are becoming more extreme over time?SEL <- ddply(PAMTEMP, .(year), summarize, Tmax = max(tmax), Tmin = min(tmin))
head(SEL)
year Tmax Tmin
1 1990 37.0 -5.6
2 1991 38.2 -5.2
3 1992 36.4 -4.4
4 1993 37.6 -4.5
5 1994 37.2 -6.8
6 1995 40.0 -5.0
ggplot(data = SEL, aes(x = year, y = Tmax)) +
geom_line(color = "red") +
geom_line(aes(x = year, y = Tmin), color = "blue") +
theme_bw() +
labs(y = "Temperature (Celcius)") +
geom_smooth(method = "lm", color = "red") +
geom_smooth(aes(x = year, y = Tmin), method = "lm")
Based on the graph, there is too much variability from year to year to make any statment about the weather becoming more extreme over time.
Arnholt, Alan T. 2014. PASWR2: Probability and Statistics with R, Second Edition.
Boettiger, Carl. 2014. knitcitations: Citations for Knitr Markdown Files. http://CRAN.R-project.org/package=knitcitations.
Wickham, Hadley. 2011. “The Split-Apply-Combine Strategy for Data Analysis.” Journal of Statistical Software 40 (1): 1–29. http://www.jstatsoft.org/v40/i01/.