Part 1

Use ggplot() to produce a histogram of salinity values

temp = read.csv("Temperature.csv")
library(ggplot2)
library(ggthemes)
myplot = ggplot(temp, aes(x = Salinity))
myplot = myplot + ggtitle("Salinity Values")
myplot + geom_histogram(binwidth=1, position = "identity") + theme_economist() + ggtitle("Salinity Values")
## Warning: Removed 798 rows containing non-finite values (stat_bin).

Make a histogram of salinity values for each year of study, and then for each month

myplot = ggplot(temp, aes(x = Salinity))
myplot + facet_wrap(~ Year) + geom_histogram(fill='lightblue') + ggtitle("Salinity Values Per Year")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 798 rows containing non-finite values (stat_bin).

myplot = ggplot(temp, aes(x = Salinity))
myplot + facet_wrap(~ Month) + geom_histogram(fill='coral') + ggtitle("Salinity Values Per Month")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 798 rows containing non-finite values (stat_bin).

Make a boxplot of temperature values for each station

myplot = ggplot(temp, aes(x=Temperature))
myplot = myplot + facet_wrap(~ Station) + geom_boxplot(fill='lightgreen') + ggtitle("Temperature values for each Station")
myplot
## Warning: Removed 927 rows containing non-finite values (stat_boxplot).

ggsave("temp_vs_station.png", myplot, width = 6, height = 4)
## Warning: Removed 927 rows containing non-finite values (stat_boxplot).

Bonus

Part 2

Make some time series plots of temperature and salinity. Using the new variable, make a scatterplot of temperature and salinity over time.

temp = read.csv("Temperature.csv")
temp$decdate <- temp$Year + temp$dDay3 / 365
myplot = ggplot(temp, aes(x = Temperature, y = Salinity))
myplot + geom_point()
## Warning: Removed 963 rows containing missing values (geom_point).

Make a scatterplot of salinity, grouped using facet_wrap() into different ‘Areas’

temp = read.csv("Temperature.csv")
temp$decdate <- temp$Year + temp$dDay3 / 365
myplot = ggplot(temp, aes(x = Salinity, y = Station))
myplot + facet_wrap(~ Area) + geom_point(color = 'lightpink', fill = 'white')
## Warning: Removed 798 rows containing missing values (geom_point).

Make a lineplot of salinity, grouped using facet_wrap() into different ‘Areas’

temp = read.csv("Temperature.csv")
temp$decdate <- temp$Year + temp$dDay3 / 365
myplot = ggplot(temp, aes(x = Salinity, y = Station))
myplot + facet_wrap(~ Station) + geom_line(aes(group=Area, color=Salinity)) + ggtitle("Salinity values for each Station")
## Warning: Removed 217 row(s) containing missing values (geom_path).