In this Exercise you will
Load water quality data
Plot Q-C relationships
Use power functions to quantify the Q-C relationships
Plot discharge data on the plot to determine hydrological conditions at time of sampling
The water quality data have been downloaded and formatted for
you.
Water quality data can be downloaded from:
** You don’t have to download data from here for this exercise. It has been downloaded for you and posted in the url below. These websites are given for your information and to cite the data source **
The CEDEN website (California): http://ceden.waterboards.ca.gov/AdvancedQueryTool
Watershed urban management plans (San Diego County only): http://www.projectcleanwater.org/index.php Go to “Stormwater copermiettees > Work Products” or to: “Watersheds > Your Watershed > Data, Plans and Projects”
Here is the URL for a link to the data, which is in a .csv file:
Read this csv and put it in a data.frame in R. I called the data.frame “x”.
Hint: Look back to previous exercises to find the commands to use to read a csv file using a URL.
Format the “Date” column as a date and give the formatted date field the name “dates”. I’m not showing the code…see previous exercises.
Plot C vs Q for Electrical conductivity (EC), Total suspended solids (TSS), turbidity, and Total.coliforms I’m not showing the code to make this plot…
I’m not showing the code to make this plot… Hint: use the plot option log=“xy” as described under Key R concepts for Exercise 7.
x$EC.log = log(x$EC.umhos.cm)
x$Q.log = log(x$Q.ft3.s)
lm.EC = lm(EC.log ~ Q.log,x) # Type ?lm to see format of input
summary(lm.EC)
##
## Call:
## lm(formula = EC.log ~ Q.log, data = x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.3417 -0.1645 -0.1325 0.2430 0.4397
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.42908 0.30398 27.729 3.09e-09 ***
## Q.log -0.20220 0.07043 -2.871 0.0208 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2906 on 8 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.5075, Adjusted R-squared: 0.4459
## F-statistic: 8.243 on 1 and 8 DF, p-value: 0.0208
# Generate an array of Q with all values between 5 and 400 in order to plot the lm
Qnew = seq(5,400,by=10)
Qnew.log = data.frame(Q.log=log(Qnew)) # Create a new dataframe that has a variable "Q.log" in it
EC.pred.log = predict.lm(lm.EC,Qnew.log)
# Use exp to turn the log-EC values into non-log values:
EC.pred = exp(EC.pred.log)
Do the same thing for Turbidity, TSS and Total Coliforms. I’m not showing the code.
x$turb.log = log(x$Turbidity.NTU)
lm.turb = lm(turb.log ~ Q.log,x)
summary(lm.turb)
##
## Call:
## lm(formula = turb.log ~ Q.log, data = x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.83459 -0.31994 -0.00517 0.42493 0.65739
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.5015 0.5445 0.921 0.381064
## Q.log 0.5996 0.1227 4.887 0.000863 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.532 on 9 degrees of freedom
## Multiple R-squared: 0.7263, Adjusted R-squared: 0.6959
## F-statistic: 23.88 on 1 and 9 DF, p-value: 0.000863
Turbidity.pred = exp(predict.lm(lm.turb,Qnew.log))
TSS
##
## Call:
## lm(formula = TSS.log ~ Q.log, data = x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.01415 -0.34337 -0.02042 0.32490 0.91866
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.5106 0.5902 0.865 0.409366
## Q.log 0.6667 0.1330 5.014 0.000725 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5765 on 9 degrees of freedom
## Multiple R-squared: 0.7363, Adjusted R-squared: 0.7071
## F-statistic: 25.14 on 1 and 9 DF, p-value: 0.0007254
Total Coliforms
##
## Call:
## lm(formula = TC.log ~ Q.log, data = x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.0443 -0.9515 -0.4254 0.9244 3.9137
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.5980 2.3146 2.851 0.0191 *
## Q.log 0.8407 0.5216 1.612 0.1414
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.261 on 9 degrees of freedom
## Multiple R-squared: 0.224, Adjusted R-squared: 0.1378
## F-statistic: 2.598 on 1 and 9 DF, p-value: 0.1414
I’m not showing the code for this… Hint: to add the regression line, use “lines”, with Qnew as the x-variable and the .pred series as the y-variables.
Here’s a reminder about how to add lines to a plot:
x.example = c(1,2,3,4,5)
y.example = c(2,3,1,6,7)
y2.example = c(4,1,2,8,9)
plot(x.example,y.example)
lines(x.example,y2.example)
Site code for San Diego River at Fashion Valley is 11023000. I’m not showing the code for this…look back at previous exercises. Put the discharge data into a data.frame called “xQ”. Rename the discharge column “Q.ft.s”.
## agency_cd site_no Date X_00060_00003 X_00060_00003_cd
## 1 USGS 11023000 1982-01-18 7.6 A
## 2 USGS 11023000 1982-01-19 8.8 A
## 3 USGS 11023000 1982-01-20 151.0 A
## 4 USGS 11023000 1982-01-21 574.0 A
## 5 USGS 11023000 1982-01-22 215.0 A
## 6 USGS 11023000 1982-01-23 70.0 A
## agency_cd site_no Date Q.ft.s QA
## 1 USGS 11023000 1982-01-18 7.6 A
## 2 USGS 11023000 1982-01-19 8.8 A
## 3 USGS 11023000 1982-01-20 151.0 A
## 4 USGS 11023000 1982-01-21 574.0 A
## 5 USGS 11023000 1982-01-22 215.0 A
## 6 USGS 11023000 1982-01-23 70.0 A
par(mfrow=c(1,1))
plot(as.Date(xQ$Date),xQ$Q.ft.s,type="l",xlim=as.Date(c("2006-10-01","2007-04-01")),ylim=c(0,500))
points(dates,x$Q.ft3.s)