Before starting, go to the tools menu, choose ‘Global options’, then in the ‘R markdown’ menu, UNCHECK the box labeled “Show output inline for all R Markdown documents”. This will tell RStudio to put the plot into the viewer sub-window.
Get your data for the standard curve into R using a Google Sheets
spreadsheet or any other method that you prefer. You’ll need to remember
how to save your data from Google as a .csv file. You can use the
read.csv() function to get it into R.
Your data sheet should have 1 row of headers with any data below those headers. To make your life easy, format your headers in a way that R can read them easily – do not use spaces or other weird characters (computer languages usually interpret spaces as a break between words). Dots (.) and underscores (_) are fine, but parentheses are not. Therefore, don’t try to incorporate units into your headings. Headings such as “Wavelength (nm)” will give you problems!
Make sure your data columns that should contain numeric values only contain numeric values throughout the entire column (except for the header row). In particular, don’t include the units alongside the data. Any ‘characters’ will cause R to interpret the data as text, rather than numbers, and you can’t add text together!
Note that you ONLY need the data for the standard curve. Don’t include any of the milk data in your Google spreadsheet.
dat.str()) function. Note what R calls your variables – you’ll
need this information for plotting the curve below.knitr::opts_knit$set(global.device = TRUE)
dat <- read.csv("https://docs.google.com/spreadsheets/d/e/2PACX-1vTz088FDDnt95wcv3Hyz22bIqbU2x5BJWvHrGAeHVIbKNxY4HOKZZPpohjW7YJD3p9Jle9HpBGEWfyg/pub?gid=0&single=true&output=csv")
Assuming you have your data in a variable called dat
which has 2 series named “x.data.series” and “y.data.series” (choose
something better for your own data!), you can plot a graph with the
plot() command. You’ll need to modify it for your data, and
to adjust the labels to something useful. MAKE SURE TO MODIFY
“x.data.series” ETC. TO REFLECT THE NAMES AT THE HEAD OF YOUR DATA
TABLE!!!
This is a good chance to learn about providing arguments to
an R function. Type ?plot in the console (bottom left pane)
to view the help file. Notice it says that the usage is
plot(x, y, ...) and then it gives information about what
the x, y, and other arguments are. There are 2
ways to supply arguments in R. In the best way, you tell R explicitly
that x=dat$variable1 and y=dat$variable2 (see
the code chunk below). If you want to use a shortcut, you can just say
plot(variable1, variable2) and R will interpret this as the
first argument is the first argument in the help file, and the second is
the second argument in the help file. However, the first method is
generally better as it is very explicit.
?plot to see the help file for the plot()
function.plot() command below to accept your data
series. Note that data series are referred to by the $
operator.xlim and ylim
arguments, change the second number (initially set to 1) to accommodate
the highest values from the spectrophotometer (ylim) or
from the graph you drew by hand (for the x-axis;
xlim).eval=FALSE to eval=TRUE once
your code runs. This is necessary to knit properly.# The following command is written on multiple lines.
# R will interpret it as one command because the command
# doesn't end until the parentheses are closed.
# It is useful to break up functions with many arguments
# like this so that they are easy to read.
plot(x=dat$protein.sample,
y=dat$Absorbance.at.595,
main="Protein standard curve ",
xlab="Protein concentration (mg/ml)",
ylab="Absorbance (%)",
ylim=c(0,1.2),
xlim=c(0,2.5))
recordedPlot <- recordPlot() # Don't touch this line
Calculate a linear regression of your data using a “linear model”
(lm()). The tilde (~) indicates to R a formula that
basically reads “is explained by”. Therefore, the following command
means “conduct a linear regression in which ‘y.data.series’ is explained
by ‘x.data.series’, and save the result in a variable called
“lin.regression”. Be careful to note which way around this
formula is: y ~ x, not x ~ y
eval=FALSE to eval=TRUE.lin.regression <- lm(dat$Absorbance.at.595 ~ dat$protein.sample)
Examine the results of your regression. Remember from math class that
a straight line has a formula of \[Y = mX +
b\] Where ‘m’ is the slope and ‘b’ is the intercept (this is how
I learned the equation – you might have used different letters – it’s a
regional thing!). You can see all the details with the
summary() function.
##
## Call:
## lm(formula = professors_data$y ~ professors_data$x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.9800 -0.6410 0.2338 0.2678 1.5452
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.16882 0.55270 -0.305 0.767817
## professors_data$x 0.55473 0.08908 6.228 0.000252 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.8091 on 8 degrees of freedom
## Multiple R-squared: 0.829, Adjusted R-squared: 0.8076
## F-statistic: 38.78 on 1 and 8 DF, p-value: 0.0002517
This seems like a lot of output, but there are only a few things we need to worry about.
You can now use the data from the regression to write down the equation for a regression line through your own data. In the example above, it would be \[Y = mX + b\] \[Y = 0.55473 X + -0.16882\]
For your own data:
eval=FALSE to
eval=TRUE.summary(lin.regression)
# Type your regression equation with values here
# y=0.27412x + 0.56260
#
#
Use the formula for a straight line that you wrote above to calculate the protein concentration of your milk samples. In this case, the Y will be the absorbance and the X will be the (unknown) protein concentration. You will therefore need to solve for X.
When you have done that, enter the data below by hand. Each of the 4 variables below should look something like the following in which each value should be separated by a comma:
skim.abs <- c(0.123, 0.133, 0.142, 0.121, 0.125)
skim.conc <- c(1.234, 1.235, 1.324, 1.442, 1.421)
where the 0.123 (skim.abs) is the absorbance (Y value)
you measured and the 1.234 (skim.conc) is the corresponding
(X) value that you calculated using the equation for the straight line
(the concentration of protein in skim milk). Make sure that you keep
them in order so that the first value for skim.abs
corresponds to the first value of skim.conc.
eval=FALSE
to eval=TRUE.skim.abs <- c(1.087, 1.065, 1.204, 1.097, 1.116)
skim.conc <- c(1.913, 1.832, 2.339, 1.949, 2.018 )
whole.abs <- c(1.020, 1.072, 0.919, 1.049, 0.987)
whole.conc <- c(1.668, 1.858, 1.300, 1.774, 1.548)
You will need to figure out how to do the following simple steps yourself. Feel free to get help from your partners, or if necessary, from your instructor.
mean() function to calculate the average
protein concentrations in both the skim and whole milk. Remember that
you can use an entire variable as the argument to the
mean() function.eval=FALSE
to eval=TRUE.average.whole <- mean(whole.conc)
average.skim <- mean(skim.conc)
undiluted.whole <-average.whole * 50
undiluted.skim <-average.skim * 50
eval=FALSE to
eval=TRUEreplayPlot(recordedPlot) # leave this line alone
abline(lin.regression)
recordedPlot <- recordPlot() # leave this line alone
Here, we will add points (of different colors) to the plot. Make sure you know which one is which color so you can describe it in the caption for this figure in your lab report.
eval=FALSE to
eval=TRUEreplayPlot(recordedPlot) # leave this line alone
points(x=skim.conc, y=skim.abs, col='blue', pch=16)
points(x=whole.conc, y=whole.abs, col='red', pch=16)
Export and then
Copy to Clipboard. Then paste it into your lab report. If
you don’t have a graph, you may need to follow the instructions at the
very beginning of this document.Knit then
Knit to html. Save a copy of the resulting HTML file on
your computer. You won’t be turning this in, but it may be useful to
refer to.