Â
[BACKGROUND]
Biological activity can be induced by adding some concentration of compound. One of the core tenants of pharmacology is that this relationship between [Compound], or agonist, and response, follows a semi-log sigmoidal curve described by the Hill equation. This means at first, biological response to the dose (DR) is slow. Then as more compound is added, it becomes roughly linear. FInally, the response is fully saturated and plateaus around some value.
Excellent background resource: https://www.graphpad.com/guides/prism/latest/curve-fitting/reg_the_ec50.htm
To describe the curve, 3 key parameters are estimated.
First, the [compound] that evokes 50% of the response is called the EC50, which is a common measure of how potent a compound is. Lower EC50 = high potency!
Second, the curve top, or maximal saturating biological response that can be induced by the compound is a common measure of the compound’s efficacy. Here, higher max = greater efficacy.
Finally, if we allow the slope of the linear region near EC50 to vary, the slope through the EC50, called the hill slope, can be taken as an estimate of how fast the dose-response occurs. Higher hill slope = smaller [compound] range from the bottom, EC0, to the top, EC100.
The drc library in R works fairly well with R base plots to generate automated dose-response curves. dplyr will help us parse microplate result files (csv results per well in a 96 wp)
Loading data (in this case, the csv results file consists of results from 2 DR curves across two plates- one per plate)
DR_Example...Data <- read.csv("~/Documents/R - Data/ZPrime_Screen_Example - Data.csv")
#Take a peek at 1st 10 rows of data
print(head(DR_Example...Data),10)
## Row Column NucCnt AggObj_Num AggObj_IntSum AggObj_Area Cell_Area NumFields
## 1 B 2 1593 18612 33700000 44231 1476947 30
## 2 C 2 1518 18835 35100000 42458 1542905 30
## 3 D 2 1494 30457 56300000 68089 1468619 30
## 4 E 2 1504 21684 43300000 42796 1544650 30
## 5 F 2 1526 20675 41000000 38545 1603921 30
## 6 G 2 1826 22082 43600000 45807 1392844 30
## Concentration Weeks
## 1 0 4
## 2 0 4
## 3 0 4
## 4 0 4
## 5 0 4
## 6 0 4
nrow(DR_Example...Data)
## [1] 120
We can see the data frame has 120 rows in total.
As a quality control pre-processing step, we will check for wells that are outliers in terms of cell growth/health and remove them from the results.
Well_out<- boxplot.stats(DR_Example...Data$Cell_Area)$out
print(Well_out)
## [1] 974327
One outlier was identified, with a cell area of 9.7E+5
print(median(DR_Example...Data$Cell_Area))
## [1] 1438254
This outlier has about half the cell area compared to the median value accross the two plates 14E+5.
Let’s grab the row index and use it to update the data frame, removing this outlier.
out_ind <- which(DR_Example...Data$Cell_Area %in% c(Well_out))
print(out_ind)
## [1] 7
DR_Example_clean <- DR_Example...Data[-c(out_ind),]
nrow(DR_Example_clean)
## [1] 119
Sanity check: the cleaned data frame has 119 rows.
We will specify which row had no compound treatment a sour negatve control, and subtract out this non-specific signal.
NegControl = filter(DR_Example_clean, Column=='2')
Next, parse the csv file for # of plates. In this case, ‘weeks’ is an indicator of which plate is which.
plates = (unique(DR_Example_clean['Weeks']))
weeks = unique((DR_Example_clean['Weeks']))
print(weeks)
## Weeks
## 1 4
## 61 7
plate_cnt= length(unique((DR_Example_clean['Weeks'])[[1]]))
print(plate_cnt)
## [1] 2
To subtract out the nonspecific backgrouynd signal from the concnetration = wells, individually for each plate:
Parsing for unique curves (timepoints!)
curves = unique((DR_Example_clean['Weeks'])) #List of compounds tested
curve_cnt = length(unique(DR_Example_clean[['Weeks']])) #Length = # of compounds tested
Finally, let’s fit the data to DR curves!
#DR curve fitting
DR.m <- drm(AggObj_IntSum ~ Concentration,
data= DF_norm,
Weeks, #fit separate curves for each timepoint
robust = 'mean', #non-robust least squares estimation ("mean")
fct = LL.4(names = c("Hill slope", "Min", "Max", "EC50")))
print(DR.m)
##
## A 'drc' model.
##
## Call:
## drm(formula = AggObj_IntSum ~ Concentration, curveid = Weeks, data = DF_norm, fct = LL.4(names = c("Hill slope", "Min", "Max", "EC50")), robust = "mean")
##
## Coefficients:
## Hill slope:4 Hill slope:7 Min:4 Min:7 Max:4
## -7.716e-01 -5.803e-01 2.489e+07 -2.996e+07 3.010e+09
## Max:7 EC50:4 EC50:7
## 5.174e+09 6.257e+00 2.777e+00
drc makes it a breeze to quickly check best-fit parameters of interest between curves
compParm(DR.m, "EC50", "-")
##
## Comparison of parameter 'EC50'
##
## Estimate Std. Error t-value p-value
## 4-7 3.4804 7.0259 0.4954 0.6213
compParm(DR.m, "Max", "-")
##
## Comparison of parameter 'Max'
##
## Estimate Std. Error t-value p-value
## 4-7 -2164062554 1562488231 -1.385 0.1688
summary <- summary(DR.m)[[3]] #Coeffs are stored as 3rd element in list obj
here, we can see the 7-week timecourse had a lower EC50 and higher max compared to the 4 week timecourse- although the results are n.s. statistically using t tests.
We can also take a peek at the confidence intervals of best-fit parameters
confint(DR.m, level = 0.90)
## 5 % 95 %
## Hill slope:4 -1.009631e+00 -5.335364e-01
## Hill slope:7 -6.441313e-01 -5.164970e-01
## Min:4 -2.061536e+07 7.040501e+07
## Min:7 -8.427547e+07 2.436285e+07
## Max:4 6.000949e+08 5.419623e+09
## Max:7 4.220024e+09 6.127819e+09
## EC50:4 -5.272151e+00 1.778644e+01
## EC50:7 1.077937e+00 4.475646e+00
Hmm, it looks like the 90% CI is quite wide for EC50. This is quite typical of curves where [compound] was not high enough to characterize the top of the curve.
Finally, let’s plot the curves side- by side
cols <- brewer.pal(4,'Set2') #from RColorBrewer
par(mfrow=c(1,2)) #(n row, n col)
for (i in 1:curve_cnt){ #note curves list index starts at 1
DR = filter(DF_norm, Weeks==weeks[i,1])
DR.mi <- drm(AggObj_IntSum ~ Concentration,
data= DR,
robust = 'mean', #non-robust least squares estimation ("mean")
fct = LL.4(names = c("Hill slope", "Min", "Max", "EC50")))
plot(DR.mi,
type= "all",
col=cols[i], #auto color selection for curves
pch=16,
lwd=1,
cex.axis=0.8,
xlim = c(1e-4, 20),
ylim = c(0, 5E+09),
legend = TRUE,
legendText = (paste(toString(weeks[i,1]), 'weeks')),
xlab= "[Compound] (nM)",
ylab = "Intensity Sum")
plot(DR.mi,
col = cols[i],
xlim = c(1e-4, 20),
ylim = c(0, 5E+09),
add=TRUE,
type='confidence')}
 Â
As suspected, the curves does not reach a steady plateau. Therefore we can’t quantitatively compare best-fit curve EC50 and top values. However, there is a clear trend- the longer timecourse (7 weeks) has a stronger response in terms of both potency of compound (smaller EC50) and higher max (non-existent plateau). This can be easily visualized if we plot the suves on top of each other this time.
par(mfrow=c(1,1)) #(n row, n col)
for (i in 1:curve_cnt){ #note curves list index starts at 1
DR = filter(DF_norm, Weeks==weeks[i,1])
DR.mi <- drm(AggObj_IntSum ~ Concentration,
data= DR,
robust = 'mean', #non-robust least squares estimation ("mean")
fct = LL.4(names = c("Hill slope", "Min", "Max", "EC50")))
plot(DR.mi,
type= "all",
col=cols[i], #auto color selection for curves
pch=16,
lwd=1,
cex.axis=0.8,
xlim = c(1e-4, 20),
ylim = c(0, 5E+09),
legend = FALSE,
xlab= "[Compound] (nM)",
ylab = "Intensity Sum")
plot(DR.mi,
col = cols[i],
xlim = c(1e-4, 20),
ylim = c(0, 5E+09),
add=TRUE,
type='confidence')
par(new=TRUE)} #Add new=TRUE arg to overlay graphs