Introduction

Phenobarbital is a benzodiazepine that is used to prevent convulsions in newborns. Pediatricians wonder whether the pharmacokinetics of phenobarbital in newborns that are treated with therapeutic hypothermia (TH) are the same for normal (normo-thermic) newborns. They have concentration (mg/L) – time (h) data available from patients that were treated with TH (=1) and patients of similar age that are not treated with TH (=0). All patients received a dose of 40 mg in this study.

Notes:

Make sure that for all questions below, the R-code required to obtain the answers are shown (you can use code folding however)
Knit this file, so that you get a more readable case study
You can use this file to fill in your answers, don’t forget to change the authors at the top.
Write a short (1 paragraph) conclusion at the end, where you summarize your findings
Knit this file to HTML to upload to Brightspace. Other formats will not be graded! (Tip: Knit early and often to prevent knitting problems just before the deadline)

Exercises

Exercise 1

Read in the dataset Phenobarbital_TH.csv and describe the design of the study in terms of number of individuals, treatments in the study arms and number and timing of PK samples. Show the table using DT (https://rstudio.github.io/DT/).
NB: Don’t change the file; all changes should be done with R code

Solution

Pheno <- read_csv("C:/Users/fabia/OneDrive/Documenten/Fabian/Universiteit Leiden/BFW 3/Introduction to computational thinking/Phenobarbital_TH.csv")

## Rows: 210 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (4): ID, TIME, DV, TH
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

library(DT)
datatable(Pheno, colnames = c("ID", "Time (h)", "Concentration (mg/L)", "Treatment"), filter = 'top')

The concentration of phenobarbital was measured in 30 patients over the course of 48 hours. Concentration measurements were taken after administration at the following seven time points (in hours) for patients with therapeutic hypothermia (TH): 0, 0.5, 2, 4, 8, 24 and 48. For patients that do not suffer from TH the concentrations were measured at the following seven time points (again in hours): 0, 1, 2, 6, 12, 24 and 48. Patients with TH are indicated with a 1 in the treatment column, those without TH are indicated with a 0.

Exercise 2

How many unique individuals are included in the data set for both treatment arms?

Solution

Pheno %>% filter(TIME == 0) %>% group_by(TH) %>% count()

## # A tibble: 2 × 2
## # Groups:   TH [2]
##      TH     n
##   <dbl> <int>
## 1     0    15
## 2     1    15

To get only one value per individual, the time was filtered on 0, as all participants had a measurement at this time point. The output shows that 15 patients were included in both groups.

Exercise 3

Are there any differences between the pharmacokinetic profiles of phenobarbital patients treated with TH and patients that are not treated with TH? Base your assessment on a plot of the concentration-time data that includes all patients and uses different colors for patients with and without TH. Do this on a linear scale and semi-logarithmic scale.

Solution

ggplot(Pheno, aes(x = TIME, y = DV, colour = as.factor(TH), group = as.factor(ID))) + geom_line() + scale_color_manual(values = c("black", "grey")) + labs(y = "Concentration (mg/L)", x = "Time (h)", colour = "TH")

ggplot(Pheno, aes(x = TIME, y = DV, colour = as.factor(TH), group = as.factor(ID))) + geom_line() + scale_color_manual(values = c("black", "grey")) + scale_y_log10() + labs(y = "log(Concentration (mg/L))", x = "Time (h)", colour = "TH")

## Warning: Transformation introduced infinite values in continuous y-axis

It seems like the patients treated for TH remain on a higher concentration of phenobarbital after the initial uptake (the clearance appears to be slower). This can be seen in the profiles in both graphs, the grey lines remains higher over time than the black lines. This trend can be seen in both the normal and semi-logarithmic graph.

Exercise 4

Can you detect any outlying data points? Make sure that you can differentiate ID numbers and treatment arms in the visualization. Do this on a linear scale and semi-logarithmic scale.

Hint: Fancy statistical outlier analysis not needed, with the correct visualization you should be able to identify the outliers by eye (also don’t worry too much about getting the correct ones, this question and the next are more about the removal and traceability than the actual points, however, you should explain why you think the points you removed are outliers)

Solution

ggplot(Pheno, aes(x = TIME, y = DV, colour = as.factor(ID), group = as.factor(ID))) + geom_line() + facet_wrap(~as.factor(TH)) + labs(y = "Concentration (mg/L)", x = "Time (h)", colour = "ID")

ggplot(Pheno, aes(x = TIME, y = DV, colour = as.factor(ID), group = as.factor(ID))) + geom_line() + facet_wrap(~as.factor(TH)) + scale_y_log10() + labs(y = "log(Concentration (mg/L))", x = "Time (h)", colour = "ID")

## Warning: Transformation introduced infinite values in continuous y-axis

From the graphs, two individuals seem to be outliers. The first one is 2018135 in TH group 0, its peak is much higher than every other patient. It seems unrealisticly high, therefore it is considered as an outlier. The second outlier is 2017154 from TH group 1. Around 6 hours after the administration of phenobarbital it shows a peak, which is much later than all the other patients. Then, the concentration drops drastically. Therefore, this is also considered as an outlier.

Exercise 5

Make a subset of the dataset that does not include the outlying data points and use that dataset for the following questions. Make sure that it is clear in your script what subset of the data you have used (or what data points you have taken out), because traceability is very important!

Solution

Pheno2 <- Pheno %>% filter (!ID == 2018135, !ID == 2017154)

The new dataset contains every patient, except for 2018135 and 2017154.

Exercise 6

What are the average Cmax (peak concentration), average Cmin (trough concentration), and average Cav (average concentration) for both treatment arms? Does it make sense to compare these values between both groups given the study design?

Hint: You need to summarize the output from summarize. (First summarize is per patient, second is summarizing over the patients)

Solution

#mean of the maxes
Pheno2 %>% filter(TH == 0) %>% group_by(ID) %>% summarize(max1 = max(DV)) %>% summarize(mean0 = mean(max1))

## # A tibble: 1 × 1
##   mean0
##   <dbl>
## 1  41.6

Pheno2 %>% filter(TH == 1) %>% group_by(ID) %>% summarize(max2 = max(DV)) %>% summarize(mean1 = mean(max2))

## # A tibble: 1 × 1
##   mean1
##   <dbl>
## 1  42.2

#mean of the mins
Pheno2 %>% filter(TH == 0) %>% filter(!TIME == 0) %>% group_by(ID) %>% summarize(min1 = min(DV)) %>% summarize(mean0 = mean(min1))

## # A tibble: 1 × 1
##   mean0
##   <dbl>
## 1  1.02

Pheno2 %>% filter(TH == 1) %>% filter(!TIME == 0) %>% group_by(ID) %>% summarize(min2 = min(DV)) %>% summarize(mean1 = mean(min2))

## # A tibble: 1 × 1
##   mean1
##   <dbl>
## 1  5.28

#mean of the averages
Pheno2 %>% filter(TH == 0) %>% group_by(ID) %>% summarize(av1 = mean(DV)) %>% summarize(mean0 = mean(av1))

## # A tibble: 1 × 1
##   mean0
##   <dbl>
## 1  18.4

Pheno2 %>% filter(TH == 1) %>% group_by(ID) %>% summarize(av2 = mean(DV)) %>% summarize(mean1 = mean(av2))

## # A tibble: 1 × 1
##   mean1
##   <dbl>
## 1  22.6

Comparing the maximum values makes sense, as this represents the distribution volume of the groups. The other two are not very purposeful to compare, as you want to know what the rate of clearance. To test that you do not need one value, but need to know the degree of decrease of the concentration

Exercise 7

The graph of concentration-time data reveals a linear profile on a semi-logarithmic scale. In order to perform a linear regression on the data, add a new column of (natural) log-transformed values of the concentrations to the data set.

Solution

Pheno3 <- Pheno2 %>% mutate(lnDV = log(DV)) 
Pheno3$TH <- as.factor(Pheno3$TH)
Pheno3$ID <- as.factor(Pheno3$ID)
Pheno3[Pheno3 == "-Inf"] <- NA

Because the computer cannot read -Inf values, this value is changed to NA.

Exercise 8

Fit a linear model to the ln-transformed concentration data, using both time and treatment as covariates. Make models with and without interaction and explain which model is better.

The peak concentration is mainly determined by the distribution volume of phenobarbital in this population, while the slope of the profiles is mainly determined by the clearance. Do you anticipate differences in distribution volume and clearance between patients on TH treatment and patients that are not on TH treatment, given the outcome of the model fit?

Hint: Which model should you use for your anticipation?

Bonus: Using the data_grid() and gather_predictions() functions from the modelr package, can you show the difference between the (predictions of the) two models in a plot?

Solution

summary(lm(lnDV~TIME+TH, data = Pheno3))

## 
## Call:
## lm(formula = lnDV ~ TIME + TH, data = Pheno3)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.98507 -0.25437  0.01175  0.25735  2.09440 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.51008    0.06974   50.34   <2e-16 ***
## TIME        -0.06637    0.00247  -26.87   <2e-16 ***
## TH1          0.42559    0.08247    5.16    7e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5342 on 165 degrees of freedom
##   (28 observations deleted due to missingness)
## Multiple R-squared:  0.8213, Adjusted R-squared:  0.8191 
## F-statistic: 379.2 on 2 and 165 DF,  p-value: < 2.2e-16

summary(lm(lnDV~TIME*TH, data = Pheno3))

## 
## Call:
## lm(formula = lnDV ~ TIME * TH, data = Pheno3)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.02324 -0.16937  0.01345  0.18409  1.53444 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.783440   0.067585  55.980  < 2e-16 ***
## TIME        -0.084006   0.002990 -28.093  < 2e-16 ***
## TH1         -0.088154   0.093511  -0.943    0.347    
## TIME:TH1     0.034310   0.004171   8.226 5.66e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4508 on 164 degrees of freedom
##   (28 observations deleted due to missingness)
## Multiple R-squared:  0.8735, Adjusted R-squared:  0.8712 
## F-statistic: 377.5 on 3 and 164 DF,  p-value: < 2.2e-16

The distribution volume is determined by the intercept. The model with interaction does not show a significant difference between TH0 and TH1, while the model without interaction does. The difference in clearance is determined by the interaction value between Time and TH1. This is a significant value, so there could be a difference between the clearance of the two groups.

Exercise 9

Fit a linear model to the ln-transformed concentration data of all individual patients in both treatment arms and make a table that includes ID number, treatment arm and the concentration at time = 0 (time0).

Hints:

Have a look at rowwise (https://dplyr.tidyverse.org/articles/rowwise.html#modelling-1) or at https://purrr.tidyverse.org/ and the slides
Depending on your approach you might need to make a tibble out of the results, and join this with the original tibble

Solution

Pheno3_nest <- 
  Pheno3 %>% 
  group_by(ID) %>% 
  nest()

Pheno3_nest1 <- Pheno3_nest %>%
  rowwise() %>% 
  mutate(model = list(lm(lnDV ~ TIME, data = data)))

Pheno3_nest2 <- Pheno3_nest1 %>% 
  rowwise() %>% 
  mutate(
    coef_intercept = model$coefficients[[1]],
    coef_x = model$coefficients[[2]])

Pheno3_nest3 <-  Pheno3_nest2 %>% group_by(ID) %>% unnest(cols = data) %>% select(ID, TH, coef_intercept)

Pheno_mooi <- Pheno3_nest3[!duplicated(Pheno3_nest3), ]

datatable(Pheno_mooi)

Exercise 10

The concentration at time zero, \(time0 \approx \frac{dose}{distribution volume}\). Assume that the distribution volume in the population is normally distributed. Can you determine whether there is a statistically significant difference in the distribution volume of phenobarbital for patients with TH treatment and without TH treatment?

Solution

Pheno_mooi2 <- Pheno_mooi %>% mutate(Time0 = 40/coef_intercept)
t.test(Pheno_mooi2$Time0[1:14], Pheno_mooi2$Time0[15:28])

## 
##  Welch Two Sample t-test
## 
## data:  Pheno_mooi2$Time0[1:14] and Pheno_mooi2$Time0[15:28]
## t = 1.4916, df = 25.997, p-value = 0.1478
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.09518002  0.59869751
## sample estimates:
## mean of x mean of y 
##  10.84125  10.58949

To test the difference in the distribution volume, a t-test is performed on the Time0 of both groups. The p-value is higher than 0.05, so there is no statistical difference in distribution volume of patients with TH and without TH treatment.

Conclusion

In conclusion, the initial data seemed to show a significant difference in clearance of the drug, when looking at it in the graph and grouping it by TH group. However, there did not seem to be a difference in the distribution volume, as the peaks did not seem to be higher for the two different groups. After removing two outliers with concentration values that seemed too high, statistical tests were performed on the groups. Making a linear model out of the (log transformed) data makes it possible to compare the slope and the intercept. The intercept represents the distribution volume, the slope represents the clearance. The model without interaction shows a significant difference in intercept, but the model with does not. The difference in slope is significant, which would mean that the clearance between the groups is significant. When the intercept was tested with a t-test, it showed that there was not a significant difference in intercept. Therefore, it can be assumed that the distribution volume in the two patient groups is the same.

ICT case study: Phenobarbital

Pharmacokinetic data analysis

Amy de Boer, Charlotte Paulssen, Jurgen Zwerwer, Fabian Leferink

Introduction

Exercises

Exercise 1

Solution

Exercise 2

Solution

Exercise 3

Solution

Exercise 4

Solution

Exercise 5

Solution

Exercise 6

Solution

Exercise 7

Solution

Exercise 8

Solution

Exercise 9

Solution

Exercise 10

Solution

Conclusion