Introduction

Neurally Adjusted Ventilatory Assist, in both its invasive and non-invasive forms, can be used on all patients requiring ventilatory assistance (neonatal, pediatric and adult patients), provided that the electrical signal from the brain to the diaphragm is intact and that there is no contraindication for insertion/exchange of a nasogastric tube. In this project, data set comes from UT Medical Center contains data on heart rate, mean blood pressure, saturation, fraction of inspired oxygen, respiratory rate, number of switches to backup and percentage of time in backup. The scores of these variables were collected from babies in the first few weeks of their lives. We are going to evaluate the difference over time for each variable, as well as the effects between invasive (NAVA) and non-invasive NAVA (NIVNAVA).

Method

The first step in the analysis is to conduct preliminary exploration of data, which can help us identify definite patterns, variability and detect any strange observations that need following up. Scatterplots grouped by treatment are sketched after rearranging the original data into usable form.

Now that we have some understanding of the data, we come to realize that the difference between subjects is not of interest so that we can add a random effect for subjects, which characterizes idiosyncratic variation that is due to individual differences. In order to reduce the noise inherent to single observation, as well as to avoid problems triggered by missing data, baseline subtraction on intercepts is applied on NAVA scores, which serve as the best prediction for the observation at time 0.

From what we have achieved earlier, we can begin fitting the ANCOVA model, a common method to assess different scaling parameters (i.e. slopes) or the development of different forms or shapes between groups. We would get the constant term for the mean for NIV-NAVA (\(\mu_{ex}\)), and a separate slope (\(\beta_{1}\) for NAVA, \(\beta_{2}\) for NIVNAVA) for each treatment as a function of time, the mathematical model for our case is shown below: \[Y_{i+} = \mu_{ex} + \beta_{i} t + \epsilon \]

#data cleaning and rearrangement
origin <- read.csv("https://raw.githubusercontent.com/yesimiao/Dataset/master/NAVAorigin.csv")
id <- rep(origin[c(2:16),1], each=8)
time <- rep(seq(-1.5,2,0.5), 15)

#rearrange data from origin to NAVA
var <- matrix(rep(NA,120*7), nrow = 120, byrow = TRUE)
for (j in 1:7){
var[,j] <- as.numeric(c(t(data.frame(origin[2:16,(4+10*(j-1)):(4+10*(j-1)+7)])))) }
colnames(var) <- c( "hr","sat" , "mbp" , "rr" ,"fio2", "switch","timeinbu")
NAVA <- data.frame(id,time,var)
NAVA$treatment <- as.factor("NAVA")

#rearrange data from origin to NIVNAVA
NIVvar <- matrix(rep(NA,120*7), nrow = 120, byrow = TRUE)
for (j in 1:7){
NIVvar[,j] <- as.numeric(c(t(data.frame(origin[23:37,(4+10*(j-1)):(4+10*(j-1)+7)])))) }
colnames(NIVvar) <- c( "hr","sat" , "mbp" , "rr" ,"fio2", "switch","timeinbu")
NIVNAVA <- data.frame(id,time,NIVvar)
NIVNAVA$treatment <- as.factor("NIVNAVA")

#data to be manipulated in following procedures
data1 <- rbind(NAVA, NIVNAVA)

Using “xyplot” function from R “lattice” package, nice scatterplots grouped by treatment are provided in following figures. The circles represent scores of individual observation while the lines indicate the linear regression for each subject.

#exploratory data analysis
library(lattice)
xyplot(hr ~ time| treatment, data = data1, groups = id, type = c("p","r"), 
       panel = panel.superpose)

From scatterplot of heart rate versus time, we can see modest increasing pattern over time for most subjects with only two or three having the decreasing form. The difference between invasive and non-invasive NAVA is not that straightforward from the plot, we would conduct linear regression model to verify this.

xyplot(sat ~ time| treatment, data = data1, groups = id, type = c("p",'r'), 
       panel = panel.superpose)

It is quite obvious that the most regression lines of NIVNAVA from plot of saturation versus time have less fluctuation and tend to be flat. Meanwhile observations in NAVA have similar pattern except subjects LW13 and AM07, who have an upward trend compared to others.

xyplot(mbp ~ time| treatment, data = data1, groups = id, type = c("p",'r'), 
       panel = panel.superpose)

In regard to mean blood pressure, scores in NIVNAVA seem to be higher than NAVA with more variation within group. Besides, both groups present flat trend over time in general.

xyplot(rr ~ time| treatment, data = data1, groups = id, type = c("p",'r'), 
       panel = panel.superpose)

Under both treatment and control groups, the number of decreasing observations over time is tied to the number of steady scores in case of respiratory rate. And different subjects have slightly distinguished reactions to the change of treatment.

xyplot(fio2 ~ time| treatment, data = data1, groups = id, type = c("p",'r'), 
       panel = panel.superpose)

When it comes to the factor of $FiO_{2} $, the patterns under the two groups have significant difference from the scatterplot. To start with, NIVNAVA group have much more variation and higher mean score value than the NAVA group. Besides, some subjects in NIVNAVA show an upward trend while no subject in NAVA group has similar pattern.

xyplot(switch ~ time| treatment, data = data1, groups = id, type = c("p",'r'), 
       panel = panel.superpose)

Greater variation and higher mean score are observed in the variable “number of switches to backup” as well. In this case, scores from most subjects indicates escalating trend especially in NIVNAVA group.

xyplot(timeinbu ~ time| treatment, data = data1, groups = id, type = c("p",'r'),
       panel = panel.superpose)

Except for subjects MW08 and LW15, the majority in both groups ascend modestly over time. Accompanied by the fact that NIVNAVA group has much variation and slightly higher mean score value.

After exploring the data visually, it is crucial for us to subtract the observations from the baselines, which helps to eliminate the impacts due to single subject. Then we can carry out linear regression analysis to find out the significant predictor by setting the subtracted scores as response variable, time as covariate and different treatment as factor. The baseline intercepts, derived from individual regression model in NAVA group, are shown below:

#find baselines
#intercept for all variables, all subjects 
idlist <- unique(NAVA$id)
ylist <- names(NAVA)[3:9]
timelist <- unique(NAVA$time)
intercept = matrix(rep(0,15*7),nrow=15, byrow=TRUE)
for (i in 1:15) {
for (j in 1:7) {
models_by_var <- lapply(ylist, function(x) {
lm(substitute(k ~ time, list(k = as.name(x))),
data=subset(NAVA,(NAVA$id==idlist[i])))
})
intercept[i,j] = models_by_var[[j]][[1]][1] }
}
colnames(intercept) <- c(ylist)
rownames(intercept) <- as.character(idlist)
intercept
##            hr      sat      mbp       rr     fio2     switch   timeinbu
## EP01 136.9524 94.66667 54.14286 70.85095 21.00952 0.42857143  0.0000000
## SM02 174.8750 93.97619 53.07143 79.51774 21.01667 0.10714286  0.7654762
## HI03 147.1254 95.49206 35.88571 56.03997 21.01397 0.14603175  0.2867302
## MS04 140.4571 94.88730 39.54286 54.85248 25.02444 0.00000000  0.0000000
## EE05 140.3095 94.94048 29.67857 41.51131 23.94405 1.17857143  9.6067857
## KJ06 159.0238 95.23810 44.28571 50.28810 21.01429 1.14285714  3.5685714
## AM07 136.9905 88.62381 29.55238 47.72517 30.81683 0.86031746  2.3155556
## MW08 153.1476 99.86825 55.42857 45.11143 21.01111 1.45079365  4.2477460
## BE09 144.6667 96.66667 43.57143 49.32000 21.15714 0.00000000  0.0000000
## EE10 163.4048 96.12500 41.89286 62.04452 21.02976 1.23809524  3.9489286
## EK11 164.5714 99.28571 31.28571 53.07714 21.00000 0.61904762  2.6980952
## ZE12 153.6190 93.67262 48.25000 71.03845 33.53214 0.13095238  0.1660714
## LW13 144.7024 88.66667 40.60714 59.58952 31.86429 1.44047619  3.7361905
## CH14 154.4048 93.59524 33.71429 76.10333 24.02857 0.19047619  0.3966667
## LW15 128.4079 99.93651 45.27619 71.28654 20.99270 0.06666667 -0.4341270

Since intercepts from NAVA group are used as baseline, intuitively speaking, the new response variable trends to be zero under NAVA treatment, which can be verified in the following boxplot. This fact supports the idea of forcing intercept term into zero when we fit the overall regression model for NAVA group.

#the updated data set after subtraction
diff_NAVA <- data.frame(id, time, NAVA[,c(3:9)] - intercept[rep(1:15,each=8),])
diff_NIVNAVA <- data.frame(id, time, NIVNAVA[,c(3:9)] - intercept[rep(1:15,each=8),])
#boxplot of new response variables
boxplot(diff_NAVA[3:9])
abline(h=0)

boxplot(diff_NIVNAVA[3:9])
abline(h=0)

On the contrary, as shown from the boxplot below, under treatment NIVNAVA, overall regression model would have distinguished intercept term due to mean score value difference between these two groups.

Based on above setup, linear regression analysis is implemented for each variable, with $_{0} $ functioned as intercept term for NIVNAVA, \(\beta_{1}\) as slope for treatment NAVA and \(\beta_{2}\) as slope for treatment NIVNAVA.

#fit linear regression to NAVA
beta1 <- NULL
pvalue1 <- NULL
for (j in 1:7) { 
models <- lapply(ylist, function(x) {
  lm(substitute(k ~ time - 1, list(k = as.name(x))), 
     data=diff_NAVA)
  })
beta1[j] = models[[j]]$coefficients
pvalue1[j] = summary(models[[j]])$coefficients[4]
} 

#fit linear regression to NIVNAVA
beta0 <- NULL
pvalue0 <- NULL
beta2 <- NULL
pvalue2 <- NULL
for (j in 1:7) { 
NIVmodels <- lapply(ylist, function(x) {
  lm(substitute(k ~ time , list(k = as.name(x))), 
     data=diff_NIVNAVA)
  })
beta0[j] = NIVmodels[[j]]$coefficients[1]
beta2[j] = NIVmodels[[j]]$coefficients[2]
pvalue0[j] = summary(NIVmodels[[j]])$coefficients[1,4]
pvalue2[j] = summary(NIVmodels[[j]])$coefficients[2,4]
} 

#summary for estimated parameters
model_summary <- round(cbind(beta0,pvalue0,beta1,pvalue1,beta2,pvalue2),4)
rownames(model_summary ) <- ylist
model_summary
##            beta0 pvalue0   beta1 pvalue1   beta2 pvalue2
## hr        2.2337  0.0089  2.6572  0.0000  0.7765  0.3613
## sat      -1.4064  0.0003  0.9094  0.0010 -0.4785  0.2077
## mbp       2.2489  0.0000 -0.1570  0.5159 -0.2224  0.6674
## rr       -1.6438  0.1147 -3.8665  0.0000  0.6322  0.5465
## fio2      5.9109  0.0000 -1.1078  0.0001 -2.0056  0.0148
## switch    0.8303  0.0000  0.3365  0.0000  0.3109  0.0397
## timeinbu  4.9652  0.0000  3.3307  0.0000  0.4982  0.6150

The table shows us the estimated parameters and corresponding p-value, among which only intercept for respiratory rate under NIVNAVA treatment, slope for mean blood pressure under NAVA treatment and slopes for heart rate, saturation, mean blood pressure, respiratory rate and percentage of time in backup under NIVNAVA treatment are not significant at 95% confidence level.

#check the goodness of fit of the models using residual analysis
par(mfrow=c(2,4))
plot((models[[1]]))
plot((models[[2]]))

plot((models[[3]]))
plot((models[[4]]))

plot((models[[5]]))
plot((models[[6]]))

par(mfrow=c(1,4))
plot((models[[7]]))

par(mfrow=c(2,4))
plot((NIVmodels[[1]]))
plot((NIVmodels[[2]]))

plot((NIVmodels[[3]]))
plot((NIVmodels[[4]]))

plot((NIVmodels[[5]]))
plot((NIVmodels[[6]]))

par(mfrow=c(1,4))
plot((NIVmodels[[7]]))

Conclusion

From the exploratory and regression analysis, we can draw the following conclusions: In regard to the variables selected as significant to the model, it’s noteworthy that all vairables but respiratory rate have significant intercepts for NIVNAVA at 95% confidence level. This result tells us that treatment under NIVNAVA affects heart rate, mean blood pressure, \(FiO_{2}\), number of switches to backup, percentage of time by increasing 2.23, 2.25, 5.91, 0.83, 4.97 units respectively. Whereas treatment under NIVNAVA affects saturation by lowering 1.41 unit.

Also, we can see that all vairables but mean blood pressure have significant slopes for NAVA at 95% confidence level. After further investigation, we can safely conclude that scores of heart rate and percentage of time in backup are increasing noteworthily, saturation and number of switches to backup rise steadily as time goes by. Meanwhile respiratory rate and \(FiO_{2}\) decline by 3.87 and 1.11 units over time.

Remarkable contrast can be observed on slopes for treatment under NIVNAVA group, with only \(FiO_{2}\) and number of switches to backup remain statistically significant. That is, \(FiO_{2}\) has a downside trend of 2 unit over time while number of switches to backup has modest rise of 0.31 unit over time.

#data cleaning and rearrangement
origin <- read.csv("https://raw.githubusercontent.com/yesimiao/Dataset/master/NAVAorigin.csv")
id <- rep(origin[c(2:16),1], each=8)
time <- rep(seq(-1.5,2,0.5), 15)

#rearrange data from origin to NAVA
var <- matrix(rep(NA,120*7), nrow = 120, byrow = TRUE)
for (j in 1:7){
var[,j] <- as.numeric(c(t(data.frame(origin[2:16,(4+10*(j-1)):(4+10*(j-1)+7)])))) }
colnames(var) <- c( "hr","sat" , "mbp" , "rr" ,"fio2", "switch","timeinbu")
NAVA <- data.frame(id,time,var)
NAVA$treatment <- as.factor("NAVA")

#rearrange data from origin to NIVNAVA
NIVvar <- matrix(rep(NA,120*7), nrow = 120, byrow = TRUE)
for (j in 1:7){
NIVvar[,j] <- as.numeric(c(t(data.frame(origin[23:37,(4+10*(j-1)):(4+10*(j-1)+7)])))) }
colnames(NIVvar) <- c( "hr","sat" , "mbp" , "rr" ,"fio2", "switch","timeinbu")
NIVNAVA <- data.frame(id,time,NIVvar)
NIVNAVA$treatment <- as.factor("NIVNAVA")

#data to be manipulated in following procedures
data1 <- rbind(NAVA, NIVNAVA)

#exploratory data analysis
library(lattice)
xyplot(hr ~ time| treatment, data = data1, groups = id, type = c("p","r"), 
       panel = panel.superpose)
xyplot(sat ~ time| treatment, data = data1, groups = id, type = c("p",'r'), 
       panel = panel.superpose)
xyplot(mbp ~ time| treatment, data = data1, groups = id, type = c("p",'r'), 
       panel = panel.superpose)
xyplot(rr ~ time| treatment, data = data1, groups = id, type = c("p",'r'), 
       panel = panel.superpose)
xyplot(fio2 ~ time| treatment, data = data1, groups = id, type = c("p",'r'), 
       panel = panel.superpose)
xyplot(switch ~ time| treatment, data = data1, groups = id, type = c("p",'r'), 
       panel = panel.superpose)
xyplot(timeinbu ~ time| treatment, data = data1, groups = id, type = c("p",'r'),
       panel = panel.superpose)

#find baselines
#intercept for all variables, all subjects 
idlist <- unique(NAVA$id)
ylist <- names(NAVA)[3:9]
timelist <- unique(NAVA$time)
intercept = matrix(rep(0,15*7),nrow=15, byrow=TRUE)
for (i in 1:15) {
for (j in 1:7) {
models_by_var <- lapply(ylist, function(x) {
lm(substitute(k ~ time, list(k = as.name(x))),
data=subset(NAVA,(NAVA$id==idlist[i])))
})
intercept[i,j] = models_by_var[[j]][[1]][1] }
}
colnames(intercept) <- c(ylist)
rownames(intercept) <- as.character(idlist)
intercept

#the updated data set after subtraction
diff_NAVA <- data.frame(id, time, NAVA[,c(3:9)] - intercept[rep(1:15,each=8),])
diff_NIVNAVA <- data.frame(id, time, NIVNAVA[,c(3:9)] - intercept[rep(1:15,each=8),])

#boxplot of new response variables
boxplot(diff_NAVA[3:9])
abline(h=0)
boxplot(diff_NIVNAVA[3:9])
abline(h=0)

#fit linear regression to NAVA
beta1 <- NULL
pvalue1 <- NULL
for (j in 1:7) { 
models <- lapply(ylist, function(x) {
  lm(substitute(k ~ time - 1, list(k = as.name(x))), 
     data=diff_NAVA)
  })
beta1[j] = models[[j]]$coefficients
pvalue1[j] = summary(models[[j]])$coefficients[4]
} 

#fit linear regression to NIVNAVA
beta0 <- NULL
pvalue0 <- NULL
beta2 <- NULL
pvalue2 <- NULL
for (j in 1:7) { 
NIVmodels <- lapply(ylist, function(x) {
  lm(substitute(k ~ time , list(k = as.name(x))), 
     data=diff_NIVNAVA)
  })
beta0[j] = NIVmodels[[j]]$coefficients[1]
beta2[j] = NIVmodels[[j]]$coefficients[2]
pvalue0[j] = summary(NIVmodels[[j]])$coefficients[1,4]
pvalue2[j] = summary(NIVmodels[[j]])$coefficients[2,4]
} 

#summary for estimated parameters
model_summary <- round(cbind(beta0,pvalue0,beta1,pvalue1,beta2,pvalue2),4)
rownames(model_summary ) <- ylist
model_summary

#check the goodness of fit of the models using residual analysis
X11()
par(mfrow=c(4,4))
plot((models[[1]]))
plot((models[[2]]))
plot((models[[3]]))
plot((models[[4]]))
X11()
par(mfrow=c(3,4))
plot((models[[5]]))
plot((models[[6]]))
plot((models[[7]]))
X11()
par(mfrow=c(4,4))
plot((NIVmodels[[1]]))
plot((NIVmodels[[2]]))
plot((NIVmodels[[3]]))
plot((NIVmodels[[4]]))
X11()
par(mfrow=c(3,4))
plot((NIVmodels[[5]]))
plot((NIVmodels[[6]]))
plot((NIVmodels[[7]]))