I used Janice’s data containing pre-post cortisol measures to generate a ficticius dataset of pre-post-3day dataset. Subsecuently I applied alternative partitioning criteria.
First: read the data, reshape and represent it graphically.
corti<-read_xlsx("Final Cortisol Data for analysis (1).xlsx")
corti<-mutate(corti,`Sample ID`=as.factor(`Sample ID`))
td<-gather(corti,"key","value",-`Sample ID`,factor_key = T)
ggplot(td,aes(x=key,y=value,group=`Sample ID`))+
geom_point(stat='summary', fun=sum) +
stat_summary(fun=sum, geom="line")+ggtitle("Cortisol Levels pre-post mix")
#simulation
To simulate data we replicate the existing data 20 times, then we simulate the 3day measure as follows for each observation (row) of the augmented data:
Notice that if coeff~0 then the 3day value will be close to pre-levels and if coef~1.0 the 3day value will be closer to the post value
##Example of the beta distribution
corti<-mutate(corti,dfd= `Post-Trans. [pg/ml]`-`Pre-Trans. [pg/ml]`)
cort<-mutate(corti,`Sample ID`=as.numeric(`Sample ID`),coef=rbeta(nrow(corti),.7,.7),day3=`Post-Trans. [pg/ml]`-coef*dfd)
for (i in 1:19){
cort2<-mutate(`Sample ID`=as.numeric(`Sample ID`)+nrow(corti)*(i),corti,coef=rbeta(nrow(corti),.7,.7),day3=`Post-Trans. [pg/ml]`-coef*dfd)
cort=rbind(cort,cort2)
}
cort<-select(cort,-c(dfd,coef))
td2<-gather(cort,"key","value",-`Sample ID`,factor_key = T)
ggplot(td2,aes(x=key,y=value,group=`Sample ID`))+
geom_point(stat='summary', fun=sum) +
stat_summary(fun=sum, geom="line")+ggtitle("simulated cortisol data for 3day")
As you can observe, the for each observation in the original dataset, there are 20 simulated values for 3 day cortisol concentration based on a random decay following the U-shaped distribution
#Apply difrerent criteria to subset these curves
One criteria to subset this is the “acute” criteria = post-pre we are pretty much set on this one.
For the recovery criteria we have two possibilities so far:
absolute recovery= 3day-pre
relative recovery= absolute/(acute) (3day-pre)/(post-pre)
lets compute these and do some graphics of the curves
cort<-mutate(cort,acute=`Post-Trans. [pg/ml]`-`Pre-Trans. [pg/ml]`,
recoverabs=day3-`Pre-Trans. [pg/ml]`,
recoverrel=recoverabs/acute,acuterel=acute/`Pre-Trans. [pg/ml]`)
cort$qacute<-findInterval(cort$acute,quantile(cort$acute,c(.25, .5, .75)))%>%as.character()
cort$qabs<-findInterval(cort$recoverabs,quantile(cort$recoverabs,c(.25, .5, .75)))%>%as.character()
cort$qrel<-findInterval(cort$recoverrel,quantile(cort$recoverrel,c(.25, .5, .75)))%>%as.character()
cort$qacrel<-findInterval(cort$acuterel,quantile(cort$acuterel,c(.25, .5, .75)))%>%as.character()
td3<-gather(cort,"key","value",-c(`Sample ID`,qacute,qabs,qrel,qacrel,recoverabs,recoverrel,acute,acuterel),factor_key = T)
ggplot(td3,aes(x=key,y=value,group=`Sample ID`,color=qacute))+
geom_point(stat='summary', fun=sum) +
stat_summary(fun=sum, geom="line")+ggtitle("curves classified according to acute criteria")
In this case we would select extreme animals colored in magenta and in purple (groups 0 and 3)
Let’s look at an alternative representaition using facets
ggplot(td3,aes(x=key,y=value,group=`Sample ID`))+
geom_point(stat='summary', fun=sum) +
stat_summary(fun=sum, geom="line")+facet_wrap(~qacute)
We can now clearly see what 0 and 3 are: 0=no increase in cortisol 3= sharp increase
Let’s throw in the absolute measure of recovery
ggplot(td3,aes(x=key,y=value,group=`Sample ID`))+
geom_point(stat='summary', fun=sum) +
stat_summary(fun=sum, geom="line")+facet_grid(qabs~qacute,)+
ggtitle("groups with acute and relative recovery measure")
Now relative recovery
ggplot(td3,aes(x=key,y=value,group=`Sample ID`))+
geom_point(stat='summary', fun=sum) +
stat_summary(fun=sum, geom="line")+facet_grid(qrel~qacute,)+
ggtitle("groups with acute and relative recovery measure")
In both cases we see some separation.
After looking at this, I have to concurr with Andrea that the recovery seems to be more obvious using the relative measure
But looking at these graphics, I wondered if we also need a relative measure of acute response: (post-pre)/pre
So I implemented and below are the results.
ggplot(td3,aes(x=key,y=value,group=`Sample ID`))+
geom_point(stat='summary', fun=sum) +
stat_summary(fun=sum, geom="line")+facet_grid(qrel~qacrel,)+
ggtitle("groups with relative acute and recovery measures")
I am not sure this added much. let’s discuss it.