ABBREVIATIONS:
A Active Drug
B Placebo
Baseline BP B0
BP Blood Pressure [in mmHG]
CV Cardio Vascular Disease
DOD Difference of Difference (T test)
LM linear model
MDS Multi-dimension sclaing
TRT Treatment either placebo or Drug Beta blocker
RCT Randomized Clinical Trial
RM Repeated Measures
RTM Regression to the Mean
SD Standard deviation
| Given that
The study of longitudinal data is a challenge for Scientists ,Students and practitioner because one wants to make inference and modelling but often failed to account for RM in their classic statistical tests : Assumptions are broken and is a flaw in publications : As result coefficient might be still reflect true value but the SE of coefficient or test or inflated1.
The main subject here is how including a baseline as a co variate act on your “Linear parameters” model prediction.
We present a short analysis as follow:
RCT & Data. Wide vs Long format
Graphing longitudinal data [ID trajectories]2
Some word on variability in time
Guideline basic-naive linear modelling: T-Test on endpoint 4 months and LM
Linear model (with/without baseline inclusion)
Linear mixed model
Beta blocker and Reserpine like drugs3 has been proven significantly efficient in CV diseases (Stroke,Infarction,BP reduction etc), this since the 60’s which was the golden age of these discoveries.
The physiological mechanisms are:
B1 or VMAST calcic antagonist blocking agent4 reduce contractile ventricular pumping hence reducing the load on arterial and peripheral pipe systems
Less blood flow
Inhibition of renine-angiotensine II at kidney receptor site -An reduce autonmous nervous system activity.
Vaso-dilatation of vessels and much more effects
Of course multi-targeted spots in our cells receptors (via Calcium or B blockade chained) result in ends of a major drop in our BP.
The data presented here come from a study: Veterans Administration Cooperative Study Group on Anti hypertensive Agents (VA Study Group (1967, D. Freis):
Endpoints Yit: Diastolic BP Supine horizontal in [mmHg]
Beside our BP follwows cycles and might be influenced by: Natural Rhythm (“random”) , Emotions-Stress (“random”) , Cafeine-Alcholl-drugs-exercises consumption (can be controlled) among others.
Randomization between placebo and TRT were done.
Questions or interest : Does the new Drug reduce the Diastolic BP with time?
Covariates : Sex,Age and a Baseline [BP0] record.
Response: Yit are the BP at 1 months intervals: [BP0] BP1 BP2 BP3 BP4
Duration 4 months follow-up from Baseline inwards.
Note: Data from ID1 was removed to create unbalanced design and show up difficulties in running within subject ANOVA..
Note that Baseline is one a covariate once as a real measure in time=0 and we will explore how this possibilities act for inferences and modelling.
The data come form the book CLINICAL DATA ANALYSIS USING R
(Chen & Peace , 2011) data extracted from the 1967 Trial:
Some question arouse:
Does the author have copied partially the data? Does it exist an
metadata of the DOE ?
Such questions must be answered before proceeding to the analysis.
Note that 20 id in Placebo group B and 20[-1]5 id in trt group [A] seems to be a phase II trial as n is low and the reader should bear in mind at this stage of analysis non significant result might come from the too low power of the study design, especially in longitudinal data.
In present study reader must be aware that regression to the mean might occur when :
1: When part of a population with abnormal value are selected
2: Random process occur under Normal law
3: In repeated measures RTM is exclusively noticeable
Note : could be attributed to the Placebo effect from practitioner
By RCT design we expect that RTM is the same magnitude in both arms and hence is accounted for in assignment of trt.
From the author:
We won’t account for sake of simplicity in the design, study of within between subject and LMM will be exposed in other blog Using lm in Longitudinal data is a flaw as RM are correlated but coefficients are usually usable but the SE for inferences.
We also try to use different plotting packages i.e ggplot2 and lattice to present the reader the pro and cons of each one.(let at your appraisal)
## [1] "GROUP A= TRTEATMENT B= PLACEBO"
| id | trt | baseline | bp1 | bp2 | bp3 | bp4 | age | sex |
|---|---|---|---|---|---|---|---|---|
| 2 | Drug | 116 | 113 | 112 | 103 | 101 | 51 | M |
| 3 | Drug | 119 | 115 | 113 | 104 | 98 | 48 | F |
| 4 | Drug | 115 | 113 | 112 | 109 | 101 | 42 | F |
| 5 | Drug | 116 | 112 | 107 | 104 | 105 | 49 | M |
| 6 | Drug | 117 | 112 | 113 | 104 | 102 | 47 | M |
| 7 | Drug | 118 | 111 | 100 | 109 | 99 | 50 | F |
| 8 | Drug | 120 | 115 | 113 | 102 | 102 | 61 | M |
| 9 | Drug | 114 | 112 | 113 | 109 | 103 | 43 | M |
| 10 | Drug | 115 | 113 | 108 | 106 | 97 | 51 | M |
| 11 | Drug | 117 | 112 | 110 | 109 | 101 | 47 | F |
| 12 | Drug | 116 | 115 | 113 | 109 | 102 | 45 | M |
| 13 | Drug | 119 | 117 | 110 | 106 | 104 | 54 | F |
| 14 | Drug | 118 | 115 | 113 | 102 | 99 | 52 | M |
| 15 | Drug | 115 | 112 | 108 | 105 | 102 | 42 | M |
| 16 | Drug | 114 | 111 | 111 | 107 | 100 | 44 | F |
| 17 | Drug | 117 | 114 | 110 | 108 | 102 | 48 | M |
| 18 | Drug | 120 | 115 | 113 | 107 | 103 | 63 | F |
| 19 | Drug | 114 | 113 | 109 | 104 | 100 | 41 | M |
| 20 | Drug | 117 | 115 | 113 | 109 | 101 | 51 | M |
| 21 | Placebo | 114 | 115 | 113 | 111 | 113 | 39 | M |
| 22 | Placebo | 116 | 114 | 114 | 109 | 110 | 40 | F |
| 23 | Placebo | 114 | 115 | 113 | 111 | 109 | 39 | F |
| 24 | Placebo | 114 | 115 | 113 | 114 | 115 | 38 | M |
| 25 | Placebo | 116 | 113 | 113 | 109 | 109 | 39 | F |
| 26 | Placebo | 114 | 115 | 114 | 111 | 110 | 41 | M |
| 27 | Placebo | 119 | 118 | 118 | 117 | 115 | 56 | F |
| 28 | Placebo | 118 | 117 | 117 | 116 | 112 | 56 | M |
| 29 | Placebo | 114 | 113 | 113 | 109 | 108 | 38 | M |
| 30 | Placebo | 120 | 115 | 113 | 113 | 113 | 57 | M |
| 31 | Placebo | 117 | 115 | 113 | 114 | 115 | 47 | F |
| 32 | Placebo | 118 | 114 | 112 | 109 | 110 | 48 | M |
| 33 | Placebo | 121 | 119 | 117 | 114 | 115 | 61 | F |
| 34 | Placebo | 116 | 115 | 116 | 114 | 111 | 49 | M |
| 35 | Placebo | 118 | 118 | 113 | 113 | 112 | 52 | M |
| 36 | Placebo | 119 | 115 | 115 | 114 | 111 | 55 | F |
| 37 | Placebo | 116 | 114 | 113 | 109 | 109 | 45 | F |
| 38 | Placebo | 116 | 115 | 114 | 114 | 112 | 42 | M |
| 39 | Placebo | 117 | 115 | 113 | 114 | 115 | 49 | F |
| 40 | Placebo | 118 | 114 | 114 | 114 | 115 | 50 | F |
| id | trt | baseline | bp1 | bp2 | bp3 | bp4 | age | sex | |
|---|---|---|---|---|---|---|---|---|---|
| Min. : 2.0 | Drug :19 | Min. :114.0 | Min. :111.0 | Min. :100.0 | Min. :102.0 | Min. : 97.0 | Min. :38.00 | Length:39 | |
| 1st Qu.:11.5 | Placebo:20 | 1st Qu.:115.0 | 1st Qu.:113.0 | 1st Qu.:112.0 | 1st Qu.:106.5 | 1st Qu.:101.5 | 1st Qu.:42.00 | Class :character | |
| Median :21.0 | NA | Median :117.0 | Median :115.0 | Median :113.0 | Median :109.0 | Median :108.0 | Median :48.00 | Mode :character | |
| Mean :21.0 | NA | Mean :116.7 | Mean :114.3 | Mean :112.4 | Mean :109.4 | Mean :106.7 | Mean :47.95 | NA | |
| 3rd Qu.:30.5 | NA | 3rd Qu.:118.0 | 3rd Qu.:115.0 | 3rd Qu.:113.0 | 3rd Qu.:113.5 | 3rd Qu.:112.0 | 3rd Qu.:51.50 | NA | |
| Max. :40.0 | NA | Max. :121.0 | Max. :119.0 | Max. :118.0 | Max. :117.0 | Max. :115.0 | Max. :63.00 | NA |
## [1] "How to process data from wide to long format creating a time variable in one RM variable SCORE is given in the attached code: Long format are desired in eased plotting and mixed models analysis"
## [1] "Range baseline A"
## [1] Inf -Inf
## [1] "Range baseline B"
## [1] Inf -Inf
Longitudinal data show often an increased variances in Time due to trt effect, between id difference and within variances: To account for such “type” of variance is not an easy task and terminology between statistician not compatible.
Here we not a first phase a between id variance (“space between point”) B0 and at B2 we notciced some individualization
At time 2-3 the between variance as taken larger values and the 6-7 contribute to the pattern also
At the end of the study we see clustered trt individualization and a stable between patients variance.
Now i will present on lattice same plot faceted by trt:
## NULL
Wiggly pattern occur on both groups
Pronounced slopping is obvious in trt Drug group
Clear group cluster occur after 4 Months. We will try in another blog with MDS techniques to catch that similarities pattern.
Flattening up on the Placebo group is also noticeable that is in polynomial LM we might expect negative vs positive > 1 order coefficients to occurs due to different curvature.
In the A group a step-down slop occurs after 2 months of active drug Conversely flattening occurs in the Placebo group after 2 months. A linear assumptions might be a crude assumptions but for a accurate modelling ; piece linear of polynomial trend should be envisioned to capture the pattern at 2-3 months periods.
## 'data.frame': 5 obs. of 2 variables:
## $ Drug : chr "116.6842" "113.4211" "110.5789" "106.1053" ...
## $ Placebo: chr "116.7500" "115.2000" "114.0500" "112.4500" ...
| Drug | Placebo | difference | |
|---|---|---|---|
| baseline | 116.6842 | 116.7500 | 0 |
| bp1 | 113.4211 | 115.2000 | -2 |
| bp2 | 110.5789 | 114.0500 | -4 |
| bp3 | 106.1053 | 112.4500 | -6 |
| bp4 | 101.1579 | 111.9500 | -10 |
about sd within and between sphericity
## sd.TRT sd.baseline sd.bp1 sd.bp2 sd.bp3 sd.bp4
## 1 A 1.945154 1.677160 3.271354 2.558097 2.007297
## 2 B 2.124419 1.609184 1.669384 2.502104 2.438183
## 'data.frame': 195 obs. of 7 variables:
## $ id : int 2 3 4 5 6 7 8 9 10 11 ...
## $ trt : Factor w/ 2 levels "Drug","Placebo": 1 1 1 1 1 1 1 1 1 1 ...
## $ age : int 51 48 42 49 47 50 61 43 51 47 ...
## $ sex : chr "M" "F" "F" "M" ...
## $ bp : Factor w/ 5 levels "baseline","bp1",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ Score: int 116 119 115 116 117 118 120 114 115 117 ...
## $ time : num 0 0 0 0 0 0 0 0 0 0 ...
## <list_of<
## tbl_df<
## trt : factor<76d77>
## time: double
## Y sd: double
## >
## >[2]>
## [[1]]
## # A tibble: 5 × 3
## trt time `Y sd`
## <fct> <dbl> <dbl>
## 1 Drug 0 1.95
## 2 Drug 1 1.68
## 3 Drug 2 3.27
## 4 Drug 3 2.56
## 5 Drug 4 2.01
##
## [[2]]
## # A tibble: 5 × 3
## trt time `Y sd`
## <fct> <dbl> <dbl>
## 1 Placebo 0 2.12
## 2 Placebo 1 1.61
## 3 Placebo 2 1.67
## 4 Placebo 3 2.50
## 5 Placebo 4 2.44
## <list_of<
## tbl_df<
## trt : factor<76d77>
## time: double
## Y sd: double
## >
## >[2]>
## [[1]]
## # A tibble: 5 × 3
## trt time `Y sd`
## <fct> <dbl> <dbl>
## 1 Drug 0 1.95
## 2 Drug 1 1.68
## 3 Drug 2 3.27
## 4 Drug 3 2.56
## 5 Drug 4 2.01
##
## [[2]]
## # A tibble: 5 × 3
## trt time `Y sd`
## <fct> <dbl> <dbl>
## 1 Placebo 0 2.12
## 2 Placebo 1 1.61
## 3 Placebo 2 1.67
## 4 Placebo 3 2.50
## 5 Placebo 4 2.44
## [1] 0.534202
CD
Effect of Drug in lowering is BP at 4 months time is obvious in graphs:
Dotplot + Boxplot show clear clustered drops - no overlap - no outlier
Placebo[B] show no 4th quartile that is Normality assumptions seems doubty
Slope of DRUG A are steeper and more parallel Specila pattern for id 7
Placebo exhibit lower value drop and greater variability in slope :
Note: Keep scaling and range as fixed for both graph Note that on Placebo group several value of Baseline are duplicated despite aspect of that graph check it with command (duplicated(BP\(baseline and dp4[BP\)trt==“B”]))
Some figures check to asses some of the assumptions for further inferences:
## [1] "BP baseline value"
## A B
## 116.6842 116.7500
## [1] "sd B0 value"
## A B
## 2.007297 2.438183
## [1] "Difference from baseline to 4M"
Baseline value mean: around 116.7 both arms: B 116.75 / A 116.68 mmHG:
Note: We don’t know if matching occurred in DOE
SD in baseline is the same that is the one population assumptions
## A B
## 15.56211 4.77000
## [1] "A drop of 15 mmHG might be expected by trt A vs 5 mmHG"
Pooled B0 SD about: 2.2 mmHG
That is:
Drop is greater than resp.pooled sd
High effect size might be expected even in PLA group.
Design experiment plans usually a >= 10 mmHg drop as first clinical intention (i.e Captopril ECA inhibitor).
Although not planned to discuss deeply in this blog (nomrality etc) we report two type of t.test:
2 sample T test pooled variance at 4Months [Naive approach]
DoD T.test pooled variance
Lets graphs first the boxplot of the difference time 4M to baseline value:
The eyes of the “JCVD aware statistician” can tell that if no overlapping( indentation of IQR segment occur a significant difference between arm is likely: Let test it:
## [1] "T-test 4month point"
##
## Two Sample t-test
##
## data: BP$bp4 by BP$trt
## t = -15.046, df = 37, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group A and group B is not equal to 0
## 95 percent confidence interval:
## -12.245436 -9.338774
## sample estimates:
## mean in group A mean in group B
## 101.1579 111.9500
Now a clever way to use baseline is to make a DOD T test:
## [1] "T-test DOD difference [Baseline-4months] value score"
##
## Two Sample t-test
##
## data: BP$baseline - BP$bp4 by BP$trt
## t = 13.203, df = 37, p-value = 1.427e-15
## alternative hypothesis: true difference in means between group A and group B is not equal to 0
## 95 percent confidence interval:
## 9.080235 12.372397
## sample estimates:
## mean in group A mean in group B
## 15.52632 4.80000
## [1] "print making the difference between 2 RV (BBO-B4) remove covariance as pairing the data within groups:Mistake Often seen: don't put Paired =true between two group as group are independant each other: that is only the RM obesrvations within an ID in group that is paired"
CONFINT: [9.34-12.24]
CONFINT: [9.08-12.37]
The interpretation is left at the reader : However note that usually we expect with DOD more precise and difference in the result but here not as much as expected. even less
What could be the explanation?
Looking at correlation / covariance in both arms presented below and which value is below <0.5 might explain why such result occurs.
Differentiating is not removing as much covariance from the DoD test VAR[X-Y] = VARX + VARY - 2 COV XY
This might be a sign also to researcher, that a linear assumptions between baseline and time point 4 might be doubtful due to low correlations (See xyplot by id above) and a CPS (Compound Symmetry structure) of the COV matrix might be doubtful.
## [1] "figures now: within groups and all for baseline-4M datapoints"
## [1] 0.09885125
## [1] 0.4445472
## [1] 0.127816
| TRT | COR B0-BP4M |
|---|---|
| A | 0.10 |
| B | 0.45 |
| OVERALL | 0.12 |
| Drug | Placebo | |
|---|---|---|
| 0 | 1.945154 | 2.124419 |
| 1 | 1.677160 | 1.609184 |
| 2 | 3.271354 | 1.669384 |
| 3 | 2.558097 | 2.502104 |
| 4 | 2.007297 | 2.438183 |
Here Baseline will be considered as a Fixed controlled co-variate, not a time co-variate,that is not changing with time.
Note : Note on Contrasts:
To have a possible lowering trend value of the active trt contrast should be done as “TREATMENT” contrast to the reference Placebo group = B as reference level factor.We expect under this kind of contrast a negative regression coefficient for the active drug A if effective.
Remark:
If you want an orthogonal testing lm ANOVA type III, contrast must be set to sum!.
## [1] "confirm type of contrast set and change to PLACEBO 0 as refs"
## B
## A 0
## B 1
m1 = lm(BP\(bp4~BP\)trt)
m2 = lm(BP\(bp4~BP\)trt+BP$baseline)
On simple linear model as TRT is a factor the beta coefficients reflect the decrease in mean in BP from placebo to trt (contr trt ref PLA) and baseline , a continuous covariates represent a slope respectively.
What is special in this presented Technics here :
- Baseline Wald test and value must not be interpreted.
- Including a baseline as a [usually] a massive impact on SE if cor TRT-B0 > 0.5
##
## Call:
## lm(formula = BP$bp4 ~ BP$trt)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.158 -1.950 0.050 1.446 3.842
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 101.1579 0.5136 196.94 <2e-16 ***
## BP$trtB 10.7921 0.7173 15.05 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.239 on 37 degrees of freedom
## Multiple R-squared: 0.8595, Adjusted R-squared: 0.8557
## F-statistic: 226.4 on 1 and 37 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = BP$bp4 ~ BP$trt + BP$baseline)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.9210 -1.6971 -0.2505 1.5233 4.0676
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 62.7089 20.3743 3.078 0.00397 **
## BP$trtB 10.7704 0.6937 15.525 < 2e-16 ***
## BP$baseline 0.3295 0.1746 1.888 0.06715 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.165 on 36 degrees of freedom
## Multiple R-squared: 0.8722, Adjusted R-squared: 0.8651
## F-statistic: 122.8 on 2 and 36 DF, p-value: < 2.2e-16
\[ \operatorname{\widehat{BP\$bp4}} = 101.16 + 10.79() \] \[ \operatorname{\widehat{BP\$bp4}} = 62.71 + 10.77() + 0.33(\operatorname{BP\$baseline}) \]
## [1] "confidence intervals of coeeficients m1 vs m2"
## 2.5 % 97.5 %
## (Intercept) 100.12 102.20
## BP$trtB 9.34 12.25
## 2.5 % 97.5 %
## (Intercept) 21.39 104.03
## BP$trtB 9.36 12.18
## BP$baseline -0.02 0.68
## [1] "plotm1 fitted values"
## [1] "plotm2 fitted values"
TRT is highly significant in both models.
Recall:On simple linear model with TRT is a factor the beta coefficients reflect the decrease in mean in BP from placebo (contrast trt ref PLA)
The SE of the TRT including a baseline is a of little gain in precision hence Inference but usually when higher correlation TRT -B= exist SE are far more impressive.
Note the T value for baseline is the same for anova model comparison
Note the Anova model comparison is a limited value here anova(m1,m2)
A little gain is done here in this dataset with but with prediction points of view a lot is done by adding a continuous covariates (ANCOVA sensus.lato) vs an ANOVA: lm as E(Y|X) is always better that the mean.
However the prediction of the drug A is fairly in accordance with true value.
We notice a slight but negligible difference in type I and III due to missing ID1 unbalanced:
## Anova Table (Type III tests)
##
## Response: BP$bp4
## Sum Sq Df F value Pr(>F)
## (Intercept) 44.41 1 9.4731 0.003973 **
## BP$baseline 16.71 1 3.5634 0.067149 .
## BP$trt 1129.97 1 241.0299 < 2.2e-16 ***
## Residuals 168.77 36
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Df Sum Sq Mean Sq F value Pr(>F)
## BP$trt 1 1134.8 1134.8 242.067 <2e-16 ***
## BP$baseline 1 16.7 16.7 3.563 0.0671 .
## Residuals 36 168.8 4.7
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Now lets explore an DOD Y | X = (BP4- BPbaseline) “like” regression with & without baseline in the X covariate
m3 = lm(BP\(bp4-BP\)Baseline~BP$trt)
m2 = lm(BP$bp4-BP$baseline~BP$trt+BP$baseline)
##
## Call:
## lm(formula = BP$diff4b ~ BP$trt)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.474 -1.837 -0.200 1.526 5.800
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -15.5263 0.5818 -26.69 < 2e-16 ***
## BP$trtB 10.7263 0.8124 13.20 1.43e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.536 on 37 degrees of freedom
## Multiple R-squared: 0.8249, Adjusted R-squared: 0.8202
## F-statistic: 174.3 on 1 and 37 DF, p-value: 1.427e-15
##
## Call:
## lm(formula = BP$diff4b ~ BP$baseline + BP$trt)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.9210 -1.6971 -0.2505 1.5233 4.0676
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 62.7089 20.3743 3.078 0.003973 **
## BP$baseline -0.6705 0.1746 -3.841 0.000478 ***
## BP$trtB 10.7704 0.6937 15.525 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.165 on 36 degrees of freedom
## Multiple R-squared: 0.8758, Adjusted R-squared: 0.8689
## F-statistic: 126.9 on 2 and 36 DF, p-value: < 2.2e-16
## 2.5 % 97.5 %
## (Intercept) -16.71 -14.35
## BP$trtB 9.08 12.37
## 2.5 % 97.5 %
## (Intercept) 21.39 104.03
## BP$baseline -1.02 -0.32
## BP$trtB 9.36 12.18
\[ \operatorname{\widehat{BP\$bp4}} = 101.16 + 10.79() \] \[ \operatorname{\widehat{BP\$bp4}} = 62.71 + 10.77() + 0.33(\operatorname{BP\$baseline}) \]
Here nothing is really gain in R2 squared including a baseline in Xis (but nothing is really lost too)!
No co-linearity is observed in the hat matrix because it depends only on X (VIF ::car Variance inflation factor): However we usually check co-linearity with 2 continuous variables but including baseline left and right from the equation is not ill conditioned matrix result.
## [1] "VIF m4"
## BP$baseline BP$trt
## 1.000274 1.000274
Note : At this stage the proposal of robust estimator is not of interest no repeated measures designed are made here on Yis even subtraction baseline (correlation is removed).
Lets summarise the model performance
##
## Call:
## lm(formula = BP$bp4 ~ BP$trt + BP$baseline)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.9210 -1.6971 -0.2505 1.5233 4.0676
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 62.7089 20.3743 3.078 0.00397 **
## BP$trtB 10.7704 0.6937 15.525 < 2e-16 ***
## BP$baseline 0.3295 0.1746 1.888 0.06715 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.165 on 36 degrees of freedom
## Multiple R-squared: 0.8722, Adjusted R-squared: 0.8651
## F-statistic: 122.8 on 2 and 36 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = BP$diff4b ~ BP$baseline + BP$trt)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.9210 -1.6971 -0.2505 1.5233 4.0676
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 62.7089 20.3743 3.078 0.003973 **
## BP$baseline -0.6705 0.1746 -3.841 0.000478 ***
## BP$trtB 10.7704 0.6937 15.525 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.165 on 36 degrees of freedom
## Multiple R-squared: 0.8758, Adjusted R-squared: 0.8689
## F-statistic: 126.9 on 2 and 36 DF, p-value: < 2.2e-16
Well R2 sq are similar , RMSE the same:
As advice choose the m2 model lm BP4~Baseline + trt in the original scale that the one mostly used in preliminary analysis.[see Zhang,2014]
Including baseline improved definitely the prediction of your model (R2sq, SE of Coeff) especially if correlation is above 0.5.
Let first explore the possibility of an interactions time * trt effect: As mentioned Interactions , that is slope in time with trt can be visualized as such on the interaction plot. Time*trt effect show how the magitude of drugs acts in time (and induce a increase in overall variance )
Therefore it might be assessed that trt difference between arms is effective in time elapsed: (fan out sloping)
Before you dataset must be revamped in long format and adding a static baseline in long format
| id | trt | age | sex | bp | Score | time |
|---|---|---|---|---|---|---|
| 2 | A | 51 | M | baseline | 116 | 0 |
| 3 | A | 48 | F | baseline | 119 | 0 |
| 4 | A | 42 | F | baseline | 115 | 0 |
| 5 | A | 49 | M | baseline | 116 | 0 |
| 6 | A | 47 | M | baseline | 117 | 0 |
| 7 | A | 50 | F | baseline | 118 | 0 |
Lets explore a basic longitudinal model where baseline is included as a time points:
ml1 Y~only time
ml2 Y~time+trt
ml3 time*trt (ANOCOVA)
## [1] "ml1"
##
## Call:
## lm(formula = Score ~ time, data = new)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11.8923 -2.3897 0.1077 2.1051 8.1128
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 116.8974 0.4560 256.37 <2e-16 ***
## time -2.5026 0.1861 -13.44 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.676 on 193 degrees of freedom
## Multiple R-squared: 0.4836, Adjusted R-squared: 0.4809
## F-statistic: 180.7 on 1 and 193 DF, p-value: < 2.2e-16
## [1] "ml2"
##
## Call:
## lm(formula = Score ~ time + trt, data = new)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.5895 -1.8347 -0.0851 2.4054 5.9251
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 114.5946 0.4195 273.16 <2e-16 ***
## time -2.5026 0.1474 -16.98 <2e-16 ***
## trtB 4.4905 0.4169 10.77 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.91 on 192 degrees of freedom
## Multiple R-squared: 0.6781, Adjusted R-squared: 0.6747
## F-statistic: 202.2 on 2 and 192 DF, p-value: < 2.2e-16
## [1] "ml3"
##
## Call:
## lm(formula = Score ~ time * trt, data = new)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.5895 -1.4263 -0.2632 1.5737 4.4500
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 117.2632 0.3997 293.383 <2e-16 ***
## time -3.8368 0.1632 -23.514 <2e-16 ***
## trtB -0.7132 0.5581 -1.278 0.203
## time:trtB 2.6018 0.2279 11.419 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.249 on 191 degrees of freedom
## Multiple R-squared: 0.8087, Adjusted R-squared: 0.8057
## F-statistic: 269.1 on 3 and 191 DF, p-value: < 2.2e-16
## [1] "conf:int ml1,2,3"
## 2.5 % 97.5 %
## (Intercept) 115.998108 117.796764
## time -2.869713 -2.135415
## 2.5 % 97.5 %
## (Intercept) 113.767165 115.422038
## time -2.793201 -2.211927
## trtB 3.668210 5.312843
## 2.5 % 97.5 %
## (Intercept) 116.474779 118.0515364
## time -4.158696 -3.5149879
## trtB -1.814069 0.3877528
## time:trtB 2.152397 3.0512870
## [1] "anova ssq ml1,2,3"
## Analysis of Variance Table
##
## Model 1: Score ~ time
## Model 2: Score ~ time + trt
## Model 3: Score ~ time * trt
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 193 2608.24
## 2 192 1625.85 1 982.39 194.19 < 2.2e-16 ***
## 3 191 966.25 1 659.60 130.38 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## [1] "sigma ml1,2,3"
## [1] 3.676163
## [1] 2.909975
## [1] 2.249196
##
## Call:
## lm(formula = Score ~ poly(time, 3), data = new)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.119 -2.493 0.322 2.101 8.348
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 111.8923 0.2643 423.399 <2e-16 ***
## poly(time, 3)1 -49.4217 3.6904 -13.392 <2e-16 ***
## poly(time, 3)2 -2.6534 3.6904 -0.719 0.473
## poly(time, 3)3 -0.1519 3.6904 -0.041 0.967
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.69 on 191 degrees of freedom
## Multiple R-squared: 0.485, Adjusted R-squared: 0.4769
## F-statistic: 59.96 on 3 and 191 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = Score ~ poly(time, 2) * trt, data = new)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.3038 -1.5489 -0.1971 1.2729 4.2729
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 109.5895 0.2274 481.934 < 2e-16 ***
## poly(time, 2)1 -75.7716 3.1754 -23.862 < 2e-16 ***
## poly(time, 2)2 -8.3452 3.1754 -2.628 0.00929 **
## trtB 4.4905 0.3175 14.142 < 2e-16 ***
## poly(time, 2)1:trtB 51.3823 4.4342 11.588 < 2e-16 ***
## poly(time, 2)2:trtB 11.0992 4.4342 2.503 0.01316 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.216 on 189 degrees of freedom
## Multiple R-squared: 0.8162, Adjusted R-squared: 0.8113
## F-statistic: 167.8 on 5 and 189 DF, p-value: < 2.2e-16
## Analysis of Variance Table
##
## Model 1: Score ~ poly(time, 2) * trt
## Model 2: Score ~ poly(time, 3)
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 189 928.43
## 2 191 2601.17 -2 -1672.7 170.26 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Linear mixed-effects model fit by REML
## Data: new
## AIC BIC logLik
## 863.3417 882.8554 -425.6709
##
## Random effects:
## Formula: ~1 | id
## (Intercept) Residual
## StdDev: 1.178514 1.927075
##
## Fixed effects: Score ~ time * trt
## Value Std.Error DF t-value p-value
## (Intercept) 117.26316 0.4363163 154 268.75724 0.0000
## time -3.83684 0.1398047 154 -27.44430 0.0000
## trtB -0.71316 0.6092825 37 -1.17049 0.2493
## time:trtB 2.60184 0.1952266 154 13.32729 0.0000
## Correlation:
## (Intr) time trtB
## time -0.641
## trtB -0.716 0.459
## time:trtB 0.459 -0.716 -0.641
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -4.23589250 -0.55319361 -0.04672961 0.53978532 2.42541707
##
## Number of Observations: 195
## Number of Groups: 39
## Linear mixed-effects model fit by REML
## Data: new
## AIC BIC logLik
## 866.1673 892.1855 -425.0837
##
## Random effects:
## Formula: ~time | id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 1.113436 (Intr)
## time 0.276513 -0.097
## Residual 1.878806
##
## Fixed effects: Score ~ time * trt
## Value Std.Error DF t-value p-value
## (Intercept) 117.26316 0.4203812 154 278.94482 0.0000
## time -3.83684 0.1503418 154 -25.52079 0.0000
## trtB -0.71316 0.5870304 37 -1.21486 0.2321
## time:trtB 2.60184 0.2099409 154 12.39321 0.0000
## Correlation:
## (Intr) time trtB
## time -0.613
## trtB -0.716 0.439
## time:trtB 0.439 -0.716 -0.613
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -4.32273721 -0.56636844 -0.04703011 0.52025097 2.58014393
##
## Number of Observations: 195
## Number of Groups: 39
It is questionable in lmer model that dual ,interactions as fixed effects and RIAS slope model might be overfitting. Well the options here is as the grapgh show obvisouly andfixed effect interaction (interaction plot) but the respective id curve shows no obvious erratic change in slope population RIAS might be diseragred for simplicity (RAZOR AKHAM).
Wrong way to do it: Wide and bp as factor time!
##
## Call:
## lm(formula = BP$bp4 ~ BP$bp1 + BP$bp3 + BP$baseline + BP$trt)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.7508 -1.3071 -0.3603 1.4835 5.0010
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 19.9953 24.9821 0.800 0.429
## BP$bp1 0.2199 0.2498 0.881 0.385
## BP$bp3 0.3484 0.1328 2.623 0.013 *
## BP$baseline 0.1650 0.1971 0.837 0.409
## BP$trtB 8.1794 1.0747 7.611 7.64e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.983 on 34 degrees of freedom
## Multiple R-squared: 0.8988, Adjusted R-squared: 0.8869
## F-statistic: 75.47 on 4 and 34 DF, p-value: < 2.2e-16
One need to cast the data into long format , required format for longitudinal data:
## [1] 116 119 115 116 117 118 120 114 115 117 116 119 118 115 114 117 120 114 117
## [20] 114 116 114 114 116 114 119 118 114 120 117 118 121 116 118 119 116 116 117
## [39] 118
## [1] 116 119 115 116 117 118 120 114 115 117 116 119 118 115 114 117 120 114
## [19] 117 114 116 114 114 116 114 119 118 114 120 117 118 121 116 118 119 116
## [37] 116 117 118 116 119 115 116 117 118 120 114 115 117 116 119 118 115 114
## [55] 117 120 114 117 114 116 114 114 116 114 119 118 114 120 117 118 121 116
## [73] 118 119 116 116 117 118 116 119 115 116 117 118 120 114 115 117 116 119
## [91] 118 115 114 117 120 114 117 114 116 114 114 116 114 119 118 114 120 117
## [109] 118 121 116 118 119 116 116 117 118 116 119 115 116 117 118 120 114 115
## [127] 117 116 119 118 115 114 117 120 114 117 114 116 114 114 116 114 119 118
## [145] 114 120 117 118 121 116 118 119 116 116 117 118 116 119 115 116 117 118
## [163] 120 114 115 117 116 119 118 115 114 117 120 114 117 114 116 114 114 116
## [181] 114 119 118 114 120 117 118 121 116 118 119 116 116 117 118
##
## Call:
## lm(formula = Score ~ time, data = new)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11.8923 -2.3897 0.1077 2.1051 8.1128
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 116.8974 0.4560 256.37 <2e-16 ***
## time -2.5026 0.1861 -13.44 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.676 on 193 degrees of freedom
## Multiple R-squared: 0.4836, Adjusted R-squared: 0.4809
## F-statistic: 180.7 on 1 and 193 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = Score ~ time + trt, data = new)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.5895 -1.8347 -0.0851 2.4054 5.9251
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 114.5946 0.4195 273.16 <2e-16 ***
## time -2.5026 0.1474 -16.98 <2e-16 ***
## trtB 4.4905 0.4169 10.77 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.91 on 192 degrees of freedom
## Multiple R-squared: 0.6781, Adjusted R-squared: 0.6747
## F-statistic: 202.2 on 2 and 192 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = Score ~ time + trt + b0, data = new)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.1859 -2.1675 0.2599 2.1481 7.1716
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 61.70601 11.66924 5.288 3.37e-07 ***
## time -2.50256 0.14037 -17.828 < 2e-16 ***
## trtB 4.46071 0.39722 11.230 < 2e-16 ***
## b0 0.45326 0.09995 4.535 1.02e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.772 on 191 degrees of freedom
## Multiple R-squared: 0.7094, Adjusted R-squared: 0.7048
## F-statistic: 155.4 on 3 and 191 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = Score ~ time * trt, data = new)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.5895 -1.4263 -0.2632 1.5737 4.4500
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 117.2632 0.3997 293.383 <2e-16 ***
## time -3.8368 0.1632 -23.514 <2e-16 ***
## trtB -0.7132 0.5581 -1.278 0.203
## time:trtB 2.6018 0.2279 11.419 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.249 on 191 degrees of freedom
## Multiple R-squared: 0.8087, Adjusted R-squared: 0.8057
## F-statistic: 269.1 on 3 and 191 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = Score ~ time * trt + b0, data = new)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.1859 -1.2015 -0.0998 1.1424 4.6365
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 64.37457 8.68442 7.413 3.97e-12 ***
## time -3.83684 0.14963 -25.643 < 2e-16 ***
## trtB -0.74298 0.51182 -1.452 0.148
## b0 0.45326 0.07436 6.095 5.96e-09 ***
## time:trtB 2.60184 0.20894 12.453 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.062 on 190 degrees of freedom
## Multiple R-squared: 0.84, Adjusted R-squared: 0.8366
## F-statistic: 249.3 on 4 and 190 DF, p-value: < 2.2e-16
## Analysis of Variance Table
##
## Model 1: Score ~ time * trt
## Model 2: Score ~ time * trt + b0
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 191 966.25
## 2 190 808.20 1 158.05 37.155 5.96e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 2.5 % 97.5 %
## (Intercept) 116.474779 118.0515364
## time -4.158696 -3.5149879
## trtB -1.814069 0.3877528
## time:trtB 2.152397 3.0512870
## 2.5 % 97.5 %
## (Intercept) 47.2442980 81.5048321
## time -4.1319830 -3.5417013
## trtB -1.7525610 0.2666053
## b0 0.3065846 0.5999406
## time:trtB 2.1897003 3.0139839
##
## Call:
## lm(formula = Score ~ poly(time, 3), data = new)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.119 -2.493 0.322 2.101 8.348
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 111.8923 0.2643 423.399 <2e-16 ***
## poly(time, 3)1 -49.4217 3.6904 -13.392 <2e-16 ***
## poly(time, 3)2 -2.6534 3.6904 -0.719 0.473
## poly(time, 3)3 -0.1519 3.6904 -0.041 0.967
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.69 on 191 degrees of freedom
## Multiple R-squared: 0.485, Adjusted R-squared: 0.4769
## F-statistic: 59.96 on 3 and 191 DF, p-value: < 2.2e-16
##
## Call:
## lm(formula = Score ~ poly(time, 2) * trt, data = new)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.3038 -1.5489 -0.1971 1.2729 4.2729
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 109.5895 0.2274 481.934 < 2e-16 ***
## poly(time, 2)1 -75.7716 3.1754 -23.862 < 2e-16 ***
## poly(time, 2)2 -8.3452 3.1754 -2.628 0.00929 **
## trtB 4.4905 0.3175 14.142 < 2e-16 ***
## poly(time, 2)1:trtB 51.3823 4.4342 11.588 < 2e-16 ***
## poly(time, 2)2:trtB 11.0992 4.4342 2.503 0.01316 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.216 on 189 degrees of freedom
## Multiple R-squared: 0.8162, Adjusted R-squared: 0.8113
## F-statistic: 167.8 on 5 and 189 DF, p-value: < 2.2e-16
## Analysis of Variance Table
##
## Model 1: Score ~ poly(time, 2) * trt
## Model 2: Score ~ poly(time, 3)
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 189 928.43
## 2 191 2601.17 -2 -1672.7 170.26 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Linear mixed-effects model fit by REML
## Data: new
## AIC BIC logLik
## 863.3417 882.8554 -425.6709
##
## Random effects:
## Formula: ~1 | id
## (Intercept) Residual
## StdDev: 1.178514 1.927075
##
## Fixed effects: Score ~ time * trt
## Value Std.Error DF t-value p-value
## (Intercept) 117.26316 0.4363163 154 268.75724 0.0000
## time -3.83684 0.1398047 154 -27.44430 0.0000
## trtB -0.71316 0.6092825 37 -1.17049 0.2493
## time:trtB 2.60184 0.1952266 154 13.32729 0.0000
## Correlation:
## (Intr) time trtB
## time -0.641
## trtB -0.716 0.459
## time:trtB 0.459 -0.716 -0.641
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -4.23589250 -0.55319361 -0.04672961 0.53978532 2.42541707
##
## Number of Observations: 195
## Number of Groups: 39
## Linear mixed-effects model fit by REML
## Data: new
## AIC BIC logLik
## 866.1673 892.1855 -425.0837
##
## Random effects:
## Formula: ~time | id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 1.113436 (Intr)
## time 0.276513 -0.097
## Residual 1.878806
##
## Fixed effects: Score ~ time * trt
## Value Std.Error DF t-value p-value
## (Intercept) 117.26316 0.4203812 154 278.94482 0.0000
## time -3.83684 0.1503418 154 -25.52079 0.0000
## trtB -0.71316 0.5870304 37 -1.21486 0.2321
## time:trtB 2.60184 0.2099409 154 12.39321 0.0000
## Correlation:
## (Intr) time trtB
## time -0.613
## trtB -0.716 0.439
## time:trtB 0.439 -0.716 -0.613
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -4.32273721 -0.56636844 -0.04703011 0.52025097 2.58014393
##
## Number of Observations: 195
## Number of Groups: 39Model selection RI with interactions or RIAS
probably a reasonable assumptions for other linear model is an order 2 polynomial with trt poly order 2 fixe effect interaction interaction to account for change in simple curvature see longitudin l graph
The following model frame is one selected:
On other way a RIAS without/with intercations might be envision Despite the high value of random time effect (Slope effect) the path line graph of id dosen’t show up much variability agaisnt one individual at somme exeption.
Sevreals variance epress and some confusion might occur
Withi between longitidunal
It is ofeten reported that longitudinal variance is increasing heneec a CPS strucute of the cov matrix is innapropriate This fact can be explain by the fact that a dissference of two trt that one is working therefor an interaction with time is present the variance become increasing even seometimes xplosive( GArch model i.e). It is even proven in the ZDZt matrix of a RIAS model the variance in time slope become a squared or even 4 power of time spacing.
A time is a dynamic variable and is always a within subject factor.
Howvere a random intercept might be considered as a between id factor as by definition it is a varaince around the mean intercept and therefore characterize a between subject difference from mean population.
Maybe that accounting anfcfixed effect interaction plus a RAIAS model is too much?: It depends on the reserach thematic: For a prediction the advantage of individual time slope is an advantages If the study remain at the sample-population under study a simple RI with fixed effect interaction might more manageable!
Of course teh RIAS model account for the slope of the two treated groups as one random slope: But looking to the grapph variance is explained by group treatemnt: If you are willing to account exactly for group effect in random ness you therefore has to add an random interaction term slope with trt lme(Scoretime+trt,random=trt*time+time|id,data=new,control = lmeControl(maxIter = 2000 However no convergence is found with so low data points-
Another way to do is to consider time a ordered factor:
##
## Call:
## lm(formula = Score ~ ordered(time) * trt, data = new)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.5789 -1.4211 -0.1579 1.5789 4.5500
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 109.58947 0.22898 478.600 <2e-16 ***
## ordered(time).L -12.13316 0.51201 -23.697 <2e-16 ***
## ordered(time).Q -1.33631 0.51201 -2.610 0.0098 **
## ordered(time).C -0.28294 0.51201 -0.553 0.5812
## ordered(time)^4 0.38373 0.51201 0.749 0.4545
## trtB 4.49053 0.31975 14.044 <2e-16 ***
## ordered(time).L:trtB 8.22775 0.71499 11.508 <2e-16 ***
## ordered(time).Q:trtB 1.77729 0.71499 2.486 0.0138 *
## ordered(time).C:trtB 0.50430 0.71499 0.705 0.4815
## ordered(time)^4:trtB -0.09688 0.71499 -0.135 0.8924
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.232 on 185 degrees of freedom
## Multiple R-squared: 0.8176, Adjusted R-squared: 0.8087
## F-statistic: 92.11 on 9 and 185 DF, p-value: < 2.2e-16
Result Still in accordance what was presented before : Linear if the main trend and a order 2 might be adequate for more precision on predictions.
Chen, D. (., Peace, K. E. (2011). Clinical Trial Data Analysis Using R. États-Unis: CRC Press.
Zhang, S., Paul, J.E., Nantha-Aree, M., Buckley, N., Shahzad, U., Cheng, J., Debeer, J., Winemaker, M.J., Wismer, D., Punthakee, D., Avram, V., & Thabane, L. (2014). Empirical comparison of four baseline covariate adjustment methods in analysis of continuous outcomes in randomized controlled trials. Clinical Epidemiology, 6, 227 - 235.
A first approcach would be to use a robust estimateor of the COV matrix knwon as the sandwich / Hubert estimators↩︎
Large clinical trials have shown that combined treatment with reserpine plus a thiazide diuretic reduces mortality of people with hypertension..↩︎
Back on tracking the history of this RCT (part of the job) it is not clear from source which treatment was tested either Resepinic or betaB…The intention here is to demonstrate the technical statistical aspect with repeated data not a biophysical full proof of a drug per se.↩︎
Missing value will be replace by a Propensity score matching approach in this slight unbalanced design to make an repeated Anova and result compared.↩︎
Naive because it doesn’t take into account repeated corrrelated measures hence partial information only Use LMM or GEE↩︎
Because we use only one type of information at time 4! some study are so designed but be aware that slope information might be misleading if one to cpare the baseline with each trt with low sample size such exposed in Captopril se xyplot study↩︎
[Naive model] doesn’t take accout of RM.↩︎