Interim

Author

Rawly

Data Load

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Attaching package: 'kableExtra'


The following object is masked from 'package:dplyr':

    group_rows

ANALYSIS

1. Univariate Analyses

Continuous Variables

Study Duration, Ulcer Size, and Ulcer Age

Variable Observations MissingValues Mean Median SD Skewness
Study Duration 45 0 150.200000 182.0 79.036988 -0.2735203
Ulcer Size 45 0 1.715556 0.8 1.903222 2.1208651
Ulcer Age 45 0 23.104444 9.0 31.619290 2.4790209
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Discrete Variables

Study Arm and Healed Status (Discrete Variables)

Warning in geom_label(aes(label = ..count..), stat = "count", position =
position_stack(vjust = 0.7), : Ignoring unknown parameters: `format_string`
Warning: The dot-dot notation (`..count..`) was deprecated in ggplot2 3.4.0.
ℹ Please use `after_stat(count)` instead.

Summary Tables of Univariate Data

Summary of variables

Variable Observations MissingValues Mean Median SD Skewness
Study Duration 45 0 150.200000 182.0 79.036988 -0.2735203
Ulcer Size 45 0 1.715556 0.8 1.903222 2.1208651
Ulcer Age 45 0 23.104444 9.0 31.619290 2.4790209
Variable Category Count Proportion
healed No 23 51.11111
healed Yes 22 48.88889
arm Active 20 44.44444
arm Placebo 25 55.55556

2. Bivariate Analysis

Table 1


Attaching package: 'table1'
The following objects are masked from 'package:base':

    units, units<-
Active
(N=20)
Placebo
(N=25)
P-value
Study Duration
Mean (SD) 150 (88.9) 150 (72.1) 0.991
Median [Min, Max] 193 [21.0, 296] 175 [40.0, 254]
Ulcer Age
Mean (SD) 15.7 (18.6) 29.0 (38.4) 0.135
Median [Min, Max] 8.50 [3.00, 74.5] 12.0 [4.00, 165]
Ulcer Start Size (cm2)
Mean (SD) 1.80 (2.41) 1.65 (1.42) 0.805
Median [Min, Max] 0.700 [0.300, 9.70] 1.20 [0.300, 5.50]
Healed Group
No 11 (55.0%) 12 (48.0%) 0.868
Yes 9 (45.0%) 13 (52.0%)
No
(N=23)
Yes
(N=22)
P-value
Study Duration
Mean (SD) 210 (33.6) 87.3 (61.6) <0.001
Median [Min, Max] 210 [119, 296] 64.0 [21.0, 240]
Ulcer Age
Mean (SD) 32.2 (39.2) 13.5 (17.2) 0.0454
Median [Min, Max] 12.0 [4.00, 165] 8.00 [3.00, 74.5]
Ulcer Start Size (cm2)
Mean (SD) 1.86 (1.67) 1.56 (2.15) 0.608
Median [Min, Max] 1.20 [0.300, 6.00] 0.650 [0.300, 9.70]
Study Arm
Active 11 (47.8%) 9 (40.9%) 0.868
Placebo 12 (52.2%) 13 (59.1%)

Correlation Analysis

I will evaluate the following

  • Study Duration x Ulcer Age

  • Study Duration x Ulcer Start

  • Ulcer Start x Ulcer Age


Attaching package: 'rstatix'
The following object is masked from 'package:stats':

    filter
`geom_smooth()` using formula = 'y ~ x'

`geom_smooth()` using formula = 'y ~ x'

`geom_smooth()` using formula = 'y ~ x'

`geom_smooth()` using formula = 'y ~ x'

T Tests

# A tibble: 3 × 9
  .y.                group1 group2     n1    n2 statistic    df     p p.signif
  <chr>              <chr>  <chr>   <int> <int>     <dbl> <dbl> <dbl> <chr>   
1 study_duration     Active Placebo    20    25   -0.0110  36.3 0.991 ns      
2 log_ulcer_age      Active Placebo    20    25   -1.39    42.9 0.17  ns      
3 log_ulcer_start_mm Active Placebo    20    25   -0.613   36.1 0.544 ns      
# A tibble: 3 × 9
  .y.                group1 group2    n1    n2 statistic    df        p p.signif
  <chr>              <chr>  <chr>  <int> <int>     <dbl> <dbl>    <dbl> <chr>   
1 study_duration     No     Yes       23    22      8.28  32.2  1.80e-9 ****    
2 log_ulcer_start_mm No     Yes       23    22      1.03  42.9  3.08e-1 ns      
3 log_ulcer_age      No     Yes       23    22      2.39  41.3  2.16e-2 *       
.y. group1 group2 n1 n2 statistic df p p.signif
study_duration Active Placebo 20 25 -0.0109965 36.31761 0.9910 ns
log_ulcer_age Active Placebo 20 25 -1.3947724 42.94486 0.1700 ns
log_ulcer_start_mm Active Placebo 20 25 -0.6132606 36.09744 0.5440 ns
study_duration No Yes 23 22 8.2755613 32.17262 0.0000 ****
log_ulcer_start_mm No Yes 23 22 1.0323140 42.93308 0.3080 ns
log_ulcer_age No Yes 23 22 2.3879907 41.26590 0.0216 *

Hypothesis Testing

We are interested in testing the hypothesis that patients in the “Active” arm of the study heal faster than those in the “Placebo” group, as evidenced by a shorter `study_duration`.

Hypothesis 1: Impact of Study Arm on Healing

  • Null Hypothesis ( \(H_0\) ): The proportion of patients who healed is independent of the study arm.

  • Alternative Hypothesis ( \(H_A\) ): The proportion of patients who healed is dependent on the study arm.

Hypothesis 2: Impact of Study Arm on Study Duration

  • \(H_0\): There is no difference in the study duration between patients in the “Active” and “Placebo” groups. ( \(\mu_{active} = \mu_{placebo}\) )
  • \(H_A\): Patients in the “Active” group have a shorter study duration than those in the “Placebo” group. ( \(\mu_{active} < \mu_{placebo}\) ) \(\mu_{placebo}\)

This is a one-sided test as the evaluation is only if active was better than placebo.

Since \(\mu_{active}: 150 [21 -296]\) and $_{placebo}: 150 $ \([40 - 254]\) we cannot reject the null hypothesis.


Call:
glm(formula = healed_group ~ arm + area_cm + ulcer_age + ulcer_start_cm, 
    family = binomial, data = data)

Coefficients:
                Estimate Std. Error z value Pr(>|z|)    
(Intercept)    -0.194023   0.154862  -1.253     0.21    
armPlacebo      0.948589   0.181777   5.218 1.80e-07 ***
area_cm        -0.878282   0.104085  -8.438  < 2e-16 ***
ulcer_age      -0.035695   0.004766  -7.489 6.93e-14 ***
ulcer_start_cm  0.604841   0.078432   7.712 1.24e-14 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 945.10  on 709  degrees of freedom
Residual deviance: 752.84  on 705  degrees of freedom
  (200 observations deleted due to missingness)
AIC: 762.84

Number of Fisher Scoring iterations: 5

Kaplan-Meier Curves


Attaching package: 'survminer'
The following object is masked from 'package:survival':

    myeloma

Extra Stuff