Interim

Author

Rawly

Data Load

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Attaching package: 'kableExtra'


The following object is masked from 'package:dplyr':

    group_rows

ANALYSIS

1. Univariate Analyses

Continuous Variables

Study Duration, Ulcer Size, and Ulcer Age

Variable	Observations	Mean	Median	SD	Skewness
Study Duration	45	150.200000	182.0	79.036988	-0.2735203
Ulcer Size	45	1.715556	0.8	1.903222	2.1208651
Ulcer Age	45	23.104444	9.0	31.619290	2.4790209

`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Discrete Variables

Study Arm and Healed Status (Discrete Variables)

Warning in geom_label(aes(label = ..count..), stat = "count", position =
position_stack(vjust = 0.7), : Ignoring unknown parameters: `format_string`

Warning: The dot-dot notation (`..count..`) was deprecated in ggplot2 3.4.0.
ℹ Please use `after_stat(count)` instead.

Summary Tables of Univariate Data

Summary of variables

Variable	Observations	Mean	Median	SD	Skewness
Study Duration	45	150.200000	182.0	79.036988	-0.2735203
Ulcer Size	45	1.715556	0.8	1.903222	2.1208651
Ulcer Age	45	23.104444	9.0	31.619290	2.4790209

Variable	Category	Count	Proportion
healed	No	23	51.11111
healed	Yes	22	48.88889
arm	Active	20	44.44444
arm	Placebo	25	55.55556

2. Bivariate Analysis

Table 1


Attaching package: 'table1'

The following objects are masked from 'package:base':

    units, units<-

	Active (N=20)	Placebo (N=25)	P-value
Study Duration
Mean (SD)	150 (88.9)	150 (72.1)	0.991
Median [Min, Max]	193 [21.0, 296]	175 [40.0, 254]
Ulcer Age
Mean (SD)	15.7 (18.6)	29.0 (38.4)	0.135
Median [Min, Max]	8.50 [3.00, 74.5]	12.0 [4.00, 165]
Ulcer Start Size (cm2)
Mean (SD)	1.80 (2.41)	1.65 (1.42)	0.805
Median [Min, Max]	0.700 [0.300, 9.70]	1.20 [0.300, 5.50]
Healed Group
No	11 (55.0%)	12 (48.0%)	0.868
Yes	9 (45.0%)	13 (52.0%)

	No (N=23)	Yes (N=22)	P-value
Study Duration
Mean (SD)	210 (33.6)	87.3 (61.6)	<0.001
Median [Min, Max]	210 [119, 296]	64.0 [21.0, 240]
Ulcer Age
Mean (SD)	32.2 (39.2)	13.5 (17.2)	0.0454
Median [Min, Max]	12.0 [4.00, 165]	8.00 [3.00, 74.5]
Ulcer Start Size (cm2)
Mean (SD)	1.86 (1.67)	1.56 (2.15)	0.608
Median [Min, Max]	1.20 [0.300, 6.00]	0.650 [0.300, 9.70]
Study Arm
Active	11 (47.8%)	9 (40.9%)	0.868
Placebo	12 (52.2%)	13 (59.1%)

Correlation Analysis

I will evaluate the following

Study Duration x Ulcer Age
Study Duration x Ulcer Start
Ulcer Start x Ulcer Age


Attaching package: 'rstatix'

The following object is masked from 'package:stats':

    filter

`geom_smooth()` using formula = 'y ~ x'

`geom_smooth()` using formula = 'y ~ x'

`geom_smooth()` using formula = 'y ~ x'

`geom_smooth()` using formula = 'y ~ x'

T Tests

# A tibble: 3 × 9
  .y.                group1 group2     n1    n2 statistic    df     p p.signif
  <chr>              <chr>  <chr>   <int> <int>     <dbl> <dbl> <dbl> <chr>   
1 study_duration     Active Placebo    20    25   -0.0110  36.3 0.991 ns      
2 log_ulcer_age      Active Placebo    20    25   -1.39    42.9 0.17  ns      
3 log_ulcer_start_mm Active Placebo    20    25   -0.613   36.1 0.544 ns

# A tibble: 3 × 9
  .y.                group1 group2    n1    n2 statistic    df        p p.signif
  <chr>              <chr>  <chr>  <int> <int>     <dbl> <dbl>    <dbl> <chr>   
1 study_duration     No     Yes       23    22      8.28  32.2  1.80e-9 ****    
2 log_ulcer_start_mm No     Yes       23    22      1.03  42.9  3.08e-1 ns      
3 log_ulcer_age      No     Yes       23    22      2.39  41.3  2.16e-2 *

.y.	group1	group2	n1	n2	statistic	df	p	p.signif
study_duration	Active	Placebo	20	25	-0.0109965	36.31761	0.9910	ns
log_ulcer_age	Active	Placebo	20	25	-1.3947724	42.94486	0.1700	ns
log_ulcer_start_mm	Active	Placebo	20	25	-0.6132606	36.09744	0.5440	ns
study_duration	No	Yes	23	22	8.2755613	32.17262	0.0000	****
log_ulcer_start_mm	No	Yes	23	22	1.0323140	42.93308	0.3080	ns
log_ulcer_age	No	Yes	23	22	2.3879907	41.26590	0.0216	*

Hypothesis Testing

We are interested in testing the hypothesis that patients in the “Active” arm of the study heal faster than those in the “Placebo” group, as evidenced by a shorter `study_duration`.

Hypothesis 1: Impact of Study Arm on Healing

Null Hypothesis ( $H_0$ ): The proportion of patients who healed is independent of the study arm.
Alternative Hypothesis ( $H_A$ ): The proportion of patients who healed is dependent on the study arm.

Hypothesis 2: Impact of Study Arm on Study Duration

$H_0$: There is no difference in the study duration between patients in the “Active” and “Placebo” groups. ( $\mu_{active} = \mu_{placebo}$ )
$H_A$: Patients in the “Active” group have a shorter study duration than those in the “Placebo” group. ( $\mu_{active} < \mu_{placebo}$ ) $\mu_{placebo}$

This is a one-sided test as the evaluation is only if active was better than placebo.

Since $\mu_{active}: 150 [21 -296]$ and $_{placebo}: 150 $ $[40 - 254]$ we cannot reject the null hypothesis.


Call:
glm(formula = healed_group ~ arm + area_cm + ulcer_age + ulcer_start_cm, 
    family = binomial, data = data)

Coefficients:
                Estimate Std. Error z value Pr(>|z|)    
(Intercept)    -0.194023   0.154862  -1.253     0.21    
armPlacebo      0.948589   0.181777   5.218 1.80e-07 ***
area_cm        -0.878282   0.104085  -8.438  < 2e-16 ***
ulcer_age      -0.035695   0.004766  -7.489 6.93e-14 ***
ulcer_start_cm  0.604841   0.078432   7.712 1.24e-14 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 945.10  on 709  degrees of freedom
Residual deviance: 752.84  on 705  degrees of freedom
  (200 observations deleted due to missingness)
AIC: 762.84

Number of Fisher Scoring iterations: 5

Kaplan-Meier Curves


Attaching package: 'survminer'

The following object is masked from 'package:survival':

    myeloma