02_EDA

Author

Sergio Uribe

Modified

February 29, 2024

Packages

Dataset

Preliminary EDA

Data summary
Name df
Number of rows 45
Number of columns 21
_______________________
Column type frequency:
character 1
numeric 20
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
sex 0 1 1 1 0 2 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
id 0 1 23.00 13.13 1.00 12.00 23.00 34.00 45.00 ▇▇▇▇▇
age 0 1 27.18 6.39 19.00 22.00 26.00 32.00 48.00 ▇▆▃▂▁
right_t1 0 1 2914.66 664.07 1861.08 2545.01 2937.47 3222.95 5489.98 ▅▇▃▁▁
left_t1 0 1 2973.87 745.97 1711.80 2583.59 2888.19 3268.36 5868.12 ▃▇▂▁▁
right_t2 0 1 763.00 366.62 259.93 490.07 707.00 950.59 1875.61 ▇▅▅▁▁
left_t2 0 1 773.06 355.35 225.07 459.18 765.47 961.36 1860.44 ▇▇▇▁▁
right_t3 0 1 29.45 79.52 0.20 5.64 12.00 24.05 539.65 ▇▁▁▁▁
left_t3 0 1 48.99 123.73 0.00 2.38 12.51 35.26 793.92 ▇▁▁▁▁
t1 0 1 5888.53 1356.70 3775.51 5021.25 5726.59 6441.41 11358.10 ▅▇▂▁▁
t2 0 1 1536.06 689.20 548.93 1104.31 1452.65 1908.64 3736.05 ▆▇▅▁▁
t3 0 1 78.45 198.29 1.58 19.93 27.47 64.44 1333.57 ▇▁▁▁▁
right_t1_abs 0 1 2914.66 664.07 1861.08 2545.01 2937.47 3222.95 5489.98 ▅▇▃▁▁
right_t2_abs 0 1 2151.66 475.49 1358.05 1883.11 2078.86 2328.23 3950.22 ▃▇▂▁▁
right_t3_abs 0 1 2122.21 475.62 1349.31 1861.97 2041.48 2211.29 3933.79 ▅▇▂▁▁
left_t1_abs 0 1 2973.87 745.97 1711.80 2583.59 2888.19 3268.36 5868.12 ▃▇▂▁▁
left_t2_abs 0 1 2200.80 551.10 1389.42 1848.22 2039.43 2359.32 4252.87 ▇▇▂▁▁
left_t3_abs 0 1 2151.81 556.29 1300.68 1782.45 2009.97 2354.87 4223.79 ▅▇▂▁▁
t1_abs 0 1 5888.53 1356.70 3775.51 5021.25 5726.59 6441.41 11358.10 ▅▇▂▁▁
t2_abs 0 1 4352.47 985.46 3062.08 3743.90 4219.50 4605.16 8203.09 ▇▇▁▁▁
t3_abs 0 1 4274.02 991.59 2991.13 3618.55 4034.38 4555.02 8157.58 ▇▆▂▁▁

Simplify the dataset

Exploratory Data Analysis

DEMOGRAPHICS

Table 1 Patients and age

Characteristic N = 451
Age 26.0 (22.0, 32.0)
Sex
    F 38 (84%)
    M 7 (16%)
1 Median (IQR); n (%)

ABSOLUTE CHANGES

Absolute Change in Volume by Time

Absolute changes by time violin plot

Absolute Changes in Volume by Time
Time n median range max min mean sd
t1 45 5726.59 1420.16 11358.10 3775.51 5888.53 1356.70
t2 45 4219.50 861.26 8203.09 3062.08 4352.47 985.46
t3 45 4034.38 936.47 8157.58 2991.13 4274.02 991.59

RELATIVE CHANGES

Relative change in volume by time

Relative change in volume by time

Table Relative Change in Volume by Time

Relative Change in Volume by Time
time n mean sd se lower_ci upper_ci
1 week 45 100.0 0.0 0.0 100.0 100.0
4 months 45 74.6 8.3 1.2 72.1 77.0
12 months 45 73.3 9.1 1.4 70.6 75.9

Relative change by Sex

Relative change by Age

Difference in relative_change by age?

REGRESSION ANALYSIS

Table 2 Difference in relative change by age and sex?

Characteristic Beta 95% CI1 p-value
Age 0.06 -0.38, 0.49 0.8
Sex
    F
    M 33 -1.9, 69 0.063
Age * Sex
    Age * M -1.4 -2.7, -0.20 0.024
1 CI = Confidence Interval

ANALYSIS BY SIDE

Additional

# A tibble: 11 × 1
      t3
   <dbl>
 1  43.7
 2  63.6
 3  66.1
 4  70.4
 5  73.2
 6  75.0
 7  76.3
 8  78.0
 9  80.0
10  84.4
11  87.6

10% of patients showed volume loss of less than 15%, which means that the final implant volume is more than anticipated. 20% of patients showed volume loss of more than 35% which means that the final implant volume is considerably less than anticipated and 5% of patients lost 45% or more of the initial volume. Furthermore, when analysing symmetry of volume loss we can see that 20% of patients had more than 7% difference in volume loss for one of the sides, this difference can cause asymmetrical malar zones.

# A tibble: 1 × 1
  change
   <dbl>
1   4.44

Correct answer: 4.4% of patients showed volume loss of less than 15%

20% of patients showed volume loss of more than 35%

# A tibble: 1 × 1
  change
   <dbl>
1   13.3

Correct answer: 13.3% of patients showed volume loss of more than 35%

5% of patients lost 45% or more of the initial volume

# A tibble: 1 × 1
  change
   <dbl>
1   4.44

Correct answer: 4.4% of patients showed volume loss of more than 45%

Furthermore, when analysing symmetry of volume loss we can see that 20% of patients had more than 7% difference in volume loss for one of the sides,

  1. calculate the difference in volume per side

    # A tibble: 45 × 3
          id right_change left_change
       <dbl>        <dbl>       <dbl>
     1     1         28.3        28.0
     2     2         21.0        20.6
     3     3         13.1        12.5
     4     4         15.7        14.9
     5     5         22.8        21.9
     6     6         25.4        26.4
     7     7         14.5        16.0
     8     8         19.3        21.0
     9     9         30.6        28.8
    10    10         29.9        28.1
    # ℹ 35 more rows

Now compare the difference between sides

# A tibble: 45 × 2
      id difference_between_sides
   <dbl>                    <dbl>
 1     1                    0.325
 2     2                    0.423
 3     3                    0.632
 4     4                    0.813
 5     5                    0.887
 6     6                    1.05 
 7     7                    1.51 
 8     8                    1.74 
 9     9                    1.77 
10    10                    1.80 
# ℹ 35 more rows

Now count how many > 7%

# A tibble: 1 × 1
  change
   <dbl>
1   17.8

Correct answer: Furthermore, when analysing symmetry of volume loss we can see that 17.7% of patients had more than 7% difference in volume loss for one of the sides,