2025_Daniels_ortho

Author

Sergio Uribe

CREATED

April 10, 2025

UPDATED

May 4, 2025

1 Packages

2 DATA PREPARATION

[1] "/home/sergiouribe/Insync/sergio.uribe@gmail.com/Google Drive/Research Drive/2025_Daniels Belajev_Institute_Orthodontics_Jakobsone/data 25.03.2025"

2.1 Name changes:

2.2 Data cleaning

Convert labelled values:

2.2.1 Transform variables: Descriptive

(Why still unknown is shown???)

ahh, is a missing value!

2.2.2 Transform variables: Doctor perception

Fits_6mo and Fits_12mo (merge rare levels into “Poor fit”

Patient_missed_6mo and _12mo

Parents_involved_6mo and _12mo

Shows_interest_6mo and _12mo

Poor_hygiene_6mo and _12mo

Uses_rubberbands_6mo and _12mo

Shows_dissatisfaction_6mo and _12mo

3 MODELLING

4 TABLE 1

Characteristic	N = 53¹
Gender
Vīriešu	16 (30%)
Sieviešu	37 (70%)
Age	34 (25, 39)
Occupation_type
Student	6 (11%)
Employed	41 (77%)
Unemployed	3 (5.7%)
Other/Unknown	3 (5.7%)
Education_level
Primary education	1 (1.9%)
Secondary education	11 (21%)
Vocational secondary	6 (11%)
Higher education	35 (66%)
Household_size
With parents	8 (15%)
With partner	27 (51%)
Alone	12 (23%)
Other	6 (11%)
Doctor_communication
Somewhat agree	19 (36%)
Strongly agree	31 (58%)
Other opinion	3 (5.7%)
Family_informed
Strongly agree	49 (92%)
Other opinion	4 (7.5%)
Family_pressure
Strongly disagree	36 (68%)
Somewhat disagree	6 (11%)
Neutral	8 (15%)
Other opinion	3 (5.7%)
Doctor_pressure
Strongly disagree	25 (47%)
Somewhat disagree	12 (23%)
Neutral	10 (19%)
Other opinion	6 (11%)
Financial_tracking
Strongly agree	19 (36%)
Somewhat agree	18 (34%)
Neutral	11 (21%)
Other opinion	5 (9.4%)
Extra_budget_money
Somewhat agree	19 (36%)
Other opinion	34 (64%)
Psych_emotional_symptoms
No	49 (92%)
Yes or skipped	4 (7.5%)
BFI_Conscientiousness	55 (50, 61)
BFI_Fake	11.00 (8.00, 13.00)
BFI_Neuroticism	41 (34, 46)
¹ n (%); Median (Q1, Q3)

Save as Table 1

4.1 Motivation

4.1.1 Table data doctor

Characteristic	N = 53¹
Aligns_12mo	39 (74%)
Fits_6mo
Perfect fit	33 (62%)
Slight gap (1–2 teeth)	19 (36%)
Poor fit	1 (1.9%)
Fits_12mo
Perfect fit	29 (55%)
Slight gap (1–2 teeth)	14 (26%)
Poor fit	10 (19%)
Patient_missed_6mo
Never	41 (77%)
Missed appointments	12 (23%)
Patient_missed_12mo
Never	38 (72%)
Missed appointments	15 (28%)
Parents_involved_6mo
Never	46 (87%)
Sometimes involved	7 (13%)
Parents_involved_12mo
Never	47 (89%)
Sometimes involved	6 (11%)
Shows_interest_6mo
Always	17 (32%)
Often	28 (53%)
Sometimes	8 (15%)
Shows_interest_12mo
Always	16 (30%)
Often	27 (51%)
Sometimes	10 (19%)
Poor_hygiene_6mo
Never	43 (81%)
Sometimes	10 (19%)
Poor_hygiene_12mo
Never	37 (70%)
Sometimes	16 (30%)
Uses_rubberbands_6mo
Not prescribed	29 (55%)
Uses regularly	21 (40%)
Rarely/never	3 (5.7%)
Uses_rubberbands_12mo
Not prescribed	30 (57%)
Uses regularly	17 (32%)
Rarely/never	6 (11%)
Shows_dissatisfaction_6mo
Never	44 (83%)
Sometimes	9 (17%)
Shows_dissatisfaction_12mo
Never	40 (75%)
Sometimes	13 (25%)
¹ n (%)

4.2 Change in position

5 MODELLING

Now the question is: what explain the difference between the expected and the actual movement?

Neuroticism and conscientiousness personality traits were assessed with the validated Big Five Personality Inventory (BFI)

BFI_Conscientiousness

BFI_Fake

BFI_Neuroticism

Patient cooperation was assessed by evaluating clinical fitting of aligners to dental arches and comparison of the planned and achieved upper 1^st premolar expansion (%).

Fits_6mo

Fits_12mo

Patient_missed_6mo:Shows_dissatisfaction_12mo

Index of Orthodontic Treatment Needed

the higher the worst the orthodontic status of the patient

IOTN_AH

IOTN_DH

and the target variable is the difference between the

Expected_position_mm

and the

Real_position_mm

Model the difference between expected and actual movement

5.1 Change from the initial to the final position

5.2 Preparing variables

Ensure age and gender will be included

5.3 Model 1: Lineal Regression

5.3.1 For the movement

Characteristic	Beta	95% CI	p-value
Age	-0.01	-0.03, 0.02	0.6
Sex
Vīriešu	—	—
Sieviešu	0.11	-0.28, 0.51	0.6
Patient_missed_6mo
Never	—	—
Missed appointments	0.38	-0.11, 0.87	0.13
Shows_dissatisfaction_12mo
Never	—	—
Sometimes	0.38	-0.03, 0.79	0.069
IOTN_AH	0.00	-0.16, 0.17	>0.9
IOTN_DH	0.01	-0.26, 0.28	>0.9
BFI_Conscientiousness	0.01	-0.02, 0.03	0.5
BFI_Neuroticism	-0.01	-0.03, 0.01	0.3
Abbreviation: CI = Confidence Interval

5.3.2 Explore BFI_Conscientiousness & BFI_Neuroticism

There are some issues:

Observed vs Model-Predicted Density: check, maybe robust regression, transforming the outcome, or checking outliers?
check cases 24, 30, 38, 47)
Consider robust regression or bootstrapping if inference is sensitive.

5.3.3 Check outliers

5.4 Multinomial models

5.4.1 Model for Fits 6 and

# weights:  30 (18 variable)
initial  value 58.226451 
iter  10 value 27.608110
iter  20 value 22.612271
iter  30 value 22.304646
iter  40 value 22.264025
final  value 22.263629 
converged

Characteristic	log(OR)	95% CI	p-value
Slight gap (1–2 teeth)
Age	-0.05	-0.14, 0.03	0.2
Sex
Vīriešu	—	—
Sieviešu	-0.22	-2.0, 1.6	0.8
Patient_missed_6mo
Never	—	—
Missed appointments	-2.1	-4.6, 0.44	0.11
Shows_dissatisfaction_12mo
Never	—	—
Sometimes	1.1	-0.69, 3.0	0.2
IOTN_AH	0.51	-0.26, 1.3	0.2
IOTN_DH	-0.24	-1.4, 0.95	0.7
BFI_Conscientiousness	0.05	-0.04, 0.15	0.3
BFI_Neuroticism	0.22	0.09, 0.36	0.001
Poor fit
Age	-1.7	-3.2, -0.26	0.021
Sex
Vīriešu	—	—
Sieviešu	21	21, 21	<0.001
Patient_missed_6mo
Never	—	—
Missed appointments	16	16, 16	<0.001
Shows_dissatisfaction_12mo
Never	—	—
Sometimes	-3.8	-3.8, -3.8	<0.001
IOTN_AH	-0.13	-0.35, 0.10	0.3
IOTN_DH	9.9	9.7, 10	<0.001
BFI_Conscientiousness	-0.89	-4.7, 2.9	0.6
BFI_Neuroticism	0.52	-3.5, 4.6	0.8
Abbreviations: CI = Confidence Interval, OR = Odds Ratio

Outcome	Characteristic	OR	95% CI	p-value
Slight gap (1–2 teeth)	Age	0.95	0.87, 1.03	0.2
	Sex
	Vīriešu	—	—
	Sieviešu	0.80	0.14, 4.71	0.8
	Patient_missed_6mo
	Never	—	—
	Missed appointments	0.13	0.01, 1.55	0.11
	Shows_dissatisfaction_12mo
	Never	—	—
	Sometimes	3.15	0.50, 19.7	0.2
	IOTN_AH	1.67	0.77, 3.62	0.2
	IOTN_DH	0.79	0.24, 2.58	0.7
	BFI_Conscientiousness	1.06	0.96, 1.17	0.3
	BFI_Neuroticism	1.25	1.09, 1.43	0.001
Poor fit	Age	0.18	0.04, 0.77	0.021
	Sex
	Vīriešu	—	—
	Sieviešu	1,596,196,257	1,476,191,068, 1,725,957,126	<0.001
	Patient_missed_6mo
	Never	—	—
	Missed appointments	12,208,005	11,717,317, 12,719,242	<0.001
	Shows_dissatisfaction_12mo
	Never	—	—
	Sometimes	0.02	0.02, 0.02	<0.001
	IOTN_AH	0.88	0.70, 1.11	0.3
	IOTN_DH	19,221	15,791, 23,395	<0.001
	BFI_Conscientiousness	0.41	0.01, 18.4	0.6
	BFI_Neuroticism	1.68	0.03, 97.2	0.8
Abbreviations: CI = Confidence Interval, OR = Odds Ratio

5.4.2 Model for Fit 12 months

# weights:  30 (18 variable)
initial  value 58.226451 
iter  10 value 33.433381
iter  20 value 24.267881
iter  30 value 20.780780
iter  40 value 15.727085
iter  50 value 15.633035
iter  60 value 15.554048
iter  70 value 15.530776
iter  80 value 15.523711
iter  90 value 15.521127
iter 100 value 15.519470
final  value 15.519470 
stopped after 100 iterations

Characteristic	log(OR)	95% CI	p-value
Slight gap (1–2 teeth)
Age	-0.04	-0.14, 0.06	0.4
Sex
Vīriešu	—	—
Sieviešu	-0.39	-2.9, 2.1	0.8
Patient_missed_6mo
Never	—	—
Missed appointments	0.93	-1.9, 3.7	0.5
Shows_dissatisfaction_12mo
Never	—	—
Sometimes	-1.5	-3.8, 0.89	0.2
IOTN_AH	0.28	-0.71, 1.3	0.6
IOTN_DH	-0.61	-2.4, 1.2	0.5
BFI_Conscientiousness	-0.09	-0.21, 0.03	0.15
BFI_Neuroticism	0.25	0.03, 0.47	0.028
Poor fit
Age	0.70	-0.21, 1.6	0.13
Sex
Vīriešu	—	—
Sieviešu	25	-13, 64	0.2
Patient_missed_6mo
Never	—	—
Missed appointments	-55	-81, -30	<0.001
Shows_dissatisfaction_12mo
Never	—	—
Sometimes	17	-14, 48	0.3
IOTN_AH	14	-5.3, 33	0.2
IOTN_DH	57	33, 81	<0.001
BFI_Conscientiousness	7.4	5.7, 9.0	<0.001
BFI_Neuroticism	14	13, 14	<0.001
Abbreviations: CI = Confidence Interval, OR = Odds Ratio

5.5 Model 2 Robust regression

Characteristic	Beta	95% CI
Age	0.00	-0.02, 0.02
Sex
Vīriešu	—	—
Sieviešu	0.04	-0.30, 0.39
BFI_Conscientiousness	0.00	-0.02, 0.02
BFI_Neuroticism	-0.01	-0.02, 0.01
Patient_missed_6mo
Never	—	—
Missed appointments	0.23	-0.20, 0.65
Shows_dissatisfaction_12mo
Never	—	—
Sometimes	0.31	-0.05, 0.67
IOTN_AH	0.02	-0.12, 0.17
IOTN_DH	-0.01	-0.24, 0.23
Abbreviation: CI = Confidence Interval

5.6 Tidymodel

Tidy Output of Ordinal Regression Model (Sorted by p-value)
term	estimate	std.error	statistic	p.value
(Intercept)	-0.425	0.084	-5.071	0.000
Shows_dissatisfaction_12mo_Sometimes	0.165	0.089	1.861	0.069
Patient_missed_6mo_Missed.appointments	0.161	0.103	1.562	0.125
BFI_Neuroticism	-0.102	0.107	-0.959	0.343
BFI_Conscientiousness	0.063	0.102	0.612	0.544
Sex_Sieviešu	0.053	0.091	0.583	0.563
Age	-0.050	0.092	-0.539	0.592
IOTN_DH	0.010	0.122	0.084	0.934
IOTN_AH	0.001	0.124	0.008	0.994

Presentation: https://docs.google.com/presentation/d/1VqYmsV5PSVvMKO3lzGamVfkDsTMCwQqNK47oTP_s-lw/edit#slide=id.p10

5.7 Effect on subjective measurements

5.7.1 Fits 6 months

Characteristic	OR	95% CI
BFI_Neuroticism	1.17	1.08, 1.30
Gender
Vīriešu	—	—
Sieviešu	1.22	0.27, 6.16
Abbreviations: CI = Confidence Interval, OR = Odds Ratio

--- title: "2025_Daniels_ortho" author: "Sergio Uribe" date: 2025-04-10 date-modified: last-modified language: title-block-published: "CREATED" title-block-modified: "UPDATED" format: html: toc: true number-sections: true code-fold: true code-tools: true toc-expand: 3 theme: cosmo # optional, or pick another like journal, lumen, etc. toc-expand: 3 code-fold: true code-tools: true editor: visual execute: echo: false cache: true warning: false message: false --- # Packages ```{r} # Load required libraries with pacman; installs them if not already installed pacman::p_load(MASS, # for the robust regression tidyverse, # tools for data science visdat, #NAs hrbrthemes, # nice ggplot themes janitor, # for data cleaning and tables here, # for reproducible research gtsummary, # for tables explore, # for EDA, check https://rolkra.github.io/explore/ # countrycode, # to normalize country data tidyplots, # check https://tidyplots.org/ ggeasy, # to use variable labels in ggplot, then easy_labs() easystats, # check https://easystats.github.io/easystats/ scales, viridis, flextable, ggeffects, # for the polot of fits 6 or 12 officer, writexl, # to save tables in xls format tidymodels, sjPlot, haven, # for spss data quantreg, # for the quantile regression lubridate ) ``` ```{r} theme_set(theme_minimal()) ``` # DATA PREPARATION ```{r} here::here() ``` ```{r} df <- read_spss("ORTOnew.sav") |> select(-1) # Remove names ``` ```{r} # glimpse(df) ``` ## Name changes: ```{r} df <- df |> rename( Gender = Dzimums, Age = Vecums, Occupation_type = Nodarb_veids, Education_level = Izglitības_līmenis, Household_size = Mājsaimniecībā_dzīvoju, Doctor_communication = Komunikācijā_ar_ārstu, Family_informed = Mana_ģimene_informēta, Family_pressure = Spiediens_no_ģimenes, Doctor_pressure = Spiediens_no_ārsta, Financial_tracking = Sekoju_finansem, Extra_budget_money = Naudu_arpus_budz, Psych_emotional_symptoms = Psihoemoc_trauc, BFI_Conscientiousness = BFI_Apzinīgums, BFI_Fake = BFI_Viltus, BFI_Neuroticism = BFI_Neirotisms, Aligns_6mo = Kape_atbislt_6men, Aligns_12mo = Kape_atbilst_12men, Fits_6mo = Kape_pieguļ_6men, Fits_12mo = Kape_pieguļ_12men, Patient_missed_6mo = Pacients_kavē_6men, Patient_missed_12mo = Pacients_kavē_12men, Parents_involved_6mo = Vecāki_iesaistīti_6men, Parents_involved_12mo = Vecāki_iesaistīti_12men, Shows_interest_6mo = Izrāda_interesi_6men, Shows_interest_12mo = Izrāda_interesi_12men, Poor_hygiene_6mo = Slikta_higiēna_6men, Poor_hygiene_12mo = Slikta_higiēna_12men, Uses_rubberbands_6mo = Izmanto_gumijas_6men, Uses_rubberbands_12mo = Izmanto_gumijas_12men, Shows_dissatisfaction_6mo = Izrāda_neapm_6men, Shows_dissatisfaction_12mo = Izrāda_neapm_12men, IOTN_AH = IOTN_AH, # Already in English IOTN_DH = IOTN_DH, # Already in English Starting_position_mm = Starting_position_mm, # Already in English Expected_position_mm = Expected_position_mm, # Already in English Real_position_mm = Real_position_mm # Already in English ) ``` ```{r} # visdat::vis_dat(df) ``` ## Data cleaning Convert labelled values:\ ```{r} df <- df %>% mutate(across(where(is.labelled), as_factor)) ``` ### Transform variables: Descriptive ```{r} df <- df %>% mutate(Gender = fct_drop(na_if(Gender, "Nevēlos atbildēt"))) ``` ```{r} df <- df %>% mutate(Occupation_type = fct_recode(Occupation_type, "Student" = "Mācos", "Employed" = "Strādāju", "Unemployed" = "Nestrādāju", "Other" = "Cits" )) %>% mutate(Occupation_type = na_if(Occupation_type, "2.1")) %>% mutate(Occupation_type = fct_drop(Occupation_type)) ``` ```{r} df <- df %>% mutate(Occupation_type = fct_collapse(Occupation_type, "Other/Unknown" = c("Other", "Unknown") )) %>% mutate(Occupation_type = fct_relevel(Occupation_type, "Student", "Employed", "Unemployed", "Other/Unknown" )) %>% mutate(Occupation_type = fct_drop(Occupation_type)) ``` (Why still unknown is shown???) ```{r} df <- df %>% mutate( Occupation_type = as_factor(Occupation_type), Occupation_type = fct_drop(Occupation_type) ) ``` ahh, is a missing value! ```{r} df <- df %>% mutate( Occupation_type = fct_na_value_to_level(Occupation_type, level = "Other/Unknown"), Occupation_type = fct_collapse(Occupation_type, "Other/Unknown" = c("Other/Unknown", "Unknown") ), Occupation_type = fct_relevel(Occupation_type, "Student", "Employed", "Unemployed", "Other/Unknown" ) ) ``` ```{r} df <- df %>% mutate(Education_level = fct_recode(Education_level, "Primary education" = "Pamatizglitība", "Secondary education" = "Vidēja izglitība", "Vocational secondary" = "Vidēja speciāla izglitība", "Higher education" = "Augstāka izglitība" )) %>% mutate(Education_level = fct_relevel(Education_level, "Primary education", "Secondary education", "Vocational secondary", "Higher education" )) ``` ```{r} df <- df %>% mutate(Household_size = fct_collapse(Household_size, "With parents" = "Ar vecākiem", "With romantic/marital partner" = "Ar romantisko attiecību vai laulību partneri", "Alone" = "viens(-a) pats(-i)", "Other" = c("Ar draugiem vai paziņam", "5") )) %>% mutate(Household_size = fct_relevel(Household_size, "With parents", "With romantic/marital partner", "Alone", "Other" )) ``` ```{r} df <- df |> mutate( Household_size = fct_recode(Household_size, "With partner" = "With romantic/marital partner" ) ) ``` ```{r} df <- df |> mutate( Doctor_communication = as.character(Doctor_communication), Doctor_communication = case_when( Doctor_communication %in% c("Daļēji piekrītu") ~ "Somewhat agree", Doctor_communication %in% c("Noteikti piekrītu") ~ "Strongly agree", Doctor_communication %in% c("Neitrāli", "Noteikti nepiekrītu", "Daļēji nepiekrītu") ~ "Other opinion", TRUE ~ Doctor_communication ), Doctor_communication = factor(Doctor_communication, levels = c("Somewhat agree", "Strongly agree", "Other opinion") ) ) ``` ```{r} df <- df |> mutate( Family_informed = as.character(Family_informed), Family_informed = case_when( Family_informed == "Noteikti piekrītu" ~ "Strongly agree", Family_informed %in% c("Daļēji nepiekrītu", "Neitrāli", "Daļēji piekrītu", "Noteikti nepiekrītu") ~ "Other opinion", TRUE ~ Family_informed ), Family_informed = factor(Family_informed, levels = c("Strongly agree", "Other opinion") ) ) ``` ```{r} df <- df |> mutate( Family_pressure = as.character(Family_pressure), Family_pressure = case_when( Family_pressure == "Noteikti nepiekrītu" ~ "Strongly disagree", Family_pressure == "Daļēji nepiekrītu" ~ "Somewhat disagree", Family_pressure == "Neitrāli" ~ "Neutral", Family_pressure %in% c("Daļēji piekrītu", "Noteikti piekrītu") ~ "Other opinion", TRUE ~ Family_pressure ), Family_pressure = factor(Family_pressure, levels = c("Strongly disagree", "Somewhat disagree", "Neutral", "Other opinion") ) ) ``` ```{r} df <- df |> mutate( Doctor_pressure = as.character(Doctor_pressure), Doctor_pressure = case_when( Doctor_pressure == "Noteikti nepiekrītu" ~ "Strongly disagree", Doctor_pressure == "Daļēji nepiekrītu" ~ "Somewhat disagree", Doctor_pressure == "Neitrāli" ~ "Neutral", Doctor_pressure %in% c("Daļēji piekrītu", "Noteikti piekrītu") ~ "Other opinion", TRUE ~ Doctor_pressure ), Doctor_pressure = factor(Doctor_pressure, levels = c("Strongly disagree", "Somewhat disagree", "Neutral", "Other opinion") ) ) ``` ```{r} df <- df |> mutate( Financial_tracking = as.character(Financial_tracking), Financial_tracking = case_when( Financial_tracking %in% c("Noteikti nepiekrītu", "Daļēji nepiekrītu") ~ "Other opinion", Financial_tracking == "Neitrāli" ~ "Neutral", Financial_tracking == "Daļēji piekrītu" ~ "Somewhat agree", Financial_tracking == "Noteikti piekrītu" ~ "Strongly agree", TRUE ~ Financial_tracking ), Financial_tracking = factor(Financial_tracking, levels = c("Strongly agree", "Somewhat agree", "Neutral", "Other opinion") ) ) ``` ```{r} df <- df |> mutate( Extra_budget_money = as.character(Extra_budget_money), Extra_budget_money = case_when( Extra_budget_money == "Daļēji piekrītu" ~ "Somewhat agree", TRUE ~ "Other opinion" ), Extra_budget_money = factor(Extra_budget_money, levels = c("Somewhat agree", "Other opinion") ) ) ``` ```{r} df <- df |> mutate( Psych_emotional_symptoms = case_when( Psych_emotional_symptoms == "Nē" ~ "No", Psych_emotional_symptoms %in% c("Jā", "Nevēlos atbildēt") ~ "Yes or skipped", TRUE ~ Psych_emotional_symptoms ), Psych_emotional_symptoms = factor(Psych_emotional_symptoms, levels = c("No", "Yes or skipped") ) ) ``` ### Transform variables: Doctor perception ```{r} df <- df |> mutate( Aligns_12mo = case_when( Aligns_12mo == "Jā" ~ "Yes", Aligns_12mo == "Nē" ~ "No", TRUE ~ Aligns_12mo ), Aligns_12mo = factor(Aligns_12mo, levels = c("Yes", "No")) ) ``` Fits_6mo and Fits_12mo (merge rare levels into “Poor fit” ```{r} fit_levels_good <- "Kape pieguļ pilnībā visos sekstantos; nav vērojamās atstarpes starp kapes materiālu un zobu virsmām" fit_levels_ok <- "Kape pieguļ pilnībā/vērojamā neliela atstarpe starp kapi un 1/2 zobiem, kuri atrodas vienā/divos blakussekstantos." df <- df |> mutate( Fits_6mo = case_when( Fits_6mo == fit_levels_good ~ "Perfect fit", Fits_6mo == fit_levels_ok ~ "Slight gap (1–2 teeth)", TRUE ~ "Poor fit" ), Fits_12mo = case_when( Fits_12mo == fit_levels_good ~ "Perfect fit", Fits_12mo == fit_levels_ok ~ "Slight gap (1–2 teeth)", TRUE ~ "Poor fit" ), Fits_6mo = factor(Fits_6mo, levels = c("Perfect fit", "Slight gap (1–2 teeth)", "Poor fit")), Fits_12mo = factor(Fits_12mo, levels = c("Perfect fit", "Slight gap (1–2 teeth)", "Poor fit")) ) ``` Patient_missed_6mo and \_12mo ```{r} df <- df |> mutate( Patient_missed_6mo = case_when( Patient_missed_6mo == "Nekad" ~ "Never", TRUE ~ "Missed appointments" ), Patient_missed_12mo = case_when( Patient_missed_12mo == "Nekad" ~ "Never", TRUE ~ "Missed appointments" ), Patient_missed_6mo = factor(Patient_missed_6mo, levels = c("Never", "Missed appointments")), Patient_missed_12mo = factor(Patient_missed_12mo, levels = c("Never", "Missed appointments")) ) ``` Parents_involved_6mo and \_12mo ```{r} df <- df |> mutate( Parents_involved_6mo = case_when( Parents_involved_6mo == "Nekad" ~ "Never", TRUE ~ "Sometimes involved" ), Parents_involved_12mo = case_when( Parents_involved_12mo == "Nekad" ~ "Never", TRUE ~ "Sometimes involved" ), Parents_involved_6mo = factor(Parents_involved_6mo, levels = c("Never", "Sometimes involved")), Parents_involved_12mo = factor(Parents_involved_12mo, levels = c("Never", "Sometimes involved")) ) ``` Shows_interest_6mo and \_12mo ```{r} df <- df |> mutate( Shows_interest_6mo = case_when( Shows_interest_6mo == "Vienmēr" ~ "Always", Shows_interest_6mo == "Bieži" ~ "Often", TRUE ~ "Sometimes" ), Shows_interest_12mo = case_when( Shows_interest_12mo == "Vienmēr" ~ "Always", Shows_interest_12mo == "Bieži" ~ "Often", TRUE ~ "Sometimes" ), Shows_interest_6mo = factor(Shows_interest_6mo, levels = c("Always", "Often", "Sometimes")), Shows_interest_12mo = factor(Shows_interest_12mo, levels = c("Always", "Often", "Sometimes")) ) ``` Poor_hygiene_6mo and \_12mo ```{r} df <- df |> mutate( Poor_hygiene_6mo = case_when( Poor_hygiene_6mo == "Nekad" ~ "Never", TRUE ~ "Sometimes" ), Poor_hygiene_12mo = case_when( Poor_hygiene_12mo == "Nekad" ~ "Never", TRUE ~ "Sometimes" ), Poor_hygiene_6mo = factor(Poor_hygiene_6mo, levels = c("Never", "Sometimes")), Poor_hygiene_12mo = factor(Poor_hygiene_12mo, levels = c("Never", "Sometimes")) ) ``` Uses_rubberbands_6mo and \_12mo ```{r} df <- df |> mutate( Uses_rubberbands_6mo = case_when( Uses_rubberbands_6mo == "Pacientam šobrīd nav paredzēta ārstēšana ar elastīgam gumijam" ~ "Not prescribed", Uses_rubberbands_6mo %in% c("Vienmēr", "Bieži") ~ "Uses regularly", TRUE ~ "Rarely/never" ), Uses_rubberbands_12mo = case_when( Uses_rubberbands_12mo == "Pacientam šobrīd nav paredzēta ārstēšana ar elastīgam gumijam" ~ "Not prescribed", Uses_rubberbands_12mo %in% c("Vienmēr", "Bieži") ~ "Uses regularly", TRUE ~ "Rarely/never" ), Uses_rubberbands_6mo = factor(Uses_rubberbands_6mo, levels = c("Not prescribed", "Uses regularly", "Rarely/never")), Uses_rubberbands_12mo = factor(Uses_rubberbands_12mo, levels = c("Not prescribed", "Uses regularly", "Rarely/never")) ) ``` Shows_dissatisfaction_6mo and \_12mo ```{r} df <- df |> mutate( Shows_dissatisfaction_6mo = case_when( Shows_dissatisfaction_6mo == "Nekad" ~ "Never", TRUE ~ "Sometimes" ), Shows_dissatisfaction_12mo = case_when( Shows_dissatisfaction_12mo == "Nekad" ~ "Never", TRUE ~ "Sometimes" ), Shows_dissatisfaction_6mo = factor(Shows_dissatisfaction_6mo, levels = c("Never", "Sometimes")), Shows_dissatisfaction_12mo = factor(Shows_dissatisfaction_12mo, levels = c("Never", "Sometimes")) ) ``` # MODELLING ```{r} # prepare the data from the 01_data_clean fit_levels_good <- "Kape pieguļ pilnībā visos sekstantos; nav vērojamās atstarpes starp kapes materiālu un zobu virsmām" fit_levels_ok <- "Kape pieguļ pilnībā/vērojamā neliela atstarpe starp kapi un 1/2 zobiem, kuri atrodas vienā/divos blakussekstantos." ``` # TABLE 1 ```{r} df <- df |> rowid_to_column("id") ``` ```{r} df |> dplyr::select(Gender:BFI_Neuroticism) |> gtsummary::tbl_summary() ``` Save as Table 1 ```{r} read_docx() |> body_add_flextable( df |> select(Gender:BFI_Neuroticism) |> tbl_summary() |> as_flex_table() ) |> print(target = here("figures_tables", "table_1.docx")) ``` ## Motivation ### Table data doctor ```{r} df |> select(Aligns_12mo:Shows_dissatisfaction_12mo) |> gtsummary::tbl_summary() ``` ## Change in position ```{r} df |> mutate(deviation_mm = Real_position_mm - Expected_position_mm) |> ggplot(aes(x = deviation_mm)) + geom_histogram(binwidth = 0.2, fill = "steelblue", color = "white") + geom_vline(xintercept = 0, linetype = "dashed") + labs( title = "Deviation between Real and Expected Position", x = "Deviation (mm)", y = "Number of patients" ) ``` ```{r} df |> mutate( actual_movement = Starting_position_mm - Real_position_mm, expected_movement = Starting_position_mm - Expected_position_mm ) |> ggplot(aes(x = expected_movement, y = actual_movement, color = Starting_position_mm)) + geom_point(alpha = 0.7) + geom_abline(slope = 1, intercept = 0, linetype = "dashed") + labs( title = "Expected vs. Actual Tooth Movement", x = "Expected Movement (mm)", y = "Actual Movement (mm)" ) ``` ```{r} df |> mutate( actual_movement = Starting_position_mm - Real_position_mm, expected_movement = Starting_position_mm - Expected_position_mm ) |> ggplot(aes(x = expected_movement, y = actual_movement, color = BFI_Conscientiousness)) + geom_point(alpha = 0.7) + geom_abline(slope = 1, intercept = 0, linetype = "dashed") + labs( title = "Expected vs. Actual Tooth Movement by BFI_Conscientiousness", x = "Expected Movement (mm)", y = "Actual Movement (mm)", color = "Conscientiousness Score" ) + theme_minimal() ``` ```{r} df |> mutate( actual_movement = Starting_position_mm - Real_position_mm, expected_movement = Starting_position_mm - Expected_position_mm ) |> ggplot(aes(x = expected_movement, y = actual_movement, color = BFI_Neuroticism)) + geom_point(alpha = 0.7) + geom_abline(slope = 1, intercept = 0, linetype = "dashed") + labs( title = "Expected vs. Actual Tooth Movement by BFI_Neuroticism", x = "Expected Movement (mm)", y = "Actual Movement (mm)", color = "Conscientiousness Score" ) + theme_minimal() ``` # MODELLING Now the question is: what explain the difference between the expected and the actual movement? Neuroticism and conscientiousness personality traits were assessed with the validated Big Five Personality Inventory (BFI) ``` BFI_Conscientiousness ``` ``` BFI_Fake ``` ``` BFI_Neuroticism ``` Patient cooperation was assessed by evaluating clinical fitting of aligners to dental arches and comparison of the planned and achieved upper 1^st^ premolar expansion (%). ``` Fits_6mo ``` ``` Fits_12mo ``` ``` Patient_missed_6mo:Shows_dissatisfaction_12mo ``` **Index of Orthodontic Treatment Needed** the higher the worst the orthodontic status of the patient ``` IOTN_AH ``` ``` IOTN_DH ``` and the target variable is the difference between the ``` Expected_position_mm ``` and the ``` Real_position_mm ``` Model the difference between expected and actual movement ## Change from the initial to the final position ```{r} df |> mutate(id = row_number()) |> pivot_longer( cols = c(Starting_position_mm, Real_position_mm), names_to = "Stage", values_to = "Position_mm" ) |> mutate(Stage = factor(Stage, levels = c("Starting_position_mm", "Real_position_mm"))) |> ggplot(aes(x = Stage, y = Position_mm, group = id, color = Gender)) + geom_line(alpha = 0.3, color = "steelblue") + geom_point(size = 1.8) + labs( title = "Actual Tooth Movement from Start to Real Position", x = "Stage", y = "Position (mm)" ) + theme_minimal() ``` ## Preparing variables ```{r} df <- df |> mutate( movement_deviation = Real_position_mm - Expected_position_mm ) ``` ```{r} df |> ggplot(aes(x = movement_deviation)) + geom_histogram(bins = 10) ``` Ensure age and gender will be included ```{r} df <- df |> mutate( Sex = as.factor(Gender), Age = as.numeric(Age) ) ``` ## Model 1: Lineal Regression ### For the movement ```{r} m1 <- lm( movement_deviation ~ Age + Sex + Patient_missed_6mo + Shows_dissatisfaction_12mo + IOTN_AH + IOTN_DH + BFI_Conscientiousness + BFI_Neuroticism, data = df ) ``` ```{r} m1 |> gtsummary::tbl_regression() ``` ```{r} plot_model(m1, type = "est", show.values = TRUE, value.offset = 0.3, title = "Predictors of Tooth Movement Deviation", axis.labels = NULL, transform = NULL) + # no transformation — interpret raw units (mm) theme_minimal() ``` ```{r} plot_model( m1, type = "est", show.values = TRUE, value.offset = 0.3, sort.est = TRUE, title = "Predictors of Tooth Movement Deviation", axis.labels = c( "Neuroticism", "Conscientiousness", "IOTN DH", "IOTN AH", "Dissatisfaction at 12 mo [Sometimes]", "Missed appointments at 6 mo", "Sex [Female]", "Age" ), transform = NULL # Keep in original scale (mm deviation) ) + theme_minimal() + theme( plot.title = element_text(size = 14, face = "bold"), axis.title.x = element_text(size = 12), axis.text = element_text(size = 10) ) + labs(x = "Regression Coefficient (mm deviation)", y = NULL) ``` ### Explore BFI_Conscientiousness & BFI_Neuroticism ```{r} df |> ggplot(aes(x = BFI_Neuroticism, y = movement_deviation)) + geom_point(alpha = 0.3) + geom_smooth(method = "loess", se = T, color = "steelblue") + # geom_smooth(method = "lm", se = TRUE, fill = "lightblue", color = "steelblue") + # geom_quantile(quantiles = 0.5, color = "steelblue") + # geom_smooth(method = "gam", formula = y ~ s(x), se = FALSE, color = "purple") + # geom_smooth() + labs( x = "Neuroticism Score", y = "Difference in Expected Movement (mm)", title = "Tooth Movement Deviation by Neuroticism Level" ) + theme_minimal() ``` ```{r} df |> ggplot(aes(x = BFI_Conscientiousness, y = movement_deviation)) + geom_point(alpha = 0.3) + geom_smooth(method = "loess", se = T, color = "steelblue") + labs( x = "Conscientiousness Score", y = "Difference in Expected Movement (mm)", title = "Tooth Movement Deviation by Conscientiousness Level" ) + theme_minimal() ``` ```{r} check_model(m1) ``` There are some issues:\ - Observed vs Model-Predicted Density: check, maybe robust regression, transforming the outcome, or checking outliers? - check cases 24, 30, 38, 47) - Consider robust regression or bootstrapping if inference is sensitive. ### Check outliers ```{r} # df |> slice(c(24, 30, 38, 47)) ``` ## Multinomial models ```{r} pacman::p_load(nnet) ``` ### Model for Fits 6 and ```{r} df$Fits_6mo <- factor(df$Fits_6mo, levels = c("Perfect fit", "Slight gap (1–2 teeth)", "Poor fit")) df$Sex <- as.factor(df$Sex) df$Patient_missed_6mo <- as.factor(df$Patient_missed_6mo) df$Shows_dissatisfaction_12mo <- as.factor(df$Shows_dissatisfaction_12mo) ``` ```{r} m_fit6 <- multinom( Fits_6mo ~ Age + Sex + Patient_missed_6mo + Shows_dissatisfaction_12mo + IOTN_AH + IOTN_DH + BFI_Conscientiousness + BFI_Neuroticism, data = df ) ``` ```{r} m_fit6 |> gtsummary::tbl_regression() ``` ```{r} plot_model(m_fit6, type = "est", sort.est = TRUE) ``` ```{r} # m_fit6 |> # tbl_regression( # exponentiate = TRUE, # label = list( # Age ~ "Age", # Sex ~ "Sex", # Patient_missed_6mo ~ "Missed appointments", # Shows_dissatisfaction_12mo ~ "Dissatisfaction (12mo)", # IOTN_AH ~ "IOTN-AH", # IOTN_DH ~ "IOTN-DH", # BFI_Conscientiousness ~ "Conscientiousness", # BFI_Neuroticism ~ "Neuroticism" # ) # ) |> # modify_header(label ~ "**Characteristic**") |> # modify_caption("**Multinomial logistic regression predicting aligner fit at 6 months**") ``` ```{r} tbl_regression(m_fit6, exponentiate = TRUE) |> as_flex_table() |> # or as_gt() flextable::autofit() ``` ```{r} # RUN ONLY ONCE tbl_regression(m_fit6, exponentiate = TRUE) |> as_tibble() |> write_xlsx(path = here("figures_tables", "fit6_model.xlsx")) ``` ### Model for Fit 12 months ```{r} df$Fits_12mo <- factor(df$Fits_12mo, levels = c("Perfect fit", "Slight gap (1–2 teeth)", "Poor fit")) # check order # Ccheck predictors df$Sex <- as.factor(df$Sex) df$Patient_missed_6mo <- as.factor(df$Patient_missed_6mo) df$Shows_dissatisfaction_12mo <- as.factor(df$Shows_dissatisfaction_12mo) ``` ```{r} m_fit12 <- multinom( Fits_12mo ~ Age + Sex + Patient_missed_6mo + Shows_dissatisfaction_12mo + IOTN_AH + IOTN_DH + BFI_Conscientiousness + BFI_Neuroticism, data = df ) ``` ```{r} m_fit12 |> gtsummary::tbl_regression() ``` ```{r} tbl_regression(m_fit12 , exponentiate = TRUE) |> as_tibble() |> write_xlsx(path = here("figures_tables", "fit12_model.xlsx")) ``` ## Model 2 Robust regression ```{r} m_robust <- rlm( movement_deviation ~ Age + Sex + BFI_Conscientiousness + BFI_Neuroticism + Patient_missed_6mo + Shows_dissatisfaction_12mo + IOTN_AH + IOTN_DH, data = df ) ``` ```{r} m_robust |> gtsummary::tbl_regression() ``` ```{r} plot_model(m_robust, type = "est", show.values = TRUE, sort.est = TRUE, axis.labels = c("Age", "Sex", "Conscientiousness", "Neuroticism", "Missed Appointments", "Dissatisfaction", "IOTN-AH", "IOTN-DH"), title = "Robust Regression Coefficients") ``` ## Tidymodel ```{r} model_recipe <- recipe(movement_deviation ~ Age + Sex + BFI_Conscientiousness + BFI_Neuroticism + # Fits_6mo + Fits_12mo + Patient_missed_6mo + Shows_dissatisfaction_12mo + IOTN_AH + IOTN_DH, data = df) |> step_dummy(all_nominal_predictors()) |> step_normalize(all_numeric_predictors()) ``` ```{r} # the model lin_mod <- linear_reg() |> set_engine("lm") |> set_mode("regression") ``` ```{r} # the workflow movement_wf <- workflow() |> add_model(lin_mod) |> add_recipe(model_recipe) ``` ```{r} # fit the model movement_fit <- movement_wf |> fit(data = df) ``` ```{r} movement_fit |> tidy() |> arrange(p.value) |> knitr::kable( digits = 3, caption = "Tidy Output of Ordinal Regression Model (Sorted by p-value)" ) ``` ```{r} movement_fit |> tidy() |> filter(term != "(Intercept)") |> mutate( term = recode(term, "BFI_Neuroticism" = "Neuroticism", "BFI_Conscientiousness" = "Conscientiousness", "IOTN_DH" = "IOTN Dental Health", "IOTN_AH" = "IOTN Aesthetic", "Age" = "Age", "Sex_Sieviešu" = "Sex (Female)", "Patient_missed_6mo_Missed.appointments" = "Missed appointments (6 mo)", "Shows_dissatisfaction_12mo_Sometimes" = "Dissatisfaction (12 mo)" ) ) |> ggplot(aes(x = reorder(term, estimate), y = estimate)) + geom_point(size = 3) + geom_errorbar(aes( ymin = estimate - std.error * 1.5, ymax = estimate + std.error * 1.5 ), width = 0.2) + coord_flip() + labs( title = "Predictors of Tooth Movement Deviation (95% Confidence Interval)", x = "Predictor", y = "Estimate (mm)" ) + geom_hline(yintercept = 0, linetype = "dashed", color = "gray40") + theme_minimal() ``` ```{r} ggsave( filename = here("figures_tables", "predictors_tooth_movement.png"), plot = last_plot(), width = 10, height = 5, dpi = 300, bg = "transparent" ) ``` Presentation: <https://docs.google.com/presentation/d/1VqYmsV5PSVvMKO3lzGamVfkDsTMCwQqNK47oTP_s-lw/edit#slide=id.p10> ## Effect on subjective measurements ### Fits 6 months ```{r} df$Fits_6mo <- factor( df$Fits_6mo, levels = c("Perfect fit", "Slight gap (1–2 teeth)", "Poor fit"), ordered = TRUE ) ``` ```{r} # Model 1: BFI_Neuroticism + Gender model_neurotic_ord <- polr(Fits_6mo ~ BFI_Neuroticism + Gender, data = df, Hess = TRUE) ``` ```{r} model_neurotic_ord |> gtsummary::tbl_regression(exponentiate = T) ``` ```{r} # Predicted probabilities from ordinal logistic model pred_neurotic <- ggeffects::ggpredict(model_neurotic_ord, terms = "BFI_Neuroticism") # Plot plot(pred_neurotic) + labs( title = "Predicted Fit Quality by Neuroticism Score", x = "BFI Neuroticism Score", y = "Predicted Probability" ) + theme_minimal() ``` ```{r} df |> mutate(Neuroticism_group = cut( BFI_Neuroticism, breaks = c(0, 40, 50, Inf), labels = c("Low", "Moderate", "High")) ) |> ggplot(aes(x = Neuroticism_group, fill = Fits_6mo)) + geom_bar(position = "stack") + labs( title = "Fit Quality Distribution by Neuroticism Group", x = "Neuroticism Level", y = "Count", fill = "Fit Quality (6 months)" ) + theme_minimal() ``` ```{r} df |> mutate(Neuroticism_group = cut( BFI_Neuroticism, breaks = c(0, 40, 50, Inf), labels = c("Low", "Moderate", "High")) ) |> ggplot(aes(x = Neuroticism_group, fill = Fits_6mo)) + geom_bar(position = "fill") + scale_fill_viridis_d(option = "D") + # You can try options: "A", "B", "C", "D", "E", "F" scale_y_continuous(labels = scales::percent_format()) + labs( title = "Fit Quality (6 months) Distribution by Neuroticism Group", x = "Neuroticism Level", y = "Proportion", fill = "Fit Quality (6 months)" ) + theme_minimal() ``` ```{r} df |> mutate(Neuroticism_group = cut( BFI_Neuroticism, breaks = c(0, 40, 50, Inf), labels = c("Low", "Moderate", "High")) ) |> ggplot(aes(x = Neuroticism_group, fill = Fits_12mo)) + geom_bar(position = "stack") + scale_fill_viridis_d(option = "D") + # You can try options: "A", "B", "C", "D", "E", "F" # scale_y_continuous(labels = scales::percent_format()) + labs( title = "Fit Quality (12 months) Distribution by Neuroticism Group", x = "Neuroticism Level", y = "Count", fill = "Fit Quality (12 months)" ) + theme_minimal() ``` ```{r} df |> mutate(Neuroticism_group = cut( BFI_Neuroticism, breaks = c(0, 40, 50, Inf), labels = c("Low", "Moderate", "High")) ) |> ggplot(aes(x = Neuroticism_group, fill = Fits_12mo)) + geom_bar(position = "fill") + scale_fill_viridis_d(option = "D") + # You can try options: "A", "B", "C", "D", "E", "F" scale_y_continuous(labels = scales::percent_format()) + labs( title = "Fit Quality (12 months) Distribution by Neuroticism Group", x = "Neuroticism Level", y = "Proportion", fill = "Fit Quality (12 months)" ) + theme_minimal() ``` ```{r} df |> mutate(Conscientiousness_group = cut( BFI_Conscientiousness, breaks = c(0, 40, 50, Inf), labels = c("Low", "Moderate", "High")) ) |> ggplot(aes(x = Conscientiousness_group, fill = Fits_6mo)) + geom_bar(position = "stack") + scale_fill_viridis_d(option = "D") + # You can try options: "A", "B", "C", "D", "E", "F" #scale_y_continuous(labels = scales::percent_format()) + labs( title = "Fit Quality (6 months) Distribution by Conscientiousness Group", x = "Conscientiousness_group Level", y = "Count", fill = "Fit Quality (6 months)" ) + theme_minimal() ``` ```{r} df |> mutate(Conscientiousness_group = cut( BFI_Conscientiousness, breaks = c(0, 40, 50, Inf), labels = c("Low", "Moderate", "High")) ) |> ggplot(aes(x = Conscientiousness_group, fill = Fits_6mo)) + geom_bar(position = "fill") + scale_fill_viridis_d(option = "D") + # You can try options: "A", "B", "C", "D", "E", "F" scale_y_continuous(labels = scales::percent_format()) + labs( title = "Fit Quality (6 months) Distribution by Conscientiousness Group", x = "Conscientiousness_group Level", y = "Proportion", fill = "Fit Quality (6 months)" ) + theme_minimal() ``` ```{r} df |> mutate(Conscientiousness_group = cut( BFI_Conscientiousness, breaks = c(0, 40, 50, Inf), labels = c("Low", "Moderate", "High")) ) |> ggplot(aes(x = Conscientiousness_group, fill = Fits_12mo)) + geom_bar(position = "stack") + scale_fill_viridis_d(option = "D") + # You can try options: "A", "B", "C", "D", "E", "F" # scale_y_continuous(labels = scales::percent_format()) + labs( title = "Fit Quality (12 months) Distribution by Conscientiousness Group", x = "Conscientiousness_group Level", y = "Count", fill = "Fit Quality (12 months)" ) + theme_minimal() ``` ```{r} df |> mutate(Conscientiousness_group = cut( BFI_Conscientiousness, breaks = c(0, 40, 50, Inf), labels = c("Low", "Moderate", "High")) ) |> ggplot(aes(x = Conscientiousness_group, fill = Fits_12mo)) + geom_bar(position = "fill") + scale_fill_viridis_d(option = "D") + # You can try options: "A", "B", "C", "D", "E", "F" scale_y_continuous(labels = scales::percent_format()) + labs( title = "Fit Quality (12 months) Distribution by Conscientiousness Group", x = "Conscientiousness_group Level", y = "Proportion", fill = "Fit Quality (12 months)" ) + theme_minimal() ```