This is interesting, seems children attend mostly in three ages.
3. How many times children goes to dentists in different age groups?
Now filter the unique PAC_ID
AGE
1
More than 1 time
0
39
2
1
784
71
2
2012
290
3
2604
886
4
3243
2341
5
3847
4373
6
4135
5865
7
5345
7187
8
4395
6440
9
4348
5748
10
4281
5093
11
3845
4246
12
3799
4087
13
3696
4426
14
3747
4888
15
3469
4829
16
3063
4523
17
2785
6058
4. How many patients per clinic?
AI_KODS
n
Less than 500 patients
38454
010019111
8007
010020301
7186
010064120
6063
010064521
4254
050024301
4224
010001535
4191
130000045
3147
170020401
2339
010064514
2153
210020301
1994
801000009
1977
210000058
1762
420200053
1740
170064506
1735
270064503
1673
620200014
1588
740200016
1576
400200025
1509
010064301
1459
210000043
1318
900200089
1317
740200059
1246
019277203
1127
270064004
1099
010000343
1090
090000093
1053
010054211
1024
019564503
1009
800600020
993
840200048
992
130024102
976
741400017
955
620200062
925
010064502
917
741400025
915
801600085
899
807600014
838
840200026
835
010001411
800
600200001
790
740200008
757
002000005
712
270000064
699
360800006
698
010001818
689
019164506
663
019464501
663
010064522
607
010001666
595
110000048
593
761200024
572
006000002
568
801200002
553
050000127
550
400200052
547
761200011
547
170077201
545
700200034
537
880200029
525
001000035
519
901200021
502
4.1. Visits per clinic
PAC_ID = Patient
MAN_DAT = Date of the visit
MP_CODE = Manipulations
AI_KODS
Total_Visits
010019111
14750
010020301
14692
010064120
13672
050024301
10146
010064521
9911
130000045
8972
010001535
8680
210020301
5968
010064514
5216
170020401
5046
270064503
4985
801000009
4783
270064004
4696
170064506
4208
210000058
3679
620200014
3679
420200053
3671
210000043
3279
900200089
3203
740200016
3108
740200059
3103
800600020
2958
380200022
2943
741400025
2934
400200025
2855
010000343
2649
010001881
2561
010064502
2438
019277203
2434
001000035
2276
019464501
2244
010064301
2224
270000064
2213
010054211
2177
090000093
2106
801600085
2106
019564503
2059
010001411
2043
010001818
2014
806077202
2014
840200048
2013
840200026
1973
006000002
1927
741400017
1840
600200001
1780
010064522
1677
130024102
1655
740200008
1642
360800006
1593
620200062
1583
801400003
1566
010001581
1488
380200011
1471
002000005
1357
807600014
1344
019164506
1338
681800001
1319
400200052
1272
110000048
1256
801200002
1170
130064502
1156
901200021
1152
880200029
1115
806900003
1107
800800006
1084
010000427
1044
090000016
1035
900200086
1029
010001666
1027
500200032
1014
700200034
1000
500200006
994
050000127
983
170077201
974
761200024
962
761200011
950
010044004
916
885100007
894
381600004
885
380200037
884
840200079
877
887600002
871
801600087
869
010001873
859
010001714
850
460800005
846
090020301
844
130000014
835
270000014
834
090000079
804
941600013
772
010077210
763
010064545
762
019164502
751
321400001
747
540200007
747
740200001
738
006000005
737
010001914
732
500200010
723
601000004
719
500200063
712
801000013
701
001000031
681
010000832
680
540200010
677
900200056
661
620200018
659
660200027
658
961600005
656
250000169
652
270000024
650
130000098
630
701400004
627
800600008
621
019364501
616
420200056
612
010001913
608
110000023
608
801800014
598
940200032
595
019464004
593
019464507
591
460200019
586
019577201
577
381600005
576
809600010
560
561800004
558
700200067
556
090024101
554
424700004
552
090000113
547
684900003
545
661400003
544
840200025
542
360200066
530
600200018
529
320200002
528
010000104
525
130064002
523
620200008
522
010064111
519
409500011
518
540200004
506
010000178
498
680200006
494
250000030
492
680200023
481
681000002
480
010001157
479
406477201
475
010001409
471
019164058
463
500200020
462
900200087
460
805200007
459
740200096
456
941600023
456
901200014
446
660200040
445
406464501
443
250000041
439
328277201
436
440200002
435
010064544
432
090077206
430
090077202
423
901200007
421
460200040
415
641000005
414
170000154
410
421200007
401
700800012
397
010001978
392
010064013
384
420200080
370
440200012
364
641000007
364
429300002
363
010001054
353
090065208
341
420200059
339
010001043
338
110000003
338
170000197
335
940200018
330
967300003
325
387500004
321
381600009
315
460800012
314
320200023
311
800600019
311
023000003
302
010000319
298
940200007
298
320200009
297
460200002
297
640600015
297
250000042
294
010001180
293
250000040
290
880200055
284
270000003
275
900200088
271
326100006
267
640600008
267
880200080
267
270064507
253
641000001
252
680200007
248
420200030
241
681000014
239
010000095
238
740600009
235
400200032
234
028000001
233
760200015
232
320200041
225
801600054
223
320200037
222
010065217
215
010001408
206
905700001
203
460200039
200
429300010
195
001000008
194
740600007
191
460200038
186
010001897
181
807600024
175
019677203
171
760200019
165
807477201
164
250000045
163
661400007
161
740200015
153
621200004
150
880200036
147
250000046
142
010054114
140
326100005
140
780200012
139
940200019
135
940200006
124
050000017
110
741000006
109
500200012
107
010077222
106
741400016
105
900200082
103
961000003
93
020000002
85
051000003
78
808477201
76
460200023
75
900200058
68
468900002
67
888300008
65
010000232
62
807635202
61
019464518
46
170000046
45
420200068
45
090000057
39
880200056
38
680200030
34
010011803
33
328200003
20
NA
3
5. How many patients per region?
PAC_ATVK
n
Less than 500 patients
40279
10093
10861
10094
10852
10092
7303
10095
6119
2000
5374
5000
4669
10096
4018
3000
3860
6000
3098
4000
3007
7000
2602
40010
2093
39410
1571
45200
1346
31010
1337
44420
1295
10091
1293
52210
1208
34420
1182
54010
1118
26200
1111
39200
1100
48200
1059
NA
1053
41400
973
39400
882
25200
857
46210
853
41200
781
40200
715
23400
696
33200
680
44400
635
34210
621
23410
611
40220
606
37210
526
24200
518
36200
515
51220
510
44410
503
29200
500
6. Why are the children coming?
IEMESLS
n
Regulâru apskati
1530903
NA
178893
Akûtâm sâpçm
167247
Traumu
14322
7. Extract the month from MAN_DAT
8. Reasons of visit per month
IEMESLS
1
2
3
4
5
6
7
8
9
10
11
12
Regulâru apskati
19008
17999
23556
20277
22314
21213
19081
23397
22967
23318
24165
19270
Akûtâm sâpçm
1960
1922
2167
1978
2205
1994
1946
2476
2559
2352
2390
2230
NA
1649
1746
2403
1953
2353
2360
2071
2853
2485
2533
2716
2285
Traumu
133
138
193
169
174
160
187
183
217
190
171
157
REMOVE OBJECT zobarst
HYGIENISTS
Filter only Zobu higiēnists
1. How many children visits 1, 2, or more times
Visits
n
1
98086
2
12210
3
857
4
70
5
1
6
1
Visits
n
1
98086
2
12210
3
857
4
70
> 4 visits
2
2. How many children per age?
AGE
n
percent
1
124
0.1%
2
1755
1.6%
3
4256
3.8%
4
5448
4.9%
5
6468
5.8%
6
7073
6.4%
7
8908
8.0%
8
9029
8.1%
9
7993
7.2%
10
7692
6.9%
11
7459
6.7%
12
7812
7.0%
13
8286
7.4%
14
8114
7.3%
15
7535
6.8%
16
6490
5.8%
17
6783
6.1%
This is interesting, seems children attend mostly in three ages.
3. How many times children goes to dentists in different age groups?
Now filter the unique PAC_ID
AGE
1
More than 1 time
1
123
1
2
1716
39
3
4039
217
4
5186
262
5
6112
356
6
6662
411
7
7329
1579
8
7188
1841
9
7290
703
10
7026
666
11
6149
1310
12
5936
1876
13
6807
1479
14
7408
706
15
6918
617
16
6029
461
17
6168
615
4. How many patients per clinic?
AI_KODS
n
Less than 500 patients
22497
420200053
8016
010019111
6876
010020301
6654
010064120
5351
050024301
3354
740200059
3325
010064521
3251
010001535
3183
010064514
2968
130000045
2857
210000058
2668
620200014
2021
090000057
1894
170020401
1805
210000043
1770
250000169
1718
010064301
1576
170064506
1555
400200025
1465
761200024
1354
019277203
1319
270064503
1195
090024001
1172
019564503
1111
019164506
1103
807600014
1062
801600085
1016
028000001
1007
840200048
960
805200007
948
010054211
934
760200003
846
900200089
845
620200062
804
010000343
759
800600020
713
320200041
694
880200029
690
010001873
689
010064502
671
010001818
657
741400025
638
801400003
626
840200026
623
130024102
614
900200006
610
010000832
603
460200002
582
400200052
548
170077202
513
740200016
511
NA
4
4.1. Visits per clinic
PAC_ID = Patient
MAN_DAT = Date of the visit
MP_CODE = Manipulations
AI_KODS
Total_Visits
420200053
8703
010019111
7756
010020301
7343
010064120
6156
740200059
4026
010064521
3943
050024301
3642
010001535
3459
010064514
3330
210000058
3143
130000045
3139
620200014
2316
090000057
2273
210000043
2027
170020401
1954
250000169
1844
400200025
1757
170064506
1724
010064301
1701
090024001
1554
019277203
1473
761200024
1465
270064503
1281
807600014
1240
019564503
1237
019164506
1197
801600085
1182
840200048
1102
010054211
1095
028000001
1090
805200007
1035
900200089
999
620200062
922
760200003
920
010000343
825
741400025
787
800600020
778
010001873
765
801400003
740
320200041
738
880200029
738
010064502
726
010001818
712
460200002
703
840200026
669
010000832
658
900200006
656
130024102
641
641000022
638
400200052
624
806900003
619
740200016
587
019464507
576
661400001
558
170077202
556
090000016
529
660200027
523
170077201
512
360200066
512
010044004
507
210020301
505
360800006
476
170000197
475
620200018
471
050000127
454
680200030
452
090077202
442
270000064
432
010064522
423
800800006
419
700200067
414
901200007
405
801600087
394
741400017
391
019364501
377
961600005
350
840200079
348
420200080
341
010001978
340
600200001
329
801800014
328
420200059
324
740200008
323
010077210
321
019464004
321
010064111
315
010001913
309
130000098
289
001000031
288
941600023
286
801200002
283
130064502
281
641600001
278
090024101
273
019164502
269
090065208
269
010000319
262
010001714
262
010001914
248
420200030
247
800600019
247
020000002
239
801000013
239
010001408
238
010064545
236
010001897
198
270064004
197
500200063
196
010064013
191
010001411
189
090000093
187
010001054
181
320200002
180
010001581
177
010001180
168
010064544
168
270000014
167
006000005
164
010001409
160
051000003
158
326100006
156
010000427
154
010001157
148
740600007
138
800600008
138
621200004
132
700800012
132
429300010
126
010000095
124
010065217
121
250000041
120
880200080
97
888300008
91
840200025
88
010001043
86
328200003
72
940200019
72
680200006
70
019164058
68
328277201
68
740200001
63
001000008
61
170000046
56
905700001
53
090077206
42
420200068
37
002000005
34
010054114
25
601000001
11
NA
4
5. How many patients per region?
PAC_ATVK
n
Less than 500 patients
35185
10094
10089
10093
8710
10092
6316
10095
5592
5000
4576
3000
3467
2000
3465
10096
3352
6000
2834
40010
2340
4000
1969
39410
1666
54010
1610
7000
1364
10091
1353
48200
1331
26200
1196
39200
1168
44420
1067
52210
1037
34420
990
45200
969
39400
945
25200
844
23400
786
NA
784
33200
724
44400
718
46210
706
23410
677
31010
623
40200
603
34210
593
28210
565
36200
507
41400
504
8. Reasons of visit per month
IEMESLS
1
2
3
4
5
6
7
8
9
10
11
12
Regulâru apskati
7280
6903
10201
8716
9975
8550
7095
10406
9720
10248
10276
8207
NA
708
757
964
863
948
1057
956
1447
1132
1295
1047
805
Akûtâm sâpçm
268
335
505
444
380
635
467
672
578
524
543
451
Traumu
1
2
4
2
NA
3
NA
3
3
1
1
1
REMOVE OBJECT hygienist
ALL 8
8. Reasons of visit per month
IEMESLS
1
2
3
4
5
6
7
8
9
10
11
12
Regulâru apskati
27326
25841
34524
30038
32890
30732
27175
34888
33283
34478
35122
28253
Akûtâm sâpçm
2368
2403
2879
2549
2713
2813
2579
3286
3275
3007
3066
2813
NA
2350
2459
3328
2791
3273
3357
2934
4169
3513
3735
3716
2990
Traumu
138
141
198
172
176
165
189
187
222
192
172
158
2024 April 11
Las preguntas siguentes de datos:
1. Verificamos, si no hemos dejado muchos datos sin analyzar:
Cuantas filas no tienen datos de SPEC_KODS?
# A tibble: 1 × 1
NA_Count
<int>
1 0
1. Verifikamos, cuantas visitas hay - cuantas veces esta el codigo 70001 (variable MP_CODE)
[1] 447849
Source Code
---title: "NVD IEVA FLPP 2024"date-modified: last-modifiedformat: html: toc: truetoc-expand: 3code-fold: truecode-tools: trueeditor: visualexecute: echo: false cache: false warning: false message: false---# Packages```{r}# Load required libraries with pacman; installs them if not already installedpacman::p_load(tidyverse, # tools for data science scales, readr, naniar, #NAs janitor, # for data cleaning and tables here, # for reproducible research gtsummary # for tables )```Set theme```{r}theme_set(theme_minimal())```# Load the data```{r}MAVITO_ZPN_MAN_UD1_B <-read_rds(here("analysis", "data", "MAVITO_ZPN_MAN_UD1_B.rds")) |>filter(!is.na(SEX_ID))```### Remove P02 and n25 from```{r}MAVITO_ZPN_MAN_UD1_B <- MAVITO_ZPN_MAN_UD1_B |>filter(!str_detect(SPEC_KODS, c("P02|n25") ))```### Recode Sex_ID```{r}MAVITO_ZPN_MAN_UD1_B <- MAVITO_ZPN_MAN_UD1_B |>mutate(SEX_ID =recode(SEX_ID,"1"="Male","2"="Female"))```### Eliminate Vecums, and change vecums_2 to Age```{r}MAVITO_ZPN_MAN_UD1_B <- MAVITO_ZPN_MAN_UD1_B |>select(-VECUMS, "AGE"="VECUMS_2") ```### Add the MONTH```{r}MAVITO_ZPN_MAN_UD1_B <- MAVITO_ZPN_MAN_UD1_B |>mutate(MONTH = lubridate::month(MAN_DAT) ) |>relocate(MONTH, .after = MAN_DAT)```Merge 18 with 17 years old```{r}MAVITO_ZPN_MAN_UD1_B <- MAVITO_ZPN_MAN_UD1_B %>%mutate(AGE =ifelse(AGE >17, 17, AGE))```# EDA## NAs```{r}MAVITO_ZPN_MAN_UD1_B |> naniar::gg_miss_var(show_pct =TRUE) +labs(title ="Missing data")```Explore the patter of missingness```{r}# gg_miss_upset(MAVITO_ZPN_MAN_UD1_B)```## EDA```{r}dim(MAVITO_ZPN_MAN_UD1_B) ```### How many unique patients?```{r}n_distinct(MAVITO_ZPN_MAN_UD1_B$PAC_ID) |> knitr::kable()```In Latvia are 356.864 children 0 to 17 years-oldSource: <https://data.stat.gov.lv/pxweb/lv/OSP_PUB/START__POP__IR__IRD/IRD041>So, this represents```{r}n_distinct(MAVITO_ZPN_MAN_UD1_B$PAC_ID) /356864*100```## How many girls and boys## Age```{r}MAVITO_ZPN_MAN_UD1_B |>tabyl(SEX_ID) |>adorn_pct_formatting() |> knitr::kable()``````{r}MAVITO_ZPN_MAN_UD1_B |>ggplot(aes(x = AGE)) +geom_histogram(bins =10) +labs(title ="Age Distribution")``````{r}# Calculate median ages for each SEX_IDmedian_ages <- MAVITO_ZPN_MAN_UD1_B |>group_by(SEX_ID) |>summarise(median_age =median(AGE, na.rm =TRUE))# PlottingMAVITO_ZPN_MAN_UD1_B |>filter(!is.na(SEX_ID)) |>ggplot(aes(x = AGE)) +geom_histogram(bins =10) +facet_grid(SEX_ID ~ .) +geom_vline(data = median_ages, aes(xintercept = median_age, color =as.factor(SEX_ID)), linetype ="dashed") +scale_color_discrete(name ="SEX_ID Median") +theme_minimal() +labs(x ="Age", y ="Count", title ="Histogram of Ages by SEX_ID with Median Lines")``````{r}rm(median_ages)```### How many specialist attentions PER patient?```{r}MAVITO_ZPN_MAN_UD1_B |>group_by(SPEC_KODS) |>summarise(visits_count =n_distinct(PAC_ID)) |>arrange(desc(visits_count)) |> knitr::kable()``````{r}MAVITO_ZPN_MAN_UD1_B |>filter(!is.na(SPEC_KODS)) |>group_by(SPEC_KODS) |>summarise(visits_count =n_distinct(PAC_ID)) |>arrange(desc(visits_count)) |>ggplot(aes(x =fct_reorder(SPEC_KODS,visits_count), y = visits_count)) +geom_col() +coord_flip() +scale_y_continuous(labels =label_number()) +labs(title ="Visits by specialist", x ="", y ="Visits") ```# FINAL ANALYSIS## ZOBARSTS AND BERNU ZOBARSTS### Filter only Zobarsts and Bernu Z``` Zobārsts Bērnu zobārsts``````{r}MAVITO_ZPN_MAN_UD1_B_zobarst <- MAVITO_ZPN_MAN_UD1_B |>filter(str_detect(SPEC_KODS, "Zobārsts|Bērnu zobārsts"))```### 1. How many children visits 1, 2, or more times```{r}# unique visitsMAVITO_ZPN_MAN_UD1_B_zobarst |>distinct(PAC_ID, MAN_DAT) |>group_by(PAC_ID) |>summarise(Visits =n()) |>ungroup() |># how many PAC_IDs had exactly 1 visit, 2 visits, etc.group_by(Visits) |>summarise(n =n()) |># plot ggplot(aes(x = Visits, y = n)) +geom_col()``````{r}MAVITO_ZPN_MAN_UD1_B_zobarst |>distinct(PAC_ID, MAN_DAT) |>group_by(PAC_ID) |>group_by(PAC_ID) |>summarise(Visits =n()) |># how many PAC_IDs had exactly 1 visit, 2 visits, etc.group_by(Visits) |>summarise(n =n()) |>mutate(Visits =if_else(Visits >9, "> 9 visits", as.character(Visits))) |>group_by(Visits) |>summarise(n =sum(n)) |> knitr::kable()``````{r}MAVITO_ZPN_MAN_UD1_B_zobarst |>distinct(PAC_ID, MAN_DAT) |>group_by(PAC_ID) |>group_by(PAC_ID) |>summarise(Visits =n()) |># how many PAC_IDs had exactly 1 visit, 2 visits, etc.group_by(Visits) |>summarise(n =n()) |>mutate(Visits =if_else(Visits >4, "> 4 visits", as.character(Visits))) |>group_by(Visits) |>summarise(n =sum(n)) |> knitr::kable()```### 2. How many children per age?```{r}MAVITO_ZPN_MAN_UD1_B_zobarst |>distinct(PAC_ID, .keep_all = T) |>group_by(AGE) |>tabyl(AGE) |>adorn_pct_formatting() |> knitr::kable()``````{r}MAVITO_ZPN_MAN_UD1_B_zobarst |>distinct(PAC_ID, .keep_all = T) |>ggplot(aes(x = AGE)) +geom_histogram(bins =16) +labs(title ="Age (unique patients)",y ="N")```This is interesting, seems children attend mostly in three ages.### 3. How many times children goes to dentists in different age groups?```{r}# Count the number of visits for each childchild_visit_counts <- MAVITO_ZPN_MAN_UD1_B_zobarst |>distinct(PAC_ID, MAN_DAT) |>group_by(PAC_ID) |>summarise(Num_Visits =n(), .groups ='drop') # Count visits``````{r}# join the visit counts with the MAVITO_ZPN_MAN_UD1_B_zobarst datasetMAVITO_ZPN_MAN_UD1_B_zobarst <- MAVITO_ZPN_MAN_UD1_B_zobarst |>left_join(child_visit_counts, by ="PAC_ID")``````{r}rm(child_visit_counts)```Now filter the unique PAC_ID```{r}MAVITO_ZPN_MAN_UD1_B_zobarst |>distinct(PAC_ID, .keep_all = T) |>mutate(Num_Visits =if_else(Num_Visits >1, "More than 1 time", as.character(Num_Visits))) |>tabyl(AGE, Num_Visits) |> knitr::kable()``````{r}MAVITO_ZPN_MAN_UD1_B_zobarst |>distinct(PAC_ID, .keep_all = T) |>mutate(Num_Visits =if_else(Num_Visits >1, "Multiple visits", as.character("One visit"))) |>tabyl(AGE, Num_Visits) |>pivot_longer(!AGE, names_to ="Group", values_to ="n") |>ggplot( aes(x = AGE, y = n, fill = Group)) +geom_bar(stat ="identity", position ="identity", alpha =0.5) +labs(title ="Distribution of Visits by Age",x ="Age",y ="Number of Children - Dentist",fill ="Visits") ```### 4. How many patients per clinic?```{r}MAVITO_ZPN_MAN_UD1_B_zobarst |>distinct(PAC_ID, .keep_all =TRUE) |># not included repeated visitsmutate(AI_KODS =as.character(AI_KODS), # Convert AI_KODS to characterAI_KODS =fct_lump_min(AI_KODS, min =500, other_level ="Less than 500 patients")) |>group_by(AI_KODS) |>count() |>arrange(desc(n)) |> knitr::kable()``````{r}MAVITO_ZPN_MAN_UD1_B_zobarst |>distinct(PAC_ID, .keep_all =TRUE) |># not included repeated visitsmutate(AI_KODS =as.character(AI_KODS), # Convert AI_KODS to characterAI_KODS =fct_lump_min(AI_KODS, min =500, other_level ="Less than 500 patients")) |>group_by(AI_KODS) |>count() |>ggplot(aes(x =fct_reorder(AI_KODS, n), y = n)) +geom_col() +coord_flip() +labs(title ="Patients per Clinic", x ="Clinic", y ="Patients")``````{r}MAVITO_ZPN_MAN_UD1_B_zobarst |>distinct(PAC_ID, .keep_all =TRUE) |># not included repeated visits# mutate(AI_KODS = as.character(AI_KODS), # Convert AI_KODS to character# AI_KODS = fct_lump_min(AI_KODS, min = 500, other_level = "Less than 500 patients")) |> group_by(AI_KODS) |>count() |>arrange(desc(n)) |>write_csv(here("data", "_zobarsts_patientes_per_clinic.csv"))```### 4.1. Visits per clinicPAC_ID = PatientMAN_DAT = Date of the visitMP_CODE = Manipulations```{r}MAVITO_ZPN_MAN_UD1_B_zobarst |>group_by(AI_KODS, PAC_ID) |># Group by clinic and patientsummarise(Visits =n_distinct(MAN_DAT)) |># Count unique visit dates per patient per clinicgroup_by(AI_KODS) |># Group by clinicsummarise(Total_Visits =sum(Visits)) |># Sum visits for each clinicarrange(desc(Total_Visits)) |> knitr::kable()``````{r}MAVITO_ZPN_MAN_UD1_B_zobarst |>group_by(AI_KODS, PAC_ID) |># Group by clinic and patientsummarise(Visits =n_distinct(MAN_DAT)) |># Count unique visit dates per patient per clinicgroup_by(AI_KODS) |># Group by clinicsummarise(Total_Visits =sum(Visits)) |># Sum visits for each clinicarrange(desc(Total_Visits)) |>ggplot(aes(x =fct_reorder(AI_KODS, Total_Visits), y = Total_Visits)) +geom_col() +coord_flip() +theme(legend.position ="none") +labs(title ="Visits per clinic", x ="Clinic")```### 5. How many patients per region?```{r}MAVITO_ZPN_MAN_UD1_B_zobarst |>distinct(PAC_ID, .keep_all =TRUE) |># not included repeated visitsmutate(PAC_ATVK =as.character(PAC_ATVK), # Convert PAC_ATVK to characterPAC_ATVK =fct_lump_min(PAC_ATVK, min =500, other_level ="Less than 500 patients")) |>group_by(PAC_ATVK) |>count() |>arrange(desc(n)) |> knitr::kable()``````{r}MAVITO_ZPN_MAN_UD1_B_zobarst |>distinct(PAC_ID, .keep_all =TRUE) |># not included repeated visitsmutate(PAC_ATVK =as.character(PAC_ATVK), # Convert PAC_ATVK to characterPAC_ATVK =fct_lump_min(PAC_ATVK, min =500, other_level ="Less than 500 patients")) |>group_by(PAC_ATVK) |>count() |>ggplot(aes(x =fct_reorder(PAC_ATVK, n), y = n)) +geom_col() +coord_flip() +labs(title ="Patients per Region", x ="Region", y ="Patients")``````{r}# Save the ungrouped data to an CSV for IlzeMAVITO_ZPN_MAN_UD1_B_zobarst |>distinct(PAC_ID, .keep_all =TRUE) |># not included repeated visitsmutate(PAC_ATVK =as.character(PAC_ATVK)) |># Convert PAC_ATVK to character group_by(PAC_ATVK) |>count() |>arrange(desc(n)) |>write_csv(here("data", "zobarsts_patientes_per_clinic.csv"))```### 6. Why are the children coming?```{r}MAVITO_ZPN_MAN_UD1_B_zobarst |>group_by(IEMESLS) |>count() |>arrange(desc(n)) |> knitr::kable()``````{r}MAVITO_ZPN_MAN_UD1_B_zobarst |>group_by(IEMESLS) |>count() |>arrange(desc(n)) |>ggplot(aes(x =fct_reorder(IEMESLS, n, .desc = T) , y = n)) +geom_col() +labs(title ="What are the reasons for children's dental visits?", x ="", y ="Visits")```### 7. Extract the month from MAN_DAT```{r}# MAVITO_ZPN_MAN_UD1_B_zobarst <- MAVITO_ZPN_MAN_UD1_B_zobarst |> # mutate(MONTH = lubridate::month(MAN_DAT) ) |> # relocate(MONTH, .after = MAN_DAT)```### 8. Reasons of visit per month```{r}MAVITO_ZPN_MAN_UD1_B_zobarst |>group_by(MONTH, IEMESLS) |># Group by month and reason for the visitsummarise(Visits =n_distinct(PAC_ID, MAN_DAT)) |># Count unique patient visits (by patient and date) for each reasonarrange(MONTH, desc(Visits)) |># Arrange by month and number of visitspivot_wider(names_from = MONTH, values_from = Visits) |> knitr::kable()``````{r}MAVITO_ZPN_MAN_UD1_B_zobarst |>group_by(MONTH, IEMESLS) |># Group by month and reason for the visitsummarise(Visits =n_distinct(PAC_ID, MAN_DAT)) |># Count unique patient visits (by patient and date) for each reasonggplot(aes(x =as.factor(MONTH), y = Visits, color = IEMESLS, group = IEMESLS)) +geom_line() +scale_y_log10() +labs(title ="Reasons of visits per month - Dentists", y ="Visits (log10)", x ="Month") +theme(legend.position ="top")```### REMOVE OBJECT zobarst```{r}rm(MAVITO_ZPN_MAN_UD1_B_zobarst)```## HYGIENISTS### Filter only Zobu higiēnists```{r}MAVITO_ZPN_MAN_UD1_B_hygienist <- MAVITO_ZPN_MAN_UD1_B |>filter(str_detect(SPEC_KODS, "Zobu higiēnists"))```### 1. How many children visits 1, 2, or more times```{r}# unique visitsMAVITO_ZPN_MAN_UD1_B_hygienist |>distinct(PAC_ID, MAN_DAT) |>group_by(PAC_ID) |>summarise(Visits =n()) |>ungroup() |># how many PAC_IDs had exactly 1 visit, 2 visits, etc.group_by(Visits) |>summarise(n =n()) |># plot ggplot(aes(x = Visits, y = n)) +geom_col() ``````{r}MAVITO_ZPN_MAN_UD1_B_hygienist |>distinct(PAC_ID, MAN_DAT) |>group_by(PAC_ID) |>group_by(PAC_ID) |>summarise(Visits =n()) |># how many PAC_IDs had exactly 1 visit, 2 visits, etc.group_by(Visits) |>summarise(n =n()) |>mutate(Visits =if_else(Visits >9, "> 9 visits", as.character(Visits))) |>group_by(Visits) |>summarise(n =sum(n)) |> knitr::kable()``````{r}MAVITO_ZPN_MAN_UD1_B_hygienist |>distinct(PAC_ID, MAN_DAT) |>group_by(PAC_ID) |>group_by(PAC_ID) |>summarise(Visits =n()) |># how many PAC_IDs had exactly 1 visit, 2 visits, etc.group_by(Visits) |>summarise(n =n()) |>mutate(Visits =if_else(Visits >4, "> 4 visits", as.character(Visits))) |>group_by(Visits) |>summarise(n =sum(n)) |> knitr::kable()```### 2. How many children per age?```{r}MAVITO_ZPN_MAN_UD1_B_hygienist |>distinct(PAC_ID, .keep_all = T) |>group_by(AGE) |>tabyl(AGE) |>adorn_pct_formatting() |> knitr::kable()``````{r}MAVITO_ZPN_MAN_UD1_B_hygienist |>distinct(PAC_ID, .keep_all = T) |>ggplot(aes(x = AGE)) +geom_histogram(bins =12) +labs(title ="Age (unique patients)",y ="N")```This is interesting, seems children attend mostly in three ages.### 3. How many times children goes to dentists in different age groups?```{r}# Count the number of visits for each childchild_visit_counts <- MAVITO_ZPN_MAN_UD1_B_hygienist |>distinct(PAC_ID, MAN_DAT) |>group_by(PAC_ID) |>summarise(Num_Visits =n(), .groups ='drop') # Count visits``````{r}# join the visit counts with the MAVITO_ZPN_MAN_UD1_B_hygienist datasetMAVITO_ZPN_MAN_UD1_B_hygienist <- MAVITO_ZPN_MAN_UD1_B_hygienist |>left_join(child_visit_counts, by ="PAC_ID")``````{r}rm(child_visit_counts)```Now filter the unique PAC_ID```{r}MAVITO_ZPN_MAN_UD1_B_hygienist |>distinct(PAC_ID, .keep_all = T) |>mutate(Num_Visits =if_else(Num_Visits >1, "More than 1 time", as.character(Num_Visits))) |>tabyl(AGE, Num_Visits) |> knitr::kable()``````{r}MAVITO_ZPN_MAN_UD1_B_hygienist |>distinct(PAC_ID, .keep_all = T) |>mutate(Num_Visits =if_else(Num_Visits >1, "Multiple visits", as.character("One visit"))) |>tabyl(AGE, Num_Visits) |>pivot_longer(!AGE, names_to ="Group", values_to ="n") |>ggplot( aes(x = AGE, y = n, fill = Group)) +geom_bar(stat ="identity", position ="identity", alpha =0.5) +labs(title ="Distribution of Visits by Age - Hygienist ",x ="Age",y ="Number of Children",fill ="Visits") ```### 4. How many patients per clinic?```{r}MAVITO_ZPN_MAN_UD1_B_hygienist |>distinct(PAC_ID, .keep_all =TRUE) |># not included repeated visitsmutate(AI_KODS =as.character(AI_KODS), # Convert AI_KODS to characterAI_KODS =fct_lump_min(AI_KODS, min =500, other_level ="Less than 500 patients")) |>group_by(AI_KODS) |>count() |>arrange(desc(n)) |> knitr::kable()``````{r}MAVITO_ZPN_MAN_UD1_B_hygienist |>distinct(PAC_ID, .keep_all =TRUE) |># not included repeated visitsmutate(AI_KODS =as.character(AI_KODS), # Convert AI_KODS to characterAI_KODS =fct_lump_min(AI_KODS, min =500, other_level ="Less than 500 patients")) |>group_by(AI_KODS) |>count() |>ggplot(aes(x =fct_reorder(AI_KODS, n), y = n)) +geom_col() +coord_flip() +labs(title ="Patients per Clinic", x ="Clinic", y ="Patients")```### 4.1. Visits per clinicPAC_ID = PatientMAN_DAT = Date of the visitMP_CODE = Manipulations```{r}MAVITO_ZPN_MAN_UD1_B_hygienist |>group_by(AI_KODS, PAC_ID) |># Group by clinic and patientsummarise(Visits =n_distinct(MAN_DAT)) |># Count unique visit dates per patient per clinicgroup_by(AI_KODS) |># Group by clinicsummarise(Total_Visits =sum(Visits)) |># Sum visits for each clinicarrange(desc(Total_Visits)) |> knitr::kable()``````{r}MAVITO_ZPN_MAN_UD1_B_hygienist |>group_by(AI_KODS, PAC_ID) |># Group by clinic and patientsummarise(Visits =n_distinct(MAN_DAT)) |># Count unique visit dates per patient per clinicgroup_by(AI_KODS) |># Group by clinicsummarise(Total_Visits =sum(Visits)) |># Sum visits for each clinicarrange(desc(Total_Visits)) |>ggplot(aes(x =fct_reorder(AI_KODS, Total_Visits), y = Total_Visits)) +geom_col() +coord_flip() +theme(legend.position ="none") +labs(title ="Visits per clinic", x ="Clinic")```### 5. How many patients per region?```{r}MAVITO_ZPN_MAN_UD1_B_hygienist |>distinct(PAC_ID, .keep_all =TRUE) |># not included repeated visitsmutate(PAC_ATVK =as.character(PAC_ATVK), # Convert PAC_ATVK to characterPAC_ATVK =fct_lump_min(PAC_ATVK, min =500, other_level ="Less than 500 patients")) |>group_by(PAC_ATVK) |>count() |>arrange(desc(n)) |> knitr::kable()``````{r}MAVITO_ZPN_MAN_UD1_B_hygienist |>distinct(PAC_ID, .keep_all =TRUE) |># not included repeated visitsmutate(PAC_ATVK =as.character(PAC_ATVK), # Convert PAC_ATVK to characterPAC_ATVK =fct_lump_min(PAC_ATVK, min =500, other_level ="Less than 500 patients")) |>group_by(PAC_ATVK) |>count() |>ggplot(aes(x =fct_reorder(PAC_ATVK, n), y = n)) +geom_col() +coord_flip() +labs(title ="Patients per Region", x ="Region", y ="Patients")``````{r}# Save the ungrouped data to an CSV for IlzeMAVITO_ZPN_MAN_UD1_B_hygienist |>distinct(PAC_ID, .keep_all =TRUE) |># not included repeated visitsmutate(PAC_ATVK =as.character(PAC_ATVK)) |># Convert PAC_ATVK to character group_by(PAC_ATVK) |>count() |>arrange(desc(n)) |>write_csv(here("data", "hygienist_patientes_per_clinic.csv"))```### 8. Reasons of visit per month```{r}MAVITO_ZPN_MAN_UD1_B_hygienist |>group_by(MONTH, IEMESLS) |># Group by month and reason for the visitsummarise(Visits =n_distinct(PAC_ID, MAN_DAT)) |># Count unique patient visits (by patient and date) for each reasonarrange(MONTH, desc(Visits)) |># Arrange by month and number of visitspivot_wider(names_from = MONTH, values_from = Visits) |> knitr::kable()``````{r}MAVITO_ZPN_MAN_UD1_B_hygienist |>group_by(MONTH, IEMESLS) |># Group by month and reason for the visitsummarise(Visits =n_distinct(PAC_ID, MAN_DAT)) |># Count unique patient visits (by patient and date) for each reasonggplot(aes(x =as.factor(MONTH), y = Visits, color = IEMESLS, group = IEMESLS)) +geom_line() +scale_y_log10() +labs(title ="Reasons of visits per month - Hygienist", y ="Visits (log10)", x ="Month") +theme(legend.position ="top")```### REMOVE OBJECT hygienist```{r}rm(MAVITO_ZPN_MAN_UD1_B_hygienist)```## ALL 8### 8. Reasons of visit per month```{r}MAVITO_ZPN_MAN_UD1_B |>group_by(MONTH, IEMESLS) |># Group by month and reason for the visitsummarise(Visits =n_distinct(PAC_ID, MAN_DAT)) |># Count unique patient visits (by patient and date) for each reasonarrange(MONTH, desc(Visits)) |># Arrange by month and number of visitspivot_wider(names_from = MONTH, values_from = Visits) |> knitr::kable()``````{r}MAVITO_ZPN_MAN_UD1_B |>group_by(MONTH, IEMESLS) |># Group by month and reason for the visitsummarise(Visits =n_distinct(PAC_ID, MAN_DAT)) |># Count unique patient visits (by patient and date) for each reasonggplot(aes(x =as.factor(MONTH), y = Visits, color = IEMESLS, group = IEMESLS)) +geom_line() +scale_y_log10() +labs(title ="Reasons of visits per month - All", y ="Visits (log10)", x ="Month") +theme(legend.position ="top")```## 2024 April 11Las preguntas siguentes de datos:### 1. Verificamos, si no hemos dejado muchos datos sin analyzar:Cuantas filas no tienen datos de SPEC_KODS?```{r}MAVITO_ZPN_MAN_UD1_B %>%summarise(NA_Count =sum(is.na(SPEC_KODS))) # Count NA values in SPEC_KODS```### 1. Verifikamos, cuantas visitas hay - cuantas veces esta el codigo 70001 (variable MP_CODE)```{r}MAVITO_ZPN_MAN_UD1_B %>%filter(MP_CODE ==70001) %>%# Filter cells with code 70001nrow() # Count rows```