The initial approach to the data for the purpose of this project was to use the variable hba1c as pivot variable to guide the whole study, because it reflects the glicemic control the patient had at the first meeting with the surgeon. Out of 835 patients in the data base, only 557 had their HbA1c recorded on the electronic medical records as shown in the following figure.
The missing values represent 33.3% of the entries in the dataset, while patients with a valid HbA1c meassurment recorded represent 66.7 %.
Once the data available was extracted, we proceeded to subset the data using the variable hba1c as shown in the codebook for the dataset and generate three groups:
hba1c \(\leq 6\).hba1c \(> 6\) and \(\leq 8.5\)hba1c \(> 8.5\)Resulting in three groups with the following amount of patients each:
## [1] "Controlled subset"
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 19.00 36.00 45.00 45.69 56.00 73.00
## [1] "Missing values?"
##
## FALSE
## 348
## [1] "Elevated subset"
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 24.00 42.50 52.00 50.89 58.50 78.00
## [1] "Missing values?"
##
## FALSE
## 175
## [1] "Wildly elevated subset"
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 27.00 40.25 45.50 45.50 51.50 62.00
## [1] "Missing values?"
##
## FALSE
## 34
Missing valueS:
## NULL
Missing values:
## NULL
Missing values:
## NULL
Missing values:
## NULL
hba1c subsetHbA1c subsetWhere:
Count = amount of entries (patients) in the variable.Miss = amount of missing values.Card = cardinality, which referes to the different unique values used througout the variable.Abnormal low values to check (less than 100 mmHg):
## [1] 414 467 649 774
Abnormal high values to check (greater or equal than 190 mmHg):
## [1] 344 418
Abnormal low values to check (less or equal to 60 mm/Hg):
## [1] 3 208 278 391 467 484 595 656 774 826
Abnormal high values to check (greater or equal to 100 mm/Hg):
## [1] 75 78 81 84 87 149 230 305 384 392 417 418 438 452 505 529 538
## [18] 543 544 546 551 559 710 729 760 767 779
Abnormal high values to check (greater or equal to 200 Kg):
## [1] 138 155 201 206 263 323 344 407 418 431 438 505 506 530 549 559 566
## [18] 686 714 797
Abnormal low values to check (less or equal to 65 Kg):
## [1] 18 348 415 421 427 436 462 467 534 546 832
ATTENTION! Outlier in controlled subset close to ZERO and around 125 centimiters. Typos?
Abnormal high values to check (greater than 250 cm):
## integer(0)
Abnormal low values to check (less than 150 cm):
## [1] 251 269 279 310 312 359 364 424 427 473 591 642 652 660 748 773
Abnormal high values to check (more or equal to than 65):
## [1] 138 175 206 232 253 263 323 407 417 431 438 473 515 559 661 689 714
## [18] 743 780 784 797
hba1c subsethba1c subsetAbnormal high values to check (more or equal to than 15):
## [1] 415
Abnormal low values to check (less or equal to 10):
## [1] 18 22 78 138 306 325 349 362 460 536 546 562 627 732 814 832
Abnormal high values to check (more than 18):
## [1] 544 706
ATTENTION!: Outliers that need to be looked at
Abnormal high values to check (greater or equal to 10):
## [1] 10 54 120 486 641 674
Abnormal high values to check (greater or equal to 75):
## [1] 22 33 46 48 54 69 100 149 197 402 504 517 548 590 618 688
Abnormal high values to check (greater or equal to 300):
## [1] 394 561
Abnormal high values to check (greater or equal to 100):
## [1] 431 665 713
Abnormal high values to check (greater or equal to 350):
## [1] 1 145 171 213 264 283 313 327 394 457 459 486 561 588 623 664 742
## [18] 768
ATTENTION!: What does a very high C reactive protein means for a patient with wildly elevated HbA1c? is it a typo?
Abnormal high values to check (greater or equal to 25):
## [1] 4 102 138 150 187 216 263 309 324 327 371 380 402 417 431 444 504
## [18] 505 548 559 576 603 627 653 656 661 743 750 781 802 811
ATTENTION! Albumin close to ZERO on controlled population. Typo?
Abnormal low values to check (less or equal to 3):
## [1] 235 241 312 462 563 591 694 709 721 732 832
ATTENTION! Typo on creatinine of controlled population. Outlier with value > 4 mg/dL
Abnormal high values to check (greater or equal to 2):
## [1] 266 349 385 563 565 600 682 694 709 732
hba1c subsethba1c subsethba1c subsetAbnormal high values to check (greater or equal to 180):
## [1] 40 149 344 543 636
Abnormal high values to check (greater or equal to 100):
## [1] 10 34 40 149 153 180 228 339 373 392 429 529 543 565 594 766 767
## [18] 777
Abnormal low values to check (less than 50):
## [1] 307
Abnormal low values to check (less than 80):
## [1] 2 18 140 297 302 348 415 421 427 436 453 462 463 467 483 534 536
## [18] 546 595 596 604 639 652 827 832
Why would someone weighing less than 80kg need a bariatric surgery?
Abnormal low values to check (less than 125):
## [1] 390
Abnormal low values to check (less than 30):
## [1] 2 18 140 208 348 415 421 427 436 453 462 463 467 483 494 534 546
## [18] 595 596 685 827 832
Why would a patient with BMI < 30 (meters/kilograms) need a bariatric surgery?
hba1c subsethba1c subsethba1c subsetAbnormal low values to check (less or equal than 3):
## [1] 18 606 609 827
Abnormal high values to check (greater or equal than 15):
## [1] 258 431 507 767
Abnormal low values to check (less or equal than 9):
## [1] 22 65 101 102 115 297 392 530 534 546 562 627 693
Abnormal high values to check (greater or equal than 16.5):
## [1] 27 31 155 174 490 501 544 570 647 683 706 709 712 717 789 808
Abnormal low values to check (less or equal than 80k):
## [1] 151 290 609 744
Abnormal high values to check (greater than 350k):
## [1] 3 72 75 78 102 105 140 143 147 170 191 200 204 206 213 225 233
## [18] 255 261 285 288 302 305 306 310 316 324 325 336 339 341 351 376 379
## [35] 403 425 436 449 453 460 465 467 472 476 478 479 491 494 500 515 523
## [52] 526 527 534 538 548 554 561 562 567 571 576 582 588 592 595 596 598
## [69] 603 618 624 627 632 636 644 645 647 649 673 681 689 711 718 726 737
## [86] 741 745 748 754 758 767 773 803 825
Abnormal low values to check (less or equal than 3):
## [1] 102 596 768 825
Abnormal high values to check (greater or equal than 2):
## [1] 266 385 565 647
hba1c subsethba1c subsethba1c subsetMissing valueS:
## [1] 85 140 536 788
Missing valueS:
## [1] 85 788
Available values:
## [1] 735 8 15
Available Values:
## [1] 14 118 122 133 217 222 234 263 291 294 318 359 387 417 419 452 516
## [18] 552 564 565 575 585 780 795 3 4 57 100 107 120 138 141 145 170
## [35] 171 229 271 278 288 309 314 332 335 337 341 358 360 362 386 398 468
## [52] 484 511 515 544 545 568 590 619 623 646 647 656 681 697 711 746 751
## [69] 765 773 777 778 781 805 807 817 826 834 64 158 213 394 473 505 525
## [86] 627 761 782
Missing Values:
## [1] 85 788
Missing Values:
## [1] 85 788 554
Missing Values:
## [1] 85 788 554
Missing Values:
## [1] 85 788
Missing Values:
## [1] 85 788
Missing Values:
## [1] 85 788
Missing Values:
## [1] 85 788 776
Missing Values:
## [1] 85 788
Missing Values:
## [1] 85 503 788 826
Missing Values:
## [1] 32 85 247 788
Missing Values:
## [1] 1 5 14 17 22 32 35 45 46 50 51 54 55 61 69 76 80
## [18] 85 90 110 111 114 117 118 122 123 124 129 132 133 137 139 140 146
## [35] 149 151 153 154 155 162 165 166 167 172 178 186 188 189 191 192 197
## [52] 201 204 208 212 216 217 222 223 224 226 228 230 234 235 238 240 244
## [69] 247 249 250 251 255 260 261 263 265 266 268 277 281 282 283 284 286
## [86] 289 291 292 294 295 296 300 301 303 304 307 308 315 318 319 322 324
## [103] 329 331 334 342 343 347 352 354 355 356 357 359 361 364 366 367 368
## [120] 370 371 373 374 377 378 379 380 381 382 383 384 385 387 388 391 392
## [137] 393 395 396 399 405 406 409 411 412 414 417 419 424 425 426 427 428
## [154] 432 433 434 435 437 439 440 442 444 447 450 452 453 454 456 458 459
## [171] 461 463 471 472 476 478 481 482 486 490 491 492 494 496 497 498 499
## [188] 500 503 506 508 509 510 513 514 516 519 523 527 532 534 535 536 538
## [205] 539 541 542 543 546 547 550 551 552 555 556 558 559 560 562 563 565
## [222] 567 569 570 573 574 575 578 580 581 582 583 584 585 587 588 589 591
## [239] 592 594 596 598 600 601 602 603 604 608 610 611 613 617 618 620 621
## [256] 622 624 625 626 629 630 631 635 639 640 641 642 649 655 659 661 662
## [273] 663 665 666 670 671 672 675 676 680 686 687 688 689 691 692 694 695
## [290] 703 705 706 712 714 715 719 720 722 726 728 729 731 736 737 740 742
## [307] 743 753 754 755 756 762 763 766 770 772 779 780 783 785 786 788 790
## [324] 791 792 793 795 796 797 798 799 800 801 802 803 808 812 814 815 816
## [341] 819 823 824 828 829 835 3 4 10 12 19 20 21 48 57 58 66
## [358] 67 70 81 99 100 102 103 107 115 119 120 138 141 145 163 164 168
## [375] 170 171 174 176 187 199 205 210 214 229 231 232 233 237 242 253 256
## [392] 258 264 270 271 276 278 288 306 309 311 314 320 325 327 328 332 335
## [409] 337 341 345 346 350 358 360 362 386 398 401 402 403 407 413 416 418
## [426] 423 431 438 448 451 464 465 468 474 475 480 484 485 487 489 504 511
## [443] 515 517 520 521 533 544 545 548 554 561 568 590 593 597 599 614 615
## [460] 619 623 628 634 637 643 644 646 647 648 652 654 656 658 664 674 678
## [477] 679 681 682 683 693 697 698 699 707 709 710 711 713 716 721 723 725
## [494] 727 734 735 745 746 750 751 765 768 771 773 775 776 777 778 784 787
## [511] 794 804 805 807 813 817 821 826 834 8 15 31 33 64 116 131 150
## [528] 158 213 241 330 353 394 441 457 473 505 525 531 576 627 636 651 708
## [545] 733 761 764 767 782 806 809
Missing Values:
## [1] 85 788
Missing Values:
## [1] 85 359 788 314 807
Missing Values:
## [1] 54 85 788
Missing Values:
## [1] 85 318 788
Missing Values:
## [1] 788
Missing Values:
## [1] 85 788 599 776
Missing Values:
## [1] 118 788
Missing Values:
## [1] 85 788
Missing Values:
## [1] 129 617 620 788
Missing Values:
## [1] 22 85 111 129 146 172 178 201 228 249 283 289 292 296 354 378 387
## [18] 427 453 459 481 486 494 496 514 536 541 547 583 596 610 617 620 631
## [35] 640 653 731 737 788 791 798 816 824 829 57 70 102 199 210 214 233
## [52] 306 311 320 345 346 358 362 398 423 465 485 521 561 614 628 716 768
## [69] 776 813 826 834 116 150 206 441 457 525 806
Missing Values:
## [1] 85 111 140 289 354 388 737 788 828 57 210 214 271 345 451 628 711
## [18] 821 131 206 525
High proportion of missing values, should just transform NAs into ‘No’?
7 Social History of Patients
7.1 Vegetarians
Missing valueS:7.2 Vegans
Transform NAs into No.
Patient IDs with available values for the variable
vegan:7.3 Alcohol Use
Missing valueS:7.4 Alcohol, amount of compsumption
NEED TO GET LABELS FOR 1, 2, 3, AND 4
ATTENTION! Again,this variable presents a significant ammount of missing values. How should we proceed?
Add value No Alcohol Use to all missing values
7.5 Tobacco Use
NEED TO GET LABELS FOR 1, 2, 3, AND 4
Missing valueS:7.6 Vaping
Missing valueS:7.7 Cunsumption of Drugs
Missing valueS:7.8 Cunsumption of Amphetamines
7.9 Cunsumption of Cocaine
7.10 Cunsumption of Marijuana
7.11 Cunsumption of Opiates, Heroin, Morphine, and Oxycodone.
7.12 Cunsumption of Benzodiazepines
7.13 Cunsumption of Phencyclidine
7.14 Cunsumption of Barbiturates