Data Structure and Data Cleaning

This analysis uses the Titanic dataset by selecting four numerical variables: Age, SibSp, Parch, and Fare. These variables represent passenger demographics, family structure, and economic condition. Rows containing missing values were removed using the na.omit() function to ensure that all statistical calculations were based on complete observations.

After this cleaning process, the dataset consisted of 714 observations and 4 numerical variables. Age and Fare are continuous variables, while SibSp and Parch are discrete variables. Although the removal of missing values reduced the dataset size from the original data, this step was necessary to avoid bias and ensure the validity of the correlation, covariance, and eigen analyses.

df <- read.csv("Titanic-Dataset.csv")
head(df)
##   PassengerId Survived Pclass
## 1           1        0      3
## 2           2        1      1
## 3           3        1      3
## 4           4        1      1
## 5           5        0      3
## 6           6        0      3
##                                                  Name    Sex Age SibSp Parch
## 1                             Braund, Mr. Owen Harris   male  22     1     0
## 2 Cumings, Mrs. John Bradley (Florence Briggs Thayer) female  38     1     0
## 3                              Heikkinen, Miss. Laina female  26     0     0
## 4        Futrelle, Mrs. Jacques Heath (Lily May Peel) female  35     1     0
## 5                            Allen, Mr. William Henry   male  35     0     0
## 6                                    Moran, Mr. James   male  NA     0     0
##             Ticket    Fare Cabin Embarked
## 1        A/5 21171  7.2500              S
## 2         PC 17599 71.2833   C85        C
## 3 STON/O2. 3101282  7.9250              S
## 4           113803 53.1000  C123        S
## 5           373450  8.0500              S
## 6           330877  8.4583              Q
summary(df)
##   PassengerId       Survived          Pclass          Name          
##  Min.   :  1.0   Min.   :0.0000   Min.   :1.000   Length:891        
##  1st Qu.:223.5   1st Qu.:0.0000   1st Qu.:2.000   Class :character  
##  Median :446.0   Median :0.0000   Median :3.000   Mode  :character  
##  Mean   :446.0   Mean   :0.3838   Mean   :2.309                     
##  3rd Qu.:668.5   3rd Qu.:1.0000   3rd Qu.:3.000                     
##  Max.   :891.0   Max.   :1.0000   Max.   :3.000                     
##                                                                     
##      Sex                 Age            SibSp           Parch       
##  Length:891         Min.   : 0.42   Min.   :0.000   Min.   :0.0000  
##  Class :character   1st Qu.:20.12   1st Qu.:0.000   1st Qu.:0.0000  
##  Mode  :character   Median :28.00   Median :0.000   Median :0.0000  
##                     Mean   :29.70   Mean   :0.523   Mean   :0.3816  
##                     3rd Qu.:38.00   3rd Qu.:1.000   3rd Qu.:0.0000  
##                     Max.   :80.00   Max.   :8.000   Max.   :6.0000  
##                     NA's   :177                                     
##     Ticket               Fare           Cabin             Embarked        
##  Length:891         Min.   :  0.00   Length:891         Length:891        
##  Class :character   1st Qu.:  7.91   Class :character   Class :character  
##  Mode  :character   Median : 14.45   Mode  :character   Mode  :character  
##                     Mean   : 32.20                                        
##                     3rd Qu.: 31.00                                        
##                     Max.   :512.33                                        
## 
tail(df)
##     PassengerId Survived Pclass                                     Name    Sex
## 886         886        0      3     Rice, Mrs. William (Margaret Norton) female
## 887         887        0      2                    Montvila, Rev. Juozas   male
## 888         888        1      1             Graham, Miss. Margaret Edith female
## 889         889        0      3 Johnston, Miss. Catherine Helen "Carrie" female
## 890         890        1      1                    Behr, Mr. Karl Howell   male
## 891         891        0      3                      Dooley, Mr. Patrick   male
##     Age SibSp Parch     Ticket   Fare Cabin Embarked
## 886  39     0     5     382652 29.125              Q
## 887  27     0     0     211536 13.000              S
## 888  19     0     0     112053 30.000   B42        S
## 889  NA     1     2 W./C. 6607 23.450              S
## 890  26     0     0     111369 30.000  C148        C
## 891  32     0     0     370376  7.750              Q
str(df)
## 'data.frame':    891 obs. of  12 variables:
##  $ PassengerId: int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Survived   : int  0 1 1 1 0 0 0 0 1 1 ...
##  $ Pclass     : int  3 1 3 1 3 3 1 3 3 2 ...
##  $ Name       : chr  "Braund, Mr. Owen Harris" "Cumings, Mrs. John Bradley (Florence Briggs Thayer)" "Heikkinen, Miss. Laina" "Futrelle, Mrs. Jacques Heath (Lily May Peel)" ...
##  $ Sex        : chr  "male" "female" "female" "female" ...
##  $ Age        : num  22 38 26 35 35 NA 54 2 27 14 ...
##  $ SibSp      : int  1 1 0 1 0 0 0 3 0 1 ...
##  $ Parch      : int  0 0 0 0 0 0 0 1 2 0 ...
##  $ Ticket     : chr  "A/5 21171" "PC 17599" "STON/O2. 3101282" "113803" ...
##  $ Fare       : num  7.25 71.28 7.92 53.1 8.05 ...
##  $ Cabin      : chr  "" "C85" "" "C123" ...
##  $ Embarked   : chr  "S" "C" "S" "S" ...
colnames(df)
##  [1] "PassengerId" "Survived"    "Pclass"      "Name"        "Sex"        
##  [6] "Age"         "SibSp"       "Parch"       "Ticket"      "Fare"       
## [11] "Cabin"       "Embarked"
colSums(is.na(df))
## PassengerId    Survived      Pclass        Name         Sex         Age 
##           0           0           0           0           0         177 
##       SibSp       Parch      Ticket        Fare       Cabin    Embarked 
##           0           0           0           0           0           0
colMeans(is.na(df)) * 100
## PassengerId    Survived      Pclass        Name         Sex         Age 
##     0.00000     0.00000     0.00000     0.00000     0.00000    19.86532 
##       SibSp       Parch      Ticket        Fare       Cabin    Embarked 
##     0.00000     0.00000     0.00000     0.00000     0.00000     0.00000
df[!complete.cases(df), ]
##     PassengerId Survived Pclass
## 6             6        0      3
## 18           18        1      2
## 20           20        1      3
## 27           27        0      3
## 29           29        1      3
## 30           30        0      3
## 32           32        1      1
## 33           33        1      3
## 37           37        1      3
## 43           43        0      3
## 46           46        0      3
## 47           47        0      3
## 48           48        1      3
## 49           49        0      3
## 56           56        1      1
## 65           65        0      1
## 66           66        1      3
## 77           77        0      3
## 78           78        0      3
## 83           83        1      3
## 88           88        0      3
## 96           96        0      3
## 102         102        0      3
## 108         108        1      3
## 110         110        1      3
## 122         122        0      3
## 127         127        0      3
## 129         129        1      3
## 141         141        0      3
## 155         155        0      3
## 159         159        0      3
## 160         160        0      3
## 167         167        1      1
## 169         169        0      1
## 177         177        0      3
## 181         181        0      3
## 182         182        0      2
## 186         186        0      1
## 187         187        1      3
## 197         197        0      3
## 199         199        1      3
## 202         202        0      3
## 215         215        0      3
## 224         224        0      3
## 230         230        0      3
## 236         236        0      3
## 241         241        0      3
## 242         242        1      3
## 251         251        0      3
## 257         257        1      1
## 261         261        0      3
## 265         265        0      3
## 271         271        0      1
## 275         275        1      3
## 278         278        0      2
## 285         285        0      1
## 296         296        0      1
## 299         299        1      1
## 301         301        1      3
## 302         302        1      3
## 304         304        1      2
## 305         305        0      3
## 307         307        1      1
## 325         325        0      3
## 331         331        1      3
## 335         335        1      1
## 336         336        0      3
## 348         348        1      3
## 352         352        0      1
## 355         355        0      3
## 359         359        1      3
## 360         360        1      3
## 365         365        0      3
## 368         368        1      3
## 369         369        1      3
## 376         376        1      1
## 385         385        0      3
## 389         389        0      3
## 410         410        0      3
## 411         411        0      3
## 412         412        0      3
## 414         414        0      2
## 416         416        0      3
## 421         421        0      3
## 426         426        0      3
## 429         429        0      3
## 432         432        1      3
## 445         445        1      3
## 452         452        0      3
## 455         455        0      3
## 458         458        1      1
## 460         460        0      3
## 465         465        0      3
## 467         467        0      2
## 469         469        0      3
## 471         471        0      3
## 476         476        0      1
## 482         482        0      2
## 486         486        0      3
## 491         491        0      3
## 496         496        0      3
## 498         498        0      3
## 503         503        0      3
## 508         508        1      1
## 512         512        0      3
## 518         518        0      3
## 523         523        0      3
## 525         525        0      3
## 528         528        0      1
## 532         532        0      3
## 534         534        1      3
## 539         539        0      3
## 548         548        1      2
## 553         553        0      3
## 558         558        0      1
## 561         561        0      3
## 564         564        0      3
## 565         565        0      3
## 569         569        0      3
## 574         574        1      3
## 579         579        0      3
## 585         585        0      3
## 590         590        0      3
## 594         594        0      3
## 597         597        1      2
## 599         599        0      3
## 602         602        0      3
## 603         603        0      1
## 612         612        0      3
## 613         613        1      3
## 614         614        0      3
## 630         630        0      3
## 634         634        0      1
## 640         640        0      3
## 644         644        1      3
## 649         649        0      3
## 651         651        0      3
## 654         654        1      3
## 657         657        0      3
## 668         668        0      3
## 670         670        1      1
## 675         675        0      2
## 681         681        0      3
## 693         693        1      3
## 698         698        1      3
## 710         710        1      3
## 712         712        0      1
## 719         719        0      3
## 728         728        1      3
## 733         733        0      2
## 739         739        0      3
## 740         740        0      3
## 741         741        1      1
## 761         761        0      3
## 767         767        0      1
## 769         769        0      3
## 774         774        0      3
## 777         777        0      3
## 779         779        0      3
## 784         784        0      3
## 791         791        0      3
## 793         793        0      3
## 794         794        0      1
## 816         816        0      1
## 826         826        0      3
## 827         827        0      3
## 829         829        1      3
## 833         833        0      3
## 838         838        0      3
## 840         840        1      1
## 847         847        0      3
## 850         850        1      1
## 860         860        0      3
## 864         864        0      3
## 869         869        0      3
## 879         879        0      3
## 889         889        0      3
##                                                   Name    Sex Age SibSp Parch
## 6                                     Moran, Mr. James   male  NA     0     0
## 18                        Williams, Mr. Charles Eugene   male  NA     0     0
## 20                             Masselmani, Mrs. Fatima female  NA     0     0
## 27                             Emir, Mr. Farred Chehab   male  NA     0     0
## 29                       O'Dwyer, Miss. Ellen "Nellie" female  NA     0     0
## 30                                 Todoroff, Mr. Lalio   male  NA     0     0
## 32      Spencer, Mrs. William Augustus (Marie Eugenie) female  NA     1     0
## 33                            Glynn, Miss. Mary Agatha female  NA     0     0
## 37                                    Mamee, Mr. Hanna   male  NA     0     0
## 43                                 Kraeff, Mr. Theodor   male  NA     0     0
## 46                            Rogers, Mr. William John   male  NA     0     0
## 47                                   Lennon, Mr. Denis   male  NA     1     0
## 48                           O'Driscoll, Miss. Bridget female  NA     0     0
## 49                                 Samaan, Mr. Youssef   male  NA     2     0
## 56                                   Woolner, Mr. Hugh   male  NA     0     0
## 65                               Stewart, Mr. Albert A   male  NA     0     0
## 66                            Moubarek, Master. Gerios   male  NA     1     1
## 77                                   Staneff, Mr. Ivan   male  NA     0     0
## 78                            Moutal, Mr. Rahamin Haim   male  NA     0     0
## 83                      McDermott, Miss. Brigdet Delia female  NA     0     0
## 88                       Slocovski, Mr. Selman Francis   male  NA     0     0
## 96                         Shorney, Mr. Charles Joseph   male  NA     0     0
## 102                   Petroff, Mr. Pastcho ("Pentcho")   male  NA     0     0
## 108                             Moss, Mr. Albert Johan   male  NA     0     0
## 110                                Moran, Miss. Bertha female  NA     1     0
## 122                         Moore, Mr. Leonard Charles   male  NA     0     0
## 127                                McMahon, Mr. Martin   male  NA     0     0
## 129                                  Peter, Miss. Anna female  NA     1     1
## 141                      Boulos, Mrs. Joseph (Sultana) female  NA     0     2
## 155                              Olsen, Mr. Ole Martin   male  NA     0     0
## 159                                Smiljanic, Mr. Mile   male  NA     0     0
## 160                         Sage, Master. Thomas Henry   male  NA     8     2
## 167             Chibnall, Mrs. (Edith Martha Bowerman) female  NA     0     1
## 169                                Baumann, Mr. John D   male  NA     0     0
## 177                      Lefebre, Master. Henry Forbes   male  NA     3     1
## 181                       Sage, Miss. Constance Gladys female  NA     8     2
## 182                                   Pernot, Mr. Rene   male  NA     0     0
## 186                              Rood, Mr. Hugh Roscoe   male  NA     0     0
## 187    O'Brien, Mrs. Thomas (Johanna "Hannah" Godfrey) female  NA     1     0
## 197                                Mernagh, Mr. Robert   male  NA     0     0
## 199                   Madigan, Miss. Margaret "Maggie" female  NA     0     0
## 202                                Sage, Mr. Frederick   male  NA     8     2
## 215                                Kiernan, Mr. Philip   male  NA     1     0
## 224                               Nenkoff, Mr. Christo   male  NA     0     0
## 230                            Lefebre, Miss. Mathilde female  NA     3     1
## 236                       Harknett, Miss. Alice Phoebe female  NA     0     0
## 241                              Zabour, Miss. Thamine female  NA     1     0
## 242                     Murphy, Miss. Katherine "Kate" female  NA     1     0
## 251                             Reed, Mr. James George   male  NA     0     0
## 257                     Thorne, Mrs. Gertrude Maybelle female  NA     0     0
## 261                                  Smith, Mr. Thomas   male  NA     0     0
## 265                                 Henry, Miss. Delia female  NA     0     0
## 271                              Cairns, Mr. Alexander   male  NA     0     0
## 275                         Healy, Miss. Hanora "Nora" female  NA     0     0
## 278                        Parkes, Mr. Francis "Frank"   male  NA     0     0
## 285                         Smith, Mr. Richard William   male  NA     0     0
## 296                                  Lewy, Mr. Ervin G   male  NA     0     0
## 299                              Saalfeld, Mr. Adolphe   male  NA     0     0
## 301           Kelly, Miss. Anna Katherine "Annie Kate" female  NA     0     0
## 302                                 McCoy, Mr. Bernard   male  NA     2     0
## 304                                Keane, Miss. Nora A female  NA     0     0
## 305                  Williams, Mr. Howard Hugh "Harry"   male  NA     0     0
## 307                            Fleming, Miss. Margaret female  NA     0     0
## 325                           Sage, Mr. George John Jr   male  NA     8     2
## 331                                 McCoy, Miss. Agnes female  NA     2     0
## 335 Frauenthal, Mrs. Henry William (Clara Heinsheimer) female  NA     1     0
## 336                                 Denkoff, Mr. Mitto   male  NA     0     0
## 348          Davison, Mrs. Thomas Henry (Mary E Finck) female  NA     1     0
## 352             Williams-Lambert, Mr. Fletcher Fellows   male  NA     0     0
## 355                                  Yousif, Mr. Wazli   male  NA     0     0
## 359                               McGovern, Miss. Mary female  NA     0     0
## 360                  Mockler, Miss. Helen Mary "Ellie" female  NA     0     0
## 365                                O'Brien, Mr. Thomas   male  NA     1     0
## 368                     Moussa, Mrs. (Mantoura Boulos) female  NA     0     0
## 369                                Jermyn, Miss. Annie female  NA     0     0
## 376              Meyer, Mrs. Edgar Joseph (Leila Saks) female  NA     1     0
## 385                             Plotcharsky, Mr. Vasil   male  NA     0     0
## 389                               Sadlier, Mr. Matthew   male  NA     0     0
## 410                                 Lefebre, Miss. Ida female  NA     3     1
## 411                                 Sdycoff, Mr. Todor   male  NA     0     0
## 412                                    Hart, Mr. Henry   male  NA     0     0
## 414                     Cunningham, Mr. Alfred Fleming   male  NA     0     0
## 416            Meek, Mrs. Thomas (Annie Louise Rowley) female  NA     0     0
## 421                             Gheorgheff, Mr. Stanio   male  NA     0     0
## 426                             Wiseman, Mr. Phillippe   male  NA     0     0
## 429                                   Flynn, Mr. James   male  NA     0     0
## 432  Thorneycroft, Mrs. Percival (Florence Kate White) female  NA     1     0
## 445                  Johannesen-Bratthammer, Mr. Bernt   male  NA     0     0
## 452                    Hagland, Mr. Ingvald Olai Olsen   male  NA     1     0
## 455                                Peduzzi, Mr. Joseph   male  NA     0     0
## 458                  Kenyon, Mrs. Frederick R (Marion) female  NA     1     0
## 460                              O'Connor, Mr. Maurice   male  NA     0     0
## 465                                 Maisner, Mr. Simon   male  NA     0     0
## 467                              Campbell, Mr. William   male  NA     0     0
## 469                                 Scanlan, Mr. James   male  NA     0     0
## 471                                  Keefe, Mr. Arthur   male  NA     0     0
## 476                        Clifford, Mr. George Quincy   male  NA     0     0
## 482                   Frost, Mr. Anthony Wood "Archie"   male  NA     0     0
## 486                             Lefebre, Miss. Jeannie female  NA     3     1
## 491               Hagland, Mr. Konrad Mathias Reiersen   male  NA     1     0
## 496                              Yousseff, Mr. Gerious   male  NA     0     0
## 498                    Shellard, Mr. Frederick William   male  NA     0     0
## 503                     O'Sullivan, Miss. Bridget Mary female  NA     0     0
## 508      Bradley, Mr. George ("George Arthur Brayton")   male  NA     0     0
## 512                                  Webber, Mr. James   male  NA     0     0
## 518                                  Ryan, Mr. Patrick   male  NA     0     0
## 523                                 Lahoud, Mr. Sarkis   male  NA     0     0
## 525                                  Kassem, Mr. Fared   male  NA     0     0
## 528                                 Farthing, Mr. John   male  NA     0     0
## 532                                  Toufik, Mr. Nakli   male  NA     0     0
## 534             Peter, Mrs. Catherine (Catherine Rizk) female  NA     0     2
## 539                           Risien, Mr. Samuel Beard   male  NA     0     0
## 548                         Padro y Manent, Mr. Julian   male  NA     0     0
## 553                               O'Brien, Mr. Timothy   male  NA     0     0
## 558                                Robbins, Mr. Victor   male  NA     0     0
## 561                           Morrow, Mr. Thomas Rowan   male  NA     0     0
## 564                                  Simmons, Mr. John   male  NA     0     0
## 565                     Meanwell, Miss. (Marion Ogden) female  NA     0     0
## 569                                Doharr, Mr. Tannous   male  NA     0     0
## 574                                  Kelly, Miss. Mary female  NA     0     0
## 579                   Caram, Mrs. Joseph (Maria Elias) female  NA     1     0
## 585                                Paulner, Mr. Uscher   male  NA     0     0
## 590                                Murdlin, Mr. Joseph   male  NA     0     0
## 594                                 Bourke, Miss. Mary female  NA     0     2
## 597                         Leitch, Miss. Jessie Wills female  NA     0     0
## 599                                  Boulos, Mr. Hanna   male  NA     0     0
## 602                               Slabenoff, Mr. Petco   male  NA     0     0
## 603                          Harrington, Mr. Charles H   male  NA     0     0
## 612                              Jardin, Mr. Jose Neto   male  NA     0     0
## 613                        Murphy, Miss. Margaret Jane female  NA     1     0
## 614                                   Horgan, Mr. John   male  NA     0     0
## 630                           O'Connell, Mr. Patrick D   male  NA     0     0
## 634                      Parr, Mr. William Henry Marsh   male  NA     0     0
## 640                         Thorneycroft, Mr. Percival   male  NA     1     0
## 644                                    Foo, Mr. Choong   male  NA     0     0
## 649                                 Willey, Mr. Edward   male  NA     0     0
## 651                                  Mitkoff, Mr. Mito   male  NA     0     0
## 654                      O'Leary, Miss. Hanora "Norah" female  NA     0     0
## 657                              Radeff, Mr. Alexander   male  NA     0     0
## 668                         Rommetvedt, Mr. Knud Paust   male  NA     0     0
## 670  Taylor, Mrs. Elmer Zebley (Juliet Cummins Wright) female  NA     1     0
## 675                         Watson, Mr. Ennis Hastings   male  NA     0     0
## 681                                Peters, Miss. Katie female  NA     0     0
## 693                                       Lam, Mr. Ali   male  NA     0     0
## 698                   Mullens, Miss. Katherine "Katie" female  NA     0     0
## 710  Moubarek, Master. Halim Gonios ("William George")   male  NA     1     1
## 712                                 Klaber, Mr. Herman   male  NA     0     0
## 719                                McEvoy, Mr. Michael   male  NA     0     0
## 728                           Mannion, Miss. Margareth female  NA     0     0
## 733                               Knight, Mr. Robert J   male  NA     0     0
## 739                                 Ivanoff, Mr. Kanio   male  NA     0     0
## 740                                 Nankoff, Mr. Minko   male  NA     0     0
## 741                        Hawksford, Mr. Walter James   male  NA     0     0
## 761                                 Garfirth, Mr. John   male  NA     0     0
## 767                          Brewe, Dr. Arthur Jackson   male  NA     0     0
## 769                                Moran, Mr. Daniel J   male  NA     1     0
## 774                                    Elias, Mr. Dibo   male  NA     0     0
## 777                                   Tobin, Mr. Roger   male  NA     0     0
## 779                            Kilgannon, Mr. Thomas J   male  NA     0     0
## 784                             Johnston, Mr. Andrew G   male  NA     1     2
## 791                           Keane, Mr. Andrew "Andy"   male  NA     0     0
## 793                            Sage, Miss. Stella Anna female  NA     8     2
## 794                           Hoyt, Mr. William Fisher   male  NA     0     0
## 816                                   Fry, Mr. Richard   male  NA     0     0
## 826                                    Flynn, Mr. John   male  NA     0     0
## 827                                       Lam, Mr. Len   male  NA     0     0
## 829                       McCormack, Mr. Thomas Joseph   male  NA     0     0
## 833                                     Saad, Mr. Amin   male  NA     0     0
## 838                                Sirota, Mr. Maurice   male  NA     0     0
## 840                               Marechal, Mr. Pierre   male  NA     0     0
## 847                           Sage, Mr. Douglas Bullen   male  NA     8     2
## 850       Goldenberg, Mrs. Samuel L (Edwiga Grabowska) female  NA     1     0
## 860                                   Razi, Mr. Raihed   male  NA     0     0
## 864                  Sage, Miss. Dorothy Edith "Dolly" female  NA     8     2
## 869                        van Melkebeke, Mr. Philemon   male  NA     0     0
## 879                                 Laleff, Mr. Kristo   male  NA     0     0
## 889           Johnston, Miss. Catherine Helen "Carrie" female  NA     1     2
##                 Ticket     Fare Cabin Embarked
## 6               330877   8.4583              Q
## 18              244373  13.0000              S
## 20                2649   7.2250              C
## 27                2631   7.2250              C
## 29              330959   7.8792              Q
## 30              349216   7.8958              S
## 32            PC 17569 146.5208   B78        C
## 33              335677   7.7500              Q
## 37                2677   7.2292              C
## 43              349253   7.8958              C
## 46     S.C./A.4. 23567   8.0500              S
## 47              370371  15.5000              Q
## 48               14311   7.7500              Q
## 49                2662  21.6792              C
## 56               19947  35.5000   C52        S
## 65            PC 17605  27.7208              C
## 66                2661  15.2458              C
## 77              349208   7.8958              S
## 78              374746   8.0500              S
## 83              330932   7.7875              Q
## 88     SOTON/OQ 392086   8.0500              S
## 96              374910   8.0500              S
## 102             349215   7.8958              S
## 108             312991   7.7750              S
## 110             371110  24.1500              Q
## 122          A4. 54510   8.0500              S
## 127             370372   7.7500              Q
## 129               2668  22.3583 F E69        C
## 141               2678  15.2458              C
## 155          Fa 265302   7.3125              S
## 159             315037   8.6625              S
## 160           CA. 2343  69.5500              S
## 167             113505  55.0000   E33        S
## 169           PC 17318  25.9250              S
## 177               4133  25.4667              S
## 181           CA. 2343  69.5500              S
## 182      SC/PARIS 2131  15.0500              C
## 186             113767  50.0000   A32        S
## 187             370365  15.5000              Q
## 197             368703   7.7500              Q
## 199             370370   7.7500              Q
## 202           CA. 2343  69.5500              S
## 215             367229   7.7500              Q
## 224             349234   7.8958              S
## 230               4133  25.4667              S
## 236         W./C. 6609   7.5500              S
## 241               2665  14.4542              C
## 242             367230  15.5000              Q
## 251             362316   7.2500              S
## 257           PC 17585  79.2000              C
## 261             384461   7.7500              Q
## 265             382649   7.7500              Q
## 271             113798  31.0000              S
## 275             370375   7.7500              Q
## 278             239853   0.0000              S
## 285             113056  26.0000   A19        S
## 296           PC 17612  27.7208              C
## 299              19988  30.5000  C106        S
## 301               9234   7.7500              Q
## 302             367226  23.2500              Q
## 304             226593  12.3500  E101        Q
## 305           A/5 2466   8.0500              S
## 307              17421 110.8833              C
## 325           CA. 2343  69.5500              S
## 331             367226  23.2500              Q
## 335           PC 17611 133.6500              S
## 336             349225   7.8958              S
## 348             386525  16.1000              S
## 352             113510  35.0000  C128        S
## 355               2647   7.2250              C
## 359             330931   7.8792              Q
## 360             330980   7.8792              Q
## 365             370365  15.5000              Q
## 368               2626   7.2292              C
## 369              14313   7.7500              Q
## 376           PC 17604  82.1708              C
## 385             349227   7.8958              S
## 389             367655   7.7292              Q
## 410               4133  25.4667              S
## 411             349222   7.8958              S
## 412             394140   6.8583              Q
## 414             239853   0.0000              S
## 416             343095   8.0500              S
## 421             349254   7.8958              C
## 426         A/4. 34244   7.2500              S
## 429             364851   7.7500              Q
## 432             376564  16.1000              S
## 445              65306   8.1125              S
## 452              65303  19.9667              S
## 455           A/5 2817   8.0500              S
## 458              17464  51.8625   D21        S
## 460             371060   7.7500              Q
## 465           A/S 2816   8.0500              S
## 467             239853   0.0000              S
## 469              36209   7.7250              Q
## 471             323592   7.2500              S
## 476             110465  52.0000   A14        S
## 482             239854   0.0000              S
## 486               4133  25.4667              S
## 491              65304  19.9667              S
## 496               2627  14.4583              C
## 498          C.A. 6212  15.1000              S
## 503             330909   7.6292              Q
## 508             111427  26.5500              S
## 512   SOTON/OQ 3101316   8.0500              S
## 518             371110  24.1500              Q
## 523               2624   7.2250              C
## 525               2700   7.2292              C
## 528           PC 17483 221.7792   C95        S
## 532               2641   7.2292              C
## 534               2668  22.3583              C
## 539             364498  14.5000              S
## 548      SC/PARIS 2146  13.8625              C
## 553             330979   7.8292              Q
## 558           PC 17757 227.5250              C
## 561             372622   7.7500              Q
## 564    SOTON/OQ 392082   8.0500              S
## 565  SOTON/O.Q. 392087   8.0500              S
## 569               2686   7.2292              C
## 574              14312   7.7500              Q
## 579               2689  14.4583              C
## 585               3411   8.7125              C
## 590         A./5. 3235   8.0500              S
## 594             364848   7.7500              Q
## 597             248727  33.0000              S
## 599               2664   7.2250              C
## 602             349214   7.8958              S
## 603             113796  42.4000              S
## 612 SOTON/O.Q. 3101305   7.0500              S
## 613             367230  15.5000              Q
## 614             370377   7.7500              Q
## 630             334912   7.7333              Q
## 634             112052   0.0000              S
## 640             376564  16.1000              S
## 644               1601  56.4958              S
## 649      S.O./P.P. 751   7.5500              S
## 651             349221   7.8958              S
## 654             330919   7.8292              Q
## 657             349223   7.8958              S
## 668             312993   7.7750              S
## 670              19996  52.0000  C126        S
## 675             239856   0.0000              S
## 681             330935   8.1375              Q
## 693               1601  56.4958              S
## 698              35852   7.7333              Q
## 710               2661  15.2458              C
## 712             113028  26.5500  C124        S
## 719              36568  15.5000              Q
## 728              36866   7.7375              Q
## 733             239855   0.0000              S
## 739             349201   7.8958              S
## 740             349218   7.8958              S
## 741              16988  30.0000   D45        S
## 761             358585  14.5000              S
## 767             112379  39.6000              C
## 769             371110  24.1500              Q
## 774               2674   7.2250              C
## 777             383121   7.7500   F38        Q
## 779              36865   7.7375              Q
## 784         W./C. 6607  23.4500              S
## 791              12460   7.7500              Q
## 793           CA. 2343  69.5500              S
## 794           PC 17600  30.6958              C
## 816             112058   0.0000  B102        S
## 826             368323   6.9500              Q
## 827               1601  56.4958              S
## 829             367228   7.7500              Q
## 833               2671   7.2292              C
## 838             392092   8.0500              S
## 840              11774  29.7000   C47        C
## 847           CA. 2343  69.5500              S
## 850              17453  89.1042   C92        C
## 860               2629   7.2292              C
## 864           CA. 2343  69.5500              S
## 869             345777   9.5000              S
## 879             349217   7.8958              S
## 889         W./C. 6607  23.4500              S
data_selected <- df[, c("Age", "SibSp", "Parch", "Fare")]
data_clean <- na.omit(data_selected)

str(data_clean)
## 'data.frame':    714 obs. of  4 variables:
##  $ Age  : num  22 38 26 35 35 54 2 27 14 4 ...
##  $ SibSp: int  1 1 0 1 0 0 3 0 1 1 ...
##  $ Parch: int  0 0 0 0 0 0 1 2 0 1 ...
##  $ Fare : num  7.25 71.28 7.92 53.1 8.05 ...
##  - attr(*, "na.action")= 'omit' Named int [1:177] 6 18 20 27 29 30 32 33 37 43 ...
##   ..- attr(*, "names")= chr [1:177] "6" "18" "20" "27" ...
dim(data_clean)
## [1] 714   4

Data Distribution and Visualization

The histograms provide a clear visual overview of the distributions of Age, SibSp, Parch, and Fare. The Age variable shows a unimodal distribution that is relatively close to normal, with most passengers concentrated between approximately 20 and 40 years old. This indicates that the majority of Titanic passengers were young to middle-aged adults.

In contrast, SibSp and Parch both exhibit strong right-skewed distributions, where most passengers have values of 0 or 1, meaning they traveled alone or with very few family members. As the number of siblings, spouses, parents, or children increases, the frequency declines sharply, suggesting that large family groups were uncommon.

The Fare variable shows the most extreme right skew, with most passengers paying low ticket prices and a small number paying very high fares, reaching values above 500. This reflects significant economic inequality among passengers and differences in ticket class. Overall, these histograms indicate that Age has a more balanced distribution, while SibSp, Parch, and especially Fare are heavily right-skewed.

par(mfrow = c(2,2))

hist(data_clean$Age,
     main = "Distribution of Age",
     xlab = "Age",
     col = "lightblue")

hist(data_clean$SibSp,
     main = "Distribution of SibSp",
     xlab = "SibSp",
     col = "lightgreen")

hist(data_clean$Parch,
     main = "Distribution of Parch",
     xlab = "Parch",
     col = "lightpink")

hist(data_clean$Fare,
     main = "Distribution of Fare",
     xlab = "Fare",
     col = "khaki")

par(mfrow = c(1,1))

Correlation Matrix

The correlation matrix was computed to measure the linear relationships among Age, SibSp, Parch, and Fare. The results show a moderate negative correlation between Age and SibSp, suggesting that younger passengers were more likely to travel with siblings or spouses.

Age also has a weak negative correlation with Parch, indicating that younger passengers more often traveled with parents or children. The strongest positive correlation appears between SibSp and Parch, which suggests that passengers traveling with siblings or spouses also tended to travel with other family members.

Fare shows weak positive correlations with SibSp and Parch, while its correlation with Age is very weak. Overall, the correlation matrix indicates that the relationships among variables are weak to moderate, and no strong linear dependency is present.

cor_matrix <- cor(data_clean)
cor_matrix
##               Age      SibSp      Parch       Fare
## Age    1.00000000 -0.3082468 -0.1891193 0.09606669
## SibSp -0.30824676  1.0000000  0.3838199 0.13832879
## Parch -0.18911926  0.3838199  1.0000000 0.20511888
## Fare   0.09606669  0.1383288  0.2051189 1.00000000

Heatmap of the Correlation Matrix

The correlation heatmap provides a visual representation of the correlation matrix, where blue colors indicate positive correlations and red colors indicate negative correlations. From the heatmap, it is evident that no correlation values are extremely high in magnitude. This confirms that multicollinearity is not a major issue in the dataset.

heatmap(
  cor_matrix,
  symm = TRUE,
  col = colorRampPalette(c("red", "white", "blue"))(100),
  main = "Heatmap of Correlation Matrix"
)

Variance–Covariance Matrix

The variance–covariance matrix was calculated to examine the variability of each variable and how they vary together in their original measurement units. The variance of Fare is the largest, indicating a very high level of dispersion in ticket prices. The variance of Age is also relatively large, reflecting the wide range of passenger ages.

In contrast, SibSp and Parch have much smaller variances due to their limited ranges. The positive covariance between SibSp and Parch reinforces the earlier finding that family-related variables tend to increase together.

cov_matrix <- cov(data_clean)
cov_matrix
##              Age      SibSp      Parch        Fare
## Age   211.019125 -4.1633339 -2.3441911   73.849030
## SibSp  -4.163334  0.8644973  0.3045128    6.806212
## Parch  -2.344191  0.3045128  0.7281027    9.262176
## Fare   73.849030  6.8062117  9.2621760 2800.413100

Eigen Values and Scree Plot

Eigen values were obtained from the eigen decomposition of the covariance matrix to identify how much variance is explained by each principal component. The first eigen value is extremely large, indicating that the first principal component explains the majority of the total variance in the dataset.

The scree plot shows a sharp drop after the first component, suggesting that most of the information is concentrated in the first one or two components.

eigen_result <- eigen(cov_matrix)

eigen_values <- eigen_result$values
eigen_vectors <- eigen_result$vectors

eigen_values
## [1] 2802.5636587  209.0385659    0.9438783    0.4787214
eigen_vectors
##             [,1]        [,2]         [,3]          [,4]
## [1,] 0.028477552  0.99929943 -0.024018111  0.0035788596
## [2,] 0.002386349 -0.02093144 -0.773693322  0.6332099362
## [3,] 0.003280818 -0.01253786 -0.633088089 -0.7739712590
## [4,] 0.999586200 -0.02837826  0.004609234  0.0009266652
plot(
  eigen_values,
  type = "b",
  pch = 19,
  col = "blue",
  xlab = "Principal Component",
  ylab = "Eigen Value",
  main = "Scree Plot of Eigen Values"
)

Eigen Vectors

Eigen vectors describe how each original variable contributes to the principal components. The first principal component is almost entirely dominated by Fare, indicating that differences in ticket prices are the main source of variability in the dataset.

The second principal component is primarily associated with Age, while the remaining components are influenced mainly by SibSp and Parch but contribute relatively little variance.

Overall Conclusion

Based on the entire analysis, Fare and Age are the dominant sources of variation in the dataset, while SibSp and Parch play a secondary role related to family structure. The correlations among variables are weak to moderate, indicating low multicollinearity and stable statistical properties.

Eigen analysis shows that most of the information in the dataset can be captured using one or two principal components, making dimensionality reduction techniques such as Principal Component Analysis highly suitable for further analysis. These results provide a strong statistical foundation for more advanced modeling or exploratory techniques using the Titanic dataset.