Quarto Presentation – Starter Example

Titanic Dataset

Shantee

INTRODUCTION

  • This is an introduction to creating presentation output using Quarto
  • Notice how headers are used to create pages of content
  • This is just a simple example - we will improve on the design and flow throughout the semester
  • Use the CRISP-DM Model to create a relevant story line
    Image: Crisp-DM

Data Understanding

  • We will use the Titanic dataset for our analysis
  • The dataset has information on all 1309 passengers aboard the Titanic when it sank in April 1912
  • The dataset has the following variables
'data.frame':   1309 obs. of  12 variables:
 $ PassengerId: int  1 2 3 4 5 6 7 8 9 10 ...
 $ Survived   : int  0 1 1 1 0 0 0 0 1 1 ...
 $ Pclass     : int  3 1 3 1 3 3 1 3 3 2 ...
 $ Name       : chr  "Braund, Mr. Owen Harris" "Cumings, Mrs. John Bradley (Florence Briggs Thayer)" "Heikkinen, Miss. Laina" "Futrelle, Mrs. Jacques Heath (Lily May Peel)" ...
 $ Sex        : chr  "male" "female" "female" "female" ...
 $ Age        : num  22 38 26 35 35 NA 54 2 27 14 ...
 $ SibSp      : int  1 1 0 1 0 0 0 3 0 1 ...
 $ Parch      : int  0 0 0 0 0 0 0 1 2 0 ...
 $ Ticket     : chr  "A/5 21171" "PC 17599" "STON/O2. 3101282" "113803" ...
 $ Fare       : num  7.25 71.28 7.92 53.1 8.05 ...
 $ Cabin      : chr  "" "C85" "" "C123" ...
 $ Embarked   : chr  "S" "C" "S" "S" ...

Passenger Statistics by Gender

  • The Titanic had more men than women – almost two-thirds were men.

female   male 
   466    843 
  • Percent distribution by gender

female   male 
  35.6   64.4 

Passenger Statistics by Survival

  • These data show that fewer passengers survive than did not survive.
  • However, there are quite a few passengers for whom no survival information is available.

Did Not Survive        Survived          Unsure 
            549             342             418 
  • Percent distribution by survival

Did Not Survive        Survived          Unsure 
           41.9            26.1            31.9 

Passenger Statistics by Gender and Survival

  • About half of the women are known to have survived
  • While over half of the men are known to have perished
Cross-Tabulation, Row Proportions  
Sex * Survived.f  
Data Frame: titanic  

-------- ------------ ----------------- ------------- ------------- ---------------
           Survived.f   Did Not Survive      Survived        Unsure           Total
     Sex                                                                           
  female                     81 (17.4%)   233 (50.0%)   152 (32.6%)    466 (100.0%)
    male                    468 (55.5%)   109 (12.9%)   266 (31.6%)    843 (100.0%)
   Total                    549 (41.9%)   342 (26.1%)   418 (31.9%)   1309 (100.0%)
-------- ------------ ----------------- ------------- ------------- ---------------

Passenger Statistics by Gender and Survival

  • Most of the non-survivors were men while most of the survivors were women.
Cross-Tabulation, Column Proportions  
Sex * Survived.f  
Data Frame: titanic  

-------- ------------ ----------------- -------------- -------------- ---------------
           Survived.f   Did Not Survive       Survived         Unsure           Total
     Sex                                                                             
  female                    81 ( 14.8%)   233 ( 68.1%)   152 ( 36.4%)    466 ( 35.6%)
    male                   468 ( 85.2%)   109 ( 31.9%)   266 ( 63.6%)    843 ( 64.4%)
   Total                   549 (100.0%)   342 (100.0%)   418 (100.0%)   1309 (100.0%)
-------- ------------ ----------------- -------------- -------------- ---------------

Average Age by Gender

  • The average age of passengers on board the Titanic is 30
  • The average age of female passengers is 29
  • The average age of male passengers is 31

Average Age by Survival

Survivors tended to be younger than those who did not survive.

Survived Average Age
Did Not Survive 31
Survived 28
Unsure 30

Average Age by Gender and Survival

Non-surviving females are younger than surviving females. The opposite is true among males.
The youngest group are female non-survivors.

Gender Survived Average Age
female Did Not Survive 25
female Survived 29
female Unsure 30
male Did Not Survive 32
male Survived 27
male Unsure 30

Average Age by Fare Class and Survival

  • Passengers booked fares in First, Second, or Third class
  • First class passengers tended to be older, regardless of survival status
Survived Fare Class Average Age
Did Not Survive 1 44
Did Not Survive 2 34
Did Not Survive 3 27
Survived 1 35
Survived 2 26
Survived 3 21
Unsure 1 41
Unsure 2 29
Unsure 3 24

Average Age by Fare Class, Gender, and Survival

Among males, those who did not survive tended to be older than survivors, regardless of fare class.
Among females, first class passengers who did not survive were younger than survivors.

Analysis using the Embark variable

This chart shows the distribution of the Titanic’s passengers from its boarding ports. Additionally, it showcases the difference in the fare each passenger paid to board the titanic.  The majority of people that got on the ship were from Southampton, then France then Ireland.

'data.frame':   1309 obs. of  15 variables:
 $ PassengerId: int  1 2 3 4 5 6 7 8 9 10 ...
 $ Survived   : int  0 1 1 1 0 0 0 0 1 1 ...
 $ Pclass     : int  3 1 3 1 3 3 1 3 3 2 ...
 $ Name       : chr  "Braund, Mr. Owen Harris" "Cumings, Mrs. John Bradley (Florence Briggs Thayer)" "Heikkinen, Miss. Laina" "Futrelle, Mrs. Jacques Heath (Lily May Peel)" ...
 $ Sex        : chr  "male" "female" "female" "female" ...
 $ Age        : num  22 38 26 35 35 NA 54 2 27 14 ...
 $ SibSp      : int  1 1 0 1 0 0 0 3 0 1 ...
 $ Parch      : int  0 0 0 0 0 0 0 1 2 0 ...
 $ Ticket     : chr  "A/5 21171" "PC 17599" "STON/O2. 3101282" "113803" ...
 $ Fare       : num  7.25 71.28 7.92 53.1 8.05 ...
 $ Cabin      : chr  "" "C85" "" "C123" ...
 $ Embarked   : chr  "S" "C" "S" "S" ...
 $ Survived.f : Factor w/ 3 levels "Did Not Survive",..: 1 2 2 2 1 1 1 1 2 2 ...
 $ Embarked.f : Factor w/ 3 levels "C","Q","S": 3 1 3 3 3 2 3 3 3 1 ...
 $ Fare_bins  : Factor w/ 4 levels "Bin1","Bin2",..: 1 4 2 4 2 2 4 3 2 3 ...

This chart found that the people who paid for the cheaper tickets were more likely to not survive, likewise a persons’ chance of survival went up the more they had paid for their ticket.  Despite this, the amount of unconfirmed survivals or deaths were around the same regardless of ticket price.  Additionally, because Southampton had so many passengers, it also had the most passenger deaths. 

C: Cherbourg, France

There were a total of 270 passengers that got on on this location, with the majority of them being in first class. 

Pclass Survival Count
1 Did Not Survive 26
1 Survived 59
1 Unsure 56
2 Did Not Survive 8
2 Survived 9
2 Unsure 11
3 Did Not Survive 41
3 Survived 25
3 Unsure 35

S: Southampton, England

This location had 916 passengers, the most out of the three and was nearly evenly split with its survival rates at first glance, however third class passengers overwhelmingly faced more casualties.  

Pclass Survival Count
1 Did Not Survive 53
1 Survived 76
1 Unsure 50
2 Did Not Survive 88
2 Survived 76
2 Unsure 78
3 Did Not Survive 286
3 Survived 67
3 Unsure 142

Q: Queenstown, Ireland

This location had the least amount of passengers at 123 and the least amount of first class passengers, like the Southampton, it had most of its population in third class and suffered large losses.  

Pclass Survival Count
1 Did Not Survive 1
1 Survived 1
1 Unsure 1
2 Did Not Survive 1
2 Survived 2
2 Unsure 4
3 Did Not Survive 45
3 Survived 27
3 Unsure 41