Assignment 2 DATA 712

rm = (list = ls())
gc()

##          used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 537935 28.8    1193727 63.8   686460 36.7
## Vcells 979941  7.5    8388608 64.0  1876069 14.4

knitr::opts_chunk$set(error = TRUE)

library(readr)

## Warning: package 'readr' was built under R version 4.4.2

d_csv<- read_csv("C:/DATA 712/titanic_data.csv", col_names = TRUE)

## Rows: 891 Columns: 12
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (5): Name, Sex, Ticket, Cabin, Embarked
## dbl (7): PassengerId, Survived, Pclass, Age, SibSp, Parch, Fare
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

head(d_csv)

## # A tibble: 6 × 12
##   PassengerId Survived Pclass Name    Sex     Age SibSp Parch Ticket  Fare Cabin
##         <dbl>    <dbl>  <dbl> <chr>   <chr> <dbl> <dbl> <dbl> <chr>  <dbl> <chr>
## 1           1        0      3 Braund… male     22     1     0 A/5 2…  7.25 <NA> 
## 2           2        1      1 Cuming… fema…    38     1     0 PC 17… 71.3  C85  
## 3           3        1      3 Heikki… fema…    26     0     0 STON/…  7.92 <NA> 
## 4           4        1      1 Futrel… fema…    35     1     0 113803 53.1  C123 
## 5           5        0      3 Allen,… male     35     0     0 373450  8.05 <NA> 
## 6           6        0      3 Moran,… male     NA     0     0 330877  8.46 <NA> 
## # ℹ 1 more variable: Embarked <chr>

install.packages(“tidyverse”) library(tidyverse)

if (!requireNamespace("tidyverse", quietly = TRUE)) {
    install.packages("tidyverse", repos = "https://cloud.r-project.org/")
}
chooseCRANmirror()

## Error in .chooseMirror(m, "CRAN", graphics, ind): cannot choose a CRAN mirror non-interactively

d_csv %>%
  group_by(Sex, Pclass) %>% 
  summarise(avg_fare = mean(Fare, na.rm = TRUE)) %>%  
  arrange(desc(avg_fare))

## Error in d_csv %>% group_by(Sex, Pclass) %>% summarise(avg_fare = mean(Fare, : could not find function "%>%"

install.packages("readr")

## Warning: package 'readr' is in use and will not be installed

install.packages("dplyr")

## Installing package into 'C:/Users/viole/AppData/Local/R/win-library/4.4'
## (as 'lib' is unspecified)

## Error in contrib.url(repos, "source"): trying to use CRAN without setting a mirror

library(readr)
library(dplyr)

## Warning: package 'dplyr' was built under R version 4.4.2

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

d_csv %>%
  group_by(Sex, Pclass) %>%
  summarise(survival_rate = mean(Survived, na.rm = TRUE)) %>%
  arrange(desc(survival_rate))

## `summarise()` has grouped output by 'Sex'. You can override using the `.groups`
## argument.

## # A tibble: 6 × 3
## # Groups:   Sex [2]
##   Sex    Pclass survival_rate
##   <chr>   <dbl>         <dbl>
## 1 female      1         0.968
## 2 female      2         0.921
## 3 female      3         0.5  
## 4 male        1         0.369
## 5 male        2         0.157
## 6 male        3         0.135

Based on my analysis of the Titanic dataset, I noticed that there are significant differences in ticket prices and survival rates based on sex and passenger class. The average fare paid by passengers varied significantly, with those in higher classes (Pclass 1) paying more than those in lower classes. Additionally, women generally paid higher fares than men within the same class. This could suggest that wealthier passengers, who had access to first-class accommodations, may have had more financial resources, which could have influenced their experience aboard the Titanic.

Survival rates also showed notable patterns. Women had a significantly higher survival rate than men across all classes. Additionally, first-class passengers had a much higher chance of survival compared to those in second and third class. This could also suggest that social and economic status may have played a critical role in survival, possibly due to better cabin locations, earlier access to lifeboats, or preferential treatment during evacuation. These findings could be used to highlight the inequalities present and emphasize how wealth and gender could have an influence on survival outcomes.

Assignment 2 DATA 712

Donna Parker

2025-02-25