Welcome to the year 2912, where your data science skills are needed to solve a cosmic mystery. We’ve received a transmission from four lightyears away and things aren’t looking good.
The Spaceship Titanic was an interstellar passenger liner launched a month ago. With almost 13,000 passengers on board, the vessel set out on its maiden voyage transporting emigrants from our solar system to three newly habitable exoplanets orbiting nearby stars.
While rounding Alpha Centauri en route to its first destination—the torrid 55 Cancri E—the unwary Spaceship Titanic collided with a spacetime anomaly hidden within a dust cloud. Sadly, it met a similar fate as its namesake from 1000 years before. Though the ship stayed intact, almost half of the passengers were transported to an alternate dimension!
library(readr)
test <- read_csv("test.csv")
train <- read_csv("train.csv")
library(DataExplorer)
create_report(train)
##
|
| | 0%
|
|. | 2%
|
|.. | 5% [global_options]
|
|... | 7%
|
|.... | 10% [introduce]
|
|.... | 12%
|
|..... | 14% [plot_intro]
|
|...... | 17%
|
|....... | 19% [data_structure]
|
|........ | 21%
|
|......... | 24% [missing_profile]
|
|.......... | 26%
|
|........... | 29% [univariate_distribution_header]
|
|........... | 31%
|
|............ | 33% [plot_histogram]
|
|............. | 36%
|
|.............. | 38% [plot_density]
|
|............... | 40%
|
|................ | 43% [plot_frequency_bar]
|
|................. | 45%
|
|.................. | 48% [plot_response_bar]
|
|.................. | 50%
|
|................... | 52% [plot_with_bar]
|
|.................... | 55%
|
|..................... | 57% [plot_normal_qq]
|
|...................... | 60%
|
|....................... | 62% [plot_response_qq]
|
|........................ | 64%
|
|......................... | 67% [plot_by_qq]
|
|.......................... | 69%
|
|.......................... | 71% [correlation_analysis]
|
|........................... | 74%
|
|............................ | 76% [principal_component_analysis]
|
|............................. | 79%
|
|.............................. | 81% [bivariate_distribution_header]
|
|............................... | 83%
|
|................................ | 86% [plot_response_boxplot]
|
|................................. | 88%
|
|................................. | 90% [plot_by_boxplot]
|
|.................................. | 93%
|
|................................... | 95% [plot_response_scatterplot]
|
|.................................... | 98%
|
|.....................................| 100% [plot_by_scatterplot]
## "C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/pandoc" +RTS -K512m -RTS "C:/Users/hutku/Desktop/FinalPorjesi/report.knit.md" --to html4 --from markdown+autolink_bare_uris+tex_math_single_backslash --output pandoc513069a82ffd.html --lua-filter "C:\Users\hutku\AppData\Local\R\win-library\4.3\rmarkdown\rmarkdown\lua\pagebreak.lua" --lua-filter "C:\Users\hutku\AppData\Local\R\win-library\4.3\rmarkdown\rmarkdown\lua\latex-div.lua" --embed-resources --standalone --variable bs3=TRUE --section-divs --table-of-contents --toc-depth 6 --template "C:\Users\hutku\AppData\Local\R\win-library\4.3\rmarkdown\rmd\h\default.html" --no-highlight --variable highlightjs=1 --variable theme=yeti --mathjax --variable "mathjax-url=https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" --include-in-header "C:\Users\hutku\AppData\Local\Temp\Rtmp27i2e6\rmarkdown-str51304df24e7a.html"
library(tidyverse)
library(explore)
train %>% describe_all()
## # A tibble: 14 × 8
## variable type na na_pct unique min mean max
## <chr> <chr> <int> <dbl> <int> <dbl> <dbl> <dbl>
## 1 PassengerId chr 0 0 8693 NA NA NA
## 2 HomePlanet chr 201 2.3 4 NA NA NA
## 3 CryoSleep lgl 217 2.5 3 0 0.36 1
## 4 Cabin chr 199 2.3 6561 NA NA NA
## 5 Destination chr 182 2.1 4 NA NA NA
## 6 Age dbl 179 2.1 81 0 28.8 79
## 7 VIP lgl 203 2.3 3 0 0.02 1
## 8 RoomService dbl 181 2.1 1274 0 225. 14327
## 9 FoodCourt dbl 183 2.1 1508 0 458. 29813
## 10 ShoppingMall dbl 208 2.4 1116 0 174. 23492
## 11 Spa dbl 183 2.1 1328 0 311. 22408
## 12 VRDeck dbl 188 2.2 1307 0 305. 24133
## 13 Name chr 200 2.3 8474 NA NA NA
## 14 Transported lgl 0 0 2 0 0.5 1