1 Section 1

This is my code.

2+2
## [1] 4

1.1 Import Data

remove(list = ls())

setwd("/Users/arvindsharma/Dropbox/WCAS/Data Analysis/Data Analysis - Spring II 2024/Data Analysis - Spring II 2024 (shared files)/W1/Week_1-2/titanic")

?read.csv
test <- read.csv(file = "test.csv") 

# explicitly specifying out the default values - gives same output

test2 <-read.csv(file = "test.csv",
                 header = TRUE)

# changeing the default values - number of rows is different now

test3 <-read.csv(file = "test.csv",
                 header = FALSE)

Q1 a. What are the types of variable (quantitative / qualitative) and levels of measurement (nominal / ordinal / interval / ratio) for PassengerId and Age?

nrow(test3)
## [1] 419

My data has 481 rows.

str(test)
## 'data.frame':    418 obs. of  11 variables:
##  $ PassengerId: int  892 893 894 895 896 897 898 899 900 901 ...
##  $ Pclass     : int  3 3 2 3 3 3 3 2 3 3 ...
##  $ Name       : chr  "Kelly, Mr. James" "Wilkes, Mrs. James (Ellen Needs)" "Myles, Mr. Thomas Francis" "Wirz, Mr. Albert" ...
##  $ Sex        : chr  "male" "female" "male" "male" ...
##  $ Age        : num  34.5 47 62 27 22 14 30 26 18 21 ...
##  $ SibSp      : int  0 1 0 0 1 0 0 1 0 2 ...
##  $ Parch      : int  0 0 0 0 1 0 0 1 0 0 ...
##  $ Ticket     : chr  "330911" "363272" "240276" "315154" ...
##  $ Fare       : num  7.83 7 9.69 8.66 12.29 ...
##  $ Cabin      : chr  "" "" "" "" ...
##  $ Embarked   : chr  "Q" "S" "Q" "S" ...
head(test)
##   PassengerId Pclass                                         Name    Sex  Age
## 1         892      3                             Kelly, Mr. James   male 34.5
## 2         893      3             Wilkes, Mrs. James (Ellen Needs) female 47.0
## 3         894      2                    Myles, Mr. Thomas Francis   male 62.0
## 4         895      3                             Wirz, Mr. Albert   male 27.0
## 5         896      3 Hirvonen, Mrs. Alexander (Helga E Lindqvist) female 22.0
## 6         897      3                   Svensson, Mr. Johan Cervin   male 14.0
##   SibSp Parch  Ticket    Fare Cabin Embarked
## 1     0     0  330911  7.8292              Q
## 2     1     0  363272  7.0000              S
## 3     0     0  240276  9.6875              Q
## 4     0     0  315154  8.6625              S
## 5     1     1 3101298 12.2875              S
## 6     0     0    7538  9.2250              S
names(test)
##  [1] "PassengerId" "Pclass"      "Name"        "Sex"         "Age"        
##  [6] "SibSp"       "Parch"       "Ticket"      "Fare"        "Cabin"      
## [11] "Embarked"

Some packages can clash with .Rmd file like View(). Comment such commands or remove them.

# install.packages("swirl")
# library("swirl")
# swirl()