Finding and Counting NA Elements

Only the top 10 entries in each column are shown for demonstration.

rm(list=ls()) #Clear the environment

train <- read.csv("C:/Users/aarav/Downloads/train.csv") #Import and save dataset "train"

NAsPerVar <- data.frame( #Create dataframe to count how many NAs each variable has
  VARs = c(colnames(train)), #Every variable name in the dataframe
  NAs = rep(0, ncol(train)) #Counter for NAs per variable
)

for(y in 1:nrow(train)) #For loop for checking X coordinates of every entry
{
  for(x in 1:ncol(train)) #For loop for checking Y coordinates of every entry
  {
    if(is.na(train[y,x]) == TRUE) #Checking if entry is NA
    {
      NAsPerVar[x,2] <- 1 + NAsPerVar[x,2] #Update the number of NAs for each variable
    }
  }
}
print(NAsPerVar) #Print "NAsPerVar" variable
##           VARs NAs
## 1  PassengerId   0
## 2     Survived   0
## 3       Pclass   0
## 4         Name   0
## 5          Sex   0
## 6          Age 177
## 7        SibSp   0
## 8        Parch   0
## 9       Ticket   0
## 10        Fare   0
## 11       Cabin   0
## 12    Embarked   0

Level of Measurement for Every Variable in “train”

“PassengerId”: Quantitave, Nominal

“Survived”: Quantitative, Symmetric Binary

“Pclass”: Quantitative, Ordinal

“Name”: Qualitative, Nominal

“Sex”: Qualitative, Symmetric Binary

“Age”: Quantitative, Ratio

“SibSp”: Quantitative, Ordinal

“Parch”: Quantitative, Interval

“Ticket”: Qualitative, Nominal

“Fare”: Quantitative, Ratio

“Cabin”: Qualitative, Nominal

“Embarked” Qualitative, Nominal

“NumOfNAs” Quantitative, Ratio

Visual Data

library(visdat) #Initialize "visdat" package
vis_dat(train) #Visualization of dataframe, "train"

# Data Table

library(stargazer) #Initialize Stargazer package
## 
## Please cite as:
##  Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
##  R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
stargazer(train, type = "text") #Create data table
## 
## ==============================================
## Statistic    N   Mean   St. Dev.  Min    Max  
## ----------------------------------------------
## PassengerId 891 446.000 257.354    1     891  
## Survived    891  0.384   0.487     0      1   
## Pclass      891  2.309   0.836     1      3   
## Age         714 29.699   14.526  0.420 80.000 
## SibSp       891  0.523   1.103     0      8   
## Parch       891  0.382   0.806     0      6   
## Fare        891 32.204   49.693  0.000 512.329
## ----------------------------------------------