Introduction

Welcome to this R demo session! Here, I will demonstrate how to use R to deal with missing data.

Missing data can be a not so trivial problem when analysing a dataset and accounting for it is usually not so straightforward either.

If the amount of missing data is very small relatively to the size of the dataset, then leaving out the few samples with missing features may be the best strategy in order not to bias the analysis, however leaving out available datapoints deprives the data of some amount of information and depending on the situation you face, you may want to look for other fixes before wiping out potentially useful datapoints from your dataset.

While some quick fixes such as mean-substitution may be fine in some cases, such simple approaches usually introduce bias into the data, for instance, applying mean substitution leaves the mean unchanged (which is desirable) but decreases variance, which may be undesirable.

The mice package in R, helps you imputing missing values with plausible data values. These plausible values are drawn from a distribution specifically designed for each missing datapoint.

In this rmarkdown we are going to impute missing values using the airquality dataset (available in R). For the purpose of the demonstration I am going to remove some datapoints from the dataset.

data <- airquality

# Examine the data
head(data)
##   Ozone Solar.R Wind Temp Month Day
## 1    41     190  7.4   67     5   1
## 2    36     118  8.0   72     5   2
## 3    12     149 12.6   74     5   3
## 4    18     313 11.5   62     5   4
## 5    NA      NA 14.3   56     5   5
## 6    28      NA 14.9   66     5   6
# Remove some datapoints
data[4:10,3] <- rep(NA,7)
data[1:5,4] <- NA

As far as categorical variables are concerned, replacing categorical variables is usually not advisable. Some common practice include replacing missing categorical variables with the mode of the observed ones, however, it is questionable whether it is a good choice. Even though in this case no datapoints are missing from the categorical variables, we remove them from our dataset (we can add them back later if needed) and take a look at the data using summary().

data <- data[-c(5,6)]
summary(data)
##      Ozone           Solar.R           Wind             Temp      
##  Min.   :  1.00   Min.   :  7.0   Min.   : 1.700   Min.   :57.00  
##  1st Qu.: 18.00   1st Qu.:115.8   1st Qu.: 7.400   1st Qu.:73.00  
##  Median : 31.50   Median :205.0   Median : 9.700   Median :79.00  
##  Mean   : 42.13   Mean   :185.9   Mean   : 9.806   Mean   :78.28  
##  3rd Qu.: 63.25   3rd Qu.:258.8   3rd Qu.:11.500   3rd Qu.:85.00  
##  Max.   :168.00   Max.   :334.0   Max.   :20.700   Max.   :97.00  
##  NA's   :37       NA's   :7       NA's   :7        NA's   :5

Apparently Ozone is the variable with the most missing datapoints. Below we are going to dig deeper into the missing data patterns.

Quick classification of missing data

As we learned in class lectures, there are three types of missing data:

  • MCAR (missing completely at random): The desirable scenario although often not the case for our data.
  • MAR (missing at random):
  • MNAR (missing not at random): This is a more serious issue and in this case it might be wise to check the data gathering process further and try to understand why the information is missing. For instance, if most of the people in a survey did not answer a certain question, why did they do that? Was the question unclear?

Assuming data is MCAR, too much missing data can be a problem too. Usually a safe maximum threshold is 5% of the total for large datasets. If missing data for a certain feature or sample is more than 5% then you probably should leave that feature or sample out. We therefore check for features (columns) and samples (rows) where more than 5% of the data is missing using a simple function.

pMiss <- function(x){sum(is.na(x))/length(x)*100}
apply(data,2,pMiss)
##     Ozone   Solar.R      Wind      Temp 
## 24.183007  4.575163  4.575163  3.267974
apply(data,1,pMiss)
##   [1]  25  25  25  50 100  50  25  25  25  50  25   0   0   0   0   0   0   0
##  [19]   0   0   0   0   0   0  25  25  50   0   0   0   0  25  25  25  25  25
##  [37]  25   0  25   0   0  25  25   0  25  25   0   0   0   0   0  25  25  25
##  [55]  25  25  25  25  25  25  25   0   0   0  25   0   0   0   0   0   0  25
##  [73]   0   0  25   0   0   0   0   0   0   0  25  25   0   0   0   0   0   0
##  [91]   0   0   0   0   0  25  25  25   0   0   0  25  25   0   0   0  25   0
## [109]   0   0   0   0   0   0  25   0   0   0  25   0   0   0   0   0   0   0
## [127]   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
## [145]   0   0   0   0   0  25   0   0   0

We see that Ozone is missing almost 25% of the datapoints. The other variables are below the 5% threshold.

Using mice for looking at missing data pattern

The mice package provides a nice function md.pattern() to get a better understanding of the pattern of missing data

library(mice)
## 
## Attaching package: 'mice'
## The following object is masked from 'package:stats':
## 
##     filter
## The following objects are masked from 'package:base':
## 
##     cbind, rbind
md.pattern(data)

##     Temp Solar.R Wind Ozone   
## 104    1       1    1     1  0
## 34     1       1    1     0  1
## 3      1       1    0     1  1
## 1      1       1    0     0  2
## 4      1       0    1     1  1
## 1      1       0    1     0  2
## 1      1       0    0     1  2
## 3      0       1    1     1  1
## 1      0       1    0     1  2
## 1      0       0    0     0  4
##        5       7    7    37 56

The output tells us that 104 samples are complete, 34 samples miss only the Ozone measurement, 4 samples miss only the Solar.R value and so on.

The accompanied plot also contains some useful information:

  • Each column represents a variable in your dataset (Temp, Solar.R, Wind, Ozone).
  • Each row represents a unique pattern of missingness, where the blue squares indicate observed (non-missing) values and the red squares indicate missing values.
  • The numbers on the left side show how many cases (rows in your dataset) have the corresponding pattern of missingness.
  • The numbers at the bottom show how many missing values there are for each variable.

A perhaps more helpful visual representation can be obtained using the VIM package as follows

library(VIM)
## Loading required package: colorspace
## Loading required package: grid
## VIM is ready to use.
## Suggestions and bug-reports can be submitted at: https://github.com/statistikat/VIM/issues
## 
## Attaching package: 'VIM'
## The following object is masked from 'package:datasets':
## 
##     sleep
aggr_plot <- aggr(data, col=c('navyblue','red'), numbers=TRUE, sortVars=TRUE, labels=names(data), cex.axis=.7, gap=3, ylab=c("Histogram of missing data","Pattern"))

## 
##  Variables sorted by number of missings: 
##  Variable      Count
##     Ozone 0.24183007
##   Solar.R 0.04575163
##      Wind 0.04575163
##      Temp 0.03267974

The plot helps us understanding that almost 70% of the samples are not missing any information, 22% are missing the Ozone value, and the remaining ones show other missing patterns. Through this approach the situation looks a bit clearer in my opinion.

Another (hopefully) helpful visual approach is a special box plot.

marginplot(data[c(1,2)])

Obviously here we are constrained at plotting 2 variables at a time only, but nevertheless we can gather some interesting insights.

The red box plot on the left shows the distribution of Solar.R with Ozone missing while the blue box plot shows the distribution of the remaining datapoints. Likewhise for the Ozone box plots at the bottom of the graph.

If our assumption of MCAR data is correct, then we expect the red and blue box plots to be very similar.

Imputing the missing data

The mice() function takes care of the imputing process

tempData <- mice(data,m=5,maxit=50,meth='pmm',seed=500)
## 
##  iter imp variable
##   1   1  Ozone  Solar.R  Wind  Temp
##   1   2  Ozone  Solar.R  Wind  Temp
##   1   3  Ozone  Solar.R  Wind  Temp
##   1   4  Ozone  Solar.R  Wind  Temp
##   1   5  Ozone  Solar.R  Wind  Temp
##   2   1  Ozone  Solar.R  Wind  Temp
##   2   2  Ozone  Solar.R  Wind  Temp
##   2   3  Ozone  Solar.R  Wind  Temp
##   2   4  Ozone  Solar.R  Wind  Temp
##   2   5  Ozone  Solar.R  Wind  Temp
##   3   1  Ozone  Solar.R  Wind  Temp
##   3   2  Ozone  Solar.R  Wind  Temp
##   3   3  Ozone  Solar.R  Wind  Temp
##   3   4  Ozone  Solar.R  Wind  Temp
##   3   5  Ozone  Solar.R  Wind  Temp
##   4   1  Ozone  Solar.R  Wind  Temp
##   4   2  Ozone  Solar.R  Wind  Temp
##   4   3  Ozone  Solar.R  Wind  Temp
##   4   4  Ozone  Solar.R  Wind  Temp
##   4   5  Ozone  Solar.R  Wind  Temp
##   5   1  Ozone  Solar.R  Wind  Temp
##   5   2  Ozone  Solar.R  Wind  Temp
##   5   3  Ozone  Solar.R  Wind  Temp
##   5   4  Ozone  Solar.R  Wind  Temp
##   5   5  Ozone  Solar.R  Wind  Temp
##   6   1  Ozone  Solar.R  Wind  Temp
##   6   2  Ozone  Solar.R  Wind  Temp
##   6   3  Ozone  Solar.R  Wind  Temp
##   6   4  Ozone  Solar.R  Wind  Temp
##   6   5  Ozone  Solar.R  Wind  Temp
##   7   1  Ozone  Solar.R  Wind  Temp
##   7   2  Ozone  Solar.R  Wind  Temp
##   7   3  Ozone  Solar.R  Wind  Temp
##   7   4  Ozone  Solar.R  Wind  Temp
##   7   5  Ozone  Solar.R  Wind  Temp
##   8   1  Ozone  Solar.R  Wind  Temp
##   8   2  Ozone  Solar.R  Wind  Temp
##   8   3  Ozone  Solar.R  Wind  Temp
##   8   4  Ozone  Solar.R  Wind  Temp
##   8   5  Ozone  Solar.R  Wind  Temp
##   9   1  Ozone  Solar.R  Wind  Temp
##   9   2  Ozone  Solar.R  Wind  Temp
##   9   3  Ozone  Solar.R  Wind  Temp
##   9   4  Ozone  Solar.R  Wind  Temp
##   9   5  Ozone  Solar.R  Wind  Temp
##   10   1  Ozone  Solar.R  Wind  Temp
##   10   2  Ozone  Solar.R  Wind  Temp
##   10   3  Ozone  Solar.R  Wind  Temp
##   10   4  Ozone  Solar.R  Wind  Temp
##   10   5  Ozone  Solar.R  Wind  Temp
##   11   1  Ozone  Solar.R  Wind  Temp
##   11   2  Ozone  Solar.R  Wind  Temp
##   11   3  Ozone  Solar.R  Wind  Temp
##   11   4  Ozone  Solar.R  Wind  Temp
##   11   5  Ozone  Solar.R  Wind  Temp
##   12   1  Ozone  Solar.R  Wind  Temp
##   12   2  Ozone  Solar.R  Wind  Temp
##   12   3  Ozone  Solar.R  Wind  Temp
##   12   4  Ozone  Solar.R  Wind  Temp
##   12   5  Ozone  Solar.R  Wind  Temp
##   13   1  Ozone  Solar.R  Wind  Temp
##   13   2  Ozone  Solar.R  Wind  Temp
##   13   3  Ozone  Solar.R  Wind  Temp
##   13   4  Ozone  Solar.R  Wind  Temp
##   13   5  Ozone  Solar.R  Wind  Temp
##   14   1  Ozone  Solar.R  Wind  Temp
##   14   2  Ozone  Solar.R  Wind  Temp
##   14   3  Ozone  Solar.R  Wind  Temp
##   14   4  Ozone  Solar.R  Wind  Temp
##   14   5  Ozone  Solar.R  Wind  Temp
##   15   1  Ozone  Solar.R  Wind  Temp
##   15   2  Ozone  Solar.R  Wind  Temp
##   15   3  Ozone  Solar.R  Wind  Temp
##   15   4  Ozone  Solar.R  Wind  Temp
##   15   5  Ozone  Solar.R  Wind  Temp
##   16   1  Ozone  Solar.R  Wind  Temp
##   16   2  Ozone  Solar.R  Wind  Temp
##   16   3  Ozone  Solar.R  Wind  Temp
##   16   4  Ozone  Solar.R  Wind  Temp
##   16   5  Ozone  Solar.R  Wind  Temp
##   17   1  Ozone  Solar.R  Wind  Temp
##   17   2  Ozone  Solar.R  Wind  Temp
##   17   3  Ozone  Solar.R  Wind  Temp
##   17   4  Ozone  Solar.R  Wind  Temp
##   17   5  Ozone  Solar.R  Wind  Temp
##   18   1  Ozone  Solar.R  Wind  Temp
##   18   2  Ozone  Solar.R  Wind  Temp
##   18   3  Ozone  Solar.R  Wind  Temp
##   18   4  Ozone  Solar.R  Wind  Temp
##   18   5  Ozone  Solar.R  Wind  Temp
##   19   1  Ozone  Solar.R  Wind  Temp
##   19   2  Ozone  Solar.R  Wind  Temp
##   19   3  Ozone  Solar.R  Wind  Temp
##   19   4  Ozone  Solar.R  Wind  Temp
##   19   5  Ozone  Solar.R  Wind  Temp
##   20   1  Ozone  Solar.R  Wind  Temp
##   20   2  Ozone  Solar.R  Wind  Temp
##   20   3  Ozone  Solar.R  Wind  Temp
##   20   4  Ozone  Solar.R  Wind  Temp
##   20   5  Ozone  Solar.R  Wind  Temp
##   21   1  Ozone  Solar.R  Wind  Temp
##   21   2  Ozone  Solar.R  Wind  Temp
##   21   3  Ozone  Solar.R  Wind  Temp
##   21   4  Ozone  Solar.R  Wind  Temp
##   21   5  Ozone  Solar.R  Wind  Temp
##   22   1  Ozone  Solar.R  Wind  Temp
##   22   2  Ozone  Solar.R  Wind  Temp
##   22   3  Ozone  Solar.R  Wind  Temp
##   22   4  Ozone  Solar.R  Wind  Temp
##   22   5  Ozone  Solar.R  Wind  Temp
##   23   1  Ozone  Solar.R  Wind  Temp
##   23   2  Ozone  Solar.R  Wind  Temp
##   23   3  Ozone  Solar.R  Wind  Temp
##   23   4  Ozone  Solar.R  Wind  Temp
##   23   5  Ozone  Solar.R  Wind  Temp
##   24   1  Ozone  Solar.R  Wind  Temp
##   24   2  Ozone  Solar.R  Wind  Temp
##   24   3  Ozone  Solar.R  Wind  Temp
##   24   4  Ozone  Solar.R  Wind  Temp
##   24   5  Ozone  Solar.R  Wind  Temp
##   25   1  Ozone  Solar.R  Wind  Temp
##   25   2  Ozone  Solar.R  Wind  Temp
##   25   3  Ozone  Solar.R  Wind  Temp
##   25   4  Ozone  Solar.R  Wind  Temp
##   25   5  Ozone  Solar.R  Wind  Temp
##   26   1  Ozone  Solar.R  Wind  Temp
##   26   2  Ozone  Solar.R  Wind  Temp
##   26   3  Ozone  Solar.R  Wind  Temp
##   26   4  Ozone  Solar.R  Wind  Temp
##   26   5  Ozone  Solar.R  Wind  Temp
##   27   1  Ozone  Solar.R  Wind  Temp
##   27   2  Ozone  Solar.R  Wind  Temp
##   27   3  Ozone  Solar.R  Wind  Temp
##   27   4  Ozone  Solar.R  Wind  Temp
##   27   5  Ozone  Solar.R  Wind  Temp
##   28   1  Ozone  Solar.R  Wind  Temp
##   28   2  Ozone  Solar.R  Wind  Temp
##   28   3  Ozone  Solar.R  Wind  Temp
##   28   4  Ozone  Solar.R  Wind  Temp
##   28   5  Ozone  Solar.R  Wind  Temp
##   29   1  Ozone  Solar.R  Wind  Temp
##   29   2  Ozone  Solar.R  Wind  Temp
##   29   3  Ozone  Solar.R  Wind  Temp
##   29   4  Ozone  Solar.R  Wind  Temp
##   29   5  Ozone  Solar.R  Wind  Temp
##   30   1  Ozone  Solar.R  Wind  Temp
##   30   2  Ozone  Solar.R  Wind  Temp
##   30   3  Ozone  Solar.R  Wind  Temp
##   30   4  Ozone  Solar.R  Wind  Temp
##   30   5  Ozone  Solar.R  Wind  Temp
##   31   1  Ozone  Solar.R  Wind  Temp
##   31   2  Ozone  Solar.R  Wind  Temp
##   31   3  Ozone  Solar.R  Wind  Temp
##   31   4  Ozone  Solar.R  Wind  Temp
##   31   5  Ozone  Solar.R  Wind  Temp
##   32   1  Ozone  Solar.R  Wind  Temp
##   32   2  Ozone  Solar.R  Wind  Temp
##   32   3  Ozone  Solar.R  Wind  Temp
##   32   4  Ozone  Solar.R  Wind  Temp
##   32   5  Ozone  Solar.R  Wind  Temp
##   33   1  Ozone  Solar.R  Wind  Temp
##   33   2  Ozone  Solar.R  Wind  Temp
##   33   3  Ozone  Solar.R  Wind  Temp
##   33   4  Ozone  Solar.R  Wind  Temp
##   33   5  Ozone  Solar.R  Wind  Temp
##   34   1  Ozone  Solar.R  Wind  Temp
##   34   2  Ozone  Solar.R  Wind  Temp
##   34   3  Ozone  Solar.R  Wind  Temp
##   34   4  Ozone  Solar.R  Wind  Temp
##   34   5  Ozone  Solar.R  Wind  Temp
##   35   1  Ozone  Solar.R  Wind  Temp
##   35   2  Ozone  Solar.R  Wind  Temp
##   35   3  Ozone  Solar.R  Wind  Temp
##   35   4  Ozone  Solar.R  Wind  Temp
##   35   5  Ozone  Solar.R  Wind  Temp
##   36   1  Ozone  Solar.R  Wind  Temp
##   36   2  Ozone  Solar.R  Wind  Temp
##   36   3  Ozone  Solar.R  Wind  Temp
##   36   4  Ozone  Solar.R  Wind  Temp
##   36   5  Ozone  Solar.R  Wind  Temp
##   37   1  Ozone  Solar.R  Wind  Temp
##   37   2  Ozone  Solar.R  Wind  Temp
##   37   3  Ozone  Solar.R  Wind  Temp
##   37   4  Ozone  Solar.R  Wind  Temp
##   37   5  Ozone  Solar.R  Wind  Temp
##   38   1  Ozone  Solar.R  Wind  Temp
##   38   2  Ozone  Solar.R  Wind  Temp
##   38   3  Ozone  Solar.R  Wind  Temp
##   38   4  Ozone  Solar.R  Wind  Temp
##   38   5  Ozone  Solar.R  Wind  Temp
##   39   1  Ozone  Solar.R  Wind  Temp
##   39   2  Ozone  Solar.R  Wind  Temp
##   39   3  Ozone  Solar.R  Wind  Temp
##   39   4  Ozone  Solar.R  Wind  Temp
##   39   5  Ozone  Solar.R  Wind  Temp
##   40   1  Ozone  Solar.R  Wind  Temp
##   40   2  Ozone  Solar.R  Wind  Temp
##   40   3  Ozone  Solar.R  Wind  Temp
##   40   4  Ozone  Solar.R  Wind  Temp
##   40   5  Ozone  Solar.R  Wind  Temp
##   41   1  Ozone  Solar.R  Wind  Temp
##   41   2  Ozone  Solar.R  Wind  Temp
##   41   3  Ozone  Solar.R  Wind  Temp
##   41   4  Ozone  Solar.R  Wind  Temp
##   41   5  Ozone  Solar.R  Wind  Temp
##   42   1  Ozone  Solar.R  Wind  Temp
##   42   2  Ozone  Solar.R  Wind  Temp
##   42   3  Ozone  Solar.R  Wind  Temp
##   42   4  Ozone  Solar.R  Wind  Temp
##   42   5  Ozone  Solar.R  Wind  Temp
##   43   1  Ozone  Solar.R  Wind  Temp
##   43   2  Ozone  Solar.R  Wind  Temp
##   43   3  Ozone  Solar.R  Wind  Temp
##   43   4  Ozone  Solar.R  Wind  Temp
##   43   5  Ozone  Solar.R  Wind  Temp
##   44   1  Ozone  Solar.R  Wind  Temp
##   44   2  Ozone  Solar.R  Wind  Temp
##   44   3  Ozone  Solar.R  Wind  Temp
##   44   4  Ozone  Solar.R  Wind  Temp
##   44   5  Ozone  Solar.R  Wind  Temp
##   45   1  Ozone  Solar.R  Wind  Temp
##   45   2  Ozone  Solar.R  Wind  Temp
##   45   3  Ozone  Solar.R  Wind  Temp
##   45   4  Ozone  Solar.R  Wind  Temp
##   45   5  Ozone  Solar.R  Wind  Temp
##   46   1  Ozone  Solar.R  Wind  Temp
##   46   2  Ozone  Solar.R  Wind  Temp
##   46   3  Ozone  Solar.R  Wind  Temp
##   46   4  Ozone  Solar.R  Wind  Temp
##   46   5  Ozone  Solar.R  Wind  Temp
##   47   1  Ozone  Solar.R  Wind  Temp
##   47   2  Ozone  Solar.R  Wind  Temp
##   47   3  Ozone  Solar.R  Wind  Temp
##   47   4  Ozone  Solar.R  Wind  Temp
##   47   5  Ozone  Solar.R  Wind  Temp
##   48   1  Ozone  Solar.R  Wind  Temp
##   48   2  Ozone  Solar.R  Wind  Temp
##   48   3  Ozone  Solar.R  Wind  Temp
##   48   4  Ozone  Solar.R  Wind  Temp
##   48   5  Ozone  Solar.R  Wind  Temp
##   49   1  Ozone  Solar.R  Wind  Temp
##   49   2  Ozone  Solar.R  Wind  Temp
##   49   3  Ozone  Solar.R  Wind  Temp
##   49   4  Ozone  Solar.R  Wind  Temp
##   49   5  Ozone  Solar.R  Wind  Temp
##   50   1  Ozone  Solar.R  Wind  Temp
##   50   2  Ozone  Solar.R  Wind  Temp
##   50   3  Ozone  Solar.R  Wind  Temp
##   50   4  Ozone  Solar.R  Wind  Temp
##   50   5  Ozone  Solar.R  Wind  Temp
summary(tempData)
## Class: mids
## Number of multiple imputations:  5 
## Imputation methods:
##   Ozone Solar.R    Wind    Temp 
##   "pmm"   "pmm"   "pmm"   "pmm" 
## PredictorMatrix:
##         Ozone Solar.R Wind Temp
## Ozone       0       1    1    1
## Solar.R     1       0    1    1
## Wind        1       1    0    1
## Temp        1       1    1    0

A couple of notes on the parameters:

  • m=5 refers to the number of imputed datasets. Five is the default value.
  • meth='pmm' refers to the imputation method. In this case we are using predictive mean matching as imputation method. Other imputation methods can be used, type methods(mice) for a list of the available imputation methods.

If you would like to check the imputed data, for instance for the variable Ozone, you need to enter the following line of code

tempData$imp$Ozone
##      1   2   3   4   5
## 5   13  19  12 115  63
## 10  30  12  13  21   7
## 25   8  28   6  18  28
## 26   9  32   4  18  37
## 27  37  21   4  32  32
## 32  40  39  35  32  47
## 33  44  28  36  52  20
## 34  20  23  37  37  19
## 35  32  28  16  32  35
## 36  89  80  48  49 115
## 37  18   7  16  30  22
## 39  96  77 135  76  85
## 42  50 168  64  50  41
## 43  96  78  96  96  78
## 45  63  20  18  24  31
## 46  71  37  20  20  28
## 52  20  35  37  63  63
## 53  16  78  73  48 115
## 54  59  35  46  44  23
## 55  16  39  28  40  49
## 56  24  36  52  21  44
## 57  36  20  20  18  23
## 58  11  11  24   7  23
## 59  44  13  23  23  27
## 60  23   4  19   4  32
## 61  44  16  46  37  35
## 65  30  23  65  30  30
## 72  45  37  63  63  44
## 75  39  46  32  39  28
## 83  37  40  59  32  35
## 84  40  59  28  28  35
## 102 61  85  96  79  78
## 103 31  59  20  31  36
## 107 32  24  11  21  21
## 115 52  16  11  14  13
## 119 78  96 168  76  50
## 150 14  12  13  23  11

The output shows the imputed data for each observation (first column left) within each imputed dataset (first row at the top).

If you need to check the imputation method used for each variable, mice makes it very easy to do.

tempData$meth
##   Ozone Solar.R    Wind    Temp 
##   "pmm"   "pmm"   "pmm"   "pmm"

Now we can get back the completed dataset using the complete() function. It is almost plain English:

completedData <- complete(tempData,1)

The missing values have been replaced with the imputed values in the first of the five datasets. If you wish to use another one, just change the second parameter in the complete() function.

Inspecting the distribution of original and imputed data

Let’s compare the distributions of original and imputed data using a some useful plots.

First of all we can use a scatterplot and plot Ozone against all the other variables

xyplot(tempData,Ozone ~ Wind+Temp+Solar.R,pch=18,cex=1)

What we would like to see is that the shape of the magenta points (imputed) matches the shape of the blue ones (observed). The matching shape tells us that the imputed values are indeed “plausible values”.

Another helpful plot is the density plot:

densityplot(tempData)

The density of the imputed data for each imputed dataset is showed in pink while the density of the observed data is showed in blue. Again, under our previous assumptions we expect the distributions to be similar.

Another useful visual take on the distributions can be obtained using the stripplot() function that shows the distributions of the variables as individual points.

stripplot(tempData, pch = 20, cex = 1.2)

Suppose that the next step in our analysis is to fit a linear model to the data. You may ask what imputed dataset to choose. The mice package makes it again very easy to fit a a model to each of the imputed dataset and then pool the results together.

modelFit1 <- with(tempData,lm(Temp~ Ozone+Solar.R+Wind))
pool(modelFit1)
## Class: mipo    m = 5 
##          term m    estimate         ubar            b            t dfcom
## 1 (Intercept) 5 72.70719792 7.093339e+00 4.431440e-01 7.625111e+00   149
## 2       Ozone 5  0.15924872 5.268255e-04 1.206098e-04 6.715573e-04   149
## 3     Solar.R 5  0.01252384 4.396201e-05 2.612657e-05 7.531389e-05   149
## 4        Wind 5 -0.34547006 4.067195e-02 2.113942e-03 4.320868e-02   149
##          df        riv     lambda        fmi
## 1 117.27936 0.07496792 0.06973968 0.08520801
## 2  49.30693 0.27472436 0.21551668 0.24551207
## 3  18.19046 0.71315866 0.41628290 0.47137535
## 4 123.65905 0.06237052 0.05870882 0.07357220
# Get the summary
summary(pool(modelFit1))
##          term    estimate   std.error statistic        df      p.value
## 1 (Intercept) 72.70719792 2.761360433 26.330209 117.27936 4.655973e-51
## 2       Ozone  0.15924872 0.025914423  6.145177  49.30693 1.367056e-07
## 3     Solar.R  0.01252384 0.008678358  1.443112  18.19046 1.659893e-01
## 4        Wind -0.34547006 0.207866970 -1.661977 123.65905 9.905045e-02

The variable modelFit1 contains the results of the fitting performed over the imputed datasets, while the pool() function pools them all together. Apparently, only the Ozone variable is statistically significant.

Note that there are other columns aside from those typical of the lm() model: fmi contains the fraction of missing information while lambda is the proportion of total variance that is attributable to the missing data. For more information I suggest to check out the paper by Stef van Buuren: link to the paper.

Remember that we initialized the mice function with a specific seed (seed = 500), therefore the results are somewhat dependent on our initial choice. To reduce this effect, we can impute a higher number of dataset, by changing the default m=5 parameter in the mice() function as follows.

tempData2 <- mice(data,m=50,seed=245435)
## 
##  iter imp variable
##   1   1  Ozone  Solar.R  Wind  Temp
##   1   2  Ozone  Solar.R  Wind  Temp
##   1   3  Ozone  Solar.R  Wind  Temp
##   1   4  Ozone  Solar.R  Wind  Temp
##   1   5  Ozone  Solar.R  Wind  Temp
##   1   6  Ozone  Solar.R  Wind  Temp
##   1   7  Ozone  Solar.R  Wind  Temp
##   1   8  Ozone  Solar.R  Wind  Temp
##   1   9  Ozone  Solar.R  Wind  Temp
##   1   10  Ozone  Solar.R  Wind  Temp
##   1   11  Ozone  Solar.R  Wind  Temp
##   1   12  Ozone  Solar.R  Wind  Temp
##   1   13  Ozone  Solar.R  Wind  Temp
##   1   14  Ozone  Solar.R  Wind  Temp
##   1   15  Ozone  Solar.R  Wind  Temp
##   1   16  Ozone  Solar.R  Wind  Temp
##   1   17  Ozone  Solar.R  Wind  Temp
##   1   18  Ozone  Solar.R  Wind  Temp
##   1   19  Ozone  Solar.R  Wind  Temp
##   1   20  Ozone  Solar.R  Wind  Temp
##   1   21  Ozone  Solar.R  Wind  Temp
##   1   22  Ozone  Solar.R  Wind  Temp
##   1   23  Ozone  Solar.R  Wind  Temp
##   1   24  Ozone  Solar.R  Wind  Temp
##   1   25  Ozone  Solar.R  Wind  Temp
##   1   26  Ozone  Solar.R  Wind  Temp
##   1   27  Ozone  Solar.R  Wind  Temp
##   1   28  Ozone  Solar.R  Wind  Temp
##   1   29  Ozone  Solar.R  Wind  Temp
##   1   30  Ozone  Solar.R  Wind  Temp
##   1   31  Ozone  Solar.R  Wind  Temp
##   1   32  Ozone  Solar.R  Wind  Temp
##   1   33  Ozone  Solar.R  Wind  Temp
##   1   34  Ozone  Solar.R  Wind  Temp
##   1   35  Ozone  Solar.R  Wind  Temp
##   1   36  Ozone  Solar.R  Wind  Temp
##   1   37  Ozone  Solar.R  Wind  Temp
##   1   38  Ozone  Solar.R  Wind  Temp
##   1   39  Ozone  Solar.R  Wind  Temp
##   1   40  Ozone  Solar.R  Wind  Temp
##   1   41  Ozone  Solar.R  Wind  Temp
##   1   42  Ozone  Solar.R  Wind  Temp
##   1   43  Ozone  Solar.R  Wind  Temp
##   1   44  Ozone  Solar.R  Wind  Temp
##   1   45  Ozone  Solar.R  Wind  Temp
##   1   46  Ozone  Solar.R  Wind  Temp
##   1   47  Ozone  Solar.R  Wind  Temp
##   1   48  Ozone  Solar.R  Wind  Temp
##   1   49  Ozone  Solar.R  Wind  Temp
##   1   50  Ozone  Solar.R  Wind  Temp
##   2   1  Ozone  Solar.R  Wind  Temp
##   2   2  Ozone  Solar.R  Wind  Temp
##   2   3  Ozone  Solar.R  Wind  Temp
##   2   4  Ozone  Solar.R  Wind  Temp
##   2   5  Ozone  Solar.R  Wind  Temp
##   2   6  Ozone  Solar.R  Wind  Temp
##   2   7  Ozone  Solar.R  Wind  Temp
##   2   8  Ozone  Solar.R  Wind  Temp
##   2   9  Ozone  Solar.R  Wind  Temp
##   2   10  Ozone  Solar.R  Wind  Temp
##   2   11  Ozone  Solar.R  Wind  Temp
##   2   12  Ozone  Solar.R  Wind  Temp
##   2   13  Ozone  Solar.R  Wind  Temp
##   2   14  Ozone  Solar.R  Wind  Temp
##   2   15  Ozone  Solar.R  Wind  Temp
##   2   16  Ozone  Solar.R  Wind  Temp
##   2   17  Ozone  Solar.R  Wind  Temp
##   2   18  Ozone  Solar.R  Wind  Temp
##   2   19  Ozone  Solar.R  Wind  Temp
##   2   20  Ozone  Solar.R  Wind  Temp
##   2   21  Ozone  Solar.R  Wind  Temp
##   2   22  Ozone  Solar.R  Wind  Temp
##   2   23  Ozone  Solar.R  Wind  Temp
##   2   24  Ozone  Solar.R  Wind  Temp
##   2   25  Ozone  Solar.R  Wind  Temp
##   2   26  Ozone  Solar.R  Wind  Temp
##   2   27  Ozone  Solar.R  Wind  Temp
##   2   28  Ozone  Solar.R  Wind  Temp
##   2   29  Ozone  Solar.R  Wind  Temp
##   2   30  Ozone  Solar.R  Wind  Temp
##   2   31  Ozone  Solar.R  Wind  Temp
##   2   32  Ozone  Solar.R  Wind  Temp
##   2   33  Ozone  Solar.R  Wind  Temp
##   2   34  Ozone  Solar.R  Wind  Temp
##   2   35  Ozone  Solar.R  Wind  Temp
##   2   36  Ozone  Solar.R  Wind  Temp
##   2   37  Ozone  Solar.R  Wind  Temp
##   2   38  Ozone  Solar.R  Wind  Temp
##   2   39  Ozone  Solar.R  Wind  Temp
##   2   40  Ozone  Solar.R  Wind  Temp
##   2   41  Ozone  Solar.R  Wind  Temp
##   2   42  Ozone  Solar.R  Wind  Temp
##   2   43  Ozone  Solar.R  Wind  Temp
##   2   44  Ozone  Solar.R  Wind  Temp
##   2   45  Ozone  Solar.R  Wind  Temp
##   2   46  Ozone  Solar.R  Wind  Temp
##   2   47  Ozone  Solar.R  Wind  Temp
##   2   48  Ozone  Solar.R  Wind  Temp
##   2   49  Ozone  Solar.R  Wind  Temp
##   2   50  Ozone  Solar.R  Wind  Temp
##   3   1  Ozone  Solar.R  Wind  Temp
##   3   2  Ozone  Solar.R  Wind  Temp
##   3   3  Ozone  Solar.R  Wind  Temp
##   3   4  Ozone  Solar.R  Wind  Temp
##   3   5  Ozone  Solar.R  Wind  Temp
##   3   6  Ozone  Solar.R  Wind  Temp
##   3   7  Ozone  Solar.R  Wind  Temp
##   3   8  Ozone  Solar.R  Wind  Temp
##   3   9  Ozone  Solar.R  Wind  Temp
##   3   10  Ozone  Solar.R  Wind  Temp
##   3   11  Ozone  Solar.R  Wind  Temp
##   3   12  Ozone  Solar.R  Wind  Temp
##   3   13  Ozone  Solar.R  Wind  Temp
##   3   14  Ozone  Solar.R  Wind  Temp
##   3   15  Ozone  Solar.R  Wind  Temp
##   3   16  Ozone  Solar.R  Wind  Temp
##   3   17  Ozone  Solar.R  Wind  Temp
##   3   18  Ozone  Solar.R  Wind  Temp
##   3   19  Ozone  Solar.R  Wind  Temp
##   3   20  Ozone  Solar.R  Wind  Temp
##   3   21  Ozone  Solar.R  Wind  Temp
##   3   22  Ozone  Solar.R  Wind  Temp
##   3   23  Ozone  Solar.R  Wind  Temp
##   3   24  Ozone  Solar.R  Wind  Temp
##   3   25  Ozone  Solar.R  Wind  Temp
##   3   26  Ozone  Solar.R  Wind  Temp
##   3   27  Ozone  Solar.R  Wind  Temp
##   3   28  Ozone  Solar.R  Wind  Temp
##   3   29  Ozone  Solar.R  Wind  Temp
##   3   30  Ozone  Solar.R  Wind  Temp
##   3   31  Ozone  Solar.R  Wind  Temp
##   3   32  Ozone  Solar.R  Wind  Temp
##   3   33  Ozone  Solar.R  Wind  Temp
##   3   34  Ozone  Solar.R  Wind  Temp
##   3   35  Ozone  Solar.R  Wind  Temp
##   3   36  Ozone  Solar.R  Wind  Temp
##   3   37  Ozone  Solar.R  Wind  Temp
##   3   38  Ozone  Solar.R  Wind  Temp
##   3   39  Ozone  Solar.R  Wind  Temp
##   3   40  Ozone  Solar.R  Wind  Temp
##   3   41  Ozone  Solar.R  Wind  Temp
##   3   42  Ozone  Solar.R  Wind  Temp
##   3   43  Ozone  Solar.R  Wind  Temp
##   3   44  Ozone  Solar.R  Wind  Temp
##   3   45  Ozone  Solar.R  Wind  Temp
##   3   46  Ozone  Solar.R  Wind  Temp
##   3   47  Ozone  Solar.R  Wind  Temp
##   3   48  Ozone  Solar.R  Wind  Temp
##   3   49  Ozone  Solar.R  Wind  Temp
##   3   50  Ozone  Solar.R  Wind  Temp
##   4   1  Ozone  Solar.R  Wind  Temp
##   4   2  Ozone  Solar.R  Wind  Temp
##   4   3  Ozone  Solar.R  Wind  Temp
##   4   4  Ozone  Solar.R  Wind  Temp
##   4   5  Ozone  Solar.R  Wind  Temp
##   4   6  Ozone  Solar.R  Wind  Temp
##   4   7  Ozone  Solar.R  Wind  Temp
##   4   8  Ozone  Solar.R  Wind  Temp
##   4   9  Ozone  Solar.R  Wind  Temp
##   4   10  Ozone  Solar.R  Wind  Temp
##   4   11  Ozone  Solar.R  Wind  Temp
##   4   12  Ozone  Solar.R  Wind  Temp
##   4   13  Ozone  Solar.R  Wind  Temp
##   4   14  Ozone  Solar.R  Wind  Temp
##   4   15  Ozone  Solar.R  Wind  Temp
##   4   16  Ozone  Solar.R  Wind  Temp
##   4   17  Ozone  Solar.R  Wind  Temp
##   4   18  Ozone  Solar.R  Wind  Temp
##   4   19  Ozone  Solar.R  Wind  Temp
##   4   20  Ozone  Solar.R  Wind  Temp
##   4   21  Ozone  Solar.R  Wind  Temp
##   4   22  Ozone  Solar.R  Wind  Temp
##   4   23  Ozone  Solar.R  Wind  Temp
##   4   24  Ozone  Solar.R  Wind  Temp
##   4   25  Ozone  Solar.R  Wind  Temp
##   4   26  Ozone  Solar.R  Wind  Temp
##   4   27  Ozone  Solar.R  Wind  Temp
##   4   28  Ozone  Solar.R  Wind  Temp
##   4   29  Ozone  Solar.R  Wind  Temp
##   4   30  Ozone  Solar.R  Wind  Temp
##   4   31  Ozone  Solar.R  Wind  Temp
##   4   32  Ozone  Solar.R  Wind  Temp
##   4   33  Ozone  Solar.R  Wind  Temp
##   4   34  Ozone  Solar.R  Wind  Temp
##   4   35  Ozone  Solar.R  Wind  Temp
##   4   36  Ozone  Solar.R  Wind  Temp
##   4   37  Ozone  Solar.R  Wind  Temp
##   4   38  Ozone  Solar.R  Wind  Temp
##   4   39  Ozone  Solar.R  Wind  Temp
##   4   40  Ozone  Solar.R  Wind  Temp
##   4   41  Ozone  Solar.R  Wind  Temp
##   4   42  Ozone  Solar.R  Wind  Temp
##   4   43  Ozone  Solar.R  Wind  Temp
##   4   44  Ozone  Solar.R  Wind  Temp
##   4   45  Ozone  Solar.R  Wind  Temp
##   4   46  Ozone  Solar.R  Wind  Temp
##   4   47  Ozone  Solar.R  Wind  Temp
##   4   48  Ozone  Solar.R  Wind  Temp
##   4   49  Ozone  Solar.R  Wind  Temp
##   4   50  Ozone  Solar.R  Wind  Temp
##   5   1  Ozone  Solar.R  Wind  Temp
##   5   2  Ozone  Solar.R  Wind  Temp
##   5   3  Ozone  Solar.R  Wind  Temp
##   5   4  Ozone  Solar.R  Wind  Temp
##   5   5  Ozone  Solar.R  Wind  Temp
##   5   6  Ozone  Solar.R  Wind  Temp
##   5   7  Ozone  Solar.R  Wind  Temp
##   5   8  Ozone  Solar.R  Wind  Temp
##   5   9  Ozone  Solar.R  Wind  Temp
##   5   10  Ozone  Solar.R  Wind  Temp
##   5   11  Ozone  Solar.R  Wind  Temp
##   5   12  Ozone  Solar.R  Wind  Temp
##   5   13  Ozone  Solar.R  Wind  Temp
##   5   14  Ozone  Solar.R  Wind  Temp
##   5   15  Ozone  Solar.R  Wind  Temp
##   5   16  Ozone  Solar.R  Wind  Temp
##   5   17  Ozone  Solar.R  Wind  Temp
##   5   18  Ozone  Solar.R  Wind  Temp
##   5   19  Ozone  Solar.R  Wind  Temp
##   5   20  Ozone  Solar.R  Wind  Temp
##   5   21  Ozone  Solar.R  Wind  Temp
##   5   22  Ozone  Solar.R  Wind  Temp
##   5   23  Ozone  Solar.R  Wind  Temp
##   5   24  Ozone  Solar.R  Wind  Temp
##   5   25  Ozone  Solar.R  Wind  Temp
##   5   26  Ozone  Solar.R  Wind  Temp
##   5   27  Ozone  Solar.R  Wind  Temp
##   5   28  Ozone  Solar.R  Wind  Temp
##   5   29  Ozone  Solar.R  Wind  Temp
##   5   30  Ozone  Solar.R  Wind  Temp
##   5   31  Ozone  Solar.R  Wind  Temp
##   5   32  Ozone  Solar.R  Wind  Temp
##   5   33  Ozone  Solar.R  Wind  Temp
##   5   34  Ozone  Solar.R  Wind  Temp
##   5   35  Ozone  Solar.R  Wind  Temp
##   5   36  Ozone  Solar.R  Wind  Temp
##   5   37  Ozone  Solar.R  Wind  Temp
##   5   38  Ozone  Solar.R  Wind  Temp
##   5   39  Ozone  Solar.R  Wind  Temp
##   5   40  Ozone  Solar.R  Wind  Temp
##   5   41  Ozone  Solar.R  Wind  Temp
##   5   42  Ozone  Solar.R  Wind  Temp
##   5   43  Ozone  Solar.R  Wind  Temp
##   5   44  Ozone  Solar.R  Wind  Temp
##   5   45  Ozone  Solar.R  Wind  Temp
##   5   46  Ozone  Solar.R  Wind  Temp
##   5   47  Ozone  Solar.R  Wind  Temp
##   5   48  Ozone  Solar.R  Wind  Temp
##   5   49  Ozone  Solar.R  Wind  Temp
##   5   50  Ozone  Solar.R  Wind  Temp
modelFit2 <- with(tempData2,lm(Temp~ Ozone+Solar.R+Wind))
summary(pool(modelFit2))
##          term    estimate   std.error statistic        df      p.value
## 1 (Intercept) 72.60178955 2.915916315 24.898448 105.59368 5.380239e-46
## 2       Ozone  0.16345639 0.026054628  6.273603  99.86352 9.122568e-09
## 3     Solar.R  0.01193645 0.007134344  1.673097 120.41496 9.690433e-02
## 4        Wind -0.33592048 0.222350762 -1.510768 107.41496 1.337838e-01

After having taken into account the random seed initialization, we obtain (in this case) more or less the same results as before with only Ozone showing statistical significance.

Comparing the results

Let’s compare the results.

Let’s use a different dataset. We’ll use the training portion of the Titanic dataset and try to impute missing values for the Age column:

library(titanic)

titanic_train$Age
##   [1] 22.00 38.00 26.00 35.00 35.00    NA 54.00  2.00 27.00 14.00  4.00 58.00
##  [13] 20.00 39.00 14.00 55.00  2.00    NA 31.00    NA 35.00 34.00 15.00 28.00
##  [25]  8.00 38.00    NA 19.00    NA    NA 40.00    NA    NA 66.00 28.00 42.00
##  [37]    NA 21.00 18.00 14.00 40.00 27.00    NA  3.00 19.00    NA    NA    NA
##  [49]    NA 18.00  7.00 21.00 49.00 29.00 65.00    NA 21.00 28.50  5.00 11.00
##  [61] 22.00 38.00 45.00  4.00    NA    NA 29.00 19.00 17.00 26.00 32.00 16.00
##  [73] 21.00 26.00 32.00 25.00    NA    NA  0.83 30.00 22.00 29.00    NA 28.00
##  [85] 17.00 33.00 16.00    NA 23.00 24.00 29.00 20.00 46.00 26.00 59.00    NA
##  [97] 71.00 23.00 34.00 34.00 28.00    NA 21.00 33.00 37.00 28.00 21.00    NA
## [109] 38.00    NA 47.00 14.50 22.00 20.00 17.00 21.00 70.50 29.00 24.00  2.00
## [121] 21.00    NA 32.50 32.50 54.00 12.00    NA 24.00    NA 45.00 33.00 20.00
## [133] 47.00 29.00 25.00 23.00 19.00 37.00 16.00 24.00    NA 22.00 24.00 19.00
## [145] 18.00 19.00 27.00  9.00 36.50 42.00 51.00 22.00 55.50 40.50    NA 51.00
## [157] 16.00 30.00    NA    NA 44.00 40.00 26.00 17.00  1.00  9.00    NA 45.00
## [169]    NA 28.00 61.00  4.00  1.00 21.00 56.00 18.00    NA 50.00 30.00 36.00
## [181]    NA    NA  9.00  1.00  4.00    NA    NA 45.00 40.00 36.00 32.00 19.00
## [193] 19.00  3.00 44.00 58.00    NA 42.00    NA 24.00 28.00    NA 34.00 45.50
## [205] 18.00  2.00 32.00 26.00 16.00 40.00 24.00 35.00 22.00 30.00    NA 31.00
## [217] 27.00 42.00 32.00 30.00 16.00 27.00 51.00    NA 38.00 22.00 19.00 20.50
## [229] 18.00    NA 35.00 29.00 59.00  5.00 24.00    NA 44.00  8.00 19.00 33.00
## [241]    NA    NA 29.00 22.00 30.00 44.00 25.00 24.00 37.00 54.00    NA 29.00
## [253] 62.00 30.00 41.00 29.00    NA 30.00 35.00 50.00    NA  3.00 52.00 40.00
## [265]    NA 36.00 16.00 25.00 58.00 35.00    NA 25.00 41.00 37.00    NA 63.00
## [277] 45.00    NA  7.00 35.00 65.00 28.00 16.00 19.00    NA 33.00 30.00 22.00
## [289] 42.00 22.00 26.00 19.00 36.00 24.00 24.00    NA 23.50  2.00    NA 50.00
## [301]    NA    NA 19.00    NA    NA  0.92    NA 17.00 30.00 30.00 24.00 18.00
## [313] 26.00 28.00 43.00 26.00 24.00 54.00 31.00 40.00 22.00 27.00 30.00 22.00
## [325]    NA 36.00 61.00 36.00 31.00 16.00    NA 45.50 38.00 16.00    NA    NA
## [337] 29.00 41.00 45.00 45.00  2.00 24.00 28.00 25.00 36.00 24.00 40.00    NA
## [349]  3.00 42.00 23.00    NA 15.00 25.00    NA 28.00 22.00 38.00    NA    NA
## [361] 40.00 29.00 45.00 35.00    NA 30.00 60.00    NA    NA 24.00 25.00 18.00
## [373] 19.00 22.00  3.00    NA 22.00 27.00 20.00 19.00 42.00  1.00 32.00 35.00
## [385]    NA 18.00  1.00 36.00    NA 17.00 36.00 21.00 28.00 23.00 24.00 22.00
## [397] 31.00 46.00 23.00 28.00 39.00 26.00 21.00 28.00 20.00 34.00 51.00  3.00
## [409] 21.00    NA    NA    NA 33.00    NA 44.00    NA 34.00 18.00 30.00 10.00
## [421]    NA 21.00 29.00 28.00 18.00    NA 28.00 19.00    NA 32.00 28.00    NA
## [433] 42.00 17.00 50.00 14.00 21.00 24.00 64.00 31.00 45.00 20.00 25.00 28.00
## [445]    NA  4.00 13.00 34.00  5.00 52.00 36.00    NA 30.00 49.00    NA 29.00
## [457] 65.00    NA 50.00    NA 48.00 34.00 47.00 48.00    NA 38.00    NA 56.00
## [469]    NA  0.75    NA 38.00 33.00 23.00 22.00    NA 34.00 29.00 22.00  2.00
## [481]  9.00    NA 50.00 63.00 25.00    NA 35.00 58.00 30.00  9.00    NA 21.00
## [493] 55.00 71.00 21.00    NA 54.00    NA 25.00 24.00 17.00 21.00    NA 37.00
## [505] 16.00 18.00 33.00    NA 28.00 26.00 29.00    NA 36.00 54.00 24.00 47.00
## [517] 34.00    NA 36.00 32.00 30.00 22.00    NA 44.00    NA 40.50 50.00    NA
## [529] 39.00 23.00  2.00    NA 17.00    NA 30.00  7.00 45.00 30.00    NA 22.00
## [541] 36.00  9.00 11.00 32.00 50.00 64.00 19.00    NA 33.00  8.00 17.00 27.00
## [553]    NA 22.00 22.00 62.00 48.00    NA 39.00 36.00    NA 40.00 28.00    NA
## [565]    NA 24.00 19.00 29.00    NA 32.00 62.00 53.00 36.00    NA 16.00 19.00
## [577] 34.00 39.00    NA 32.00 25.00 39.00 54.00 36.00    NA 18.00 47.00 60.00
## [589] 22.00    NA 35.00 52.00 47.00    NA 37.00 36.00    NA 49.00    NA 49.00
## [601] 24.00    NA    NA 44.00 35.00 36.00 30.00 27.00 22.00 40.00 39.00    NA
## [613]    NA    NA 35.00 24.00 34.00 26.00  4.00 26.00 27.00 42.00 20.00 21.00
## [625] 21.00 61.00 57.00 21.00 26.00    NA 80.00 51.00 32.00    NA  9.00 28.00
## [637] 32.00 31.00 41.00    NA 20.00 24.00  2.00    NA  0.75 48.00 19.00 56.00
## [649]    NA 23.00    NA 18.00 21.00    NA 18.00 24.00    NA 32.00 23.00 58.00
## [661] 50.00 40.00 47.00 36.00 20.00 32.00 25.00    NA 43.00    NA 40.00 31.00
## [673] 70.00 31.00    NA 18.00 24.50 18.00 43.00 36.00    NA 27.00 20.00 14.00
## [685] 60.00 25.00 14.00 19.00 18.00 15.00 31.00  4.00    NA 25.00 60.00 52.00
## [697] 44.00    NA 49.00 42.00 18.00 35.00 18.00 25.00 26.00 39.00 45.00 42.00
## [709] 22.00    NA 24.00    NA 48.00 29.00 52.00 19.00 38.00 27.00    NA 33.00
## [721]  6.00 17.00 34.00 50.00 27.00 20.00 30.00    NA 25.00 25.00 29.00 11.00
## [733]    NA 23.00 23.00 28.50 48.00 35.00    NA    NA    NA 36.00 21.00 24.00
## [745] 31.00 70.00 16.00 30.00 19.00 31.00  4.00  6.00 33.00 23.00 48.00  0.67
## [757] 28.00 18.00 34.00 33.00    NA 41.00 20.00 36.00 16.00 51.00    NA 30.50
## [769]    NA 32.00 24.00 48.00 57.00    NA 54.00 18.00    NA  5.00    NA 43.00
## [781] 13.00 17.00 29.00    NA 25.00 25.00 18.00  8.00  1.00 46.00    NA 16.00
## [793]    NA    NA 25.00 39.00 49.00 31.00 30.00 30.00 34.00 31.00 11.00  0.42
## [805] 27.00 31.00 39.00 18.00 39.00 33.00 26.00 39.00 35.00  6.00 30.50    NA
## [817] 23.00 31.00 43.00 10.00 52.00 27.00 38.00 27.00  2.00    NA    NA  1.00
## [829]    NA 62.00 15.00  0.83    NA 23.00 18.00 39.00 21.00    NA 32.00    NA
## [841] 20.00 16.00 30.00 34.50 17.00 42.00    NA 35.00 28.00    NA  4.00 74.00
## [853]  9.00 16.00 44.00 18.00 45.00 51.00 24.00    NA 41.00 21.00 48.00    NA
## [865] 24.00 42.00 27.00 31.00    NA  4.00 26.00 47.00 33.00 47.00 28.00 15.00
## [877] 20.00 19.00    NA 56.00 25.00 33.00 22.00 28.00 25.00 39.00 27.00 19.00
## [889]    NA 26.00 32.00

There’s a fair amount of NA values, and it’s our job to impute them. They’re most likely missing because the creator of the dataset had no information on the person’s age. If you were to build a, say machine learning model, on this dataset, the best way to evaluate the imputation technique would be to measure classification metrics (accuracy, precision, recall, f1) after training the model (we will talk more about this when we discuss logistic regression).

Simple Value Imputation in R with Built-in Functions

The value_imputed variable will store a data.frame of the imputed ages The imputation itself boils down to replacing a column subset that has a value of NA with the value of our choice. We now have a dataset with four columns representing the ages.

  • Zero: constant imputation, feel free to change the value.
  • Mean (average): average age after when all NAs are removed.
  • Median: median age after when all NAs are removed.
value_imputed <- data.frame(original = titanic_train$Age,
                            imputed_zero = replace(titanic_train$Age, is.na(titanic_train$Age), 0),
                            imputed_mean = replace(titanic_train$Age,
                                                   is.na(titanic_train$Age),
                                                   mean(titanic_train$Age, na.rm = TRUE)),
                            imputed_median = replace(titanic_train$Age,
                                                     is.na(titanic_train$Age),
                                                     median(titanic_train$Age, na.rm = TRUE)))
value_imputed
##     original imputed_zero imputed_mean imputed_median
## 1      22.00        22.00     22.00000          22.00
## 2      38.00        38.00     38.00000          38.00
## 3      26.00        26.00     26.00000          26.00
## 4      35.00        35.00     35.00000          35.00
## 5      35.00        35.00     35.00000          35.00
## 6         NA         0.00     29.69912          28.00
## 7      54.00        54.00     54.00000          54.00
## 8       2.00         2.00      2.00000           2.00
## 9      27.00        27.00     27.00000          27.00
## 10     14.00        14.00     14.00000          14.00
## 11      4.00         4.00      4.00000           4.00
## 12     58.00        58.00     58.00000          58.00
## 13     20.00        20.00     20.00000          20.00
## 14     39.00        39.00     39.00000          39.00
## 15     14.00        14.00     14.00000          14.00
## 16     55.00        55.00     55.00000          55.00
## 17      2.00         2.00      2.00000           2.00
## 18        NA         0.00     29.69912          28.00
## 19     31.00        31.00     31.00000          31.00
## 20        NA         0.00     29.69912          28.00
## 21     35.00        35.00     35.00000          35.00
## 22     34.00        34.00     34.00000          34.00
## 23     15.00        15.00     15.00000          15.00
## 24     28.00        28.00     28.00000          28.00
## 25      8.00         8.00      8.00000           8.00
## 26     38.00        38.00     38.00000          38.00
## 27        NA         0.00     29.69912          28.00
## 28     19.00        19.00     19.00000          19.00
## 29        NA         0.00     29.69912          28.00
## 30        NA         0.00     29.69912          28.00
## 31     40.00        40.00     40.00000          40.00
## 32        NA         0.00     29.69912          28.00
## 33        NA         0.00     29.69912          28.00
## 34     66.00        66.00     66.00000          66.00
## 35     28.00        28.00     28.00000          28.00
## 36     42.00        42.00     42.00000          42.00
## 37        NA         0.00     29.69912          28.00
## 38     21.00        21.00     21.00000          21.00
## 39     18.00        18.00     18.00000          18.00
## 40     14.00        14.00     14.00000          14.00
## 41     40.00        40.00     40.00000          40.00
## 42     27.00        27.00     27.00000          27.00
## 43        NA         0.00     29.69912          28.00
## 44      3.00         3.00      3.00000           3.00
## 45     19.00        19.00     19.00000          19.00
## 46        NA         0.00     29.69912          28.00
## 47        NA         0.00     29.69912          28.00
## 48        NA         0.00     29.69912          28.00
## 49        NA         0.00     29.69912          28.00
## 50     18.00        18.00     18.00000          18.00
## 51      7.00         7.00      7.00000           7.00
## 52     21.00        21.00     21.00000          21.00
## 53     49.00        49.00     49.00000          49.00
## 54     29.00        29.00     29.00000          29.00
## 55     65.00        65.00     65.00000          65.00
## 56        NA         0.00     29.69912          28.00
## 57     21.00        21.00     21.00000          21.00
## 58     28.50        28.50     28.50000          28.50
## 59      5.00         5.00      5.00000           5.00
## 60     11.00        11.00     11.00000          11.00
## 61     22.00        22.00     22.00000          22.00
## 62     38.00        38.00     38.00000          38.00
## 63     45.00        45.00     45.00000          45.00
## 64      4.00         4.00      4.00000           4.00
## 65        NA         0.00     29.69912          28.00
## 66        NA         0.00     29.69912          28.00
## 67     29.00        29.00     29.00000          29.00
## 68     19.00        19.00     19.00000          19.00
## 69     17.00        17.00     17.00000          17.00
## 70     26.00        26.00     26.00000          26.00
## 71     32.00        32.00     32.00000          32.00
## 72     16.00        16.00     16.00000          16.00
## 73     21.00        21.00     21.00000          21.00
## 74     26.00        26.00     26.00000          26.00
## 75     32.00        32.00     32.00000          32.00
## 76     25.00        25.00     25.00000          25.00
## 77        NA         0.00     29.69912          28.00
## 78        NA         0.00     29.69912          28.00
## 79      0.83         0.83      0.83000           0.83
## 80     30.00        30.00     30.00000          30.00
## 81     22.00        22.00     22.00000          22.00
## 82     29.00        29.00     29.00000          29.00
## 83        NA         0.00     29.69912          28.00
## 84     28.00        28.00     28.00000          28.00
## 85     17.00        17.00     17.00000          17.00
## 86     33.00        33.00     33.00000          33.00
## 87     16.00        16.00     16.00000          16.00
## 88        NA         0.00     29.69912          28.00
## 89     23.00        23.00     23.00000          23.00
## 90     24.00        24.00     24.00000          24.00
## 91     29.00        29.00     29.00000          29.00
## 92     20.00        20.00     20.00000          20.00
## 93     46.00        46.00     46.00000          46.00
## 94     26.00        26.00     26.00000          26.00
## 95     59.00        59.00     59.00000          59.00
## 96        NA         0.00     29.69912          28.00
## 97     71.00        71.00     71.00000          71.00
## 98     23.00        23.00     23.00000          23.00
## 99     34.00        34.00     34.00000          34.00
## 100    34.00        34.00     34.00000          34.00
## 101    28.00        28.00     28.00000          28.00
## 102       NA         0.00     29.69912          28.00
## 103    21.00        21.00     21.00000          21.00
## 104    33.00        33.00     33.00000          33.00
## 105    37.00        37.00     37.00000          37.00
## 106    28.00        28.00     28.00000          28.00
## 107    21.00        21.00     21.00000          21.00
## 108       NA         0.00     29.69912          28.00
## 109    38.00        38.00     38.00000          38.00
## 110       NA         0.00     29.69912          28.00
## 111    47.00        47.00     47.00000          47.00
## 112    14.50        14.50     14.50000          14.50
## 113    22.00        22.00     22.00000          22.00
## 114    20.00        20.00     20.00000          20.00
## 115    17.00        17.00     17.00000          17.00
## 116    21.00        21.00     21.00000          21.00
## 117    70.50        70.50     70.50000          70.50
## 118    29.00        29.00     29.00000          29.00
## 119    24.00        24.00     24.00000          24.00
## 120     2.00         2.00      2.00000           2.00
## 121    21.00        21.00     21.00000          21.00
## 122       NA         0.00     29.69912          28.00
## 123    32.50        32.50     32.50000          32.50
## 124    32.50        32.50     32.50000          32.50
## 125    54.00        54.00     54.00000          54.00
## 126    12.00        12.00     12.00000          12.00
## 127       NA         0.00     29.69912          28.00
## 128    24.00        24.00     24.00000          24.00
## 129       NA         0.00     29.69912          28.00
## 130    45.00        45.00     45.00000          45.00
## 131    33.00        33.00     33.00000          33.00
## 132    20.00        20.00     20.00000          20.00
## 133    47.00        47.00     47.00000          47.00
## 134    29.00        29.00     29.00000          29.00
## 135    25.00        25.00     25.00000          25.00
## 136    23.00        23.00     23.00000          23.00
## 137    19.00        19.00     19.00000          19.00
## 138    37.00        37.00     37.00000          37.00
## 139    16.00        16.00     16.00000          16.00
## 140    24.00        24.00     24.00000          24.00
## 141       NA         0.00     29.69912          28.00
## 142    22.00        22.00     22.00000          22.00
## 143    24.00        24.00     24.00000          24.00
## 144    19.00        19.00     19.00000          19.00
## 145    18.00        18.00     18.00000          18.00
## 146    19.00        19.00     19.00000          19.00
## 147    27.00        27.00     27.00000          27.00
## 148     9.00         9.00      9.00000           9.00
## 149    36.50        36.50     36.50000          36.50
## 150    42.00        42.00     42.00000          42.00
## 151    51.00        51.00     51.00000          51.00
## 152    22.00        22.00     22.00000          22.00
## 153    55.50        55.50     55.50000          55.50
## 154    40.50        40.50     40.50000          40.50
## 155       NA         0.00     29.69912          28.00
## 156    51.00        51.00     51.00000          51.00
## 157    16.00        16.00     16.00000          16.00
## 158    30.00        30.00     30.00000          30.00
## 159       NA         0.00     29.69912          28.00
## 160       NA         0.00     29.69912          28.00
## 161    44.00        44.00     44.00000          44.00
## 162    40.00        40.00     40.00000          40.00
## 163    26.00        26.00     26.00000          26.00
## 164    17.00        17.00     17.00000          17.00
## 165     1.00         1.00      1.00000           1.00
## 166     9.00         9.00      9.00000           9.00
## 167       NA         0.00     29.69912          28.00
## 168    45.00        45.00     45.00000          45.00
## 169       NA         0.00     29.69912          28.00
## 170    28.00        28.00     28.00000          28.00
## 171    61.00        61.00     61.00000          61.00
## 172     4.00         4.00      4.00000           4.00
## 173     1.00         1.00      1.00000           1.00
## 174    21.00        21.00     21.00000          21.00
## 175    56.00        56.00     56.00000          56.00
## 176    18.00        18.00     18.00000          18.00
## 177       NA         0.00     29.69912          28.00
## 178    50.00        50.00     50.00000          50.00
## 179    30.00        30.00     30.00000          30.00
## 180    36.00        36.00     36.00000          36.00
## 181       NA         0.00     29.69912          28.00
## 182       NA         0.00     29.69912          28.00
## 183     9.00         9.00      9.00000           9.00
## 184     1.00         1.00      1.00000           1.00
## 185     4.00         4.00      4.00000           4.00
## 186       NA         0.00     29.69912          28.00
## 187       NA         0.00     29.69912          28.00
## 188    45.00        45.00     45.00000          45.00
## 189    40.00        40.00     40.00000          40.00
## 190    36.00        36.00     36.00000          36.00
## 191    32.00        32.00     32.00000          32.00
## 192    19.00        19.00     19.00000          19.00
## 193    19.00        19.00     19.00000          19.00
## 194     3.00         3.00      3.00000           3.00
## 195    44.00        44.00     44.00000          44.00
## 196    58.00        58.00     58.00000          58.00
## 197       NA         0.00     29.69912          28.00
## 198    42.00        42.00     42.00000          42.00
## 199       NA         0.00     29.69912          28.00
## 200    24.00        24.00     24.00000          24.00
## 201    28.00        28.00     28.00000          28.00
## 202       NA         0.00     29.69912          28.00
## 203    34.00        34.00     34.00000          34.00
## 204    45.50        45.50     45.50000          45.50
## 205    18.00        18.00     18.00000          18.00
## 206     2.00         2.00      2.00000           2.00
## 207    32.00        32.00     32.00000          32.00
## 208    26.00        26.00     26.00000          26.00
## 209    16.00        16.00     16.00000          16.00
## 210    40.00        40.00     40.00000          40.00
## 211    24.00        24.00     24.00000          24.00
## 212    35.00        35.00     35.00000          35.00
## 213    22.00        22.00     22.00000          22.00
## 214    30.00        30.00     30.00000          30.00
## 215       NA         0.00     29.69912          28.00
## 216    31.00        31.00     31.00000          31.00
## 217    27.00        27.00     27.00000          27.00
## 218    42.00        42.00     42.00000          42.00
## 219    32.00        32.00     32.00000          32.00
## 220    30.00        30.00     30.00000          30.00
## 221    16.00        16.00     16.00000          16.00
## 222    27.00        27.00     27.00000          27.00
## 223    51.00        51.00     51.00000          51.00
## 224       NA         0.00     29.69912          28.00
## 225    38.00        38.00     38.00000          38.00
## 226    22.00        22.00     22.00000          22.00
## 227    19.00        19.00     19.00000          19.00
## 228    20.50        20.50     20.50000          20.50
## 229    18.00        18.00     18.00000          18.00
## 230       NA         0.00     29.69912          28.00
## 231    35.00        35.00     35.00000          35.00
## 232    29.00        29.00     29.00000          29.00
## 233    59.00        59.00     59.00000          59.00
## 234     5.00         5.00      5.00000           5.00
## 235    24.00        24.00     24.00000          24.00
## 236       NA         0.00     29.69912          28.00
## 237    44.00        44.00     44.00000          44.00
## 238     8.00         8.00      8.00000           8.00
## 239    19.00        19.00     19.00000          19.00
## 240    33.00        33.00     33.00000          33.00
## 241       NA         0.00     29.69912          28.00
## 242       NA         0.00     29.69912          28.00
## 243    29.00        29.00     29.00000          29.00
## 244    22.00        22.00     22.00000          22.00
## 245    30.00        30.00     30.00000          30.00
## 246    44.00        44.00     44.00000          44.00
## 247    25.00        25.00     25.00000          25.00
## 248    24.00        24.00     24.00000          24.00
## 249    37.00        37.00     37.00000          37.00
## 250    54.00        54.00     54.00000          54.00
## 251       NA         0.00     29.69912          28.00
## 252    29.00        29.00     29.00000          29.00
## 253    62.00        62.00     62.00000          62.00
## 254    30.00        30.00     30.00000          30.00
## 255    41.00        41.00     41.00000          41.00
## 256    29.00        29.00     29.00000          29.00
## 257       NA         0.00     29.69912          28.00
## 258    30.00        30.00     30.00000          30.00
## 259    35.00        35.00     35.00000          35.00
## 260    50.00        50.00     50.00000          50.00
## 261       NA         0.00     29.69912          28.00
## 262     3.00         3.00      3.00000           3.00
## 263    52.00        52.00     52.00000          52.00
## 264    40.00        40.00     40.00000          40.00
## 265       NA         0.00     29.69912          28.00
## 266    36.00        36.00     36.00000          36.00
## 267    16.00        16.00     16.00000          16.00
## 268    25.00        25.00     25.00000          25.00
## 269    58.00        58.00     58.00000          58.00
## 270    35.00        35.00     35.00000          35.00
## 271       NA         0.00     29.69912          28.00
## 272    25.00        25.00     25.00000          25.00
## 273    41.00        41.00     41.00000          41.00
## 274    37.00        37.00     37.00000          37.00
## 275       NA         0.00     29.69912          28.00
## 276    63.00        63.00     63.00000          63.00
## 277    45.00        45.00     45.00000          45.00
## 278       NA         0.00     29.69912          28.00
## 279     7.00         7.00      7.00000           7.00
## 280    35.00        35.00     35.00000          35.00
## 281    65.00        65.00     65.00000          65.00
## 282    28.00        28.00     28.00000          28.00
## 283    16.00        16.00     16.00000          16.00
## 284    19.00        19.00     19.00000          19.00
## 285       NA         0.00     29.69912          28.00
## 286    33.00        33.00     33.00000          33.00
## 287    30.00        30.00     30.00000          30.00
## 288    22.00        22.00     22.00000          22.00
## 289    42.00        42.00     42.00000          42.00
## 290    22.00        22.00     22.00000          22.00
## 291    26.00        26.00     26.00000          26.00
## 292    19.00        19.00     19.00000          19.00
## 293    36.00        36.00     36.00000          36.00
## 294    24.00        24.00     24.00000          24.00
## 295    24.00        24.00     24.00000          24.00
## 296       NA         0.00     29.69912          28.00
## 297    23.50        23.50     23.50000          23.50
## 298     2.00         2.00      2.00000           2.00
## 299       NA         0.00     29.69912          28.00
## 300    50.00        50.00     50.00000          50.00
## 301       NA         0.00     29.69912          28.00
## 302       NA         0.00     29.69912          28.00
## 303    19.00        19.00     19.00000          19.00
## 304       NA         0.00     29.69912          28.00
## 305       NA         0.00     29.69912          28.00
## 306     0.92         0.92      0.92000           0.92
## 307       NA         0.00     29.69912          28.00
## 308    17.00        17.00     17.00000          17.00
## 309    30.00        30.00     30.00000          30.00
## 310    30.00        30.00     30.00000          30.00
## 311    24.00        24.00     24.00000          24.00
## 312    18.00        18.00     18.00000          18.00
## 313    26.00        26.00     26.00000          26.00
## 314    28.00        28.00     28.00000          28.00
## 315    43.00        43.00     43.00000          43.00
## 316    26.00        26.00     26.00000          26.00
## 317    24.00        24.00     24.00000          24.00
## 318    54.00        54.00     54.00000          54.00
## 319    31.00        31.00     31.00000          31.00
## 320    40.00        40.00     40.00000          40.00
## 321    22.00        22.00     22.00000          22.00
## 322    27.00        27.00     27.00000          27.00
## 323    30.00        30.00     30.00000          30.00
## 324    22.00        22.00     22.00000          22.00
## 325       NA         0.00     29.69912          28.00
## 326    36.00        36.00     36.00000          36.00
## 327    61.00        61.00     61.00000          61.00
## 328    36.00        36.00     36.00000          36.00
## 329    31.00        31.00     31.00000          31.00
## 330    16.00        16.00     16.00000          16.00
## 331       NA         0.00     29.69912          28.00
## 332    45.50        45.50     45.50000          45.50
## 333    38.00        38.00     38.00000          38.00
## 334    16.00        16.00     16.00000          16.00
## 335       NA         0.00     29.69912          28.00
## 336       NA         0.00     29.69912          28.00
## 337    29.00        29.00     29.00000          29.00
## 338    41.00        41.00     41.00000          41.00
## 339    45.00        45.00     45.00000          45.00
## 340    45.00        45.00     45.00000          45.00
## 341     2.00         2.00      2.00000           2.00
## 342    24.00        24.00     24.00000          24.00
## 343    28.00        28.00     28.00000          28.00
## 344    25.00        25.00     25.00000          25.00
## 345    36.00        36.00     36.00000          36.00
## 346    24.00        24.00     24.00000          24.00
## 347    40.00        40.00     40.00000          40.00
## 348       NA         0.00     29.69912          28.00
## 349     3.00         3.00      3.00000           3.00
## 350    42.00        42.00     42.00000          42.00
## 351    23.00        23.00     23.00000          23.00
## 352       NA         0.00     29.69912          28.00
## 353    15.00        15.00     15.00000          15.00
## 354    25.00        25.00     25.00000          25.00
## 355       NA         0.00     29.69912          28.00
## 356    28.00        28.00     28.00000          28.00
## 357    22.00        22.00     22.00000          22.00
## 358    38.00        38.00     38.00000          38.00
## 359       NA         0.00     29.69912          28.00
## 360       NA         0.00     29.69912          28.00
## 361    40.00        40.00     40.00000          40.00
## 362    29.00        29.00     29.00000          29.00
## 363    45.00        45.00     45.00000          45.00
## 364    35.00        35.00     35.00000          35.00
## 365       NA         0.00     29.69912          28.00
## 366    30.00        30.00     30.00000          30.00
## 367    60.00        60.00     60.00000          60.00
## 368       NA         0.00     29.69912          28.00
## 369       NA         0.00     29.69912          28.00
## 370    24.00        24.00     24.00000          24.00
## 371    25.00        25.00     25.00000          25.00
## 372    18.00        18.00     18.00000          18.00
## 373    19.00        19.00     19.00000          19.00
## 374    22.00        22.00     22.00000          22.00
## 375     3.00         3.00      3.00000           3.00
## 376       NA         0.00     29.69912          28.00
## 377    22.00        22.00     22.00000          22.00
## 378    27.00        27.00     27.00000          27.00
## 379    20.00        20.00     20.00000          20.00
## 380    19.00        19.00     19.00000          19.00
## 381    42.00        42.00     42.00000          42.00
## 382     1.00         1.00      1.00000           1.00
## 383    32.00        32.00     32.00000          32.00
## 384    35.00        35.00     35.00000          35.00
## 385       NA         0.00     29.69912          28.00
## 386    18.00        18.00     18.00000          18.00
## 387     1.00         1.00      1.00000           1.00
## 388    36.00        36.00     36.00000          36.00
## 389       NA         0.00     29.69912          28.00
## 390    17.00        17.00     17.00000          17.00
## 391    36.00        36.00     36.00000          36.00
## 392    21.00        21.00     21.00000          21.00
## 393    28.00        28.00     28.00000          28.00
## 394    23.00        23.00     23.00000          23.00
## 395    24.00        24.00     24.00000          24.00
## 396    22.00        22.00     22.00000          22.00
## 397    31.00        31.00     31.00000          31.00
## 398    46.00        46.00     46.00000          46.00
## 399    23.00        23.00     23.00000          23.00
## 400    28.00        28.00     28.00000          28.00
## 401    39.00        39.00     39.00000          39.00
## 402    26.00        26.00     26.00000          26.00
## 403    21.00        21.00     21.00000          21.00
## 404    28.00        28.00     28.00000          28.00
## 405    20.00        20.00     20.00000          20.00
## 406    34.00        34.00     34.00000          34.00
## 407    51.00        51.00     51.00000          51.00
## 408     3.00         3.00      3.00000           3.00
## 409    21.00        21.00     21.00000          21.00
## 410       NA         0.00     29.69912          28.00
## 411       NA         0.00     29.69912          28.00
## 412       NA         0.00     29.69912          28.00
## 413    33.00        33.00     33.00000          33.00
## 414       NA         0.00     29.69912          28.00
## 415    44.00        44.00     44.00000          44.00
## 416       NA         0.00     29.69912          28.00
## 417    34.00        34.00     34.00000          34.00
## 418    18.00        18.00     18.00000          18.00
## 419    30.00        30.00     30.00000          30.00
## 420    10.00        10.00     10.00000          10.00
## 421       NA         0.00     29.69912          28.00
## 422    21.00        21.00     21.00000          21.00
## 423    29.00        29.00     29.00000          29.00
## 424    28.00        28.00     28.00000          28.00
## 425    18.00        18.00     18.00000          18.00
## 426       NA         0.00     29.69912          28.00
## 427    28.00        28.00     28.00000          28.00
## 428    19.00        19.00     19.00000          19.00
## 429       NA         0.00     29.69912          28.00
## 430    32.00        32.00     32.00000          32.00
## 431    28.00        28.00     28.00000          28.00
## 432       NA         0.00     29.69912          28.00
## 433    42.00        42.00     42.00000          42.00
## 434    17.00        17.00     17.00000          17.00
## 435    50.00        50.00     50.00000          50.00
## 436    14.00        14.00     14.00000          14.00
## 437    21.00        21.00     21.00000          21.00
## 438    24.00        24.00     24.00000          24.00
## 439    64.00        64.00     64.00000          64.00
## 440    31.00        31.00     31.00000          31.00
## 441    45.00        45.00     45.00000          45.00
## 442    20.00        20.00     20.00000          20.00
## 443    25.00        25.00     25.00000          25.00
## 444    28.00        28.00     28.00000          28.00
## 445       NA         0.00     29.69912          28.00
## 446     4.00         4.00      4.00000           4.00
## 447    13.00        13.00     13.00000          13.00
## 448    34.00        34.00     34.00000          34.00
## 449     5.00         5.00      5.00000           5.00
## 450    52.00        52.00     52.00000          52.00
## 451    36.00        36.00     36.00000          36.00
## 452       NA         0.00     29.69912          28.00
## 453    30.00        30.00     30.00000          30.00
## 454    49.00        49.00     49.00000          49.00
## 455       NA         0.00     29.69912          28.00
## 456    29.00        29.00     29.00000          29.00
## 457    65.00        65.00     65.00000          65.00
## 458       NA         0.00     29.69912          28.00
## 459    50.00        50.00     50.00000          50.00
## 460       NA         0.00     29.69912          28.00
## 461    48.00        48.00     48.00000          48.00
## 462    34.00        34.00     34.00000          34.00
## 463    47.00        47.00     47.00000          47.00
## 464    48.00        48.00     48.00000          48.00
## 465       NA         0.00     29.69912          28.00
## 466    38.00        38.00     38.00000          38.00
## 467       NA         0.00     29.69912          28.00
## 468    56.00        56.00     56.00000          56.00
## 469       NA         0.00     29.69912          28.00
## 470     0.75         0.75      0.75000           0.75
## 471       NA         0.00     29.69912          28.00
## 472    38.00        38.00     38.00000          38.00
## 473    33.00        33.00     33.00000          33.00
## 474    23.00        23.00     23.00000          23.00
## 475    22.00        22.00     22.00000          22.00
## 476       NA         0.00     29.69912          28.00
## 477    34.00        34.00     34.00000          34.00
## 478    29.00        29.00     29.00000          29.00
## 479    22.00        22.00     22.00000          22.00
## 480     2.00         2.00      2.00000           2.00
## 481     9.00         9.00      9.00000           9.00
## 482       NA         0.00     29.69912          28.00
## 483    50.00        50.00     50.00000          50.00
## 484    63.00        63.00     63.00000          63.00
## 485    25.00        25.00     25.00000          25.00
## 486       NA         0.00     29.69912          28.00
## 487    35.00        35.00     35.00000          35.00
## 488    58.00        58.00     58.00000          58.00
## 489    30.00        30.00     30.00000          30.00
## 490     9.00         9.00      9.00000           9.00
## 491       NA         0.00     29.69912          28.00
## 492    21.00        21.00     21.00000          21.00
## 493    55.00        55.00     55.00000          55.00
## 494    71.00        71.00     71.00000          71.00
## 495    21.00        21.00     21.00000          21.00
## 496       NA         0.00     29.69912          28.00
## 497    54.00        54.00     54.00000          54.00
## 498       NA         0.00     29.69912          28.00
## 499    25.00        25.00     25.00000          25.00
## 500    24.00        24.00     24.00000          24.00
## 501    17.00        17.00     17.00000          17.00
## 502    21.00        21.00     21.00000          21.00
## 503       NA         0.00     29.69912          28.00
## 504    37.00        37.00     37.00000          37.00
## 505    16.00        16.00     16.00000          16.00
## 506    18.00        18.00     18.00000          18.00
## 507    33.00        33.00     33.00000          33.00
## 508       NA         0.00     29.69912          28.00
## 509    28.00        28.00     28.00000          28.00
## 510    26.00        26.00     26.00000          26.00
## 511    29.00        29.00     29.00000          29.00
## 512       NA         0.00     29.69912          28.00
## 513    36.00        36.00     36.00000          36.00
## 514    54.00        54.00     54.00000          54.00
## 515    24.00        24.00     24.00000          24.00
## 516    47.00        47.00     47.00000          47.00
## 517    34.00        34.00     34.00000          34.00
## 518       NA         0.00     29.69912          28.00
## 519    36.00        36.00     36.00000          36.00
## 520    32.00        32.00     32.00000          32.00
## 521    30.00        30.00     30.00000          30.00
## 522    22.00        22.00     22.00000          22.00
## 523       NA         0.00     29.69912          28.00
## 524    44.00        44.00     44.00000          44.00
## 525       NA         0.00     29.69912          28.00
## 526    40.50        40.50     40.50000          40.50
## 527    50.00        50.00     50.00000          50.00
## 528       NA         0.00     29.69912          28.00
## 529    39.00        39.00     39.00000          39.00
## 530    23.00        23.00     23.00000          23.00
## 531     2.00         2.00      2.00000           2.00
## 532       NA         0.00     29.69912          28.00
## 533    17.00        17.00     17.00000          17.00
## 534       NA         0.00     29.69912          28.00
## 535    30.00        30.00     30.00000          30.00
## 536     7.00         7.00      7.00000           7.00
## 537    45.00        45.00     45.00000          45.00
## 538    30.00        30.00     30.00000          30.00
## 539       NA         0.00     29.69912          28.00
## 540    22.00        22.00     22.00000          22.00
## 541    36.00        36.00     36.00000          36.00
## 542     9.00         9.00      9.00000           9.00
## 543    11.00        11.00     11.00000          11.00
## 544    32.00        32.00     32.00000          32.00
## 545    50.00        50.00     50.00000          50.00
## 546    64.00        64.00     64.00000          64.00
## 547    19.00        19.00     19.00000          19.00
## 548       NA         0.00     29.69912          28.00
## 549    33.00        33.00     33.00000          33.00
## 550     8.00         8.00      8.00000           8.00
## 551    17.00        17.00     17.00000          17.00
## 552    27.00        27.00     27.00000          27.00
## 553       NA         0.00     29.69912          28.00
## 554    22.00        22.00     22.00000          22.00
## 555    22.00        22.00     22.00000          22.00
## 556    62.00        62.00     62.00000          62.00
## 557    48.00        48.00     48.00000          48.00
## 558       NA         0.00     29.69912          28.00
## 559    39.00        39.00     39.00000          39.00
## 560    36.00        36.00     36.00000          36.00
## 561       NA         0.00     29.69912          28.00
## 562    40.00        40.00     40.00000          40.00
## 563    28.00        28.00     28.00000          28.00
## 564       NA         0.00     29.69912          28.00
## 565       NA         0.00     29.69912          28.00
## 566    24.00        24.00     24.00000          24.00
## 567    19.00        19.00     19.00000          19.00
## 568    29.00        29.00     29.00000          29.00
## 569       NA         0.00     29.69912          28.00
## 570    32.00        32.00     32.00000          32.00
## 571    62.00        62.00     62.00000          62.00
## 572    53.00        53.00     53.00000          53.00
## 573    36.00        36.00     36.00000          36.00
## 574       NA         0.00     29.69912          28.00
## 575    16.00        16.00     16.00000          16.00
## 576    19.00        19.00     19.00000          19.00
## 577    34.00        34.00     34.00000          34.00
## 578    39.00        39.00     39.00000          39.00
## 579       NA         0.00     29.69912          28.00
## 580    32.00        32.00     32.00000          32.00
## 581    25.00        25.00     25.00000          25.00
## 582    39.00        39.00     39.00000          39.00
## 583    54.00        54.00     54.00000          54.00
## 584    36.00        36.00     36.00000          36.00
## 585       NA         0.00     29.69912          28.00
## 586    18.00        18.00     18.00000          18.00
## 587    47.00        47.00     47.00000          47.00
## 588    60.00        60.00     60.00000          60.00
## 589    22.00        22.00     22.00000          22.00
## 590       NA         0.00     29.69912          28.00
## 591    35.00        35.00     35.00000          35.00
## 592    52.00        52.00     52.00000          52.00
## 593    47.00        47.00     47.00000          47.00
## 594       NA         0.00     29.69912          28.00
## 595    37.00        37.00     37.00000          37.00
## 596    36.00        36.00     36.00000          36.00
## 597       NA         0.00     29.69912          28.00
## 598    49.00        49.00     49.00000          49.00
## 599       NA         0.00     29.69912          28.00
## 600    49.00        49.00     49.00000          49.00
## 601    24.00        24.00     24.00000          24.00
## 602       NA         0.00     29.69912          28.00
## 603       NA         0.00     29.69912          28.00
## 604    44.00        44.00     44.00000          44.00
## 605    35.00        35.00     35.00000          35.00
## 606    36.00        36.00     36.00000          36.00
## 607    30.00        30.00     30.00000          30.00
## 608    27.00        27.00     27.00000          27.00
## 609    22.00        22.00     22.00000          22.00
## 610    40.00        40.00     40.00000          40.00
## 611    39.00        39.00     39.00000          39.00
## 612       NA         0.00     29.69912          28.00
## 613       NA         0.00     29.69912          28.00
## 614       NA         0.00     29.69912          28.00
## 615    35.00        35.00     35.00000          35.00
## 616    24.00        24.00     24.00000          24.00
## 617    34.00        34.00     34.00000          34.00
## 618    26.00        26.00     26.00000          26.00
## 619     4.00         4.00      4.00000           4.00
## 620    26.00        26.00     26.00000          26.00
## 621    27.00        27.00     27.00000          27.00
## 622    42.00        42.00     42.00000          42.00
## 623    20.00        20.00     20.00000          20.00
## 624    21.00        21.00     21.00000          21.00
## 625    21.00        21.00     21.00000          21.00
## 626    61.00        61.00     61.00000          61.00
## 627    57.00        57.00     57.00000          57.00
## 628    21.00        21.00     21.00000          21.00
## 629    26.00        26.00     26.00000          26.00
## 630       NA         0.00     29.69912          28.00
## 631    80.00        80.00     80.00000          80.00
## 632    51.00        51.00     51.00000          51.00
## 633    32.00        32.00     32.00000          32.00
## 634       NA         0.00     29.69912          28.00
## 635     9.00         9.00      9.00000           9.00
## 636    28.00        28.00     28.00000          28.00
## 637    32.00        32.00     32.00000          32.00
## 638    31.00        31.00     31.00000          31.00
## 639    41.00        41.00     41.00000          41.00
## 640       NA         0.00     29.69912          28.00
## 641    20.00        20.00     20.00000          20.00
## 642    24.00        24.00     24.00000          24.00
## 643     2.00         2.00      2.00000           2.00
## 644       NA         0.00     29.69912          28.00
## 645     0.75         0.75      0.75000           0.75
## 646    48.00        48.00     48.00000          48.00
## 647    19.00        19.00     19.00000          19.00
## 648    56.00        56.00     56.00000          56.00
## 649       NA         0.00     29.69912          28.00
## 650    23.00        23.00     23.00000          23.00
## 651       NA         0.00     29.69912          28.00
## 652    18.00        18.00     18.00000          18.00
## 653    21.00        21.00     21.00000          21.00
## 654       NA         0.00     29.69912          28.00
## 655    18.00        18.00     18.00000          18.00
## 656    24.00        24.00     24.00000          24.00
## 657       NA         0.00     29.69912          28.00
## 658    32.00        32.00     32.00000          32.00
## 659    23.00        23.00     23.00000          23.00
## 660    58.00        58.00     58.00000          58.00
## 661    50.00        50.00     50.00000          50.00
## 662    40.00        40.00     40.00000          40.00
## 663    47.00        47.00     47.00000          47.00
## 664    36.00        36.00     36.00000          36.00
## 665    20.00        20.00     20.00000          20.00
## 666    32.00        32.00     32.00000          32.00
## 667    25.00        25.00     25.00000          25.00
## 668       NA         0.00     29.69912          28.00
## 669    43.00        43.00     43.00000          43.00
## 670       NA         0.00     29.69912          28.00
## 671    40.00        40.00     40.00000          40.00
## 672    31.00        31.00     31.00000          31.00
## 673    70.00        70.00     70.00000          70.00
## 674    31.00        31.00     31.00000          31.00
## 675       NA         0.00     29.69912          28.00
## 676    18.00        18.00     18.00000          18.00
## 677    24.50        24.50     24.50000          24.50
## 678    18.00        18.00     18.00000          18.00
## 679    43.00        43.00     43.00000          43.00
## 680    36.00        36.00     36.00000          36.00
## 681       NA         0.00     29.69912          28.00
## 682    27.00        27.00     27.00000          27.00
## 683    20.00        20.00     20.00000          20.00
## 684    14.00        14.00     14.00000          14.00
## 685    60.00        60.00     60.00000          60.00
## 686    25.00        25.00     25.00000          25.00
## 687    14.00        14.00     14.00000          14.00
## 688    19.00        19.00     19.00000          19.00
## 689    18.00        18.00     18.00000          18.00
## 690    15.00        15.00     15.00000          15.00
## 691    31.00        31.00     31.00000          31.00
## 692     4.00         4.00      4.00000           4.00
## 693       NA         0.00     29.69912          28.00
## 694    25.00        25.00     25.00000          25.00
## 695    60.00        60.00     60.00000          60.00
## 696    52.00        52.00     52.00000          52.00
## 697    44.00        44.00     44.00000          44.00
## 698       NA         0.00     29.69912          28.00
## 699    49.00        49.00     49.00000          49.00
## 700    42.00        42.00     42.00000          42.00
## 701    18.00        18.00     18.00000          18.00
## 702    35.00        35.00     35.00000          35.00
## 703    18.00        18.00     18.00000          18.00
## 704    25.00        25.00     25.00000          25.00
## 705    26.00        26.00     26.00000          26.00
## 706    39.00        39.00     39.00000          39.00
## 707    45.00        45.00     45.00000          45.00
## 708    42.00        42.00     42.00000          42.00
## 709    22.00        22.00     22.00000          22.00
## 710       NA         0.00     29.69912          28.00
## 711    24.00        24.00     24.00000          24.00
## 712       NA         0.00     29.69912          28.00
## 713    48.00        48.00     48.00000          48.00
## 714    29.00        29.00     29.00000          29.00
## 715    52.00        52.00     52.00000          52.00
## 716    19.00        19.00     19.00000          19.00
## 717    38.00        38.00     38.00000          38.00
## 718    27.00        27.00     27.00000          27.00
## 719       NA         0.00     29.69912          28.00
## 720    33.00        33.00     33.00000          33.00
## 721     6.00         6.00      6.00000           6.00
## 722    17.00        17.00     17.00000          17.00
## 723    34.00        34.00     34.00000          34.00
## 724    50.00        50.00     50.00000          50.00
## 725    27.00        27.00     27.00000          27.00
## 726    20.00        20.00     20.00000          20.00
## 727    30.00        30.00     30.00000          30.00
## 728       NA         0.00     29.69912          28.00
## 729    25.00        25.00     25.00000          25.00
## 730    25.00        25.00     25.00000          25.00
## 731    29.00        29.00     29.00000          29.00
## 732    11.00        11.00     11.00000          11.00
## 733       NA         0.00     29.69912          28.00
## 734    23.00        23.00     23.00000          23.00
## 735    23.00        23.00     23.00000          23.00
## 736    28.50        28.50     28.50000          28.50
## 737    48.00        48.00     48.00000          48.00
## 738    35.00        35.00     35.00000          35.00
## 739       NA         0.00     29.69912          28.00
## 740       NA         0.00     29.69912          28.00
## 741       NA         0.00     29.69912          28.00
## 742    36.00        36.00     36.00000          36.00
## 743    21.00        21.00     21.00000          21.00
## 744    24.00        24.00     24.00000          24.00
## 745    31.00        31.00     31.00000          31.00
## 746    70.00        70.00     70.00000          70.00
## 747    16.00        16.00     16.00000          16.00
## 748    30.00        30.00     30.00000          30.00
## 749    19.00        19.00     19.00000          19.00
## 750    31.00        31.00     31.00000          31.00
## 751     4.00         4.00      4.00000           4.00
## 752     6.00         6.00      6.00000           6.00
## 753    33.00        33.00     33.00000          33.00
## 754    23.00        23.00     23.00000          23.00
## 755    48.00        48.00     48.00000          48.00
## 756     0.67         0.67      0.67000           0.67
## 757    28.00        28.00     28.00000          28.00
## 758    18.00        18.00     18.00000          18.00
## 759    34.00        34.00     34.00000          34.00
## 760    33.00        33.00     33.00000          33.00
## 761       NA         0.00     29.69912          28.00
## 762    41.00        41.00     41.00000          41.00
## 763    20.00        20.00     20.00000          20.00
## 764    36.00        36.00     36.00000          36.00
## 765    16.00        16.00     16.00000          16.00
## 766    51.00        51.00     51.00000          51.00
## 767       NA         0.00     29.69912          28.00
## 768    30.50        30.50     30.50000          30.50
## 769       NA         0.00     29.69912          28.00
## 770    32.00        32.00     32.00000          32.00
## 771    24.00        24.00     24.00000          24.00
## 772    48.00        48.00     48.00000          48.00
## 773    57.00        57.00     57.00000          57.00
## 774       NA         0.00     29.69912          28.00
## 775    54.00        54.00     54.00000          54.00
## 776    18.00        18.00     18.00000          18.00
## 777       NA         0.00     29.69912          28.00
## 778     5.00         5.00      5.00000           5.00
## 779       NA         0.00     29.69912          28.00
## 780    43.00        43.00     43.00000          43.00
## 781    13.00        13.00     13.00000          13.00
## 782    17.00        17.00     17.00000          17.00
## 783    29.00        29.00     29.00000          29.00
## 784       NA         0.00     29.69912          28.00
## 785    25.00        25.00     25.00000          25.00
## 786    25.00        25.00     25.00000          25.00
## 787    18.00        18.00     18.00000          18.00
## 788     8.00         8.00      8.00000           8.00
## 789     1.00         1.00      1.00000           1.00
## 790    46.00        46.00     46.00000          46.00
## 791       NA         0.00     29.69912          28.00
## 792    16.00        16.00     16.00000          16.00
## 793       NA         0.00     29.69912          28.00
## 794       NA         0.00     29.69912          28.00
## 795    25.00        25.00     25.00000          25.00
## 796    39.00        39.00     39.00000          39.00
## 797    49.00        49.00     49.00000          49.00
## 798    31.00        31.00     31.00000          31.00
## 799    30.00        30.00     30.00000          30.00
## 800    30.00        30.00     30.00000          30.00
## 801    34.00        34.00     34.00000          34.00
## 802    31.00        31.00     31.00000          31.00
## 803    11.00        11.00     11.00000          11.00
## 804     0.42         0.42      0.42000           0.42
## 805    27.00        27.00     27.00000          27.00
## 806    31.00        31.00     31.00000          31.00
## 807    39.00        39.00     39.00000          39.00
## 808    18.00        18.00     18.00000          18.00
## 809    39.00        39.00     39.00000          39.00
## 810    33.00        33.00     33.00000          33.00
## 811    26.00        26.00     26.00000          26.00
## 812    39.00        39.00     39.00000          39.00
## 813    35.00        35.00     35.00000          35.00
## 814     6.00         6.00      6.00000           6.00
## 815    30.50        30.50     30.50000          30.50
## 816       NA         0.00     29.69912          28.00
## 817    23.00        23.00     23.00000          23.00
## 818    31.00        31.00     31.00000          31.00
## 819    43.00        43.00     43.00000          43.00
## 820    10.00        10.00     10.00000          10.00
## 821    52.00        52.00     52.00000          52.00
## 822    27.00        27.00     27.00000          27.00
## 823    38.00        38.00     38.00000          38.00
## 824    27.00        27.00     27.00000          27.00
## 825     2.00         2.00      2.00000           2.00
## 826       NA         0.00     29.69912          28.00
## 827       NA         0.00     29.69912          28.00
## 828     1.00         1.00      1.00000           1.00
## 829       NA         0.00     29.69912          28.00
## 830    62.00        62.00     62.00000          62.00
## 831    15.00        15.00     15.00000          15.00
## 832     0.83         0.83      0.83000           0.83
## 833       NA         0.00     29.69912          28.00
## 834    23.00        23.00     23.00000          23.00
## 835    18.00        18.00     18.00000          18.00
## 836    39.00        39.00     39.00000          39.00
## 837    21.00        21.00     21.00000          21.00
## 838       NA         0.00     29.69912          28.00
## 839    32.00        32.00     32.00000          32.00
## 840       NA         0.00     29.69912          28.00
## 841    20.00        20.00     20.00000          20.00
## 842    16.00        16.00     16.00000          16.00
## 843    30.00        30.00     30.00000          30.00
## 844    34.50        34.50     34.50000          34.50
## 845    17.00        17.00     17.00000          17.00
## 846    42.00        42.00     42.00000          42.00
## 847       NA         0.00     29.69912          28.00
## 848    35.00        35.00     35.00000          35.00
## 849    28.00        28.00     28.00000          28.00
## 850       NA         0.00     29.69912          28.00
## 851     4.00         4.00      4.00000           4.00
## 852    74.00        74.00     74.00000          74.00
## 853     9.00         9.00      9.00000           9.00
## 854    16.00        16.00     16.00000          16.00
## 855    44.00        44.00     44.00000          44.00
## 856    18.00        18.00     18.00000          18.00
## 857    45.00        45.00     45.00000          45.00
## 858    51.00        51.00     51.00000          51.00
## 859    24.00        24.00     24.00000          24.00
## 860       NA         0.00     29.69912          28.00
## 861    41.00        41.00     41.00000          41.00
## 862    21.00        21.00     21.00000          21.00
## 863    48.00        48.00     48.00000          48.00
## 864       NA         0.00     29.69912          28.00
## 865    24.00        24.00     24.00000          24.00
## 866    42.00        42.00     42.00000          42.00
## 867    27.00        27.00     27.00000          27.00
## 868    31.00        31.00     31.00000          31.00
## 869       NA         0.00     29.69912          28.00
## 870     4.00         4.00      4.00000           4.00
## 871    26.00        26.00     26.00000          26.00
## 872    47.00        47.00     47.00000          47.00
## 873    33.00        33.00     33.00000          33.00
## 874    47.00        47.00     47.00000          47.00
## 875    28.00        28.00     28.00000          28.00
## 876    15.00        15.00     15.00000          15.00
## 877    20.00        20.00     20.00000          20.00
## 878    19.00        19.00     19.00000          19.00
## 879       NA         0.00     29.69912          28.00
## 880    56.00        56.00     56.00000          56.00
## 881    25.00        25.00     25.00000          25.00
## 882    33.00        33.00     33.00000          33.00
## 883    22.00        22.00     22.00000          22.00
## 884    28.00        28.00     28.00000          28.00
## 885    25.00        25.00     25.00000          25.00
## 886    39.00        39.00     39.00000          39.00
## 887    27.00        27.00     27.00000          27.00
## 888    19.00        19.00     19.00000          19.00
## 889       NA         0.00     29.69912          28.00
## 890    26.00        26.00     26.00000          26.00
## 891    32.00        32.00     32.00000          32.00

Let’s take a look at the variable distribution changes introduced by imputation on a 2×2 grid of histograms:

library(ggplot2)
library(cowplot) # for plot_grid()

h1 <- ggplot(value_imputed, aes(x = original)) +
  geom_histogram(fill = "#ad1538", color = "#000000", position = "identity") +
  ggtitle("Original distribution") +
  theme_classic()
h2 <- ggplot(value_imputed, aes(x = imputed_zero)) +
  geom_histogram(fill = "#15ad4f", color = "#000000", position = "identity") +
  ggtitle("Zero-imputed distribution") +
  theme_classic()
h3 <- ggplot(value_imputed, aes(x = imputed_mean)) +
  geom_histogram(fill = "#1543ad", color = "#000000", position = "identity") +
  ggtitle("Mean-imputed distribution") +
  theme_classic()
h4 <- ggplot(value_imputed, aes(x = imputed_median)) +
  geom_histogram(fill = "#ad8415", color = "#000000", position = "identity") +
  ggtitle("Median-imputed distribution") +
  theme_classic()

plot_grid(h1, h2, h3, h4, nrow = 2, ncol = 2)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 177 rows containing non-finite outside the scale range
## (`stat_bin()`).
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

All imputation methods severely impact the distribution. There are a lot of missing values, so setting a single constant value doesn’t make much sense. Zero imputation is the worst, as it’s highly unlikely for close to 200 passengers to have the age of zero.

Maybe mode imputation would provide better results, but we’ll leave that up to you.

What about the results from the imputation methods from using MICE?

Here, we will need the information from the other variables in the dataset.

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
titanic_numeric <- titanic_train %>%
  select(Survived, Pclass, SibSp, Parch, Age)

Here, we will use the following MICE imputation methods:

  • pmm: Predictive mean matching.
  • cart: Classification and regression trees.
  • laso.norm: Lasso linear regression.

Once again, the results will be stored in a data.frame:

mice_imputed <- data.frame(original = titanic_train$Age,
                           imputed_pmm = complete(mice(titanic_numeric, method = "pmm"))$Age,
                           imputed_cart = complete(mice(titanic_numeric, method = "cart"))$Age,
                           imputed_lasso = complete(mice(titanic_numeric, method = "lasso.norm"))$Age)
## 
##  iter imp variable
##   1   1  Age
##   1   2  Age
##   1   3  Age
##   1   4  Age
##   1   5  Age
##   2   1  Age
##   2   2  Age
##   2   3  Age
##   2   4  Age
##   2   5  Age
##   3   1  Age
##   3   2  Age
##   3   3  Age
##   3   4  Age
##   3   5  Age
##   4   1  Age
##   4   2  Age
##   4   3  Age
##   4   4  Age
##   4   5  Age
##   5   1  Age
##   5   2  Age
##   5   3  Age
##   5   4  Age
##   5   5  Age
## 
##  iter imp variable
##   1   1  Age
##   1   2  Age
##   1   3  Age
##   1   4  Age
##   1   5  Age
##   2   1  Age
##   2   2  Age
##   2   3  Age
##   2   4  Age
##   2   5  Age
##   3   1  Age
##   3   2  Age
##   3   3  Age
##   3   4  Age
##   3   5  Age
##   4   1  Age
##   4   2  Age
##   4   3  Age
##   4   4  Age
##   4   5  Age
##   5   1  Age
##   5   2  Age
##   5   3  Age
##   5   4  Age
##   5   5  Age
## 
##  iter imp variable
##   1   1  Age
##   1   2  Age
##   1   3  Age
##   1   4  Age
##   1   5  Age
##   2   1  Age
##   2   2  Age
##   2   3  Age
##   2   4  Age
##   2   5  Age
##   3   1  Age
##   3   2  Age
##   3   3  Age
##   3   4  Age
##   3   5  Age
##   4   1  Age
##   4   2  Age
##   4   3  Age
##   4   4  Age
##   4   5  Age
##   5   1  Age
##   5   2  Age
##   5   3  Age
##   5   4  Age
##   5   5  Age
mice_imputed
##     original imputed_pmm imputed_cart imputed_lasso
## 1      22.00       22.00        22.00     22.000000
## 2      38.00       38.00        38.00     38.000000
## 3      26.00       26.00        26.00     26.000000
## 4      35.00       35.00        35.00     35.000000
## 5      35.00       35.00        35.00     35.000000
## 6         NA       50.00        28.00     23.138597
## 7      54.00       54.00        54.00     54.000000
## 8       2.00        2.00         2.00      2.000000
## 9      27.00       27.00        27.00     27.000000
## 10     14.00       14.00        14.00     14.000000
## 11      4.00        4.00         4.00      4.000000
## 12     58.00       58.00        58.00     58.000000
## 13     20.00       20.00        20.00     20.000000
## 14     39.00       39.00        39.00     39.000000
## 15     14.00       14.00        14.00     14.000000
## 16     55.00       55.00        55.00     55.000000
## 17      2.00        2.00         2.00      2.000000
## 18        NA       19.00        34.00     29.699241
## 19     31.00       31.00        31.00     31.000000
## 20        NA       16.00        63.00     28.063402
## 21     35.00       35.00        35.00     35.000000
## 22     34.00       34.00        34.00     34.000000
## 23     15.00       15.00        15.00     15.000000
## 24     28.00       28.00        28.00     28.000000
## 25      8.00        8.00         8.00      8.000000
## 26     38.00       38.00        38.00     38.000000
## 27        NA       50.00        29.00     33.361442
## 28     19.00       19.00        19.00     19.000000
## 29        NA       27.00        45.00     49.407257
## 30        NA       13.00        45.50     44.382027
## 31     40.00       40.00        40.00     40.000000
## 32        NA       17.00        51.00     40.220626
## 33        NA       32.00        19.00     27.968374
## 34     66.00       66.00        66.00     66.000000
## 35     28.00       28.00        28.00     28.000000
## 36     42.00       42.00        42.00     42.000000
## 37        NA       24.00        13.00     18.964398
## 38     21.00       21.00        21.00     21.000000
## 39     18.00       18.00        18.00     18.000000
## 40     14.00       14.00        14.00     14.000000
## 41     40.00       40.00        40.00     40.000000
## 42     27.00       27.00        27.00     27.000000
## 43        NA       50.00        25.00     58.783021
## 44      3.00        3.00         3.00      3.000000
## 45     19.00       19.00        19.00     19.000000
## 46        NA       18.00        19.00     34.664493
## 47        NA       40.00        26.00     29.418582
## 48        NA       32.00        23.00     25.814748
## 49        NA       39.00        29.00     17.092353
## 50     18.00       18.00        18.00     18.000000
## 51      7.00        7.00         7.00      7.000000
## 52     21.00       21.00        21.00     21.000000
## 53     49.00       49.00        49.00     49.000000
## 54     29.00       29.00        29.00     29.000000
## 55     65.00       65.00        65.00     65.000000
## 56        NA       25.00        62.00     34.964772
## 57     21.00       21.00        21.00     21.000000
## 58     28.50       28.50        28.50     28.500000
## 59      5.00        5.00         5.00      5.000000
## 60     11.00       11.00        11.00     11.000000
## 61     22.00       22.00        22.00     22.000000
## 62     38.00       38.00        38.00     38.000000
## 63     45.00       45.00        45.00     45.000000
## 64      4.00        4.00         4.00      4.000000
## 65        NA       71.00        40.00     42.456328
## 66        NA        3.00        27.00     38.422024
## 67     29.00       29.00        29.00     29.000000
## 68     19.00       19.00        19.00     19.000000
## 69     17.00       17.00        17.00     17.000000
## 70     26.00       26.00        26.00     26.000000
## 71     32.00       32.00        32.00     32.000000
## 72     16.00       16.00        16.00     16.000000
## 73     21.00       21.00        21.00     21.000000
## 74     26.00       26.00        26.00     26.000000
## 75     32.00       32.00        32.00     32.000000
## 76     25.00       25.00        25.00     25.000000
## 77        NA       18.00        16.00      7.928571
## 78        NA       18.00        24.00     19.626921
## 79      0.83        0.83         0.83      0.830000
## 80     30.00       30.00        30.00     30.000000
## 81     22.00       22.00        22.00     22.000000
## 82     29.00       29.00        29.00     29.000000
## 83        NA       32.00        16.00     21.627360
## 84     28.00       28.00        28.00     28.000000
## 85     17.00       17.00        17.00     17.000000
## 86     33.00       33.00        33.00     33.000000
## 87     16.00       16.00        16.00     16.000000
## 88        NA       41.00        36.00     53.314791
## 89     23.00       23.00        23.00     23.000000
## 90     24.00       24.00        24.00     24.000000
## 91     29.00       29.00        29.00     29.000000
## 92     20.00       20.00        20.00     20.000000
## 93     46.00       46.00        46.00     46.000000
## 94     26.00       26.00        26.00     26.000000
## 95     59.00       59.00        59.00     59.000000
## 96        NA       25.00        19.00     25.431443
## 97     71.00       71.00        71.00     71.000000
## 98     23.00       23.00        23.00     23.000000
## 99     34.00       34.00        34.00     34.000000
## 100    34.00       34.00        34.00     34.000000
## 101    28.00       28.00        28.00     28.000000
## 102       NA       18.00        16.00     35.232305
## 103    21.00       21.00        21.00     21.000000
## 104    33.00       33.00        33.00     33.000000
## 105    37.00       37.00        37.00     37.000000
## 106    28.00       28.00        28.00     28.000000
## 107    21.00       21.00        21.00     21.000000
## 108       NA       45.00        30.00     25.935342
## 109    38.00       38.00        38.00     38.000000
## 110       NA       25.00        33.00     12.848847
## 111    47.00       47.00        47.00     47.000000
## 112    14.50       14.50        14.50     14.500000
## 113    22.00       22.00        22.00     22.000000
## 114    20.00       20.00        20.00     20.000000
## 115    17.00       17.00        17.00     17.000000
## 116    21.00       21.00        21.00     21.000000
## 117    70.50       70.50        70.50     70.500000
## 118    29.00       29.00        29.00     29.000000
## 119    24.00       24.00        24.00     24.000000
## 120     2.00        2.00         2.00      2.000000
## 121    21.00       21.00        21.00     21.000000
## 122       NA       25.00        23.00     28.661167
## 123    32.50       32.50        32.50     32.500000
## 124    32.50       32.50        32.50     32.500000
## 125    54.00       54.00        54.00     54.000000
## 126    12.00       12.00        12.00     12.000000
## 127       NA       13.00        29.00     30.054590
## 128    24.00       24.00        24.00     24.000000
## 129       NA       35.00         0.42     18.657682
## 130    45.00       45.00        45.00     45.000000
## 131    33.00       33.00        33.00     33.000000
## 132    20.00       20.00        20.00     20.000000
## 133    47.00       47.00        47.00     47.000000
## 134    29.00       29.00        29.00     29.000000
## 135    25.00       25.00        25.00     25.000000
## 136    23.00       23.00        23.00     23.000000
## 137    19.00       19.00        19.00     19.000000
## 138    37.00       37.00        37.00     37.000000
## 139    16.00       16.00        16.00     16.000000
## 140    24.00       24.00        24.00     24.000000
## 141       NA       28.00        44.00     20.391541
## 142    22.00       22.00        22.00     22.000000
## 143    24.00       24.00        24.00     24.000000
## 144    19.00       19.00        19.00     19.000000
## 145    18.00       18.00        18.00     18.000000
## 146    19.00       19.00        19.00     19.000000
## 147    27.00       27.00        27.00     27.000000
## 148     9.00        9.00         9.00      9.000000
## 149    36.50       36.50        36.50     36.500000
## 150    42.00       42.00        42.00     42.000000
## 151    51.00       51.00        51.00     51.000000
## 152    22.00       22.00        22.00     22.000000
## 153    55.50       55.50        55.50     55.500000
## 154    40.50       40.50        40.50     40.500000
## 155       NA       41.00        33.00     40.138689
## 156    51.00       51.00        51.00     51.000000
## 157    16.00       16.00        16.00     16.000000
## 158    30.00       30.00        30.00     30.000000
## 159       NA       41.00        41.00     20.403791
## 160       NA        5.00        11.00     -1.686867
## 161    44.00       44.00        44.00     44.000000
## 162    40.00       40.00        40.00     40.000000
## 163    26.00       26.00        26.00     26.000000
## 164    17.00       17.00        17.00     17.000000
## 165     1.00        1.00         1.00      1.000000
## 166     9.00        9.00         9.00      9.000000
## 167       NA       28.00        58.00     39.490353
## 168    45.00       45.00        45.00     45.000000
## 169       NA       45.50        62.00     42.199702
## 170    28.00       28.00        28.00     28.000000
## 171    61.00       61.00        61.00     61.000000
## 172     4.00        4.00         4.00      4.000000
## 173     1.00        1.00         1.00      1.000000
## 174    21.00       21.00        21.00     21.000000
## 175    56.00       56.00        56.00     56.000000
## 176    18.00       18.00        18.00     18.000000
## 177       NA       10.00         3.00     17.414502
## 178    50.00       50.00        50.00     50.000000
## 179    30.00       30.00        30.00     30.000000
## 180    36.00       36.00        36.00     36.000000
## 181       NA       16.00        16.00     -3.147851
## 182       NA       30.00        35.00     55.961696
## 183     9.00        9.00         9.00      9.000000
## 184     1.00        1.00         1.00      1.000000
## 185     4.00        4.00         4.00      4.000000
## 186       NA       38.00        45.00     36.839671
## 187       NA       20.00        25.00     22.049281
## 188    45.00       45.00        45.00     45.000000
## 189    40.00       40.00        40.00     40.000000
## 190    36.00       36.00        36.00     36.000000
## 191    32.00       32.00        32.00     32.000000
## 192    19.00       19.00        19.00     19.000000
## 193    19.00       19.00        19.00     19.000000
## 194     3.00        3.00         3.00      3.000000
## 195    44.00       44.00        44.00     44.000000
## 196    58.00       58.00        58.00     58.000000
## 197       NA       25.00        28.00     27.222496
## 198    42.00       42.00        42.00     42.000000
## 199       NA       24.00        16.00     12.997681
## 200    24.00       24.00        24.00     24.000000
## 201    28.00       28.00        28.00     28.000000
## 202       NA        5.00         1.00     14.523671
## 203    34.00       34.00        34.00     34.000000
## 204    45.50       45.50        45.50     45.500000
## 205    18.00       18.00        18.00     18.000000
## 206     2.00        2.00         2.00      2.000000
## 207    32.00       32.00        32.00     32.000000
## 208    26.00       26.00        26.00     26.000000
## 209    16.00       16.00        16.00     16.000000
## 210    40.00       40.00        40.00     40.000000
## 211    24.00       24.00        24.00     24.000000
## 212    35.00       35.00        35.00     35.000000
## 213    22.00       22.00        22.00     22.000000
## 214    30.00       30.00        30.00     30.000000
## 215       NA        3.00        47.00     20.278079
## 216    31.00       31.00        31.00     31.000000
## 217    27.00       27.00        27.00     27.000000
## 218    42.00       42.00        42.00     42.000000
## 219    32.00       32.00        32.00     32.000000
## 220    30.00       30.00        30.00     30.000000
## 221    16.00       16.00        16.00     16.000000
## 222    27.00       27.00        27.00     27.000000
## 223    51.00       51.00        51.00     51.000000
## 224       NA       25.00        19.00     37.804621
## 225    38.00       38.00        38.00     38.000000
## 226    22.00       22.00        22.00     22.000000
## 227    19.00       19.00        19.00     19.000000
## 228    20.50       20.50        20.50     20.500000
## 229    18.00       18.00        18.00     18.000000
## 230       NA        9.00         5.00      7.609335
## 231    35.00       35.00        35.00     35.000000
## 232    29.00       29.00        29.00     29.000000
## 233    59.00       59.00        59.00     59.000000
## 234     5.00        5.00         5.00      5.000000
## 235    24.00       24.00        24.00     24.000000
## 236       NA       50.00        21.00     17.485835
## 237    44.00       44.00        44.00     44.000000
## 238     8.00        8.00         8.00      8.000000
## 239    19.00       19.00        19.00     19.000000
## 240    33.00       33.00        33.00     33.000000
## 241       NA       40.00        47.00     25.829111
## 242       NA       25.00        25.00     26.289708
## 243    29.00       29.00        29.00     29.000000
## 244    22.00       22.00        22.00     22.000000
## 245    30.00       30.00        30.00     30.000000
## 246    44.00       44.00        44.00     44.000000
## 247    25.00       25.00        25.00     25.000000
## 248    24.00       24.00        24.00     24.000000
## 249    37.00       37.00        37.00     37.000000
## 250    54.00       54.00        54.00     54.000000
## 251       NA       18.00        19.00     53.828940
## 252    29.00       29.00        29.00     29.000000
## 253    62.00       62.00        62.00     62.000000
## 254    30.00       30.00        30.00     30.000000
## 255    41.00       41.00        41.00     41.000000
## 256    29.00       29.00        29.00     29.000000
## 257       NA       35.00        40.00     37.090808
## 258    30.00       30.00        30.00     30.000000
## 259    35.00       35.00        35.00     35.000000
## 260    50.00       50.00        50.00     50.000000
## 261       NA       41.00        21.00     36.689417
## 262     3.00        3.00         3.00      3.000000
## 263    52.00       52.00        52.00     52.000000
## 264    40.00       40.00        40.00     40.000000
## 265       NA       50.00        48.00     24.460699
## 266    36.00       36.00        36.00     36.000000
## 267    16.00       16.00        16.00     16.000000
## 268    25.00       25.00        25.00     25.000000
## 269    58.00       58.00        58.00     58.000000
## 270    35.00       35.00        35.00     35.000000
## 271       NA       62.00        64.00     35.306099
## 272    25.00       25.00        25.00     25.000000
## 273    41.00       41.00        41.00     41.000000
## 274    37.00       37.00        37.00     37.000000
## 275       NA       16.00        22.00      5.522564
## 276    63.00       63.00        63.00     63.000000
## 277    45.00       45.00        45.00     45.000000
## 278       NA       25.00        31.00     45.213842
## 279     7.00        7.00         7.00      7.000000
## 280    35.00       35.00        35.00     35.000000
## 281    65.00       65.00        65.00     65.000000
## 282    28.00       28.00        28.00     28.000000
## 283    16.00       16.00        16.00     16.000000
## 284    19.00       19.00        19.00     19.000000
## 285       NA       47.00        40.00     48.073608
## 286    33.00       33.00        33.00     33.000000
## 287    30.00       30.00        30.00     30.000000
## 288    22.00       22.00        22.00     22.000000
## 289    42.00       42.00        42.00     42.000000
## 290    22.00       22.00        22.00     22.000000
## 291    26.00       26.00        26.00     26.000000
## 292    19.00       19.00        19.00     19.000000
## 293    36.00       36.00        36.00     36.000000
## 294    24.00       24.00        24.00     24.000000
## 295    24.00       24.00        24.00     24.000000
## 296       NA       47.00        31.00     43.169129
## 297    23.50       23.50        23.50     23.500000
## 298     2.00        2.00         2.00      2.000000
## 299       NA       35.00        36.00     34.449586
## 300    50.00       50.00        50.00     50.000000
## 301       NA       32.00        27.00     31.543555
## 302       NA        9.00        36.00     22.861419
## 303    19.00       19.00        19.00     19.000000
## 304       NA       25.00        50.00     37.010063
## 305       NA       18.00        45.00     13.570978
## 306     0.92        0.92         0.92      0.920000
## 307       NA       35.00        30.00     44.566861
## 308    17.00       17.00        17.00     17.000000
## 309    30.00       30.00        30.00     30.000000
## 310    30.00       30.00        30.00     30.000000
## 311    24.00       24.00        24.00     24.000000
## 312    18.00       18.00        18.00     18.000000
## 313    26.00       26.00        26.00     26.000000
## 314    28.00       28.00        28.00     28.000000
## 315    43.00       43.00        43.00     43.000000
## 316    26.00       26.00        26.00     26.000000
## 317    24.00       24.00        24.00     24.000000
## 318    54.00       54.00        54.00     54.000000
## 319    31.00       31.00        31.00     31.000000
## 320    40.00       40.00        40.00     40.000000
## 321    22.00       22.00        22.00     22.000000
## 322    27.00       27.00        27.00     27.000000
## 323    30.00       30.00        30.00     30.000000
## 324    22.00       22.00        22.00     22.000000
## 325       NA        3.00        11.00      2.604959
## 326    36.00       36.00        36.00     36.000000
## 327    61.00       61.00        61.00     61.000000
## 328    36.00       36.00        36.00     36.000000
## 329    31.00       31.00        31.00     31.000000
## 330    16.00       16.00        16.00     16.000000
## 331       NA       38.00        25.00     15.333732
## 332    45.50       45.50        45.50     45.500000
## 333    38.00       38.00        38.00     38.000000
## 334    16.00       16.00        16.00     16.000000
## 335       NA       49.00        27.00     38.640769
## 336       NA       41.00        45.00     38.717188
## 337    29.00       29.00        29.00     29.000000
## 338    41.00       41.00        41.00     41.000000
## 339    45.00       45.00        45.00     45.000000
## 340    45.00       45.00        45.00     45.000000
## 341     2.00        2.00         2.00      2.000000
## 342    24.00       24.00        24.00     24.000000
## 343    28.00       28.00        28.00     28.000000
## 344    25.00       25.00        25.00     25.000000
## 345    36.00       36.00        36.00     36.000000
## 346    24.00       24.00        24.00     24.000000
## 347    40.00       40.00        40.00     40.000000
## 348       NA       19.00        19.00     22.108909
## 349     3.00        3.00         3.00      3.000000
## 350    42.00       42.00        42.00     42.000000
## 351    23.00       23.00        23.00     23.000000
## 352       NA       47.00        39.00     46.434081
## 353    15.00       15.00        15.00     15.000000
## 354    25.00       25.00        25.00     25.000000
## 355       NA       25.00        21.00     38.389168
## 356    28.00       28.00        28.00     28.000000
## 357    22.00       22.00        22.00     22.000000
## 358    38.00       38.00        38.00     38.000000
## 359       NA       24.00        26.00     24.662871
## 360       NA       27.00        18.00      9.260164
## 361    40.00       40.00        40.00     40.000000
## 362    29.00       29.00        29.00     29.000000
## 363    45.00       45.00        45.00     45.000000
## 364    35.00       35.00        35.00     35.000000
## 365       NA       40.00        14.50     45.875044
## 366    30.00       30.00        30.00     30.000000
## 367    60.00       60.00        60.00     60.000000
## 368       NA       24.00        22.00      7.847575
## 369       NA       32.00        39.00     37.099847
## 370    24.00       24.00        24.00     24.000000
## 371    25.00       25.00        25.00     25.000000
## 372    18.00       18.00        18.00     18.000000
## 373    19.00       19.00        19.00     19.000000
## 374    22.00       22.00        22.00     22.000000
## 375     3.00        3.00         3.00      3.000000
## 376       NA       25.00        53.00     26.291744
## 377    22.00       22.00        22.00     22.000000
## 378    27.00       27.00        27.00     27.000000
## 379    20.00       20.00        20.00     20.000000
## 380    19.00       19.00        19.00     19.000000
## 381    42.00       42.00        42.00     42.000000
## 382     1.00        1.00         1.00      1.000000
## 383    32.00       32.00        32.00     32.000000
## 384    35.00       35.00        35.00     35.000000
## 385       NA       41.00        25.00     35.418128
## 386    18.00       18.00        18.00     18.000000
## 387     1.00        1.00         1.00      1.000000
## 388    36.00       36.00        36.00     36.000000
## 389       NA       41.00        19.00     30.581699
## 390    17.00       17.00        17.00     17.000000
## 391    36.00       36.00        36.00     36.000000
## 392    21.00       21.00        21.00     21.000000
## 393    28.00       28.00        28.00     28.000000
## 394    23.00       23.00        23.00     23.000000
## 395    24.00       24.00        24.00     24.000000
## 396    22.00       22.00        22.00     22.000000
## 397    31.00       31.00        31.00     31.000000
## 398    46.00       46.00        46.00     46.000000
## 399    23.00       23.00        23.00     23.000000
## 400    28.00       28.00        28.00     28.000000
## 401    39.00       39.00        39.00     39.000000
## 402    26.00       26.00        26.00     26.000000
## 403    21.00       21.00        21.00     21.000000
## 404    28.00       28.00        28.00     28.000000
## 405    20.00       20.00        20.00     20.000000
## 406    34.00       34.00        34.00     34.000000
## 407    51.00       51.00        51.00     51.000000
## 408     3.00        3.00         3.00      3.000000
## 409    21.00       21.00        21.00     21.000000
## 410       NA        4.00         0.75      2.240382
## 411       NA       50.00        19.00     35.054886
## 412       NA       13.00        29.00     45.052403
## 413    33.00       33.00        33.00     33.000000
## 414       NA       66.00        35.00     37.121316
## 415    44.00       44.00        44.00     44.000000
## 416       NA       18.00        24.50     48.544095
## 417    34.00       34.00        34.00     34.000000
## 418    18.00       18.00        18.00     18.000000
## 419    30.00       30.00        30.00     30.000000
## 420    10.00       10.00        10.00     10.000000
## 421       NA       41.00        21.00     36.152166
## 422    21.00       21.00        21.00     21.000000
## 423    29.00       29.00        29.00     29.000000
## 424    28.00       28.00        28.00     28.000000
## 425    18.00       18.00        18.00     18.000000
## 426       NA       41.00        38.00     34.540660
## 427    28.00       28.00        28.00     28.000000
## 428    19.00       19.00        19.00     19.000000
## 429       NA       13.00        20.00     33.408042
## 430    32.00       32.00        32.00     32.000000
## 431    28.00       28.00        28.00     28.000000
## 432       NA       12.00        20.00     18.540613
## 433    42.00       42.00        42.00     42.000000
## 434    17.00       17.00        17.00     17.000000
## 435    50.00       50.00        50.00     50.000000
## 436    14.00       14.00        14.00     14.000000
## 437    21.00       21.00        21.00     21.000000
## 438    24.00       24.00        24.00     24.000000
## 439    64.00       64.00        64.00     64.000000
## 440    31.00       31.00        31.00     31.000000
## 441    45.00       45.00        45.00     45.000000
## 442    20.00       20.00        20.00     20.000000
## 443    25.00       25.00        25.00     25.000000
## 444    28.00       28.00        28.00     28.000000
## 445       NA       24.00        29.00     28.639739
## 446     4.00        4.00         4.00      4.000000
## 447    13.00       13.00        13.00     13.000000
## 448    34.00       34.00        34.00     34.000000
## 449     5.00        5.00         5.00      5.000000
## 450    52.00       52.00        52.00     52.000000
## 451    36.00       36.00        36.00     36.000000
## 452       NA       41.00        40.00     32.207197
## 453    30.00       30.00        30.00     30.000000
## 454    49.00       49.00        49.00     49.000000
## 455       NA       13.00        51.00     27.241407
## 456    29.00       29.00        29.00     29.000000
## 457    65.00       65.00        65.00     65.000000
## 458       NA       49.00        35.00     46.530238
## 459    50.00       50.00        50.00     50.000000
## 460       NA       18.00        11.00     35.745600
## 461    48.00       48.00        48.00     48.000000
## 462    34.00       34.00        34.00     34.000000
## 463    47.00       47.00        47.00     47.000000
## 464    48.00       48.00        48.00     48.000000
## 465       NA       25.00        33.00     19.322652
## 466    38.00       38.00        38.00     38.000000
## 467       NA       30.00        50.00     50.200934
## 468    56.00       56.00        56.00     56.000000
## 469       NA       13.00        59.00     25.620317
## 470     0.75        0.75         0.75      0.750000
## 471       NA       41.00        43.00     57.706203
## 472    38.00       38.00        38.00     38.000000
## 473    33.00       33.00        33.00     33.000000
## 474    23.00       23.00        23.00     23.000000
## 475    22.00       22.00        22.00     22.000000
## 476       NA       71.00        39.00     41.032847
## 477    34.00       34.00        34.00     34.000000
## 478    29.00       29.00        29.00     29.000000
## 479    22.00       22.00        22.00     22.000000
## 480     2.00        2.00         2.00      2.000000
## 481     9.00        9.00         9.00      9.000000
## 482       NA       66.00        19.00     25.914125
## 483    50.00       50.00        50.00     50.000000
## 484    63.00       63.00        63.00     63.000000
## 485    25.00       25.00        25.00     25.000000
## 486       NA        2.00         0.75     -5.867558
## 487    35.00       35.00        35.00     35.000000
## 488    58.00       58.00        58.00     58.000000
## 489    30.00       30.00        30.00     30.000000
## 490     9.00        9.00         9.00      9.000000
## 491       NA       39.00        25.00     46.664719
## 492    21.00       21.00        21.00     21.000000
## 493    55.00       55.00        55.00     55.000000
## 494    71.00       71.00        71.00     71.000000
## 495    21.00       21.00        21.00     21.000000
## 496       NA       25.00        20.00     39.076812
## 497    54.00       54.00        54.00     54.000000
## 498       NA       50.00        21.00     14.470910
## 499    25.00       25.00        25.00     25.000000
## 500    24.00       24.00        24.00     24.000000
## 501    17.00       17.00        17.00     17.000000
## 502    21.00       21.00        21.00     21.000000
## 503       NA       13.00        43.00     28.932999
## 504    37.00       37.00        37.00     37.000000
## 505    16.00       16.00        16.00     16.000000
## 506    18.00       18.00        18.00     18.000000
## 507    33.00       33.00        33.00     33.000000
## 508       NA       52.00        35.00     30.212159
## 509    28.00       28.00        28.00     28.000000
## 510    26.00       26.00        26.00     26.000000
## 511    29.00       29.00        29.00     29.000000
## 512       NA       13.00        42.00     50.832488
## 513    36.00       36.00        36.00     36.000000
## 514    54.00       54.00        54.00     54.000000
## 515    24.00       24.00        24.00     24.000000
## 516    47.00       47.00        47.00     47.000000
## 517    34.00       34.00        34.00     34.000000
## 518       NA       25.00        38.00     27.024037
## 519    36.00       36.00        36.00     36.000000
## 520    32.00       32.00        32.00     32.000000
## 521    30.00       30.00        30.00     30.000000
## 522    22.00       22.00        22.00     22.000000
## 523       NA       25.00        22.00     30.992980
## 524    44.00       44.00        44.00     44.000000
## 525       NA       25.00        34.00     31.351219
## 526    40.50       40.50        40.50     40.500000
## 527    50.00       50.00        50.00     50.000000
## 528       NA       47.00        47.00     53.624024
## 529    39.00       39.00        39.00     39.000000
## 530    23.00       23.00        23.00     23.000000
## 531     2.00        2.00         2.00      2.000000
## 532       NA       41.00        34.50     36.462676
## 533    17.00       17.00        17.00     17.000000
## 534       NA       21.00        24.00     20.863092
## 535    30.00       30.00        30.00     30.000000
## 536     7.00        7.00         7.00      7.000000
## 537    45.00       45.00        45.00     45.000000
## 538    30.00       30.00        30.00     30.000000
## 539       NA       18.00        30.50     29.051316
## 540    22.00       22.00        22.00     22.000000
## 541    36.00       36.00        36.00     36.000000
## 542     9.00        9.00         9.00      9.000000
## 543    11.00       11.00        11.00     11.000000
## 544    32.00       32.00        32.00     32.000000
## 545    50.00       50.00        50.00     50.000000
## 546    64.00       64.00        64.00     64.000000
## 547    19.00       19.00        19.00     19.000000
## 548       NA       25.00        23.00     36.466799
## 549    33.00       33.00        33.00     33.000000
## 550     8.00        8.00         8.00      8.000000
## 551    17.00       17.00        17.00     17.000000
## 552    27.00       27.00        27.00     27.000000
## 553       NA       13.00        24.00     31.901357
## 554    22.00       22.00        22.00     22.000000
## 555    22.00       22.00        22.00     22.000000
## 556    62.00       62.00        62.00     62.000000
## 557    48.00       48.00        48.00     48.000000
## 558       NA       71.00        39.00     69.165379
## 559    39.00       39.00        39.00     39.000000
## 560    36.00       36.00        36.00     36.000000
## 561       NA       41.00        43.00     17.500269
## 562    40.00       40.00        40.00     40.000000
## 563    28.00       28.00        28.00     28.000000
## 564       NA       50.00        30.00     18.340103
## 565       NA       13.00        24.00     33.991401
## 566    24.00       24.00        24.00     24.000000
## 567    19.00       19.00        19.00     19.000000
## 568    29.00       29.00        29.00     29.000000
## 569       NA       50.00        18.00     18.403709
## 570    32.00       32.00        32.00     32.000000
## 571    62.00       62.00        62.00     62.000000
## 572    53.00       53.00        53.00     53.000000
## 573    36.00       36.00        36.00     36.000000
## 574       NA       32.00        29.00     47.034050
## 575    16.00       16.00        16.00     16.000000
## 576    19.00       19.00        19.00     19.000000
## 577    34.00       34.00        34.00     34.000000
## 578    39.00       39.00        39.00     39.000000
## 579       NA       40.00        28.00     23.189149
## 580    32.00       32.00        32.00     32.000000
## 581    25.00       25.00        25.00     25.000000
## 582    39.00       39.00        39.00     39.000000
## 583    54.00       54.00        54.00     54.000000
## 584    36.00       36.00        36.00     36.000000
## 585       NA       18.00        21.00     67.694091
## 586    18.00       18.00        18.00     18.000000
## 587    47.00       47.00        47.00     47.000000
## 588    60.00       60.00        60.00     60.000000
## 589    22.00       22.00        22.00     22.000000
## 590       NA       25.00        17.00     30.649153
## 591    35.00       35.00        35.00     35.000000
## 592    52.00       52.00        52.00     52.000000
## 593    47.00       47.00        47.00     47.000000
## 594       NA       24.00        40.50     32.436296
## 595    37.00       37.00        37.00     37.000000
## 596    36.00       36.00        36.00     36.000000
## 597       NA       19.00        18.00     31.221822
## 598    49.00       49.00        49.00     49.000000
## 599       NA       50.00        25.00     47.436361
## 600    49.00       49.00        49.00     49.000000
## 601    24.00       24.00        24.00     24.000000
## 602       NA       13.00        18.00     18.820254
## 603       NA       71.00        47.00     58.670056
## 604    44.00       44.00        44.00     44.000000
## 605    35.00       35.00        35.00     35.000000
## 606    36.00       36.00        36.00     36.000000
## 607    30.00       30.00        30.00     30.000000
## 608    27.00       27.00        27.00     27.000000
## 609    22.00       22.00        22.00     22.000000
## 610    40.00       40.00        40.00     40.000000
## 611    39.00       39.00        39.00     39.000000
## 612       NA       18.00        20.00     25.907702
## 613       NA       12.00        25.00     25.518209
## 614       NA       18.00        28.00     33.791775
## 615    35.00       35.00        35.00     35.000000
## 616    24.00       24.00        24.00     24.000000
## 617    34.00       34.00        34.00     34.000000
## 618    26.00       26.00        26.00     26.000000
## 619     4.00        4.00         4.00      4.000000
## 620    26.00       26.00        26.00     26.000000
## 621    27.00       27.00        27.00     27.000000
## 622    42.00       42.00        42.00     42.000000
## 623    20.00       20.00        20.00     20.000000
## 624    21.00       21.00        21.00     21.000000
## 625    21.00       21.00        21.00     21.000000
## 626    61.00       61.00        61.00     61.000000
## 627    57.00       57.00        57.00     57.000000
## 628    21.00       21.00        21.00     21.000000
## 629    26.00       26.00        26.00     26.000000
## 630       NA       13.00        22.00     46.119539
## 631    80.00       80.00        80.00     80.000000
## 632    51.00       51.00        51.00     51.000000
## 633    32.00       32.00        32.00     32.000000
## 634       NA       62.00        22.00     37.860192
## 635     9.00        9.00         9.00      9.000000
## 636    28.00       28.00        28.00     28.000000
## 637    32.00       32.00        32.00     32.000000
## 638    31.00       31.00        31.00     31.000000
## 639    41.00       41.00        41.00     41.000000
## 640       NA       39.00        24.00     26.211401
## 641    20.00       20.00        20.00     20.000000
## 642    24.00       24.00        24.00     24.000000
## 643     2.00        2.00         2.00      2.000000
## 644       NA       32.00        18.00     31.775737
## 645     0.75        0.75         0.75      0.750000
## 646    48.00       48.00        48.00     48.000000
## 647    19.00       19.00        19.00     19.000000
## 648    56.00       56.00        56.00     56.000000
## 649       NA       41.00        33.00     23.171088
## 650    23.00       23.00        23.00     23.000000
## 651       NA       50.00        17.00     59.713457
## 652    18.00       18.00        18.00     18.000000
## 653    21.00       21.00        21.00     21.000000
## 654       NA       24.00        16.00     36.481209
## 655    18.00       18.00        18.00     18.000000
## 656    24.00       24.00        24.00     24.000000
## 657       NA       13.00        50.00     23.331798
## 658    32.00       32.00        32.00     32.000000
## 659    23.00       23.00        23.00     23.000000
## 660    58.00       58.00        58.00     58.000000
## 661    50.00       50.00        50.00     50.000000
## 662    40.00       40.00        40.00     40.000000
## 663    47.00       47.00        47.00     47.000000
## 664    36.00       36.00        36.00     36.000000
## 665    20.00       20.00        20.00     20.000000
## 666    32.00       32.00        32.00     32.000000
## 667    25.00       25.00        25.00     25.000000
## 668       NA       13.00        22.00     50.931164
## 669    43.00       43.00        43.00     43.000000
## 670       NA       48.00        33.00     48.648657
## 671    40.00       40.00        40.00     40.000000
## 672    31.00       31.00        31.00     31.000000
## 673    70.00       70.00        70.00     70.000000
## 674    31.00       31.00        31.00     31.000000
## 675       NA       34.00        35.00     43.780320
## 676    18.00       18.00        18.00     18.000000
## 677    24.50       24.50        24.50     24.500000
## 678    18.00       18.00        18.00     18.000000
## 679    43.00       43.00        43.00     43.000000
## 680    36.00       36.00        36.00     36.000000
## 681       NA       13.00        24.00     22.710286
## 682    27.00       27.00        27.00     27.000000
## 683    20.00       20.00        20.00     20.000000
## 684    14.00       14.00        14.00     14.000000
## 685    60.00       60.00        60.00     60.000000
## 686    25.00       25.00        25.00     25.000000
## 687    14.00       14.00        14.00     14.000000
## 688    19.00       19.00        19.00     19.000000
## 689    18.00       18.00        18.00     18.000000
## 690    15.00       15.00        15.00     15.000000
## 691    31.00       31.00        31.00     31.000000
## 692     4.00        4.00         4.00      4.000000
## 693       NA       24.00        16.00     34.746741
## 694    25.00       25.00        25.00     25.000000
## 695    60.00       60.00        60.00     60.000000
## 696    52.00       52.00        52.00     52.000000
## 697    44.00       44.00        44.00     44.000000
## 698       NA       32.00        22.00     32.130059
## 699    49.00       49.00        49.00     49.000000
## 700    42.00       42.00        42.00     42.000000
## 701    18.00       18.00        18.00     18.000000
## 702    35.00       35.00        35.00     35.000000
## 703    18.00       18.00        18.00     18.000000
## 704    25.00       25.00        25.00     25.000000
## 705    26.00       26.00        26.00     26.000000
## 706    39.00       39.00        39.00     39.000000
## 707    45.00       45.00        45.00     45.000000
## 708    42.00       42.00        42.00     42.000000
## 709    22.00       22.00        22.00     22.000000
## 710       NA        3.00         9.00     33.182082
## 711    24.00       24.00        24.00     24.000000
## 712       NA       45.50        58.00     33.466917
## 713    48.00       48.00        48.00     48.000000
## 714    29.00       29.00        29.00     29.000000
## 715    52.00       52.00        52.00     52.000000
## 716    19.00       19.00        19.00     19.000000
## 717    38.00       38.00        38.00     38.000000
## 718    27.00       27.00        27.00     27.000000
## 719       NA       41.00        36.00     36.130454
## 720    33.00       33.00        33.00     33.000000
## 721     6.00        6.00         6.00      6.000000
## 722    17.00       17.00        17.00     17.000000
## 723    34.00       34.00        34.00     34.000000
## 724    50.00       50.00        50.00     50.000000
## 725    27.00       27.00        27.00     27.000000
## 726    20.00       20.00        20.00     20.000000
## 727    30.00       30.00        30.00     30.000000
## 728       NA       32.00        32.00     19.740786
## 729    25.00       25.00        25.00     25.000000
## 730    25.00       25.00        25.00     25.000000
## 731    29.00       29.00        29.00     29.000000
## 732    11.00       11.00        11.00     11.000000
## 733       NA       25.00        52.00     32.450605
## 734    23.00       23.00        23.00     23.000000
## 735    23.00       23.00        23.00     23.000000
## 736    28.50       28.50        28.50     28.500000
## 737    48.00       48.00        48.00     48.000000
## 738    35.00       35.00        35.00     35.000000
## 739       NA       18.00        32.00     50.635961
## 740       NA       50.00        26.00     48.306316
## 741       NA       52.00        19.00     50.612894
## 742    36.00       36.00        36.00     36.000000
## 743    21.00       21.00        21.00     21.000000
## 744    24.00       24.00        24.00     24.000000
## 745    31.00       31.00        31.00     31.000000
## 746    70.00       70.00        70.00     70.000000
## 747    16.00       16.00        16.00     16.000000
## 748    30.00       30.00        30.00     30.000000
## 749    19.00       19.00        19.00     19.000000
## 750    31.00       31.00        31.00     31.000000
## 751     4.00        4.00         4.00      4.000000
## 752     6.00        6.00         6.00      6.000000
## 753    33.00       33.00        33.00     33.000000
## 754    23.00       23.00        23.00     23.000000
## 755    48.00       48.00        48.00     48.000000
## 756     0.67        0.67         0.67      0.670000
## 757    28.00       28.00        28.00     28.000000
## 758    18.00       18.00        18.00     18.000000
## 759    34.00       34.00        34.00     34.000000
## 760    33.00       33.00        33.00     33.000000
## 761       NA       13.00        16.00     31.803481
## 762    41.00       41.00        41.00     41.000000
## 763    20.00       20.00        20.00     20.000000
## 764    36.00       36.00        36.00     36.000000
## 765    16.00       16.00        16.00     16.000000
## 766    51.00       51.00        51.00     51.000000
## 767       NA       47.00        56.00     40.361236
## 768    30.50       30.50        30.50     30.500000
## 769       NA        3.00        25.00     15.651240
## 770    32.00       32.00        32.00     32.000000
## 771    24.00       24.00        24.00     24.000000
## 772    48.00       48.00        48.00     48.000000
## 773    57.00       57.00        57.00     57.000000
## 774       NA       13.00        24.00     36.102583
## 775    54.00       54.00        54.00     54.000000
## 776    18.00       18.00        18.00     18.000000
## 777       NA       50.00        24.00     12.751652
## 778     5.00        5.00         5.00      5.000000
## 779       NA       25.00        45.50     35.792110
## 780    43.00       43.00        43.00     43.000000
## 781    13.00       13.00        13.00     13.000000
## 782    17.00       17.00        17.00     17.000000
## 783    29.00       29.00        29.00     29.000000
## 784       NA       18.00        26.00      9.943468
## 785    25.00       25.00        25.00     25.000000
## 786    25.00       25.00        25.00     25.000000
## 787    18.00       18.00        18.00     18.000000
## 788     8.00        8.00         8.00      8.000000
## 789     1.00        1.00         1.00      1.000000
## 790    46.00       46.00        46.00     46.000000
## 791       NA       13.00        19.00     13.405442
## 792    16.00       16.00        16.00     16.000000
## 793       NA       16.00        11.00    -27.366029
## 794       NA       38.00        62.00     56.220349
## 795    25.00       25.00        25.00     25.000000
## 796    39.00       39.00        39.00     39.000000
## 797    49.00       49.00        49.00     49.000000
## 798    31.00       31.00        31.00     31.000000
## 799    30.00       30.00        30.00     30.000000
## 800    30.00       30.00        30.00     30.000000
## 801    34.00       34.00        34.00     34.000000
## 802    31.00       31.00        31.00     31.000000
## 803    11.00       11.00        11.00     11.000000
## 804     0.42        0.42         0.42      0.420000
## 805    27.00       27.00        27.00     27.000000
## 806    31.00       31.00        31.00     31.000000
## 807    39.00       39.00        39.00     39.000000
## 808    18.00       18.00        18.00     18.000000
## 809    39.00       39.00        39.00     39.000000
## 810    33.00       33.00        33.00     33.000000
## 811    26.00       26.00        26.00     26.000000
## 812    39.00       39.00        39.00     39.000000
## 813    35.00       35.00        35.00     35.000000
## 814     6.00        6.00         6.00      6.000000
## 815    30.50       30.50        30.50     30.500000
## 816       NA       71.00        62.00     37.351594
## 817    23.00       23.00        23.00     23.000000
## 818    31.00       31.00        31.00     31.000000
## 819    43.00       43.00        43.00     43.000000
## 820    10.00       10.00        10.00     10.000000
## 821    52.00       52.00        52.00     52.000000
## 822    27.00       27.00        27.00     27.000000
## 823    38.00       38.00        38.00     38.000000
## 824    27.00       27.00        27.00     27.000000
## 825     2.00        2.00         2.00      2.000000
## 826       NA       25.00        30.00     35.616908
## 827       NA       25.00        49.00     18.377707
## 828     1.00        1.00         1.00      1.000000
## 829       NA       45.00        26.00     15.378832
## 830    62.00       62.00        62.00     62.000000
## 831    15.00       15.00        15.00     15.000000
## 832     0.83        0.83         0.83      0.830000
## 833       NA       50.00        25.00     36.482607
## 834    23.00       23.00        23.00     23.000000
## 835    18.00       18.00        18.00     18.000000
## 836    39.00       39.00        39.00     39.000000
## 837    21.00       21.00        21.00     21.000000
## 838       NA       18.00        28.00     25.091181
## 839    32.00       32.00        32.00     32.000000
## 840       NA       35.00        44.00     40.287568
## 841    20.00       20.00        20.00     20.000000
## 842    16.00       16.00        16.00     16.000000
## 843    30.00       30.00        30.00     30.000000
## 844    34.50       34.50        34.50     34.500000
## 845    17.00       17.00        17.00     17.000000
## 846    42.00       42.00        42.00     42.000000
## 847       NA        5.00        11.00     -9.862554
## 848    35.00       35.00        35.00     35.000000
## 849    28.00       28.00        28.00     28.000000
## 850       NA       48.00        39.00     38.411686
## 851     4.00        4.00         4.00      4.000000
## 852    74.00       74.00        74.00     74.000000
## 853     9.00        9.00         9.00      9.000000
## 854    16.00       16.00        16.00     16.000000
## 855    44.00       44.00        44.00     44.000000
## 856    18.00       18.00        18.00     18.000000
## 857    45.00       45.00        45.00     45.000000
## 858    51.00       51.00        51.00     51.000000
## 859    24.00       24.00        24.00     24.000000
## 860       NA       13.00        28.00     58.146066
## 861    41.00       41.00        41.00     41.000000
## 862    21.00       21.00        21.00     21.000000
## 863    48.00       48.00        48.00     48.000000
## 864       NA       14.00         9.00     -3.237722
## 865    24.00       24.00        24.00     24.000000
## 866    42.00       42.00        42.00     42.000000
## 867    27.00       27.00        27.00     27.000000
## 868    31.00       31.00        31.00     31.000000
## 869       NA       25.00        28.00     14.966908
## 870     4.00        4.00         4.00      4.000000
## 871    26.00       26.00        26.00     26.000000
## 872    47.00       47.00        47.00     47.000000
## 873    33.00       33.00        33.00     33.000000
## 874    47.00       47.00        47.00     47.000000
## 875    28.00       28.00        28.00     28.000000
## 876    15.00       15.00        15.00     15.000000
## 877    20.00       20.00        20.00     20.000000
## 878    19.00       19.00        19.00     19.000000
## 879       NA       18.00        22.00     23.425275
## 880    56.00       56.00        56.00     56.000000
## 881    25.00       25.00        25.00     25.000000
## 882    33.00       33.00        33.00     33.000000
## 883    22.00       22.00        22.00     22.000000
## 884    28.00       28.00        28.00     28.000000
## 885    25.00       25.00        25.00     25.000000
## 886    39.00       39.00        39.00     39.000000
## 887    27.00       27.00        27.00     27.000000
## 888    19.00       19.00        19.00     19.000000
## 889       NA        4.00        32.00     13.016662
## 890    26.00       26.00        26.00     26.000000
## 891    32.00       32.00        32.00     32.000000

It’s hard to judge from the table data alone, so we’ll draw a grid of histograms once again (copy and modify the code from the previous section):

h1 <- ggplot(mice_imputed, aes(x = original)) +
  geom_histogram(fill = "#ad1538", color = "#000000", position = "identity") +
  ggtitle("Original distribution") +
  theme_classic()
h2 <- ggplot(mice_imputed, aes(x = imputed_pmm)) +
  geom_histogram(fill = "#15ad4f", color = "#000000", position = "identity") +
  ggtitle("MICE-pmm-imputed distribution") +
  theme_classic()
h3 <- ggplot(mice_imputed, aes(x = imputed_cart)) +
  geom_histogram(fill = "#1543ad", color = "#000000", position = "identity") +
  ggtitle("MICE-cart-imputed distribution") +
  theme_classic()
h4 <- ggplot(mice_imputed, aes(x = imputed_lasso)) +
  geom_histogram(fill = "#ad8415", color = "#000000", position = "identity") +
  ggtitle("MICE-lasso-imputed distribution") +
  theme_classic()

plot_grid(h1, h2, h3, h4, nrow = 2, ncol = 2)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 177 rows containing non-finite outside the scale range
## (`stat_bin()`).
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

The imputed distributions overall look much closer to the original one. The CART-imputed age distribution probably looks the closest. Also, take a look at the last histogram – the age values go below zero. This doesn’t make sense for a variable such as age, so you will need to correct the negative values manually if you opt for this imputation technique.

Imputation with R missForest Package

The Miss Forest imputation technique is based on the Random Forest algorithm. It’s a non-parametric imputation method, which means it doesn’t make explicit assumptions about the function form, but instead tries to estimate the function in a way that’s closest to the data points.

In other words, it builds a random forest model for each variable and then uses the model to predict missing values. You can learn more about it by reading the article by Oxford Academic.

Let’s see how it works for imputation in R. We’ll apply it to the entire numerical dataset and only extract the age:

library(missForest)
## 
## Attaching package: 'missForest'
## The following object is masked from 'package:VIM':
## 
##     nrmse
missForest_imputed <- data.frame(
  original = titanic_numeric$Age,
  imputed_missForest = missForest(titanic_numeric)$ximp$Age
)
missForest_imputed
##     original imputed_missForest
## 1      22.00          22.000000
## 2      38.00          38.000000
## 3      26.00          26.000000
## 4      35.00          35.000000
## 5      35.00          35.000000
## 6         NA          28.935501
## 7      54.00          54.000000
## 8       2.00           2.000000
## 9      27.00          27.000000
## 10     14.00          14.000000
## 11      4.00           4.000000
## 12     58.00          58.000000
## 13     20.00          20.000000
## 14     39.00          39.000000
## 15     14.00          14.000000
## 16     55.00          55.000000
## 17      2.00           2.000000
## 18        NA          32.117919
## 19     31.00          31.000000
## 20        NA          27.099293
## 21     35.00          35.000000
## 22     34.00          34.000000
## 23     15.00          15.000000
## 24     28.00          28.000000
## 25      8.00           8.000000
## 26     38.00          38.000000
## 27        NA          28.935501
## 28     19.00          19.000000
## 29        NA          27.099293
## 30        NA          28.935501
## 31     40.00          40.000000
## 32        NA          37.533744
## 33        NA          27.099293
## 34     66.00          66.000000
## 35     28.00          28.000000
## 36     42.00          42.000000
## 37        NA          27.099293
## 38     21.00          21.000000
## 39     18.00          18.000000
## 40     14.00          14.000000
## 41     40.00          40.000000
## 42     27.00          27.000000
## 43        NA          28.935501
## 44      3.00           3.000000
## 45     19.00          19.000000
## 46        NA          28.935501
## 47        NA          27.447446
## 48        NA          27.099293
## 49        NA          26.968359
## 50     18.00          18.000000
## 51      7.00           7.000000
## 52     21.00          21.000000
## 53     49.00          49.000000
## 54     29.00          29.000000
## 55     65.00          65.000000
## 56        NA          36.413750
## 57     21.00          21.000000
## 58     28.50          28.500000
## 59      5.00           5.000000
## 60     11.00          11.000000
## 61     22.00          22.000000
## 62     38.00          38.000000
## 63     45.00          45.000000
## 64      4.00           4.000000
## 65        NA          45.034064
## 66        NA          14.045910
## 67     29.00          29.000000
## 68     19.00          19.000000
## 69     17.00          17.000000
## 70     26.00          26.000000
## 71     32.00          32.000000
## 72     16.00          16.000000
## 73     21.00          21.000000
## 74     26.00          26.000000
## 75     32.00          32.000000
## 76     25.00          25.000000
## 77        NA          28.935501
## 78        NA          28.935501
## 79      0.83           0.830000
## 80     30.00          30.000000
## 81     22.00          22.000000
## 82     29.00          29.000000
## 83        NA          27.099293
## 84     28.00          28.000000
## 85     17.00          17.000000
## 86     33.00          33.000000
## 87     16.00          16.000000
## 88        NA          28.935501
## 89     23.00          23.000000
## 90     24.00          24.000000
## 91     29.00          29.000000
## 92     20.00          20.000000
## 93     46.00          46.000000
## 94     26.00          26.000000
## 95     59.00          59.000000
## 96        NA          28.935501
## 97     71.00          71.000000
## 98     23.00          23.000000
## 99     34.00          34.000000
## 100    34.00          34.000000
## 101    28.00          28.000000
## 102       NA          28.935501
## 103    21.00          21.000000
## 104    33.00          33.000000
## 105    37.00          37.000000
## 106    28.00          28.000000
## 107    21.00          21.000000
## 108       NA          27.099293
## 109    38.00          38.000000
## 110       NA          23.695411
## 111    47.00          47.000000
## 112    14.50          14.500000
## 113    22.00          22.000000
## 114    20.00          20.000000
## 115    17.00          17.000000
## 116    21.00          21.000000
## 117    70.50          70.500000
## 118    29.00          29.000000
## 119    24.00          24.000000
## 120     2.00           2.000000
## 121    21.00          21.000000
## 122       NA          28.935501
## 123    32.50          32.500000
## 124    32.50          32.500000
## 125    54.00          54.000000
## 126    12.00          12.000000
## 127       NA          28.935501
## 128    24.00          24.000000
## 129       NA          14.045910
## 130    45.00          45.000000
## 131    33.00          33.000000
## 132    20.00          20.000000
## 133    47.00          47.000000
## 134    29.00          29.000000
## 135    25.00          25.000000
## 136    23.00          23.000000
## 137    19.00          19.000000
## 138    37.00          37.000000
## 139    16.00          16.000000
## 140    24.00          24.000000
## 141       NA          30.097261
## 142    22.00          22.000000
## 143    24.00          24.000000
## 144    19.00          19.000000
## 145    18.00          18.000000
## 146    19.00          19.000000
## 147    27.00          27.000000
## 148     9.00           9.000000
## 149    36.50          36.500000
## 150    42.00          42.000000
## 151    51.00          51.000000
## 152    22.00          22.000000
## 153    55.50          55.500000
## 154    40.50          40.500000
## 155       NA          28.935501
## 156    51.00          51.000000
## 157    16.00          16.000000
## 158    30.00          30.000000
## 159       NA          28.935501
## 160       NA           9.825194
## 161    44.00          44.000000
## 162    40.00          40.000000
## 163    26.00          26.000000
## 164    17.00          17.000000
## 165     1.00           1.000000
## 166     9.00           9.000000
## 167       NA          35.255782
## 168    45.00          45.000000
## 169       NA          45.034064
## 170    28.00          28.000000
## 171    61.00          61.000000
## 172     4.00           4.000000
## 173     1.00           1.000000
## 174    21.00          21.000000
## 175    56.00          56.000000
## 176    18.00          18.000000
## 177       NA           5.255613
## 178    50.00          50.000000
## 179    30.00          30.000000
## 180    36.00          36.000000
## 181       NA           9.825194
## 182       NA          33.467588
## 183     9.00           9.000000
## 184     1.00           1.000000
## 185     4.00           4.000000
## 186       NA          45.034064
## 187       NA          23.695411
## 188    45.00          45.000000
## 189    40.00          40.000000
## 190    36.00          36.000000
## 191    32.00          32.000000
## 192    19.00          19.000000
## 193    19.00          19.000000
## 194     3.00           3.000000
## 195    44.00          44.000000
## 196    58.00          58.000000
## 197       NA          28.935501
## 198    42.00          42.000000
## 199       NA          27.099293
## 200    24.00          24.000000
## 201    28.00          28.000000
## 202       NA           9.825194
## 203    34.00          34.000000
## 204    45.50          45.500000
## 205    18.00          18.000000
## 206     2.00           2.000000
## 207    32.00          32.000000
## 208    26.00          26.000000
## 209    16.00          16.000000
## 210    40.00          40.000000
## 211    24.00          24.000000
## 212    35.00          35.000000
## 213    22.00          22.000000
## 214    30.00          30.000000
## 215       NA          27.447446
## 216    31.00          31.000000
## 217    27.00          27.000000
## 218    42.00          42.000000
## 219    32.00          32.000000
## 220    30.00          30.000000
## 221    16.00          16.000000
## 222    27.00          27.000000
## 223    51.00          51.000000
## 224       NA          28.935501
## 225    38.00          38.000000
## 226    22.00          22.000000
## 227    19.00          19.000000
## 228    20.50          20.500000
## 229    18.00          18.000000
## 230       NA           5.255613
## 231    35.00          35.000000
## 232    29.00          29.000000
## 233    59.00          59.000000
## 234     5.00           5.000000
## 235    24.00          24.000000
## 236       NA          28.935501
## 237    44.00          44.000000
## 238     8.00           8.000000
## 239    19.00          19.000000
## 240    33.00          33.000000
## 241       NA          27.447446
## 242       NA          23.695411
## 243    29.00          29.000000
## 244    22.00          22.000000
## 245    30.00          30.000000
## 246    44.00          44.000000
## 247    25.00          25.000000
## 248    24.00          24.000000
## 249    37.00          37.000000
## 250    54.00          54.000000
## 251       NA          28.935501
## 252    29.00          29.000000
## 253    62.00          62.000000
## 254    30.00          30.000000
## 255    41.00          41.000000
## 256    29.00          29.000000
## 257       NA          36.413750
## 258    30.00          30.000000
## 259    35.00          35.000000
## 260    50.00          50.000000
## 261       NA          28.935501
## 262     3.00           3.000000
## 263    52.00          52.000000
## 264    40.00          40.000000
## 265       NA          28.935501
## 266    36.00          36.000000
## 267    16.00          16.000000
## 268    25.00          25.000000
## 269    58.00          58.000000
## 270    35.00          35.000000
## 271       NA          45.034064
## 272    25.00          25.000000
## 273    41.00          41.000000
## 274    37.00          37.000000
## 275       NA          27.099293
## 276    63.00          63.000000
## 277    45.00          45.000000
## 278       NA          33.467588
## 279     7.00           7.000000
## 280    35.00          35.000000
## 281    65.00          65.000000
## 282    28.00          28.000000
## 283    16.00          16.000000
## 284    19.00          19.000000
## 285       NA          45.034064
## 286    33.00          33.000000
## 287    30.00          30.000000
## 288    22.00          22.000000
## 289    42.00          42.000000
## 290    22.00          22.000000
## 291    26.00          26.000000
## 292    19.00          19.000000
## 293    36.00          36.000000
## 294    24.00          24.000000
## 295    24.00          24.000000
## 296       NA          45.034064
## 297    23.50          23.500000
## 298     2.00           2.000000
## 299       NA          36.413750
## 300    50.00          50.000000
## 301       NA          27.099293
## 302       NA          24.198155
## 303    19.00          19.000000
## 304       NA          32.117919
## 305       NA          28.935501
## 306     0.92           0.920000
## 307       NA          36.413750
## 308    17.00          17.000000
## 309    30.00          30.000000
## 310    30.00          30.000000
## 311    24.00          24.000000
## 312    18.00          18.000000
## 313    26.00          26.000000
## 314    28.00          28.000000
## 315    43.00          43.000000
## 316    26.00          26.000000
## 317    24.00          24.000000
## 318    54.00          54.000000
## 319    31.00          31.000000
## 320    40.00          40.000000
## 321    22.00          22.000000
## 322    27.00          27.000000
## 323    30.00          30.000000
## 324    22.00          22.000000
## 325       NA           9.825194
## 326    36.00          36.000000
## 327    61.00          61.000000
## 328    36.00          36.000000
## 329    31.00          31.000000
## 330    16.00          16.000000
## 331       NA          24.198155
## 332    45.50          45.500000
## 333    38.00          38.000000
## 334    16.00          16.000000
## 335       NA          37.533744
## 336       NA          28.935501
## 337    29.00          29.000000
## 338    41.00          41.000000
## 339    45.00          45.000000
## 340    45.00          45.000000
## 341     2.00           2.000000
## 342    24.00          24.000000
## 343    28.00          28.000000
## 344    25.00          25.000000
## 345    36.00          36.000000
## 346    24.00          24.000000
## 347    40.00          40.000000
## 348       NA          23.695411
## 349     3.00           3.000000
## 350    42.00          42.000000
## 351    23.00          23.000000
## 352       NA          45.034064
## 353    15.00          15.000000
## 354    25.00          25.000000
## 355       NA          28.935501
## 356    28.00          28.000000
## 357    22.00          22.000000
## 358    38.00          38.000000
## 359       NA          27.099293
## 360       NA          27.099293
## 361    40.00          40.000000
## 362    29.00          29.000000
## 363    45.00          45.000000
## 364    35.00          35.000000
## 365       NA          27.447446
## 366    30.00          30.000000
## 367    60.00          60.000000
## 368       NA          27.099293
## 369       NA          27.099293
## 370    24.00          24.000000
## 371    25.00          25.000000
## 372    18.00          18.000000
## 373    19.00          19.000000
## 374    22.00          22.000000
## 375     3.00           3.000000
## 376       NA          37.533744
## 377    22.00          22.000000
## 378    27.00          27.000000
## 379    20.00          20.000000
## 380    19.00          19.000000
## 381    42.00          42.000000
## 382     1.00           1.000000
## 383    32.00          32.000000
## 384    35.00          35.000000
## 385       NA          28.935501
## 386    18.00          18.000000
## 387     1.00           1.000000
## 388    36.00          36.000000
## 389       NA          28.935501
## 390    17.00          17.000000
## 391    36.00          36.000000
## 392    21.00          21.000000
## 393    28.00          28.000000
## 394    23.00          23.000000
## 395    24.00          24.000000
## 396    22.00          22.000000
## 397    31.00          31.000000
## 398    46.00          46.000000
## 399    23.00          23.000000
## 400    28.00          28.000000
## 401    39.00          39.000000
## 402    26.00          26.000000
## 403    21.00          21.000000
## 404    28.00          28.000000
## 405    20.00          20.000000
## 406    34.00          34.000000
## 407    51.00          51.000000
## 408     3.00           3.000000
## 409    21.00          21.000000
## 410       NA           5.255613
## 411       NA          28.935501
## 412       NA          28.935501
## 413    33.00          33.000000
## 414       NA          33.467588
## 415    44.00          44.000000
## 416       NA          28.935501
## 417    34.00          34.000000
## 418    18.00          18.000000
## 419    30.00          30.000000
## 420    10.00          10.000000
## 421       NA          28.935501
## 422    21.00          21.000000
## 423    29.00          29.000000
## 424    28.00          28.000000
## 425    18.00          18.000000
## 426       NA          28.935501
## 427    28.00          28.000000
## 428    19.00          19.000000
## 429       NA          28.935501
## 430    32.00          32.000000
## 431    28.00          28.000000
## 432       NA          23.695411
## 433    42.00          42.000000
## 434    17.00          17.000000
## 435    50.00          50.000000
## 436    14.00          14.000000
## 437    21.00          21.000000
## 438    24.00          24.000000
## 439    64.00          64.000000
## 440    31.00          31.000000
## 441    45.00          45.000000
## 442    20.00          20.000000
## 443    25.00          25.000000
## 444    28.00          28.000000
## 445       NA          27.099293
## 446     4.00           4.000000
## 447    13.00          13.000000
## 448    34.00          34.000000
## 449     5.00           5.000000
## 450    52.00          52.000000
## 451    36.00          36.000000
## 452       NA          27.447446
## 453    30.00          30.000000
## 454    49.00          49.000000
## 455       NA          28.935501
## 456    29.00          29.000000
## 457    65.00          65.000000
## 458       NA          37.533744
## 459    50.00          50.000000
## 460       NA          28.935501
## 461    48.00          48.000000
## 462    34.00          34.000000
## 463    47.00          47.000000
## 464    48.00          48.000000
## 465       NA          28.935501
## 466    38.00          38.000000
## 467       NA          33.467588
## 468    56.00          56.000000
## 469       NA          28.935501
## 470     0.75           0.750000
## 471       NA          28.935501
## 472    38.00          38.000000
## 473    33.00          33.000000
## 474    23.00          23.000000
## 475    22.00          22.000000
## 476       NA          45.034064
## 477    34.00          34.000000
## 478    29.00          29.000000
## 479    22.00          22.000000
## 480     2.00           2.000000
## 481     9.00           9.000000
## 482       NA          33.467588
## 483    50.00          50.000000
## 484    63.00          63.000000
## 485    25.00          25.000000
## 486       NA           5.255613
## 487    35.00          35.000000
## 488    58.00          58.000000
## 489    30.00          30.000000
## 490     9.00           9.000000
## 491       NA          27.447446
## 492    21.00          21.000000
## 493    55.00          55.000000
## 494    71.00          71.000000
## 495    21.00          21.000000
## 496       NA          28.935501
## 497    54.00          54.000000
## 498       NA          28.935501
## 499    25.00          25.000000
## 500    24.00          24.000000
## 501    17.00          17.000000
## 502    21.00          21.000000
## 503       NA          28.935501
## 504    37.00          37.000000
## 505    16.00          16.000000
## 506    18.00          18.000000
## 507    33.00          33.000000
## 508       NA          36.413750
## 509    28.00          28.000000
## 510    26.00          26.000000
## 511    29.00          29.000000
## 512       NA          28.935501
## 513    36.00          36.000000
## 514    54.00          54.000000
## 515    24.00          24.000000
## 516    47.00          47.000000
## 517    34.00          34.000000
## 518       NA          28.935501
## 519    36.00          36.000000
## 520    32.00          32.000000
## 521    30.00          30.000000
## 522    22.00          22.000000
## 523       NA          28.935501
## 524    44.00          44.000000
## 525       NA          28.935501
## 526    40.50          40.500000
## 527    50.00          50.000000
## 528       NA          45.034064
## 529    39.00          39.000000
## 530    23.00          23.000000
## 531     2.00           2.000000
## 532       NA          28.935501
## 533    17.00          17.000000
## 534       NA          15.536604
## 535    30.00          30.000000
## 536     7.00           7.000000
## 537    45.00          45.000000
## 538    30.00          30.000000
## 539       NA          28.935501
## 540    22.00          22.000000
## 541    36.00          36.000000
## 542     9.00           9.000000
## 543    11.00          11.000000
## 544    32.00          32.000000
## 545    50.00          50.000000
## 546    64.00          64.000000
## 547    19.00          19.000000
## 548       NA          32.117919
## 549    33.00          33.000000
## 550     8.00           8.000000
## 551    17.00          17.000000
## 552    27.00          27.000000
## 553       NA          28.935501
## 554    22.00          22.000000
## 555    22.00          22.000000
## 556    62.00          62.000000
## 557    48.00          48.000000
## 558       NA          45.034064
## 559    39.00          39.000000
## 560    36.00          36.000000
## 561       NA          28.935501
## 562    40.00          40.000000
## 563    28.00          28.000000
## 564       NA          28.935501
## 565       NA          28.935501
## 566    24.00          24.000000
## 567    19.00          19.000000
## 568    29.00          29.000000
## 569       NA          28.935501
## 570    32.00          32.000000
## 571    62.00          62.000000
## 572    53.00          53.000000
## 573    36.00          36.000000
## 574       NA          27.099293
## 575    16.00          16.000000
## 576    19.00          19.000000
## 577    34.00          34.000000
## 578    39.00          39.000000
## 579       NA          27.447446
## 580    32.00          32.000000
## 581    25.00          25.000000
## 582    39.00          39.000000
## 583    54.00          54.000000
## 584    36.00          36.000000
## 585       NA          28.935501
## 586    18.00          18.000000
## 587    47.00          47.000000
## 588    60.00          60.000000
## 589    22.00          22.000000
## 590       NA          28.935501
## 591    35.00          35.000000
## 592    52.00          52.000000
## 593    47.00          47.000000
## 594       NA          30.097261
## 595    37.00          37.000000
## 596    36.00          36.000000
## 597       NA          32.117919
## 598    49.00          49.000000
## 599       NA          28.935501
## 600    49.00          49.000000
## 601    24.00          24.000000
## 602       NA          28.935501
## 603       NA          45.034064
## 604    44.00          44.000000
## 605    35.00          35.000000
## 606    36.00          36.000000
## 607    30.00          30.000000
## 608    27.00          27.000000
## 609    22.00          22.000000
## 610    40.00          40.000000
## 611    39.00          39.000000
## 612       NA          28.935501
## 613       NA          23.695411
## 614       NA          28.935501
## 615    35.00          35.000000
## 616    24.00          24.000000
## 617    34.00          34.000000
## 618    26.00          26.000000
## 619     4.00           4.000000
## 620    26.00          26.000000
## 621    27.00          27.000000
## 622    42.00          42.000000
## 623    20.00          20.000000
## 624    21.00          21.000000
## 625    21.00          21.000000
## 626    61.00          61.000000
## 627    57.00          57.000000
## 628    21.00          21.000000
## 629    26.00          26.000000
## 630       NA          28.935501
## 631    80.00          80.000000
## 632    51.00          51.000000
## 633    32.00          32.000000
## 634       NA          45.034064
## 635     9.00           9.000000
## 636    28.00          28.000000
## 637    32.00          32.000000
## 638    31.00          31.000000
## 639    41.00          41.000000
## 640       NA          27.447446
## 641    20.00          20.000000
## 642    24.00          24.000000
## 643     2.00           2.000000
## 644       NA          27.099293
## 645     0.75           0.750000
## 646    48.00          48.000000
## 647    19.00          19.000000
## 648    56.00          56.000000
## 649       NA          28.935501
## 650    23.00          23.000000
## 651       NA          28.935501
## 652    18.00          18.000000
## 653    21.00          21.000000
## 654       NA          27.099293
## 655    18.00          18.000000
## 656    24.00          24.000000
## 657       NA          28.935501
## 658    32.00          32.000000
## 659    23.00          23.000000
## 660    58.00          58.000000
## 661    50.00          50.000000
## 662    40.00          40.000000
## 663    47.00          47.000000
## 664    36.00          36.000000
## 665    20.00          20.000000
## 666    32.00          32.000000
## 667    25.00          25.000000
## 668       NA          28.935501
## 669    43.00          43.000000
## 670       NA          37.533744
## 671    40.00          40.000000
## 672    31.00          31.000000
## 673    70.00          70.000000
## 674    31.00          31.000000
## 675       NA          33.467588
## 676    18.00          18.000000
## 677    24.50          24.500000
## 678    18.00          18.000000
## 679    43.00          43.000000
## 680    36.00          36.000000
## 681       NA          28.935501
## 682    27.00          27.000000
## 683    20.00          20.000000
## 684    14.00          14.000000
## 685    60.00          60.000000
## 686    25.00          25.000000
## 687    14.00          14.000000
## 688    19.00          19.000000
## 689    18.00          18.000000
## 690    15.00          15.000000
## 691    31.00          31.000000
## 692     4.00           4.000000
## 693       NA          27.099293
## 694    25.00          25.000000
## 695    60.00          60.000000
## 696    52.00          52.000000
## 697    44.00          44.000000
## 698       NA          27.099293
## 699    49.00          49.000000
## 700    42.00          42.000000
## 701    18.00          18.000000
## 702    35.00          35.000000
## 703    18.00          18.000000
## 704    25.00          25.000000
## 705    26.00          26.000000
## 706    39.00          39.000000
## 707    45.00          45.000000
## 708    42.00          42.000000
## 709    22.00          22.000000
## 710       NA          14.045910
## 711    24.00          24.000000
## 712       NA          45.034064
## 713    48.00          48.000000
## 714    29.00          29.000000
## 715    52.00          52.000000
## 716    19.00          19.000000
## 717    38.00          38.000000
## 718    27.00          27.000000
## 719       NA          28.935501
## 720    33.00          33.000000
## 721     6.00           6.000000
## 722    17.00          17.000000
## 723    34.00          34.000000
## 724    50.00          50.000000
## 725    27.00          27.000000
## 726    20.00          20.000000
## 727    30.00          30.000000
## 728       NA          27.099293
## 729    25.00          25.000000
## 730    25.00          25.000000
## 731    29.00          29.000000
## 732    11.00          11.000000
## 733       NA          33.467588
## 734    23.00          23.000000
## 735    23.00          23.000000
## 736    28.50          28.500000
## 737    48.00          48.000000
## 738    35.00          35.000000
## 739       NA          28.935501
## 740       NA          28.935501
## 741       NA          36.413750
## 742    36.00          36.000000
## 743    21.00          21.000000
## 744    24.00          24.000000
## 745    31.00          31.000000
## 746    70.00          70.000000
## 747    16.00          16.000000
## 748    30.00          30.000000
## 749    19.00          19.000000
## 750    31.00          31.000000
## 751     4.00           4.000000
## 752     6.00           6.000000
## 753    33.00          33.000000
## 754    23.00          23.000000
## 755    48.00          48.000000
## 756     0.67           0.670000
## 757    28.00          28.000000
## 758    18.00          18.000000
## 759    34.00          34.000000
## 760    33.00          33.000000
## 761       NA          28.935501
## 762    41.00          41.000000
## 763    20.00          20.000000
## 764    36.00          36.000000
## 765    16.00          16.000000
## 766    51.00          51.000000
## 767       NA          45.034064
## 768    30.50          30.500000
## 769       NA          27.447446
## 770    32.00          32.000000
## 771    24.00          24.000000
## 772    48.00          48.000000
## 773    57.00          57.000000
## 774       NA          28.935501
## 775    54.00          54.000000
## 776    18.00          18.000000
## 777       NA          28.935501
## 778     5.00           5.000000
## 779       NA          28.935501
## 780    43.00          43.000000
## 781    13.00          13.000000
## 782    17.00          17.000000
## 783    29.00          29.000000
## 784       NA          25.821523
## 785    25.00          25.000000
## 786    25.00          25.000000
## 787    18.00          18.000000
## 788     8.00           8.000000
## 789     1.00           1.000000
## 790    46.00          46.000000
## 791       NA          28.935501
## 792    16.00          16.000000
## 793       NA           9.825194
## 794       NA          45.034064
## 795    25.00          25.000000
## 796    39.00          39.000000
## 797    49.00          49.000000
## 798    31.00          31.000000
## 799    30.00          30.000000
## 800    30.00          30.000000
## 801    34.00          34.000000
## 802    31.00          31.000000
## 803    11.00          11.000000
## 804     0.42           0.420000
## 805    27.00          27.000000
## 806    31.00          31.000000
## 807    39.00          39.000000
## 808    18.00          18.000000
## 809    39.00          39.000000
## 810    33.00          33.000000
## 811    26.00          26.000000
## 812    39.00          39.000000
## 813    35.00          35.000000
## 814     6.00           6.000000
## 815    30.50          30.500000
## 816       NA          45.034064
## 817    23.00          23.000000
## 818    31.00          31.000000
## 819    43.00          43.000000
## 820    10.00          10.000000
## 821    52.00          52.000000
## 822    27.00          27.000000
## 823    38.00          38.000000
## 824    27.00          27.000000
## 825     2.00           2.000000
## 826       NA          28.935501
## 827       NA          28.935501
## 828     1.00           1.000000
## 829       NA          27.099293
## 830    62.00          62.000000
## 831    15.00          15.000000
## 832     0.83           0.830000
## 833       NA          28.935501
## 834    23.00          23.000000
## 835    18.00          18.000000
## 836    39.00          39.000000
## 837    21.00          21.000000
## 838       NA          28.935501
## 839    32.00          32.000000
## 840       NA          36.413750
## 841    20.00          20.000000
## 842    16.00          16.000000
## 843    30.00          30.000000
## 844    34.50          34.500000
## 845    17.00          17.000000
## 846    42.00          42.000000
## 847       NA           9.825194
## 848    35.00          35.000000
## 849    28.00          28.000000
## 850       NA          37.533744
## 851     4.00           4.000000
## 852    74.00          74.000000
## 853     9.00           9.000000
## 854    16.00          16.000000
## 855    44.00          44.000000
## 856    18.00          18.000000
## 857    45.00          45.000000
## 858    51.00          51.000000
## 859    24.00          24.000000
## 860       NA          28.935501
## 861    41.00          41.000000
## 862    21.00          21.000000
## 863    48.00          48.000000
## 864       NA           9.825194
## 865    24.00          24.000000
## 866    42.00          42.000000
## 867    27.00          27.000000
## 868    31.00          31.000000
## 869       NA          28.935501
## 870     4.00           4.000000
## 871    26.00          26.000000
## 872    47.00          47.000000
## 873    33.00          33.000000
## 874    47.00          47.000000
## 875    28.00          28.000000
## 876    15.00          15.000000
## 877    20.00          20.000000
## 878    19.00          19.000000
## 879       NA          28.935501
## 880    56.00          56.000000
## 881    25.00          25.000000
## 882    33.00          33.000000
## 883    22.00          22.000000
## 884    28.00          28.000000
## 885    25.00          25.000000
## 886    39.00          39.000000
## 887    27.00          27.000000
## 888    19.00          19.000000
## 889       NA          25.821523
## 890    26.00          26.000000
## 891    32.00          32.000000

There’s no option for different imputation techniques with Miss Forest, as it always uses the random forests algorithm:

Finally, let’s visualize the distributions:

h1 <- ggplot(missForest_imputed, aes(x = original)) +
  geom_histogram(fill = "#ad1538", color = "#000000", position = "identity") +
  ggtitle("Original distribution") +
  theme_classic()
h2 <- ggplot(missForest_imputed, aes(x = imputed_missForest)) +
  geom_histogram(fill = "#15ad4f", color = "#000000", position = "identity") +
  ggtitle("random-forest-imputed distribution") +
  theme_classic()

plot_grid(h1, h2, nrow = 1, ncol = 2)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 177 rows containing non-finite outside the scale range
## (`stat_bin()`).
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

It looks like Miss Forest gravitated towards a constant value imputation since a large portion of values is around 35. The distribution is quite different from the original one, which means Miss Forest isn’t the best imputation technique we’ve seen today.