Github repo | portfolio | Blog
The UC Irvine Machine Learning Repository contains a data set related to glass identification. The data consists of 214 glass samples labeled as one of several class categories. There are nine predictors, including the refractive index and percentages of eight elements: Na, Mg, Al, Si, K, Ca, Ba, and Fe. The data can be accessed via:
#> 'data.frame': 214 obs. of 10 variables:
#> $ RI : num 1.52 1.52 1.52 1.52 1.52 ...
#> $ Na : num 13.6 13.9 13.5 13.2 13.3 ...
#> $ Mg : num 4.49 3.6 3.55 3.69 3.62 3.61 3.6 3.61 3.58 3.6 ...
#> $ Al : num 1.1 1.36 1.54 1.29 1.24 1.62 1.14 1.05 1.37 1.36 ...
#> $ Si : num 71.8 72.7 73 72.6 73.1 ...
#> $ K : num 0.06 0.48 0.39 0.57 0.55 0.64 0.58 0.57 0.56 0.57 ...
#> $ Ca : num 8.75 7.83 7.78 8.22 8.07 8.07 8.17 8.24 8.3 8.4 ...
#> $ Ba : num 0 0 0 0 0 0 0 0 0 0 ...
#> $ Fe : num 0 0 0 0 0 0.26 0 0 0 0.11 ...
#> $ Type: Factor w/ 6 levels "1","2","3","5",..: 1 1 1 1 1 1 1 1 1 1 ...
a. Using visualizations, explore the predictor variables to understand their distributions as well as the relationships between predictors
I will start by conducting a uni-variate analysis for each predictor to study the distribution.
RI
, Si
, Na
, AI
, and Ca
have Gausian normal distribution. However, the rest of the variables are either severly skewed with long tail to the right, or has a bi modal distribution such as the Mg
. We can consider to normalize/standardize the data or make a transformation to make the predictors have more reliability in building the model.
The next type of visualization would be the multi-variate visualization, which reveals the reationship between each predictor and the target variable. We will use the correlation heatmap plot utilizing pearson correlation metho.
Most of the predictors are negatively correlated with each other.
b. Do there appear to be any outliers in the data? Are any predictors skewed?
For skewness: Looking back to the uni-variate (histograms), we can see that the majority of the variables are skewed with a long tail to the right. For outliers: This can be determined from the boxplot
Yes there are outliers in most of the predictors. As shown from the scatter plot, the outliers are creating a cluster within value (15-20) for Na
and Ca
c. Are there any relevant transformations of one or more predictors that might improve the classification model?
yes, I would consider a transformation, Boxcox transformation or log-transformation.
The soybean data can also be found at the UC Irvine Machine Learning Repository. Data were collected to predict disease in 683 soybeans. The 35 predictors are mostly categorical and include information on the environemental conditions (e.g. temperature, precipitation) and plant conditions (e.g., left spots, mold growth). The outcome labels consist of 19 distinct classes. The data can be loaded via:
#> 'data.frame': 683 obs. of 36 variables:
#> $ Class : Factor w/ 19 levels "2-4-d-injury",..: 11 11 11 11 11 11 11 11 11 11 ...
#> $ date : Factor w/ 7 levels "0","1","2","3",..: 7 5 4 4 7 6 6 5 7 5 ...
#> $ plant.stand : Ord.factor w/ 2 levels "0"<"1": 1 1 1 1 1 1 1 1 1 1 ...
#> $ precip : Ord.factor w/ 3 levels "0"<"1"<"2": 3 3 3 3 3 3 3 3 3 3 ...
#> $ temp : Ord.factor w/ 3 levels "0"<"1"<"2": 2 2 2 2 2 2 2 2 2 2 ...
#> $ hail : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 2 1 1 ...
#> $ crop.hist : Factor w/ 4 levels "0","1","2","3": 2 3 2 2 3 4 3 2 4 3 ...
#> $ area.dam : Factor w/ 4 levels "0","1","2","3": 2 1 1 1 1 1 1 1 1 1 ...
#> $ sever : Factor w/ 3 levels "0","1","2": 2 3 3 3 2 2 2 2 2 3 ...
#> $ seed.tmt : Factor w/ 3 levels "0","1","2": 1 2 2 1 1 1 2 1 2 1 ...
#> $ germ : Ord.factor w/ 3 levels "0"<"1"<"2": 1 2 3 2 3 2 1 3 2 3 ...
#> $ plant.growth : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 2 2 2 ...
#> $ leaves : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 2 2 2 ...
#> $ leaf.halo : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
#> $ leaf.marg : Factor w/ 3 levels "0","1","2": 3 3 3 3 3 3 3 3 3 3 ...
#> $ leaf.size : Ord.factor w/ 3 levels "0"<"1"<"2": 3 3 3 3 3 3 3 3 3 3 ...
#> $ leaf.shread : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
#> $ leaf.malf : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
#> $ leaf.mild : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
#> $ stem : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 2 2 2 ...
#> $ lodging : Factor w/ 2 levels "0","1": 2 1 1 1 1 1 2 1 1 1 ...
#> $ stem.cankers : Factor w/ 4 levels "0","1","2","3": 4 4 4 4 4 4 4 4 4 4 ...
#> $ canker.lesion : Factor w/ 4 levels "0","1","2","3": 2 2 1 1 2 1 2 2 2 2 ...
#> $ fruiting.bodies: Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 2 2 2 ...
#> $ ext.decay : Factor w/ 3 levels "0","1","2": 2 2 2 2 2 2 2 2 2 2 ...
#> $ mycelium : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
#> $ int.discolor : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
#> $ sclerotia : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
#> $ fruit.pods : Factor w/ 4 levels "0","1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
#> $ fruit.spots : Factor w/ 4 levels "0","1","2","4": 4 4 4 4 4 4 4 4 4 4 ...
#> $ seed : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
#> $ mold.growth : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
#> $ seed.discolor : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
#> $ seed.size : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
#> $ shriveling : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
#> $ roots : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
a. Investigate the frequency distributions for the categorical predictors. Are any of the distributions degenerate in the ways discussed earlier in this chapter?
#> date plant.stand precip temp hail crop.hist area.dam sever seed.tmt germ
#> 1 6 0 2 1 0 1 1 1 0 0
#> 2 4 0 2 1 0 2 0 2 1 1
#> 3 3 0 2 1 0 1 0 2 1 2
#> 4 3 0 2 1 0 1 0 2 0 1
#> 5 6 0 2 1 0 2 0 1 0 2
#> 6 5 0 2 1 0 3 0 1 0 1
#> plant.growth leaves leaf.halo leaf.marg leaf.size leaf.shread leaf.malf
#> 1 1 1 0 2 2 0 0
#> 2 1 1 0 2 2 0 0
#> 3 1 1 0 2 2 0 0
#> 4 1 1 0 2 2 0 0
#> 5 1 1 0 2 2 0 0
#> 6 1 1 0 2 2 0 0
#> leaf.mild stem lodging stem.cankers canker.lesion fruiting.bodies ext.decay
#> 1 0 1 1 3 1 1 1
#> 2 0 1 0 3 1 1 1
#> 3 0 1 0 3 0 1 1
#> 4 0 1 0 3 0 1 1
#> 5 0 1 0 3 1 1 1
#> 6 0 1 0 3 0 1 1
#> mycelium int.discolor sclerotia fruit.pods fruit.spots seed mold.growth
#> 1 0 0 0 0 4 0 0
#> 2 0 0 0 0 4 0 0
#> 3 0 0 0 0 4 0 0
#> 4 0 0 0 0 4 0 0
#> 5 0 0 0 0 4 0 0
#> 6 0 0 0 0 4 0 0
#> seed.discolor seed.size shriveling roots
#> 1 0 0 0 0
#> 2 0 0 0 0
#> 3 0 0 0 0
#> 4 0 0 0 0
#> 5 0 0 0 0
#> 6 0 0 0 0
Roughly 18 % of the data are missing. Are there particular predictors that are more likely to be missing? Is the pattern of missing data related to the classes?
I will start with counting the missing values in the Soybean.
#> hail sever seed.tmt lodging germ
#> 121 121 121 121 112
#> leaf.mild fruiting.bodies fruit.spots seed.discolor shriveling
#> 108 106 106 106 106
#> leaf.shread seed mold.growth seed.size leaf.halo
#> 100 92 92 92 84
#> leaf.marg leaf.size leaf.malf fruit.pods precip
#> 84 84 84 84 38
#> stem.cankers canker.lesion ext.decay mycelium int.discolor
#> 38 38 38 38 38
#> sclerotia plant.stand roots temp crop.hist
#> 38 36 31 30 16
#> plant.growth stem date area.dam leaves
#> 16 16 1 1 0
2-4-d-injury | alternarialeaf-spot | anthracnose | bacterial-blight | bacterial-pustule | brown-spot | brown-stem-rot | charcoal-rot | cyst-nematode | diaporthe-pod-&-stem-blight | diaporthe-stem-canker | downy-mildew | frog-eye-leaf-spot | herbicide-injury | phyllosticta-leaf-spot | phytophthora-rot | powdery-mildew | purple-seed-stain | rhizoctonia-root-rot | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
hail | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 15 | 0 | 0 | 0 | 8 | 0 | 68 | 0 | 0 | 0 |
sever | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 15 | 0 | 0 | 0 | 8 | 0 | 68 | 0 | 0 | 0 |
seed.tmt | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 15 | 0 | 0 | 0 | 8 | 0 | 68 | 0 | 0 | 0 |
lodging | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 15 | 0 | 0 | 0 | 8 | 0 | 68 | 0 | 0 | 0 |
germ | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 6 | 0 | 0 | 0 | 8 | 0 | 68 | 0 | 0 | 0 |
leaf.mild | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 15 | 0 | 0 | 0 | 8 | 0 | 55 | 0 | 0 | 0 |
fruiting.bodies | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 0 | 0 | 0 | 0 | 8 | 0 | 68 | 0 | 0 | 0 |
fruit.spots | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 0 | 0 | 0 | 0 | 8 | 0 | 68 | 0 | 0 | 0 |
seed.discolor | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 0 | 0 | 0 | 0 | 8 | 0 | 68 | 0 | 0 | 0 |
shriveling | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 0 | 0 | 0 | 0 | 8 | 0 | 68 | 0 | 0 | 0 |
leaf.shread | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 15 | 0 | 0 | 0 | 0 | 0 | 55 | 0 | 0 | 0 |
seed | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8 | 0 | 68 | 0 | 0 | 0 |
mold.growth | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8 | 0 | 68 | 0 | 0 | 0 |
seed.size | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 8 | 0 | 68 | 0 | 0 | 0 |
leaf.halo | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 15 | 0 | 0 | 0 | 0 | 0 | 55 | 0 | 0 | 0 |
leaf.marg | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 15 | 0 | 0 | 0 | 0 | 0 | 55 | 0 | 0 | 0 |
leaf.size | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 15 | 0 | 0 | 0 | 0 | 0 | 55 | 0 | 0 | 0 |
leaf.malf | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 15 | 0 | 0 | 0 | 0 | 0 | 55 | 0 | 0 | 0 |
fruit.pods | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 68 | 0 | 0 | 0 |
precip | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 0 | 0 | 0 | 0 | 8 | 0 | 0 | 0 | 0 | 0 |
stem.cankers | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 0 | 0 | 0 | 0 | 8 | 0 | 0 | 0 | 0 | 0 |
canker.lesion | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 0 | 0 | 0 | 0 | 8 | 0 | 0 | 0 | 0 | 0 |
ext.decay | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 0 | 0 | 0 | 0 | 8 | 0 | 0 | 0 | 0 | 0 |
mycelium | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 0 | 0 | 0 | 0 | 8 | 0 | 0 | 0 | 0 | 0 |
int.discolor | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 0 | 0 | 0 | 0 | 8 | 0 | 0 | 0 | 0 | 0 |
sclerotia | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 0 | 0 | 0 | 0 | 8 | 0 | 0 | 0 | 0 | 0 |
plant.stand | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
roots | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 15 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
temp | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 14 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
crop.hist | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
plant.growth | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
stem | 16 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
date | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
area.dam | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
leaves | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
The numbers are the count of missing values for the predictors.
From this table, it seems that some predictors have same rows with missing values, and the same distribution of classes. Furthere, these predictors’ missing values are biased toward the class phytophthorarot. For example, for the predictor hail, out of the 121 missing values, 68 (56%) of them are phytophthorarot. This indicates “informative missingness”, which can induce significant bias in the model.
c. Develop a strategy for handling missing data, either by eliminating predictors or imputation.
#> [1] "Eliminated 68 rows."
#> [1] "615 rows remaining."
#> [1] "53 rows still contain missing values."
#> [1] "Filling 1 missing values for feature: date ."
#> [1] "The most frequent factor of this feature is: 5 , which is 24.27 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 36 missing values for feature: plant.stand ."
#> [1] "The most frequent factor of this feature is: 0 , which is 61.14 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 38 missing values for feature: precip ."
#> [1] "The most frequent factor of this feature is: 2 , which is 72.27 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 30 missing values for feature: temp ."
#> [1] "The most frequent factor of this feature is: 1 , which is 57.09 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 53 missing values for feature: hail ."
#> [1] "The most frequent factor of this feature is: 0 , which is 77.4 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 16 missing values for feature: crop.hist ."
#> [1] "The most frequent factor of this feature is: 3 , which is 32.39 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 1 missing values for feature: area.dam ."
#> [1] "The most frequent factor of this feature is: 3 , which is 30.46 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 53 missing values for feature: sever ."
#> [1] "The most frequent factor of this feature is: 1 , which is 57.3 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 53 missing values for feature: seed.tmt ."
#> [1] "The most frequent factor of this feature is: 0 , which is 54.27 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 44 missing values for feature: germ ."
#> [1] "The most frequent factor of this feature is: 1 , which is 37.3 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 16 missing values for feature: plant.growth ."
#> [1] "The most frequent factor of this feature is: 0 , which is 73.62 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 0 missing values for feature: leaves ."
#> [1] "The most frequent factor of this feature is: 1 , which is 87.48 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 29 missing values for feature: leaf.halo ."
#> [1] "The most frequent factor of this feature is: 2 , which is 58.36 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 29 missing values for feature: leaf.marg ."
#> [1] "The most frequent factor of this feature is: 0 , which is 60.92 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 29 missing values for feature: leaf.size ."
#> [1] "The most frequent factor of this feature is: 1 , which is 55.8 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 45 missing values for feature: leaf.shread ."
#> [1] "The most frequent factor of this feature is: 0 , which is 83.16 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 29 missing values for feature: leaf.malf ."
#> [1] "The most frequent factor of this feature is: 0 , which is 92.32 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 53 missing values for feature: leaf.mild ."
#> [1] "The most frequent factor of this feature is: 0 , which is 92.88 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 16 missing values for feature: stem ."
#> [1] "The most frequent factor of this feature is: 1 , which is 50.58 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 53 missing values for feature: lodging ."
#> [1] "The most frequent factor of this feature is: 0 , which is 92.53 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 38 missing values for feature: stem.cankers ."
#> [1] "The most frequent factor of this feature is: 0 , which is 64.64 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 38 missing values for feature: canker.lesion ."
#> [1] "The most frequent factor of this feature is: 0 , which is 55.46 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 38 missing values for feature: fruiting.bodies ."
#> [1] "The most frequent factor of this feature is: 0 , which is 81.98 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 38 missing values for feature: ext.decay ."
#> [1] "The most frequent factor of this feature is: 0 , which is 76.6 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 38 missing values for feature: mycelium ."
#> [1] "The most frequent factor of this feature is: 0 , which is 98.96 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 38 missing values for feature: int.discolor ."
#> [1] "The most frequent factor of this feature is: 0 , which is 88.91 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 38 missing values for feature: sclerotia ."
#> [1] "The most frequent factor of this feature is: 0 , which is 96.53 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 16 missing values for feature: fruit.pods ."
#> [1] "The most frequent factor of this feature is: 0 , which is 67.95 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 38 missing values for feature: fruit.spots ."
#> [1] "The most frequent factor of this feature is: 0 , which is 59.79 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 24 missing values for feature: seed ."
#> [1] "The most frequent factor of this feature is: 0 , which is 80.54 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 24 missing values for feature: mold.growth ."
#> [1] "The most frequent factor of this feature is: 0 , which is 88.66 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 38 missing values for feature: seed.discolor ."
#> [1] "The most frequent factor of this feature is: 0 , which is 88.91 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 24 missing values for feature: seed.size ."
#> [1] "The most frequent factor of this feature is: 0 , which is 90.02 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 38 missing values for feature: shriveling ."
#> [1] "The most frequent factor of this feature is: 0 , which is 93.41 % of the class."
#> [1] "------------------------------------------------"
#> [1] "Filling 31 missing values for feature: roots ."
#> [1] "The most frequent factor of this feature is: 0 , which is 94.35 % of the class."
#> [1] "------------------------------------------------"
#> [1] "There are now 615 rows. 0 rows have missing values."
Github repo | portfolio | Blog