##3.1 3.1.The UC Irvine Machine Learning Repository contains a dataset related to glass identification. The data consists of 214 glass samples labeled as one of seven class categories. There are nine predictors, including the refractive index and percentages of eight elements: Na, Mg, Al, Si, K, Ca, Ba, and Fe.

library(mlbench)
## Warning: package 'mlbench' was built under R version 4.3.3
library(tidyverse)
## Warning: package 'ggplot2' was built under R version 4.3.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(GGally)
## Warning: package 'GGally' was built under R version 4.3.3
## Registered S3 method overwritten by 'GGally':
##   method from   
##   +.gg   ggplot2
data(Glass)
str(Glass)
## 'data.frame':    214 obs. of  10 variables:
##  $ RI  : num  1.52 1.52 1.52 1.52 1.52 ...
##  $ Na  : num  13.6 13.9 13.5 13.2 13.3 ...
##  $ Mg  : num  4.49 3.6 3.55 3.69 3.62 3.61 3.6 3.61 3.58 3.6 ...
##  $ Al  : num  1.1 1.36 1.54 1.29 1.24 1.62 1.14 1.05 1.37 1.36 ...
##  $ Si  : num  71.8 72.7 73 72.6 73.1 ...
##  $ K   : num  0.06 0.48 0.39 0.57 0.55 0.64 0.58 0.57 0.56 0.57 ...
##  $ Ca  : num  8.75 7.83 7.78 8.22 8.07 8.07 8.17 8.24 8.3 8.4 ...
##  $ Ba  : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Fe  : num  0 0 0 0 0 0.26 0 0 0 0.11 ...
##  $ Type: Factor w/ 6 levels "1","2","3","5",..: 1 1 1 1 1 1 1 1 1 1 ...
?Glass
## starting httpd help server ... done
  1. Using visualizations, explore the predictor variables to understand their distributions as well as the relationships between predictors.
# Load necessary libraries
library(mlbench)
library(ggplot2)

data(Glass)
predictors <- Glass[, 1:9] # Columns 1 to 9 correspond to the predictors which are the 9 different materials 

# Plot histograms 
ggplot(reshape2::melt(predictors), aes(value)) + 
  geom_histogram(bins = 30, color = "black") +
  facet_wrap(~variable, scales = "free", ncol = 3) +
  ggtitle("Histograms of Glass Predictors")
## No id variables; using all as measure variables

ggplot(reshape2::melt(predictors), aes(x = variable, y = value)) + 
  geom_boxplot(fill = "lightgreen") +
  facet_wrap(~variable, scales = "free", ncol = 3) +
  theme_minimal() +
  ggtitle("Boxplots of Glass Predictors")
## No id variables; using all as measure variables

B)Do there appear to be any outliers in the data? Are any predictors skewed?

From the histograms and the box plots, we can see that several predectors are skewed, such as K, Ba, and Fe. We can also see many predectors with outliers in the box plots.

(c)Are there any relevant transformations of one or more predictors that might improve the classification model? I believe the biggest downside for the dataset is that the chimical analysis used produce a large number of small values and a few larger values, which we can see in the righ-skewed data. to fix this we can use log transformation to reduce skewness and making the distribution more even.

##3.2

The soybean data is available at the UC Irvine Machine Learning Repository. The dataset contains information collected to predict disease in 683 soybean samples. There are 35 predictors, most of which are categorical, and they include details about environmental conditions (e.g., temperature, precipitation) and plant conditions (e.g., leaf spots, mold growth). The outcome variable consists of 19 distinct classes representing different diseases.

  1. Investigate the frequency distributions for the categorical predictors. Are any of the distributions degenerate in the ways discussed earlier in this chapter?

to investigate the frequency distribution for each variable in the dataset, we

data("Soybean")
Soybean <- as.data.frame(Soybean)

# Ensure relevant columns are converted to factors
categorical_columns <- c("Class", "date", "plant.stand", "precip", "temp", "hail",
                         "crop.hist", "area.dam", "sever", "seed.tmt", "germ",
                         "plant.growth", "leaves", "leaf.halo", "leaf.marg", 
                         "leaf.size", "leaf.shread", "leaf.malf", "leaf.mild",
                         "stem", "lodging", "stem.cankers", "canker.lesion",
                         "fruiting.bodies", "ext.decay", "mycelium", 
                         "int.discolor", "sclerotia", "fruit.pods", 
                         "fruit.spots", "seed", "mold.growth", 
                         "seed.discolor", "seed.size", "shriveling", "roots")

# Convert relevant columns to factors
Soybean[categorical_columns] <- lapply(Soybean[categorical_columns], as.factor)

# Check the structure of the dataset
str(Soybean)
## 'data.frame':    683 obs. of  36 variables:
##  $ Class          : Factor w/ 19 levels "2-4-d-injury",..: 11 11 11 11 11 11 11 11 11 11 ...
##  $ date           : Factor w/ 7 levels "0","1","2","3",..: 7 5 4 4 7 6 6 5 7 5 ...
##  $ plant.stand    : Ord.factor w/ 2 levels "0"<"1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ precip         : Ord.factor w/ 3 levels "0"<"1"<"2": 3 3 3 3 3 3 3 3 3 3 ...
##  $ temp           : Ord.factor w/ 3 levels "0"<"1"<"2": 2 2 2 2 2 2 2 2 2 2 ...
##  $ hail           : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 2 1 1 ...
##  $ crop.hist      : Factor w/ 4 levels "0","1","2","3": 2 3 2 2 3 4 3 2 4 3 ...
##  $ area.dam       : Factor w/ 4 levels "0","1","2","3": 2 1 1 1 1 1 1 1 1 1 ...
##  $ sever          : Factor w/ 3 levels "0","1","2": 2 3 3 3 2 2 2 2 2 3 ...
##  $ seed.tmt       : Factor w/ 3 levels "0","1","2": 1 2 2 1 1 1 2 1 2 1 ...
##  $ germ           : Ord.factor w/ 3 levels "0"<"1"<"2": 1 2 3 2 3 2 1 3 2 3 ...
##  $ plant.growth   : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 2 2 2 ...
##  $ leaves         : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 2 2 2 ...
##  $ leaf.halo      : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
##  $ leaf.marg      : Factor w/ 3 levels "0","1","2": 3 3 3 3 3 3 3 3 3 3 ...
##  $ leaf.size      : Ord.factor w/ 3 levels "0"<"1"<"2": 3 3 3 3 3 3 3 3 3 3 ...
##  $ leaf.shread    : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ leaf.malf      : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ leaf.mild      : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
##  $ stem           : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 2 2 2 ...
##  $ lodging        : Factor w/ 2 levels "0","1": 2 1 1 1 1 1 2 1 1 1 ...
##  $ stem.cankers   : Factor w/ 4 levels "0","1","2","3": 4 4 4 4 4 4 4 4 4 4 ...
##  $ canker.lesion  : Factor w/ 4 levels "0","1","2","3": 2 2 1 1 2 1 2 2 2 2 ...
##  $ fruiting.bodies: Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 2 2 2 ...
##  $ ext.decay      : Factor w/ 3 levels "0","1","2": 2 2 2 2 2 2 2 2 2 2 ...
##  $ mycelium       : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ int.discolor   : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
##  $ sclerotia      : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ fruit.pods     : Factor w/ 4 levels "0","1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
##  $ fruit.spots    : Factor w/ 4 levels "0","1","2","4": 4 4 4 4 4 4 4 4 4 4 ...
##  $ seed           : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ mold.growth    : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ seed.discolor  : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ seed.size      : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ shriveling     : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ roots          : Factor w/ 3 levels "0","1","2": 1 1 1 1 1 1 1 1 1 1 ...
# Create histograms for each categorical predictor
for (col in categorical_columns) {
  p <- ggplot(Soybean, aes_string(x = col)) +  # Use aes_string to directly use col name
    geom_bar(fill = "lightblue", color = "black") +
    theme_minimal() +
    labs(title = paste("Frequency Distribution of", col),
         x = col,
         y = "Frequency") +
    theme(axis.text.x = element_text(angle = 45, hjust = 1))
  
  # Print the plot
  print(p)
  
  # Optionally, save the plot to a file
  ggsave(filename = paste0("Frequency_Distribution_", col, ".png"), plot = p, width = 8, height = 4) # Save plots
}
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation idioms with `aes()`.
## ℹ See also `vignette("ggplot2-in-packages")` for more information.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

  1. Roughly 18% of the data are missing. Are there particular predictors that are more likely to be missing? Is the pattern of missing data related to the classes? We can identify which variable has the most missing data using the code below.
missing_data_summary <- colSums(is.na(Soybean))
missing_percentage <- (missing_data_summary / nrow(Soybean)) * 100
missing_df <- data.frame(Predictor = names(missing_percentage), 
                         Missing_Percentage = missing_percentage)
missing_df_sorted <- missing_df[order(-missing_df$Missing_Percentage), ]
print(missing_df_sorted)
##                       Predictor Missing_Percentage
## hail                       hail         17.7159590
## sever                     sever         17.7159590
## seed.tmt               seed.tmt         17.7159590
## lodging                 lodging         17.7159590
## germ                       germ         16.3982430
## leaf.mild             leaf.mild         15.8125915
## fruiting.bodies fruiting.bodies         15.5197657
## fruit.spots         fruit.spots         15.5197657
## seed.discolor     seed.discolor         15.5197657
## shriveling           shriveling         15.5197657
## leaf.shread         leaf.shread         14.6412884
## seed                       seed         13.4699854
## mold.growth         mold.growth         13.4699854
## seed.size             seed.size         13.4699854
## leaf.halo             leaf.halo         12.2986823
## leaf.marg             leaf.marg         12.2986823
## leaf.size             leaf.size         12.2986823
## leaf.malf             leaf.malf         12.2986823
## fruit.pods           fruit.pods         12.2986823
## precip                   precip          5.5636896
## stem.cankers       stem.cankers          5.5636896
## canker.lesion     canker.lesion          5.5636896
## ext.decay             ext.decay          5.5636896
## mycelium               mycelium          5.5636896
## int.discolor       int.discolor          5.5636896
## sclerotia             sclerotia          5.5636896
## plant.stand         plant.stand          5.2708638
## roots                     roots          4.5387994
## temp                       temp          4.3923865
## crop.hist             crop.hist          2.3426061
## plant.growth       plant.growth          2.3426061
## stem                       stem          2.3426061
## date                       date          0.1464129
## area.dam               area.dam          0.1464129
## Class                     Class          0.0000000
## leaves                   leaves          0.0000000
# 1. Create a data frame indicating missingness
missing_data <- Soybean %>%
  mutate(across(everything(), ~ is.na(.))) %>%
  mutate(Class = as.factor(Class))

# 2. Summarize the counts of missing values by class
missing_counts <- missing_data %>%
  group_by(Class) %>%
  summarise(across(everything(), sum, na.rm = TRUE)) %>%
  pivot_longer(-Class, names_to = "Predictor", values_to = "Missing_Count")
## Warning: There was 1 warning in `summarise()`.
## ℹ In argument: `across(everything(), sum, na.rm = TRUE)`.
## ℹ In group 1: `Class = FALSE`.
## Caused by warning:
## ! The `...` argument of `across()` is deprecated as of dplyr 1.1.0.
## Supply arguments directly to `.fns` through an anonymous function instead.
## 
##   # Previously
##   across(a:b, mean, na.rm = TRUE)
## 
##   # Now
##   across(a:b, \(x) mean(x, na.rm = TRUE))
# 3. Create a bar plot for missing counts across classes
ggplot(missing_counts, aes(x = Predictor, y = Missing_Count, fill = Class)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(title = "Counts of Missing Values Across Classes",
       x = "Predictor",
       y = "Missing Count") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

It looks like some predictors have more missing data, (c) Develop a strategy for handling missing data, either by eliminating predictors or imputation. There is multiple different aproached to deal with missing data like KNN and mean. in this case, imputing the missing values with the means of the variable would be suffecient enough, since the plants from the same origiol normally have a similar predective variables. We can inspect that the imputed data has filled the missing values with generated values.

library(mice)
## Warning: package 'mice' was built under R version 4.3.3
## Warning in check_dep_version(): ABI version mismatch: 
## lme4 was built with Matrix ABI version 1
## Current Matrix ABI version is 0
## Please re-install lme4 from source or restore original 'Matrix' package
## 
## Attaching package: 'mice'
## The following object is masked from 'package:stats':
## 
##     filter
## The following objects are masked from 'package:base':
## 
##     cbind, rbind
missing_percent <- sapply(Soybean, function(x) sum(is.na(x)) / length(x) * 100)
threshold <- 50
filtered_data <- Soybean[, which(missing_percent < threshold)]
imputed_data <- mice(Soybean, m = 5, method = 'pmm', maxit = 5, seed = 1234)
## 
##  iter imp variable
##   1   1  date  plant.stand*  precip*  temp*  hail*  crop.hist  area.dam  sever*  seed.tmt*  germ*  plant.growth  leaf.halo  leaf.marg  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   1   2  date  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam  sever*  seed.tmt*  germ*  plant.growth  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   1   3  date  plant.stand  precip*  temp*  hail*  crop.hist  area.dam  sever*  seed.tmt*  germ*  plant.growth  leaf.halo  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   1   4  date  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   1   5  date  plant.stand*  precip*  temp*  hail*  crop.hist  area.dam  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   2   1  date*  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   2   2  date*  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   2   3  date*  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   2   4  date*  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   2   5  date*  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   3   1  date*  plant.stand*  precip*  temp*  hail*  crop.hist  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   3   2  date*  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   3   3  date*  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   3   4  date*  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   3   5  date*  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   4   1  date*  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   4   2  date*  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   4   3  date*  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   4   4  date*  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   4   5  date*  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   5   1  date*  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   5   2  date*  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   5   3  date*  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   5   4  date*  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
##   5   5  date*  plant.stand*  precip*  temp*  hail*  crop.hist*  area.dam*  sever*  seed.tmt*  germ*  plant.growth*  leaf.halo*  leaf.marg*  leaf.size*  leaf.shread*  leaf.malf*  leaf.mild*  stem*  lodging*  stem.cankers*  canker.lesion*  fruiting.bodies*  ext.decay*  mycelium*  int.discolor*  sclerotia*  fruit.pods*  fruit.spots*  seed*  mold.growth*  seed.discolor*  seed.size*  shriveling*  roots*
## Warning: Number of logged events: 1667
summary(imputed_data)
## Class: mids
## Number of multiple imputations:  5 
## Imputation methods:
##           Class            date     plant.stand          precip            temp 
##              ""           "pmm"           "pmm"           "pmm"           "pmm" 
##            hail       crop.hist        area.dam           sever        seed.tmt 
##           "pmm"           "pmm"           "pmm"           "pmm"           "pmm" 
##            germ    plant.growth          leaves       leaf.halo       leaf.marg 
##           "pmm"           "pmm"              ""           "pmm"           "pmm" 
##       leaf.size     leaf.shread       leaf.malf       leaf.mild            stem 
##           "pmm"           "pmm"           "pmm"           "pmm"           "pmm" 
##         lodging    stem.cankers   canker.lesion fruiting.bodies       ext.decay 
##           "pmm"           "pmm"           "pmm"           "pmm"           "pmm" 
##        mycelium    int.discolor       sclerotia      fruit.pods     fruit.spots 
##           "pmm"           "pmm"           "pmm"           "pmm"           "pmm" 
##            seed     mold.growth   seed.discolor       seed.size      shriveling 
##           "pmm"           "pmm"           "pmm"           "pmm"           "pmm" 
##           roots 
##           "pmm" 
## PredictorMatrix:
##             Class date plant.stand precip temp hail crop.hist area.dam sever
## Class           0    1           1      1    1    1         1        1     1
## date            1    0           1      1    1    1         1        1     1
## plant.stand     1    1           0      1    1    1         1        1     1
## precip          1    1           1      0    1    1         1        1     1
## temp            1    1           1      1    0    1         1        1     1
## hail            1    1           1      1    1    0         1        1     1
##             seed.tmt germ plant.growth leaves leaf.halo leaf.marg leaf.size
## Class              1    1            1      1         1         1         1
## date               1    1            1      1         1         1         1
## plant.stand        1    1            1      1         1         1         1
## precip             1    1            1      1         1         1         1
## temp               1    1            1      1         1         1         1
## hail               1    1            1      1         1         1         1
##             leaf.shread leaf.malf leaf.mild stem lodging stem.cankers
## Class                 1         1         1    1       1            1
## date                  1         1         1    1       1            1
## plant.stand           1         1         1    1       1            1
## precip                1         1         1    1       1            1
## temp                  1         1         1    1       1            1
## hail                  1         1         1    1       1            1
##             canker.lesion fruiting.bodies ext.decay mycelium int.discolor
## Class                   1               1         1        1            1
## date                    1               1         1        1            1
## plant.stand             1               1         1        1            1
## precip                  1               1         1        1            1
## temp                    1               1         1        1            1
## hail                    1               1         1        1            1
##             sclerotia fruit.pods fruit.spots seed mold.growth seed.discolor
## Class               1          1           1    1           1             1
## date                1          1           1    1           1             1
## plant.stand         1          1           1    1           1             1
## precip              1          1           1    1           1             1
## temp                1          1           1    1           1             1
## hail                1          1           1    1           1             1
##             seed.size shriveling roots
## Class               1          1     1
## date                1          1     1
## plant.stand         1          1     1
## precip              1          1     1
## temp                1          1     1
## hail                1          1     1
## Number of logged events:  1667 
##   it im         dep meth
## 1  1  1 plant.stand  pmm
## 2  1  1 plant.stand  pmm
## 3  1  1      precip  pmm
## 4  1  1      precip  pmm
## 5  1  1        temp  pmm
## 6  1  1        temp  pmm
##                                                                                                                                                                                                                                                        out
## 1                                                                                                                                                              Classbrown-spot, Classcharcoal-rot, Classcyst-nematode, Classpurple-seed-stain, fruit.pods3
## 2 mice detected that your data are (nearly) multi-collinear.\nIt applied a ridge penalty to continue calculations, but the results can be unstable.\nDoes your dataset contain duplicates, linear transformation, or factors with unique respondent names?
## 3                                                                                                        Classbrown-spot, Classcharcoal-rot, Classcyst-nematode, Classdiaporthe-stem-canker, Classherbicide-injury, Classrhizoctonia-root-rot, fruit.pods1
## 4 mice detected that your data are (nearly) multi-collinear.\nIt applied a ridge penalty to continue calculations, but the results can be unstable.\nDoes your dataset contain duplicates, linear transformation, or factors with unique respondent names?
## 5                                                                                                                                                                                Classbrown-spot, Classcyst-nematode, leaf.mild2, fruit.pods1, fruit.pods3
## 6 mice detected that your data are (nearly) multi-collinear.\nIt applied a ridge penalty to continue calculations, but the results can be unstable.\nDoes your dataset contain duplicates, linear transformation, or factors with unique respondent names?