Homework 5
For this assignment, we will be working with the file
"pokemon.csv", found in /data. The file is
from Kaggle: https://www.kaggle.com/abcsds/pokemon.
The Pokémon franchise
encompasses video games, TV shows, movies, books, and a card game. This
data set was drawn from the video game series and contains statistics
about 721 Pokémon, or “pocket monsters.” In Pokémon games, the user
plays as a trainer who collects, trades, and battles Pokémon to (a)
collect all the Pokémon and (b) become the champion Pokémon trainer.
Each Pokémon has a primary type
(some even have secondary types). Based on their type, a Pokémon is
strong against some types, and vulnerable to others. (Think rock, paper,
scissors.) A Fire-type Pokémon, for example, is vulnerable to Water-type
Pokémon, but strong against Grass-type.
Fig 1. Vulpix, a Fire-type fox Pokémon from
Generation 1 (also my favorite Pokémon!)
The goal of this assignment is to build a statistical learning model
that can predict the primary type of a Pokémon based on
its generation, legendary status, and six battle statistics. This is
an example of a classification problem, but these
models can also be used for regression
problems.
Read in the file and familiarize yourself with the variables using
pokemon_codebook.txt.
Exercise 1
Install and load the janitor package. Use its
clean_names() function on the Pokémon data, and save the
results to work with for the rest of the assignment. What happened to
the data? Why do you think clean_names() is useful?
pokemon <- read.csv("/Users/zhaolei/Downloads/homework-5/data/Pokemon.csv")
library(vip)
library(janitor)
library(forcats)
library(tidyverse)
library(tidymodels)
library(ISLR)
library(ISLR2)
library(glmnet)
library(modeldata)
library(ggthemes)
library(naniar) # to assess missing data patterns
library(corrplot) # for a correlation plot
library(patchwork) # for putting plots together
library(rpart.plot)
library(verification)
library(verification)
tidymodels_prefer()
library(ggplot2)
pokemon <-clean_names(pokemon)
head(pokemon)
## x name type_1 type_2 total hp attack defense sp_atk sp_def
## 1 1 Bulbasaur Grass Poison 318 45 49 49 65 65
## 2 2 Ivysaur Grass Poison 405 60 62 63 80 80
## 3 3 Venusaur Grass Poison 525 80 82 83 100 100
## 4 3 VenusaurMega Venusaur Grass Poison 625 80 100 123 122 120
## 5 4 Charmander Fire 309 39 52 43 60 50
## 6 5 Charmeleon Fire 405 58 64 58 80 65
## speed generation legendary
## 1 45 1 False
## 2 60 1 False
## 3 80 1 False
## 4 80 1 False
## 5 65 1 False
## 6 80 1 False
The clean_names function standardizes the dataset’s column names by
removing leading and trailing spaces and replacing special characters
with underscores. This ensures consistency among variable names, making
them easier to access and work with.
Exercise 2
Using the entire data set, create a bar chart of the outcome
variable, type_1.
How many classes of the outcome are there? Are there any Pokémon
types with very few Pokémon? If so, which ones?
For this assignment, we’ll handle the rarer classes by grouping them,
or “lumping them,” together into an ‘other’ category. Using the forcats
package, determine how to do this, and lump all the other
levels together except for the top 6 most frequent (which are
Bug, Fire, Grass, Normal, Water, and Psychic).
Convert type_1 and legendary to
factors.
pokemon%>%
ggplot(aes(x=type_1))+
geom_bar()
There are 18 outcomes here, and classes such as Flying, fairy, Ice do
include very small number of observations, so we have to lumping
them.
pokemon$type_1 <- pokemon %>%
mutate(type_1 = fct_lump_n(type_1, n = 6, w = NULL, other_level = "Other"))%>%
pull(type_1)
pokemon %>%
ggplot(aes(x = type_1)) +
geom_bar()

# factorization
pokemon$type_1 <- as.factor(pokemon$type_1)
pokemon$legendary <- as.factor(pokemon $legendary)
Exercise 3
Perform an initial split of the data. Stratify by the outcome
variable. You can choose a proportion to use. Verify that your training
and test sets have the desired number of observations.
Next, use v-fold cross-validation on the training set. Use 5
folds. Stratify the folds by type_1 as well. Hint: Look
for a strata argument.
Why do you think doing stratified sampling for cross-validation is
useful?
set.seed(0926)
pokemon_split <- initial_split(pokemon, prop = 0.75,strata = "type_1",)
pokemon_train <- training(pokemon_split)
pokemon_test <- testing(pokemon_split)
pokemon_folds <-vfold_cv(pokemon_train, v = 5, strata = "type_1")
Stratified sampling for cross-validation ensures that the training
and testing sets maintain the same proportion of features as the
original data, allowing the results to better represent the entire
dataset.
Exercise 4
Create a correlation matrix of the training set, using the
corrplot package. Note: You can choose how to handle
the categorical variables for this plot; justify your
decision(s).
What relationships, if any, do you notice?
pokemon_train%>%select(is.numeric)%>%cor()%>%corrplot()
the variable total is strongly and positively related with hp, attack,
defense, sp_atk, sp_def, and speed.
Exercise 5
Set up a recipe to predict type_1 with
legendary, generation, sp_atk,
attack, speed, defense,
hp, and sp_def.
pokemon_recipe<-recipe(type_1~legendary + generation + sp_atk + attack + speed
+defense + hp + sp_def,data=pokemon_train)%>%
step_dummy(all_nominal(), -all_outcomes())%>%
step_scale(all_predictors())%>%
step_center(all_predictors())
Exercise 6
We’ll be fitting and tuning an elastic net, tuning
penalty and mixture (use
multinom_reg() with the glmnet engine).
Set up this model and workflow. Create a regular grid for
penalty and mixture with 10 levels each;
mixture should range from 0 to 1. For this assignment, let
penalty range from 0.01 to 3 (this is on the
identity_trans() scale; note that you’ll need to specify
these values in base 10 otherwise).
nom_model <- multinom_reg(penalty = tune(),
mixture = tune()) %>%
set_engine("glmnet") %>%
set_mode("classification")
# set up the workflow
nom_wkflow <-workflow() %>%
add_model(nom_model) %>%
add_recipe(pokemon_recipe)
# create grid
nom_grid <- grid_regular(penalty(range =c(0.01, 3), trans = identity_trans()),
mixture(range = c(0,1)),
levels = 10)
Exercise 7
Now set up a random forest model and workflow. Use the
ranger engine and set importance = "impurity";
we’ll be tuning mtry, trees, and
min_n. Using the documentation for
rand_forest(), explain in your own words what each of these
hyperparameters represent.
Create a regular grid with 8 levels each. You can choose plausible
ranges for each hyperparameter. Note that mtry should not
be smaller than 1 or larger than 8. Explain why neither of those
values would make sense.
What type of model does mtry = 8 represent?
# set up model
randomf_model <- rand_forest(mtry = tune(), trees = tune(), min_n = tune()) %>%
set_engine("ranger", importance = "impurity") %>%
set_mode("classification")
# set up workflow
randomf_wkflow <- workflow() %>%
add_model(randomf_model) %>%
add_recipe(pokemon_recipe)
randomf_grid <- grid_regular(mtry(range = c(2,7)),
trees(range = c(100,1000)),
min_n(range = c(2,10)),
levels = 8)
mtry represents the number of variables randomly sampled as
candidates at each split when building a tree. We need at least one
variable, so mtry can’t be less than 1. Since we have at most 8
predictors in the model, mtry can’t be larger than 8. When mtry is set
to 8, the model becomes a decision tree that uses all features.
Exercise 8
Fit all models to your folded data using
tune_grid().
Note: Tuning your random forest model will take a few minutes
to run, anywhere from 5 minutes to 15 minutes and up. Consider running
your models outside of the .Rmd, storing the results, and loading them
in your .Rmd to minimize time to knit. We’ll go over how to do this in
lecture.
Use autoplot() on the results. What do you notice? Do
larger or smaller values of penalty and
mixture produce better ROC AUC? What about values of
min_n, trees, and mtry?
What elastic net model and what random forest model perform the best
on your folded data? (What specific values of the hyperparameters
resulted in the optimal ROC AUC?)
nom_res <- tune_grid(
nom_wkflow,
resamples = pokemon_folds,
grid = nom_grid,
control = control_grid(save_pred = TRUE)
)
load("/Users/zhaolei/Downloads/homework-5/tune_random_forest.rda")
autoplot(nom_res)
For this particular dataset and elastic net model, using smaller values
for both the penalty and mixture parameters results in better model
performance, as indicated by higher ROC AUC values. This suggests that
the model benefits from less regularization and a balance that favors
Ridge regularization.
collect_metrics(nom_res) %>%
filter(.metric == "roc_auc") %>%
select(penalty, mixture, mean, std_err)
## # A tibble: 100 × 4
## penalty mixture mean std_err
## <dbl> <dbl> <dbl> <dbl>
## 1 0.01 0 0.682 0.0167
## 2 0.342 0 0.638 0.0157
## 3 0.674 0 0.628 0.0159
## 4 1.01 0 0.623 0.0160
## 5 1.34 0 0.620 0.0158
## 6 1.67 0 0.617 0.0164
## 7 2.00 0 0.615 0.0161
## 8 2.34 0 0.613 0.0161
## 9 2.67 0 0.611 0.0162
## 10 3 0 0.611 0.0162
## # ℹ 90 more rows
select_best(nom_res)
## # A tibble: 1 × 3
## penalty mixture .config
## <dbl> <dbl> <chr>
## 1 0.01 0.667 Preprocessor1_Model061
The best roc_auc value is when penalty = 0.01 and mixture is about
0.556
autoplot(rf_res)

select_best(rf_res)
## # A tibble: 1 × 4
## mtry trees min_n .config
## <int> <int> <int> <chr>
## 1 3 742 6 Preprocessor1_Model224
e random forest model, although the results varies a lot, the highest
roc_auc value is when selecting 3 predictors , node size64, and 742
trees. ### Exercise 9
Select your optimal random forest modelin
terms of roc_auc. Then fit that model to your training set
and evaluate its performance on the testing set.
Using the training set:
Using the testing set:
Create plots of the different ROC curves, one per level of the
outcome variable;
Make a heat map of the confusion matrix.
final_wkflow <- finalize_workflow(randomf_wkflow,
select_best(rf_res))
pokemon_fit <- fit(final_wkflow, pokemon_train) # fit the training set
vip(pokemon_fit, importance = "impurity")

predict(pokemon_fit, new_data=pokemon_test, type ="prob")
## # A tibble: 201 × 7
## .pred_Bug .pred_Fire .pred_Grass .pred_Normal .pred_Psychic .pred_Water
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 0.00620 0.516 0.0553 0.0294 0.0208 0.126
## 2 0.0739 0.262 0.0544 0.0337 0.0729 0.158
## 3 0.0201 0.0509 0.350 0.0233 0.0196 0.291
## 4 0.0134 0.147 0.0510 0.0988 0.0616 0.109
## 5 0.0271 0.00460 0.0211 0.471 0.00106 0.0710
## 6 0.0668 0.0148 0.173 0.114 0.1 0.284
## 7 0.124 0.0710 0.0832 0.0812 0.0580 0.354
## 8 0.0144 0.0652 0.267 0.0870 0.0302 0.229
## 9 0.110 0.131 0.188 0.191 0.0207 0.102
## 10 0.0494 0.00966 0.0192 0.201 0.0412 0.388
## # ℹ 191 more rows
## # ℹ 1 more variable: .pred_Other <dbl>
augment(pokemon_fit, new_data = pokemon_test) %>%
roc_curve(type_1, .pred_Bug,.pred_Fire, .pred_Grass, .pred_Normal, .pred_Psychic, .pred_Water, .pred_Other) %>%
autoplot()

augment(pokemon_fit, new_data= pokemon_test) %>%
conf_mat(truth = type_1, estimate = .pred_class) %>%
autoplot(type = "heatmap")
The model is best at predicting Other type, this might because after the
lumping, the Other type becomes most prevalent in the dataset.
Accuracy: 46.27%
Indicates that the model correctly predicted the Pokémon type for 46.27%
of the instances. Sensitivity (Recall): 25.15%
Suggests the model struggles to identify true positives, often missing
the correct Pokémon types. Specificity: 88.07%
Shows the model is good at avoiding false positives, correctly
identifying instances that do not belong to each type.
Exercise 10
How did your best random forest model do on the testing set?
Which Pokemon types is the model best at predicting, and which is it
worst at? (Do you have any ideas why this might be?)
multi_metric <- metric_set(accuracy, sensitivity, specificity)
augment(pokemon_fit, new_data = pokemon_test) %>%
multi_metric(truth = type_1, estimate = .pred_class)
## # A tibble: 3 × 3
## .metric .estimator .estimate
## <chr> <chr> <dbl>
## 1 accuracy multiclass 0.478
## 2 sensitivity macro 0.269
## 3 specificity macro 0.884
---
title: "Homework 5"
author: "Lei Zhao"
output:
  html_document:
    toc: true
    toc_float: true
    code_folding: show
    code_download: true
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, message = FALSE,
                      warning = FALSE)
```

## Homework 5

For this assignment, we will be working with the file `"pokemon.csv"`, found in `/data`. The file is from Kaggle: <https://www.kaggle.com/abcsds/pokemon>.

The [Pokémon](https://www.pokemon.com/us/) franchise encompasses video games, TV shows, movies, books, and a card game. This data set was drawn from the video game series and contains statistics about 721 Pokémon, or "pocket monsters." In Pokémon games, the user plays as a trainer who collects, trades, and battles Pokémon to (a) collect all the Pokémon and (b) become the champion Pokémon trainer.

Each Pokémon has a [primary type](https://bulbapedia.bulbagarden.net/wiki/Type) (some even have secondary types). Based on their type, a Pokémon is strong against some types, and vulnerable to others. (Think rock, paper, scissors.) A Fire-type Pokémon, for example, is vulnerable to Water-type Pokémon, but strong against Grass-type.

![Fig 1. Vulpix, a Fire-type fox Pokémon from Generation 1 (also my favorite Pokémon!) ](images/vulpix.png){width="196"}

The goal of this assignment is to build a statistical learning model that can predict the **primary type** of a Pokémon based on its generation, legendary status, and six battle statistics. *This is an example of a **classification problem**, but these models can also be used for **regression problems***.

Read in the file and familiarize yourself with the variables using `pokemon_codebook.txt`.


### Exercise 1

Install and load the `janitor` package. Use its `clean_names()` function on the Pokémon data, and save the results to work with for the rest of the assignment. What happened to the data? Why do you think `clean_names()` is useful?
```{r}
pokemon <- read.csv("/Users/zhaolei/Downloads/homework-5/data/Pokemon.csv")
library(vip)
library(janitor)
library(forcats)
library(tidyverse)
library(tidymodels)
library(ISLR)
library(ISLR2)
library(glmnet)
library(modeldata)
library(ggthemes)
library(naniar) # to assess missing data patterns
library(corrplot) # for a correlation plot
library(patchwork) # for putting plots together
library(rpart.plot)
library(verification)
library(verification)
tidymodels_prefer()
library(ggplot2)
pokemon <-clean_names(pokemon)
head(pokemon)

```
The clean_names function standardizes the dataset's column names by removing leading and trailing spaces and replacing special characters with underscores. This ensures consistency among variable names, making them easier to access and work with.

### Exercise 2

Using the entire data set, create a bar chart of the outcome variable, `type_1`.

How many classes of the outcome are there? Are there any Pokémon types with very few Pokémon? If so, which ones?

For this assignment, we'll handle the rarer classes by grouping them, or "lumping them," together into an 'other' category. [Using the `forcats` package](https://forcats.tidyverse.org/), determine how to do this, and **lump all the other levels together except for the top 6 most frequent** (which are Bug, Fire, Grass, Normal, Water, and Psychic).

Convert `type_1` and `legendary` to factors.

```{r}
pokemon%>%
  ggplot(aes(x=type_1))+
  geom_bar()

```
There are 18 outcomes here, and classes such as Flying, fairy, Ice do include very small number of observations, so we have to lumping them.

```{r}
pokemon$type_1 <- pokemon %>%
  mutate(type_1 = fct_lump_n(type_1, n = 6, w = NULL, other_level = "Other"))%>%
   pull(type_1)

pokemon %>%
  ggplot(aes(x = type_1)) +
  geom_bar()

# factorization
pokemon$type_1 <- as.factor(pokemon$type_1)
pokemon$legendary <- as.factor(pokemon $legendary)
```
### Exercise 3

Perform an initial split of the data. Stratify by the outcome variable. You can choose a proportion to use. Verify that your training and test sets have the desired number of observations.

Next, use *v*-fold cross-validation on the training set. Use 5 folds. Stratify the folds by `type_1` as well. *Hint: Look for a `strata` argument.*

Why do you think doing stratified sampling for cross-validation is useful?

```{r}
set.seed(0926)


pokemon_split <- initial_split(pokemon, prop = 0.75,strata =   "type_1",)

pokemon_train <- training(pokemon_split)
pokemon_test <- testing(pokemon_split)
pokemon_folds <-vfold_cv(pokemon_train, v = 5, strata = "type_1")

```
Stratified sampling for cross-validation ensures that the training and testing sets maintain the same proportion of features as the original data, allowing the results to better represent the entire dataset.

### Exercise 4

Create a correlation matrix of the training set, using the `corrplot` package. *Note: You can choose how to handle the categorical variables for this plot; justify your decision(s).*

What relationships, if any, do you notice?

```{r}
pokemon_train%>%select(is.numeric)%>%cor()%>%corrplot()
```
the variable total is strongly and positively related with hp, attack, defense, sp_atk, sp_def, and speed.


### Exercise 5

Set up a recipe to predict `type_1` with `legendary`, `generation`, `sp_atk`, `attack`, `speed`, `defense`, `hp`, and `sp_def`.

-   Dummy-code `legendary` and `generation`;

-   Center and scale all predictors.

```{r}
pokemon_recipe<-recipe(type_1~legendary + generation + sp_atk + attack + speed 
                       +defense + hp + sp_def,data=pokemon_train)%>%
  step_dummy(all_nominal(), -all_outcomes())%>%
  step_scale(all_predictors())%>%
  step_center(all_predictors())


```

### Exercise 6

We'll be fitting and tuning an elastic net, tuning `penalty` and `mixture` (use `multinom_reg()` with the `glmnet` engine).

Set up this model and workflow. Create a regular grid for `penalty` and `mixture` with 10 levels each; `mixture` should range from 0 to 1. For this assignment, let `penalty` range from 0.01 to 3 (this is on the `identity_trans()` scale; note that you'll need to specify these values in base 10 otherwise).

```{r}
nom_model <- multinom_reg(penalty = tune(),
                         mixture = tune()) %>%
  set_engine("glmnet") %>%
  set_mode("classification")
  
# set up the workflow
nom_wkflow <-workflow() %>%
  add_model(nom_model) %>%
  add_recipe(pokemon_recipe)

# create grid
nom_grid <- grid_regular(penalty(range =c(0.01, 3), trans = identity_trans()),
                        
                        mixture(range = c(0,1)),
                        levels = 10)
```

### Exercise 7

Now set up a random forest model and workflow. Use the `ranger` engine and set `importance = "impurity"`; we'll be tuning `mtry`, `trees`, and `min_n`. Using the documentation for `rand_forest()`, explain in your own words what each of these hyperparameters represent.

Create a regular grid with 8 levels each. You can choose plausible ranges for each hyperparameter. Note that `mtry` should not be smaller than 1 or larger than 8. **Explain why neither of those values would make sense.**

What type of model does `mtry = 8` represent?

```{r}
# set up model
randomf_model <- rand_forest(mtry = tune(), trees = tune(), min_n = tune()) %>%
  set_engine("ranger", importance = "impurity") %>%
  set_mode("classification")

# set up workflow
randomf_wkflow <- workflow() %>%
  add_model(randomf_model) %>%
  add_recipe(pokemon_recipe)

randomf_grid <- grid_regular(mtry(range = c(2,7)),
                        trees(range = c(100,1000)),
                        min_n(range = c(2,10)),
                        levels = 8)


```
mtry represents the number of variables randomly sampled as candidates at each split when building a tree. We need at least one variable, so mtry can't be less than 1. Since we have at most 8 predictors in the model, mtry can't be larger than 8. When mtry is set to 8, the model becomes a decision tree that uses all features.

### Exercise 8

Fit all models to your folded data using `tune_grid()`.

**Note: Tuning your random forest model will take a few minutes to run, anywhere from 5 minutes to 15 minutes and up. Consider running your models outside of the .Rmd, storing the results, and loading them in your .Rmd to minimize time to knit. We'll go over how to do this in lecture.**

Use `autoplot()` on the results. What do you notice? Do larger or smaller values of `penalty` and `mixture` produce better ROC AUC? What about values of `min_n`, `trees`, and `mtry`?

What elastic net model and what random forest model perform the best on your folded data? (What specific values of the hyperparameters resulted in the optimal ROC AUC?)

```{r}
nom_res <- tune_grid(
  nom_wkflow,
  resamples = pokemon_folds,
  grid = nom_grid,
  control = control_grid(save_pred = TRUE)
)

load("/Users/zhaolei/Downloads/homework-5/tune_random_forest.rda")
```


```{r}
autoplot(nom_res)
```
For this particular dataset and elastic net model, using smaller values for both the penalty and mixture parameters results in better model performance, as indicated by higher ROC AUC values. This suggests that the model benefits from less regularization and a balance that favors Ridge regularization.
```{r}
collect_metrics(nom_res) %>%
  filter(.metric == "roc_auc") %>%
  select(penalty, mixture, mean, std_err)
```
```{r}
select_best(nom_res)
```
The best roc_auc value is when penalty = 0.01 and mixture is about 0.556


```{r}
autoplot(rf_res)
```

```{r}
select_best(rf_res)
```
e random forest model, although the results varies a lot, the highest roc_auc value is when selecting 3 predictors , node size64, and 742 trees.
### Exercise 9

Select your optimal [**random forest model**]{.underline}in terms of `roc_auc`. Then fit that model to your training set and evaluate its performance on the testing set.

Using the **training** set:

-   Create a variable importance plot, using `vip()`. *Note that you'll still need to have set `importance = "impurity"` when fitting the model to your entire training set in order for this to work.*

    -   What variables were most useful? Which were least useful? Are these results what you expected, or not?

Using the testing set:

-   Create plots of the different ROC curves, one per level of the outcome variable;

-   Make a heat map of the confusion matrix.

```{r}
final_wkflow <- finalize_workflow(randomf_wkflow,
                                  select_best(rf_res))
pokemon_fit <- fit(final_wkflow, pokemon_train) # fit the training set

vip(pokemon_fit, importance = "impurity")

```

```{r}
predict(pokemon_fit, new_data=pokemon_test, type ="prob")
```

```{r}
augment(pokemon_fit, new_data = pokemon_test) %>%
  roc_curve(type_1, .pred_Bug,.pred_Fire, .pred_Grass, .pred_Normal, .pred_Psychic, .pred_Water, .pred_Other) %>%
  autoplot()
```

```{r}
augment(pokemon_fit, new_data= pokemon_test) %>%
  conf_mat(truth = type_1, estimate = .pred_class) %>%
  autoplot(type = "heatmap")
```
The model is best at predicting Other type, this might because after the lumping, the Other type becomes most prevalent in the dataset.

Accuracy: 46.27%\
Indicates that the model correctly predicted the Pokémon type for 46.27% of the instances.
Sensitivity (Recall): 25.15%\
Suggests the model struggles to identify true positives, often missing the correct Pokémon types.
Specificity: 88.07%\
Shows the model is good at avoiding false positives, correctly identifying instances that do not belong to each type.

### Exercise 10

How did your best random forest model do on the testing set?

Which Pokemon types is the model best at predicting, and which is it worst at? (Do you have any ideas why this might be?)

```{r}
multi_metric <- metric_set(accuracy, sensitivity, specificity)

augment(pokemon_fit, new_data = pokemon_test) %>%
  multi_metric(truth = type_1, estimate = .pred_class)
```


## For 231 Students

### Exercise 11

In the 2020-2021 season, Stephen Curry, an NBA basketball player, made 337 out of 801 three point shot attempts (42.1%). Use bootstrap resampling on a sequence of 337 1's (makes) and 464 0's (misses). For each bootstrap sample, compute and save the sample mean (e.g. bootstrap FG% for the player). Use 1000 bootstrap samples to plot a histogram of those values. Compute the 99% bootstrap confidence interval for Stephen Curry's "true" end-of-season FG% using the quantile function in R. Print the endpoints of this interval.

### Exercise 12

Using the `abalone.txt` data from previous assignments, fit and tune a **random forest** model to predict `age`. Use stratified cross-validation and select ranges for `mtry`, `min_n`, and `trees`. Present your results. What was your final chosen model's **RMSE** on your testing set?
