
In this lab you will respond to a set of prompts for two parts.
For the data product, you will interpret a different type of model – a model in a regression mode.
So far, we have specified and interpreted a classification model: one predicting a dichotomous outcome (i.e., whether students pass a course). In many cases, however, we are interested in predicting a continuous outcome (e.g., students’ number of points in a course or their score on a final exam).
While many parts of the machine learning process are the same for a regression machine learning model, one key part that is relevant to this lab is different: their interpretation. The confusion matrix we created to parse the predictive strength of our classification model does not pertain to regression machine learning models. Different metrics are used. For this lab, you will specify and interpret a regression machine learning model.
The requirements are as follows:
Change your outcome to students’ final exam performance (note: check the data dictionary for a pointer!).
Using the same data (and testing and training data sets), recipe, and workflow as you used in the case study, change the mode of your model from classification to regression and change the engine from a glm to an lm model.
Interpret your regression machine learning model in terms of three regression machine learning model metrics: MAE, MSE, and RMSE. Read about these metrics here. Similar to how we interpreted the classification machine learning metrics, focus on the substantive meaning of these statistics.
Please use the code chunk below for your code:
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.3.1
## Warning: package 'ggplot2' was built under R version 4.3.1
## Warning: package 'tibble' was built under R version 4.3.1
## Warning: package 'tidyr' was built under R version 4.3.1
## Warning: package 'readr' was built under R version 4.3.1
## Warning: package 'purrr' was built under R version 4.3.1
## Warning: package 'dplyr' was built under R version 4.3.1
## Warning: package 'stringr' was built under R version 4.3.1
## Warning: package 'forcats' was built under R version 4.3.1
## Warning: package 'lubridate' was built under R version 4.3.1
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.2 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.3 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tidymodels)
## Warning: package 'tidymodels' was built under R version 4.3.1
## ── Attaching packages ────────────────────────────────────── tidymodels 1.1.1 ──
## ✔ broom 1.0.5 ✔ rsample 1.2.0
## ✔ dials 1.2.0 ✔ tune 1.1.2
## ✔ infer 1.0.5 ✔ workflows 1.1.3
## ✔ modeldata 1.2.0 ✔ workflowsets 1.0.1
## ✔ parsnip 1.1.1 ✔ yardstick 1.2.0
## ✔ recipes 1.0.8
## Warning: package 'broom' was built under R version 4.3.1
## Warning: package 'dials' was built under R version 4.3.1
## Warning: package 'scales' was built under R version 4.3.1
## Warning: package 'infer' was built under R version 4.3.1
## Warning: package 'modeldata' was built under R version 4.3.1
## Warning: package 'parsnip' was built under R version 4.3.1
## Warning: package 'recipes' was built under R version 4.3.1
## Warning: package 'rsample' was built under R version 4.3.1
## Warning: package 'tune' was built under R version 4.3.1
## Warning: package 'workflows' was built under R version 4.3.1
## Warning: package 'workflowsets' was built under R version 4.3.1
## Warning: package 'yardstick' was built under R version 4.3.1
## ── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
## ✖ scales::discard() masks purrr::discard()
## ✖ dplyr::filter() masks stats::filter()
## ✖ recipes::fixed() masks stringr::fixed()
## ✖ dplyr::lag() masks stats::lag()
## ✖ yardstick::spec() masks readr::spec()
## ✖ recipes::step() masks stats::step()
## • Use suppressPackageStartupMessages() to eliminate package startup messages
library(janitor)
## Warning: package 'janitor' was built under R version 4.3.1
##
## Attaching package: 'janitor'
##
## The following objects are masked from 'package:stats':
##
## chisq.test, fisher.test
```
Please add your interpretations here:
MAE: MAE tells us, on average, how far off your predictions are from the actual values. A lower MAE means your predictions are, on average, closer to the truth.
MSE:It also measures prediction errors, but it emphasizes larger errors more than smaller ones by squaring them. Lower MSE means smaller prediction errors on average.
RMSE:It is the square root of MSE. It gives you an idea of the average prediction error in the same units as your data. A lower RMSE means smaller prediction errors in your data’s original units.
Step 1: We need to get a bunch of reviews and tell the computer if they are positive, negative, or neither (neutral).
Step 2: We need to Teach the computer by showing it many examples of reviews with their labels (positive, negative, or neutral).
Step 3: The computer learns from these examples and figures out what words and phrases usually mean something good or bad in a review.
Step 4: Now, the computer can look at new reviews and decide if they are positive, negative, or neutral all on its own.
This helps businesses understand what people think about their products and make changes if needed.
Complete the following steps to knit and publish your work:
First, change the name of the author: in the YAML
header at the very top of this document to your name. The YAML
header controls the style and feel for knitted document but doesn’t
actually display in the final output.
Next, click the knit button in the toolbar above to “knit” your R Markdown document to a HTML file that will be saved in your R Project folder. You should see a formatted webpage appear in your Viewer tab in the lower right pan or in a new browser window. Let’s us know if you run into any issues with knitting.
Finally, publish your webpage on RPubs by clicking the “Publish” button located in the Viewer Pane after you knit your document. See screenshot below.
Have fun!
