Learning Lab 3 Case Study

Even after feature engineering, machine learning approaches can often (but not always) be improved by choosing a more sophisticated model type. Note how we used a regression model in the first two case studies; here, we explore a considerably more sophisticated model, a random forest.

Choosing a more sophisticated model adds some complexity to the modeling. Notably, more complex models have tuning parameters - parts of the model that are not estimated from the data. In addition to using feature engineering in a way akin to how we did in the last case study, Bertolini et al. (2021) use tuning parameters to improve the performance of their predictive model.

Bertolini, R., Finch, S. J., & Nehm, R. H. (2021). Enhancing data pipelines for forecasting student performance: integrating feature selection with cross-validation. International Journal of Educational Technology in Higher Education, 18(1), 1-23. https://github.com/laser-institute/essential-readings/blob/main/machine-learning/ml-lab-3/bertolini-et-al-2021-ijethe.pdf

Our driving question is: How much of a difference does a more complex model make? Looking back to our predictive model from LL1, we can see that our accuracy was around 87%: 0.872, more specifically. Can we improve on that?

While answering this question, we focus not only on estimating, but also on tuning a complex model. The data we use is, again, from the #NGSSchat community on Twitter, as in doing so we can compare the performance of this tuned, complex model to the initial model we used in the first case study.

Step 0: Loading and setting up

First, let’s load the packages we’ll use—the familiar {tidyverse} and several others focused on modeling. Like in earlier learning labs, click the green arrow to run the code chunk.

Your Turn ⤵

Please add to the chunk below code to load three packages we’ve used in both LL1 and LL2 - tidyverse, tidymodels, and here.

library(tidyverse)

## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──

## ✔ ggplot2 3.3.5     ✔ purrr   0.3.4
## ✔ tibble  3.1.7     ✔ dplyr   1.0.9
## ✔ tidyr   1.2.0     ✔ stringr 1.4.0
## ✔ readr   2.1.2     ✔ forcats 0.5.1

## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

library(tidymodels)

## ── Attaching packages ────────────────────────────────────── tidymodels 0.2.0 ──

## ✔ broom        0.7.12     ✔ rsample      1.0.0 
## ✔ dials        1.0.0      ✔ tune         1.0.0 
## ✔ infer        1.0.2      ✔ workflows    1.0.0 
## ✔ modeldata    1.0.0      ✔ workflowsets 0.2.1 
## ✔ parsnip      1.0.0      ✔ yardstick    1.0.0 
## ✔ recipes      1.0.1

## ── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
## ✖ scales::discard() masks purrr::discard()
## ✖ dplyr::filter()   masks stats::filter()
## ✖ recipes::fixed()  masks stringr::fixed()
## ✖ dplyr::lag()      masks stats::lag()
## ✖ yardstick::spec() masks readr::spec()
## ✖ recipes::step()   masks stats::step()
## • Dig deeper into tidy modeling with R at https://www.tmwr.org

library(vip) # a new package we're adding for variable importance measures

## 
## Attaching package: 'vip'

## The following object is masked from 'package:utils':
## 
##     vi

library(ranger) # this is needed for the random forest algorithm
library(here)

## here() starts at /Users/meinazhu/Documents/GitHub/machine-learning

Next, we’ll load the processed data.

Note: We created a means of visualizing the threads to make coding them easier; that’s here and it provides a means of seeing what the raw data is like: https://jmichaelrosenberg.shinyapps.io/ngsschat-shiny/.

We’ve added three additional variables for this analysis; thus, the variables we have to consider to use as features are:

n: The number of tweets in the thread (independent variable)
mean_favorite_count: The mean number of favorites for the tweets in the thread (independent variable)
sum_favorite_count: The sum of the number of favorites for the tweets in the thread (independent variable)
mean_retweet_count: The mean number of retweets for the tweets in the thread (independent variable)
sum_retweet_count: The sum of the number of retweets for the tweets in the thread (independent variable)
sum_display_text_width: The sum of the number of characters for the tweets in the thread (independent variable)
mean_display_text_width: The mean of the number of characters for the tweets in the thread (independent variable)
code: The qualitative code (TS = transactional; TF = transformational) (dependent variable)

One big type of feature not included in this analysis - more information on the text in the tweets. This is likely to be quite predictive: the words that users included in their tweets are probably associated with (and predictive of) substantive or transactional conversations. Given our focus on ML in this topic area, we do not include features relating to the text data, but think this could be a great direction for future work (and research) in this area.

d <- read_csv("data/ngsschat-processed-data.csv")

## Rows: 3793 Columns: 5
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): code
## dbl (4): mean_favorite_count, mean_retweet_count, sum_display_text_width, n
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

d <- d %>% 
    mutate(code = as.factor(code)) # this is needed for the classification mode

Step 1. Split data

We treat this step relatively minimally as we have now carried out a step very similar to this in LL1 and LL2; return to the case study for those (especially LL1) for more on data splitting. Note that we carry out the k-folds cross-validation process introduced in LL2. Consider - like there - setting a different value for v (k) as you think is appropriate.

Your Turn ⤵

You do have one step that is your turn! Please add the code for setting up the k-folds cross-validation (exactly as you did this in LL2.

set.seed(20220712)
train_test_split <- initial_split(d, prop = .80)
data_train <- training(train_test_split)

#<replace this line with your kfcv code!>
kfcv <- vfold_cv(data_train, v = 8)

Step 2: Engineer features

In Step 1, we noted how we added three variables as potential features. Here, we carry out two feature engineering steps we have carried out before - standardizing the numeric variables (to have a mean equal to 0 and a standard deviation equal to 1) and dropping any features with near-zero variance. Consider adding other feature engineering steps - perhaps the step you carried out complete the badge requirements for LL2.

Your Turn ⤵

Add a feature engineering step below. Consider those described here.

my_rec <- recipe(code ~ ., data = data_train) %>% 
    step_normalize(all_numeric_predictors()) %>%
    step_nzv(all_predictors()) %>%
    step_dummy(all_nominal_predictors()) %>%  # dummy code all factor variables
    step_impute_knn(all_predictors()) # impute missing data for all predictor variables

Step 3: Specify recipe, model, and workflow

There are several steps that are different from the past learning labs here.

using the random_forest() function to set the model as a random forest
using set_engine("ranger", importance = "impurity") to set the engine as that provided for random forests through the {ranger} package; we also add the importance = "impurity" line to be able to interpret a particular variable importance metric (impurity) specific to random forest models
finally, using set_mode("classification")) as we are again predicting categories (transactional and substantive conversations taking place through #NGSSchat)

# specify model
my_mod <-
    rand_forest(mtry = tune(),
                min_n = tune(),
                trees = tune ()) %>%
    set_engine("ranger", importance = "impurity") %>%
    set_mode("classification")

# specify workflow
my_wf <-
    workflow() %>%
    add_model(my_mod) %>% 
    add_recipe(my_rec)

Step 4: Fit model

Here, things become are different once again. We’ll follow a grid method to specify two tuning parameters, the number of predictor variables that are randomly sampled for each split in the tree (mtry) and the number of data points required to execute a split (min_n). size refers to how many distinct combinations of the tuning parameters will be returned. 10 is a relatively small number - we can imagine a much larger number of combinations of the mtry and min_n hyperparameters - but it should give us a sense of what parameters lead to the best performance.

These next two functions are used to get a sense of what the values for mtry and min_n should be based on the dimensions or range of the values of the variables in the data.

finalize(mtry(), data_train)

## # Randomly Selected Predictors (quantitative)
## Range: [1, 5]

finalize(min_n(), data_train)

## Minimal Node Size (quantitative)
## Range: [2, 40]

finalize(trees(), data_train)

## # Trees (quantitative)
## Range: [1, 2000]

Your Turn ⤵

We then use these values in the grid_max_entropy() function below; replace the xx values below with the maximum value provided by the mtry and min_n variables, above. You can see that tree_grid is simply a combination of values of the hyperparameters.

tree_grid <- grid_max_entropy(mtry(range(1, 5)),
                              min_n(range(2, 40)),
                              trees(range(1,600)),
                              size = 10)

tree_grid

## # A tibble: 10 × 3
##     mtry min_n trees
##    <int> <int> <int>
##  1     3    30   358
##  2     4    32   545
##  3     1     2   567
##  4     5     3   141
##  5     1    34   586
##  6     1    24    86
##  7     5     6   486
##  8     2     7   193
##  9     5    34   239
## 10     4    26    59

Now, we’re ready to fit the model. Note - this can take some time, the longest of any function we’ve run so far, as we’re estimating a) as many models as there are folds (v = 10 as default) and b) distinct combinations of hyperparameters (size = 10 as default).

# fit model with tune_grid()
fitted_model <- my_wf %>% 
    tune_grid(
        resamples = kfcv,
        grid = tree_grid,
        metrics = metric_set(roc_auc, accuracy, kap, sensitivity, specificity, precision)
    )

## ! Fold1: preprocessor 1/1, model 4/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold1: preprocessor 1/1, model 7/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold1: preprocessor 1/1, model 9/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold2: preprocessor 1/1, model 4/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold2: preprocessor 1/1, model 7/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold2: preprocessor 1/1, model 9/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold3: preprocessor 1/1, model 4/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold3: preprocessor 1/1, model 7/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold3: preprocessor 1/1, model 9/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold4: preprocessor 1/1, model 4/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold4: preprocessor 1/1, model 7/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold4: preprocessor 1/1, model 9/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold5: preprocessor 1/1, model 4/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold5: preprocessor 1/1, model 7/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold5: preprocessor 1/1, model 9/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold6: preprocessor 1/1, model 4/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold6: preprocessor 1/1, model 7/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold6: preprocessor 1/1, model 9/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold7: preprocessor 1/1, model 4/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold7: preprocessor 1/1, model 7/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold7: preprocessor 1/1, model 9/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold8: preprocessor 1/1, model 4/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold8: preprocessor 1/1, model 7/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

## ! Fold8: preprocessor 1/1, model 9/10: 5 columns were requested but there were 4 predictors in the data. 4 will...

Here comes some further additional steps. This next step is key technically and conceptually - we’re examining the best tuning parameters ranked by their predictive accuracy.

# examine best set of tuning parameters; repeat?
show_best(fitted_model, n = 10, metric = "accuracy")

## # A tibble: 10 × 9
##     mtry trees min_n .metric  .estimator  mean     n std_err .config            
##    <int> <int> <int> <chr>    <chr>      <dbl> <int>   <dbl> <chr>              
##  1     4    59    26 accuracy binary     0.884     8 0.00402 Preprocessor1_Mode…
##  2     3   358    30 accuracy binary     0.881     8 0.00396 Preprocessor1_Mode…
##  3     2   193     7 accuracy binary     0.881     8 0.00565 Preprocessor1_Mode…
##  4     4   545    32 accuracy binary     0.881     8 0.00344 Preprocessor1_Mode…
##  5     1   567     2 accuracy binary     0.880     8 0.00563 Preprocessor1_Mode…
##  6     5   239    34 accuracy binary     0.879     8 0.00346 Preprocessor1_Mode…
##  7     1    86    24 accuracy binary     0.878     8 0.00525 Preprocessor1_Mode…
##  8     1   586    34 accuracy binary     0.876     8 0.00548 Preprocessor1_Mode…
##  9     5   486     6 accuracy binary     0.865     8 0.00537 Preprocessor1_Mode…
## 10     5   141     3 accuracy binary     0.862     8 0.00556 Preprocessor1_Mode…

This function simply indicates that you want to use the best of the sets of tuning parameters examined though the code in the above chunk - literally the first row.

# select best set of tuning parameters
best_tree <- fitted_model %>%
    select_best(metric = "accuracy")

Next, we’ll finalize workflow with best set of tuning parameters and then fit the model on the training data.

final_wf <- my_wf %>% 
    finalize_workflow(best_tree)

final_fit <- final_wf %>% 
    last_fit(train_test_split, metrics = metric_set(roc_auc, accuracy, kap, sensitivity, specificity, precision))

We can see that final_fit is for a single fit: a random forest with the best performing tuning parameters trained with the entire training set of data to predict the values in our (otherwise not used/“spent”) testing set of data.

final_fit

## # Resampling results
## # Manual resampling 
## # A tibble: 1 × 6
##   splits             id               .metrics .notes   .predictions .workflow 
##   <list>             <chr>            <list>   <list>   <list>       <list>    
## 1 <split [3034/759]> train/test split <tibble> <tibble> <tibble>     <workflow>

Step 5: Interpret accuracy

Last, we can interpret the accuracy of our tuned model.

# fit stats
final_fit %>%
    collect_metrics()

## # A tibble: 6 × 4
##   .metric     .estimator .estimate .config             
##   <chr>       <chr>          <dbl> <chr>               
## 1 accuracy    binary         0.877 Preprocessor1_Model1
## 2 kap         binary         0.752 Preprocessor1_Model1
## 3 sensitivity binary         0.885 Preprocessor1_Model1
## 4 specificity binary         0.871 Preprocessor1_Model1
## 5 precision   binary         0.842 Preprocessor1_Model1
## 6 roc_auc     binary         0.932 Preprocessor1_Model1

Interpreting these - apart from accuracy - may present some challenges. First, we can focus on the accuracy - around 89% (0.886). Accuracy and the others are defined below.

Accuracy: For the known codes, what percentage of the predictions are correct
Cohen’s K: Same as accuracy, while account for the base rate of (chance) agreement
Sensitivity (AKA recall): Among the true “positives”, what percentage are classified as “positive”?
Specificity: Among the true “negatives”, what percentage are classified as “negative”?
ROC AUC: For different levels of the threshold, what is the sensitivity and specificity?

You’ll have the chance to interpret these further in the badge for this learning lab.

One last note - we may be interested to see which variables were most importance. We can do this with the following.

final_fit %>% 
    pluck(".workflow", 1) %>%   
    extract_fit_parsnip() %>% 
    vip(num_features = 10)

🧶 Knit & Check ✅

Congratulations - you’ve completed this case study! Consider moving on to the badge activity next.

LS0tCnRpdGxlOiAnTGVhcm5pbmcgTGFiIDMgQ2FzZSBTdHVkeScKYXV0aG9yOiAiTWVpbmEgWmh1IgpkYXRlOiAiYHIgZm9ybWF0KFN5cy5EYXRlKCksJyVCICVlLCAlWScpYCIKb3V0cHV0OgogIGh0bWxfZG9jdW1lbnQ6CiAgICB0b2M6IHllcwogICAgdG9jX2RlcHRoOiA0CiAgICB0b2NfZmxvYXQ6IHllcwogICAgY29kZV9mb2xkaW5nOiBzaG93CiAgICBjb2RlX2Rvd25sb2FkOiBUUlVFCmVkaXRvcl9vcHRpb25zOgogIG1hcmtkb3duOgogICAgd3JhcDogNzIKI2JpYmxpb2dyYXBoeTogbGl0L3JlZmVyZW5jZXMuYmliCi0tLQoKYGBge3Igc2V0dXAsIGluY2x1ZGU9RkFMU0V9CmtuaXRyOjpvcHRzX2NodW5rJHNldChlY2hvID0gVFJVRSwgZXZhbCA9IFRSVUUpCmBgYAoKRXZlbiBhZnRlciBmZWF0dXJlIGVuZ2luZWVyaW5nLCBtYWNoaW5lIGxlYXJuaW5nIGFwcHJvYWNoZXMgY2FuIG9mdGVuCihidXQgbm90IGFsd2F5cykgYmUgaW1wcm92ZWQgYnkgY2hvb3NpbmcgYSBtb3JlIHNvcGhpc3RpY2F0ZWQgbW9kZWwKdHlwZS4gTm90ZSBob3cgd2UgdXNlZCBhIHJlZ3Jlc3Npb24gbW9kZWwgaW4gdGhlIGZpcnN0IHR3byBjYXNlIHN0dWRpZXM7CmhlcmUsIHdlIGV4cGxvcmUgYSBjb25zaWRlcmFibHkgbW9yZSBzb3BoaXN0aWNhdGVkIG1vZGVsLCBhIHJhbmRvbQpmb3Jlc3QuCgpDaG9vc2luZyBhIG1vcmUgc29waGlzdGljYXRlZCBtb2RlbCBhZGRzIHNvbWUgY29tcGxleGl0eSB0byB0aGUKbW9kZWxpbmcuIE5vdGFibHksIG1vcmUgY29tcGxleCBtb2RlbHMgaGF2ZSAqdHVuaW5nIHBhcmFtZXRlcnMqIC0gcGFydHMKb2YgdGhlIG1vZGVsIHRoYXQgYXJlIG5vdCBlc3RpbWF0ZWQgZnJvbSB0aGUgZGF0YS4gSW4gYWRkaXRpb24gdG8gdXNpbmcKZmVhdHVyZSBlbmdpbmVlcmluZyBpbiBhIHdheSBha2luIHRvIGhvdyB3ZSBkaWQgaW4gdGhlIGxhc3QgY2FzZSBzdHVkeSwKQmVydG9saW5pIGV0IGFsLiAoMjAyMSkgdXNlIHR1bmluZyBwYXJhbWV0ZXJzIHRvIGltcHJvdmUgdGhlIHBlcmZvcm1hbmNlCm9mIHRoZWlyIHByZWRpY3RpdmUgbW9kZWwuCgo+IEJlcnRvbGluaSwgUi4sIEZpbmNoLCBTLiBKLiwgJiBOZWhtLCBSLiBILiAoMjAyMSkuIEVuaGFuY2luZyBkYXRhCj4gcGlwZWxpbmVzIGZvciBmb3JlY2FzdGluZyBzdHVkZW50IHBlcmZvcm1hbmNlOiBpbnRlZ3JhdGluZyBmZWF0dXJlCj4gc2VsZWN0aW9uIHdpdGggY3Jvc3MtdmFsaWRhdGlvbi4gKkludGVybmF0aW9uYWwgSm91cm5hbCBvZiBFZHVjYXRpb25hbAo+IFRlY2hub2xvZ3kgaW4gSGlnaGVyIEVkdWNhdGlvbiwgMTgqKDEpLCAxLTIzLgo+IDxodHRwczovL2dpdGh1Yi5jb20vbGFzZXItaW5zdGl0dXRlL2Vzc2VudGlhbC1yZWFkaW5ncy9ibG9iL21haW4vbWFjaGluZS1sZWFybmluZy9tbC1sYWItMy9iZXJ0b2xpbmktZXQtYWwtMjAyMS1pamV0aGUucGRmPgoKT3VyIGRyaXZpbmcgcXVlc3Rpb24gaXM6ICoqSG93IG11Y2ggb2YgYSBkaWZmZXJlbmNlIGRvZXMgYSBtb3JlIGNvbXBsZXgKbW9kZWwgbWFrZT8qKiBMb29raW5nIGJhY2sgdG8gb3VyIHByZWRpY3RpdmUgbW9kZWwgZnJvbSBMTDEsIHdlIGNhbiBzZWUKdGhhdCBvdXIgYWNjdXJhY3kgd2FzIGFyb3VuZCA4NyU6IDAuODcyLCBtb3JlIHNwZWNpZmljYWxseS4gQ2FuIHdlCmltcHJvdmUgb24gdGhhdD8KCldoaWxlIGFuc3dlcmluZyB0aGlzIHF1ZXN0aW9uLCB3ZSBmb2N1cyBub3Qgb25seSBvbiBlc3RpbWF0aW5nLCBidXQgYWxzbwpvbiB0dW5pbmcgYSBjb21wbGV4IG1vZGVsLiBUaGUgZGF0YSB3ZSB1c2UgaXMsIGFnYWluLCBmcm9tIHRoZSAjTkdTU2NoYXQKY29tbXVuaXR5IG9uIFR3aXR0ZXIsIGFzIGluIGRvaW5nIHNvIHdlIGNhbiBjb21wYXJlIHRoZSBwZXJmb3JtYW5jZSBvZgp0aGlzIHR1bmVkLCBjb21wbGV4IG1vZGVsIHRvIHRoZSBpbml0aWFsIG1vZGVsIHdlIHVzZWQgaW4gdGhlIGZpcnN0IGNhc2UKc3R1ZHkuCgojIyBTdGVwIDA6IExvYWRpbmcgYW5kIHNldHRpbmcgdXAKCkZpcnN0LCBsZXQncyBsb2FkIHRoZSBwYWNrYWdlcyB3ZSdsbCB1c2UtLS10aGUgZmFtaWxpYXIge3RpZHl2ZXJzZX0gYW5kCnNldmVyYWwgb3RoZXJzIGZvY3VzZWQgb24gbW9kZWxpbmcuIExpa2UgaW4gZWFybGllciBsZWFybmluZyBsYWJzLCBjbGljawp0aGUgZ3JlZW4gYXJyb3cgdG8gcnVuIHRoZSBjb2RlIGNodW5rLgoKIyMjIyBbWW91ciBUdXJuXXtzdHlsZT0iY29sb3I6IGdyZWVuOyJ9IOKktQoKUGxlYXNlIGFkZCB0byB0aGUgY2h1bmsgYmVsb3cgY29kZSB0byBsb2FkIHRocmVlIHBhY2thZ2VzIHdlJ3ZlIHVzZWQgaW4KYm90aCBMTDEgYW5kIExMMiAtIHRpZHl2ZXJzZSwgdGlkeW1vZGVscywgYW5kIGhlcmUuCgpgYGB7cn0KCmxpYnJhcnkodGlkeXZlcnNlKQpsaWJyYXJ5KHRpZHltb2RlbHMpCgpsaWJyYXJ5KHZpcCkgIyBhIG5ldyBwYWNrYWdlIHdlJ3JlIGFkZGluZyBmb3IgdmFyaWFibGUgaW1wb3J0YW5jZSBtZWFzdXJlcwpsaWJyYXJ5KHJhbmdlcikgIyB0aGlzIGlzIG5lZWRlZCBmb3IgdGhlIHJhbmRvbSBmb3Jlc3QgYWxnb3JpdGhtCmxpYnJhcnkoaGVyZSkKCmBgYAoKTmV4dCwgd2UnbGwgbG9hZCB0aGUgcHJvY2Vzc2VkIGRhdGEuCgoqTm90ZSo6IFdlIGNyZWF0ZWQgYSBtZWFucyBvZiB2aXN1YWxpemluZyB0aGUgdGhyZWFkcyB0byBtYWtlIGNvZGluZwp0aGVtIGVhc2llcjsgdGhhdCdzIGhlcmUgYW5kIGl0IHByb3ZpZGVzIGEgbWVhbnMgb2Ygc2VlaW5nIHdoYXQgdGhlIHJhdwpkYXRhIGlzIGxpa2U6IDxodHRwczovL2ptaWNoYWVscm9zZW5iZXJnLnNoaW55YXBwcy5pby9uZ3NzY2hhdC1zaGlueS8+LgoKV2UndmUgYWRkZWQgdGhyZWUgYWRkaXRpb25hbCB2YXJpYWJsZXMgZm9yIHRoaXMgYW5hbHlzaXM7IHRodXMsIHRoZQp2YXJpYWJsZXMgd2UgaGF2ZSB0byBjb25zaWRlciB0byB1c2UgYXMgZmVhdHVyZXMgYXJlOgoKMS4gIGBuYDogVGhlIG51bWJlciBvZiB0d2VldHMgaW4gdGhlICp0aHJlYWQqIChpbmRlcGVuZGVudCB2YXJpYWJsZSkKMi4gIGBtZWFuX2Zhdm9yaXRlX2NvdW50YDogVGhlIG1lYW4gbnVtYmVyIG9mIGZhdm9yaXRlcyBmb3IgdGhlIHR3ZWV0cwogICAgaW4gdGhlIHRocmVhZCAoKmluZGVwZW5kZW50KiB2YXJpYWJsZSkKMy4gIGBzdW1fZmF2b3JpdGVfY291bnRgOiBUaGUgc3VtIG9mIHRoZSBudW1iZXIgb2YgZmF2b3JpdGVzIGZvciB0aGUKICAgIHR3ZWV0cyBpbiB0aGUgdGhyZWFkICgqaW5kZXBlbmRlbnQqIHZhcmlhYmxlKQo0LiAgYG1lYW5fcmV0d2VldF9jb3VudGA6IFRoZSBtZWFuIG51bWJlciBvZiByZXR3ZWV0cyBmb3IgdGhlIHR3ZWV0cyBpbgogICAgdGhlIHRocmVhZCAoKmluZGVwZW5kZW50KiB2YXJpYWJsZSkKNS4gIGBzdW1fcmV0d2VldF9jb3VudGA6IFRoZSBzdW0gb2YgdGhlIG51bWJlciBvZiByZXR3ZWV0cyBmb3IgdGhlCiAgICB0d2VldHMgaW4gdGhlIHRocmVhZCAoKmluZGVwZW5kZW50KiB2YXJpYWJsZSkKNi4gIGBzdW1fZGlzcGxheV90ZXh0X3dpZHRoYDogVGhlIHN1bSBvZiB0aGUgbnVtYmVyIG9mIGNoYXJhY3RlcnMgZm9yCiAgICB0aGUgdHdlZXRzIGluIHRoZSB0aHJlYWQgKCppbmRlcGVuZGVudCogdmFyaWFibGUpCjcuICBgbWVhbl9kaXNwbGF5X3RleHRfd2lkdGhgOiBUaGUgbWVhbiBvZiB0aGUgbnVtYmVyIG9mIGNoYXJhY3RlcnMgZm9yCiAgICB0aGUgdHdlZXRzIGluIHRoZSB0aHJlYWQgKCppbmRlcGVuZGVudCogdmFyaWFibGUpCjguICBgY29kZWA6IFRoZSBxdWFsaXRhdGl2ZSBjb2RlIChUUyA9IHRyYW5zYWN0aW9uYWw7IFRGID0KICAgIHRyYW5zZm9ybWF0aW9uYWwpICgqZGVwZW5kZW50KiB2YXJpYWJsZSkKCk9uZSAqYmlnKiB0eXBlIG9mIGZlYXR1cmUgbm90IGluY2x1ZGVkIGluIHRoaXMgYW5hbHlzaXMgLSBtb3JlCmluZm9ybWF0aW9uIG9uIHRoZSB0ZXh0IGluIHRoZSB0d2VldHMuIFRoaXMgaXMgbGlrZWx5IHRvIGJlIHF1aXRlCnByZWRpY3RpdmU6IHRoZSB3b3JkcyB0aGF0IHVzZXJzIGluY2x1ZGVkIGluIHRoZWlyIHR3ZWV0cyBhcmUgcHJvYmFibHkKYXNzb2NpYXRlZCB3aXRoIChhbmQgcHJlZGljdGl2ZSBvZikgc3Vic3RhbnRpdmUgb3IgdHJhbnNhY3Rpb25hbApjb252ZXJzYXRpb25zLiBHaXZlbiBvdXIgZm9jdXMgb24gTUwgaW4gdGhpcyB0b3BpYyBhcmVhLCB3ZSBkbyAqbm90CmluY2x1ZGUgZmVhdHVyZXMgcmVsYXRpbmcgdG8gdGhlIHRleHQgZGF0YSosIGJ1dCB0aGluayB0aGlzIGNvdWxkIGJlIGEKZ3JlYXQgZGlyZWN0aW9uIGZvciBmdXR1cmUgd29yayAoYW5kIHJlc2VhcmNoKSBpbiB0aGlzIGFyZWEuCgpgYGB7cn0KZCA8LSByZWFkX2NzdigiZGF0YS9uZ3NzY2hhdC1wcm9jZXNzZWQtZGF0YS5jc3YiKQoKZCA8LSBkICU+JSAKICAgIG11dGF0ZShjb2RlID0gYXMuZmFjdG9yKGNvZGUpKSAjIHRoaXMgaXMgbmVlZGVkIGZvciB0aGUgY2xhc3NpZmljYXRpb24gbW9kZQpgYGAKCiMjIFN0ZXAgMS4gU3BsaXQgZGF0YQoKV2UgdHJlYXQgdGhpcyBzdGVwIHJlbGF0aXZlbHkgbWluaW1hbGx5IGFzIHdlIGhhdmUgbm93IGNhcnJpZWQgb3V0IGEKc3RlcCB2ZXJ5IHNpbWlsYXIgdG8gdGhpcyBpbiBMTDEgYW5kIExMMjsgcmV0dXJuIHRvIHRoZSBjYXNlIHN0dWR5IGZvcgp0aG9zZSAoZXNwZWNpYWxseSBMTDEpIGZvciBtb3JlIG9uIGRhdGEgc3BsaXR0aW5nLiBOb3RlIHRoYXQgd2UgY2FycnkKb3V0IHRoZSAqayotZm9sZHMgY3Jvc3MtdmFsaWRhdGlvbiBwcm9jZXNzIGludHJvZHVjZWQgaW4gTEwyLiBDb25zaWRlciAtCmxpa2UgdGhlcmUgLSBzZXR0aW5nIGEgZGlmZmVyZW50IHZhbHVlIGZvciAqdiogKCprKikgYXMgeW91IHRoaW5rIGlzCmFwcHJvcHJpYXRlLgoKIyMjIyBbWW91ciBUdXJuXXtzdHlsZT0iY29sb3I6IGdyZWVuOyJ9IOKktQoKWW91IGRvIGhhdmUgb25lIHN0ZXAgdGhhdCBpcyB5b3VyIHR1cm4hIFBsZWFzZSBhZGQgdGhlIGNvZGUgZm9yIHNldHRpbmcKdXAgdGhlIGstZm9sZHMgY3Jvc3MtdmFsaWRhdGlvbiAoZXhhY3RseSBhcyB5b3UgZGlkIHRoaXMgaW4gTEwyLgoKYGBge3J9CnNldC5zZWVkKDIwMjIwNzEyKQp0cmFpbl90ZXN0X3NwbGl0IDwtIGluaXRpYWxfc3BsaXQoZCwgcHJvcCA9IC44MCkKZGF0YV90cmFpbiA8LSB0cmFpbmluZyh0cmFpbl90ZXN0X3NwbGl0KQoKIzxyZXBsYWNlIHRoaXMgbGluZSB3aXRoIHlvdXIga2ZjdiBjb2RlIT4Ka2ZjdiA8LSB2Zm9sZF9jdihkYXRhX3RyYWluLCB2ID0gOCkgCmBgYAoKIyMgU3RlcCAyOiBFbmdpbmVlciBmZWF0dXJlcwoKSW4gU3RlcCAxLCB3ZSBub3RlZCBob3cgd2UgYWRkZWQgdGhyZWUgdmFyaWFibGVzIGFzIHBvdGVudGlhbCBmZWF0dXJlcy4KSGVyZSwgd2UgY2Fycnkgb3V0IHR3byBmZWF0dXJlIGVuZ2luZWVyaW5nIHN0ZXBzIHdlIGhhdmUgY2FycmllZCBvdXQKYmVmb3JlIC0gc3RhbmRhcmRpemluZyB0aGUgbnVtZXJpYyB2YXJpYWJsZXMgKHRvIGhhdmUgYSBtZWFuIGVxdWFsIHRvIDAKYW5kIGEgc3RhbmRhcmQgZGV2aWF0aW9uIGVxdWFsIHRvIDEpIGFuZCBkcm9wcGluZyBhbnkgZmVhdHVyZXMgd2l0aApuZWFyLXplcm8gdmFyaWFuY2UuIENvbnNpZGVyIGFkZGluZyBvdGhlciBmZWF0dXJlIGVuZ2luZWVyaW5nIHN0ZXBzIC0KcGVyaGFwcyB0aGUgc3RlcCB5b3UgY2FycmllZCBvdXQgY29tcGxldGUgdGhlIGJhZGdlIHJlcXVpcmVtZW50cyBmb3IKTEwyLgoKIyMjIyBbWW91ciBUdXJuXXtzdHlsZT0iY29sb3I6IGdyZWVuOyJ9IOKktQoKQWRkIGEgZmVhdHVyZSBlbmdpbmVlcmluZyBzdGVwIGJlbG93LiBDb25zaWRlciB0aG9zZSBkZXNjcmliZWQKW2hlcmVdKGh0dHBzOi8vcmVjaXBlcy50aWR5bW9kZWxzLm9yZy9yZWZlcmVuY2UvaW5kZXguaHRtbCkuCgpgYGB7ciBwYW5lbC1jaHVuay0yLCBlY2hvID0gVFJVRSwgZXZhbCA9IFRSVUV9Cm15X3JlYyA8LSByZWNpcGUoY29kZSB+IC4sIGRhdGEgPSBkYXRhX3RyYWluKSAlPiUgCiAgICBzdGVwX25vcm1hbGl6ZShhbGxfbnVtZXJpY19wcmVkaWN0b3JzKCkpICU+JQogICAgc3RlcF9uenYoYWxsX3ByZWRpY3RvcnMoKSkgJT4lCiAgICBzdGVwX2R1bW15KGFsbF9ub21pbmFsX3ByZWRpY3RvcnMoKSkgJT4lICAjIGR1bW15IGNvZGUgYWxsIGZhY3RvciB2YXJpYWJsZXMKICAgIHN0ZXBfaW1wdXRlX2tubihhbGxfcHJlZGljdG9ycygpKSAjIGltcHV0ZSBtaXNzaW5nIGRhdGEgZm9yIGFsbCBwcmVkaWN0b3IgdmFyaWFibGVzCiAgCmBgYAoKIyMgU3RlcCAzOiBTcGVjaWZ5IHJlY2lwZSwgbW9kZWwsIGFuZCB3b3JrZmxvdwoKVGhlcmUgYXJlIHNldmVyYWwgc3RlcHMgdGhhdCBhcmUgZGlmZmVyZW50IGZyb20gdGhlIHBhc3QgbGVhcm5pbmcgbGFicwpoZXJlLgoKLSAgIHVzaW5nIHRoZSBgcmFuZG9tX2ZvcmVzdCgpYCBmdW5jdGlvbiB0byBzZXQgdGhlICptb2RlbCogYXMgYSByYW5kb20KICAgIGZvcmVzdAotICAgdXNpbmcgYHNldF9lbmdpbmUoInJhbmdlciIsIGltcG9ydGFuY2UgPSAiaW1wdXJpdHkiKWAgdG8gc2V0IHRoZQogICAgKmVuZ2luZSogYXMgdGhhdCBwcm92aWRlZCBmb3IgcmFuZG9tIGZvcmVzdHMgdGhyb3VnaCB0aGUge3Jhbmdlcn0KICAgIHBhY2thZ2U7IHdlIGFsc28gYWRkIHRoZSBgaW1wb3J0YW5jZSA9ICJpbXB1cml0eSJgIGxpbmUgdG8gYmUgYWJsZQogICAgdG8gaW50ZXJwcmV0IGEgcGFydGljdWxhciB2YXJpYWJsZSBpbXBvcnRhbmNlIG1ldHJpYyAoaW1wdXJpdHkpCiAgICBzcGVjaWZpYyB0byByYW5kb20gZm9yZXN0IG1vZGVscwotICAgZmluYWxseSwgdXNpbmcgYHNldF9tb2RlKCJjbGFzc2lmaWNhdGlvbiIpKWAgYXMgd2UgYXJlIGFnYWluCiAgICBwcmVkaWN0aW5nIGNhdGVnb3JpZXMgKHRyYW5zYWN0aW9uYWwgYW5kIHN1YnN0YW50aXZlIGNvbnZlcnNhdGlvbnMKICAgIHRha2luZyBwbGFjZSB0aHJvdWdoICNOR1NTY2hhdCkKCmBgYHtyIHBhbmVsLWNodW5rLTMsIGVjaG8gPSBUUlVFLCBldmFsID0gVFJVRX0KIyBzcGVjaWZ5IG1vZGVsCm15X21vZCA8LQogICAgcmFuZF9mb3Jlc3QobXRyeSA9IHR1bmUoKSwKICAgICAgICAgICAgICAgIG1pbl9uID0gdHVuZSgpLAogICAgICAgICAgICAgICAgdHJlZXMgPSB0dW5lICgpKSAlPiUKICAgIHNldF9lbmdpbmUoInJhbmdlciIsIGltcG9ydGFuY2UgPSAiaW1wdXJpdHkiKSAlPiUKICAgIHNldF9tb2RlKCJjbGFzc2lmaWNhdGlvbiIpCgojIHNwZWNpZnkgd29ya2Zsb3cKbXlfd2YgPC0KICAgIHdvcmtmbG93KCkgJT4lCiAgICBhZGRfbW9kZWwobXlfbW9kKSAlPiUgCiAgICBhZGRfcmVjaXBlKG15X3JlYykKYGBgCgojIyBTdGVwIDQ6IEZpdCBtb2RlbAoKSGVyZSwgdGhpbmdzIGJlY29tZSBhcmUgZGlmZmVyZW50IG9uY2UgYWdhaW4uIFdlJ2xsIGZvbGxvdyBhIGdyaWQgbWV0aG9kCnRvIHNwZWNpZnkgdHdvIHR1bmluZyBwYXJhbWV0ZXJzLCB0aGUgbnVtYmVyIG9mIHByZWRpY3RvciB2YXJpYWJsZXMgdGhhdAphcmUgcmFuZG9tbHkgc2FtcGxlZCBmb3IgZWFjaCBzcGxpdCBpbiB0aGUgdHJlZSAoYG10cnlgKSBhbmQgdGhlIG51bWJlcgpvZiBkYXRhIHBvaW50cyByZXF1aXJlZCB0byBleGVjdXRlIGEgc3BsaXQgKGBtaW5fbmApLiBgc2l6ZWAgcmVmZXJzIHRvCmhvdyBtYW55IGRpc3RpbmN0IGNvbWJpbmF0aW9ucyBvZiB0aGUgdHVuaW5nIHBhcmFtZXRlcnMgd2lsbCBiZQpyZXR1cm5lZC4gMTAgaXMgYSByZWxhdGl2ZWx5IHNtYWxsIG51bWJlciAtIHdlIGNhbiBpbWFnaW5lIGEgbXVjaCBsYXJnZXIKbnVtYmVyIG9mIGNvbWJpbmF0aW9ucyBvZiB0aGUgYG10cnlgIGFuZCBgbWluX25gIGh5cGVycGFyYW1ldGVycyAtIGJ1dAppdCBzaG91bGQgZ2l2ZSB1cyBhIHNlbnNlIG9mIHdoYXQgcGFyYW1ldGVycyBsZWFkIHRvIHRoZSBiZXN0CnBlcmZvcm1hbmNlLgoKVGhlc2UgbmV4dCB0d28gZnVuY3Rpb25zIGFyZSB1c2VkIHRvIGdldCBhIHNlbnNlIG9mIHdoYXQgdGhlIHZhbHVlcyBmb3IKYG10cnlgIGFuZCBgbWluX25gIHNob3VsZCBiZSBiYXNlZCBvbiB0aGUgZGltZW5zaW9ucyBvciByYW5nZSBvZiB0aGUKdmFsdWVzIG9mIHRoZSB2YXJpYWJsZXMgaW4gdGhlIGRhdGEuCgpgYGB7ciBwYW5lbC1jaHVuay00LCBlY2hvID0gVFJVRSwgZXZhbCA9IFRSVUV9CmZpbmFsaXplKG10cnkoKSwgZGF0YV90cmFpbikKZmluYWxpemUobWluX24oKSwgZGF0YV90cmFpbikKZmluYWxpemUodHJlZXMoKSwgZGF0YV90cmFpbikKYGBgCgojIyMjIFtZb3VyIFR1cm5de3N0eWxlPSJjb2xvcjogZ3JlZW47In0g4qS1CgpXZSB0aGVuIHVzZSB0aGVzZSB2YWx1ZXMgaW4gdGhlIGBncmlkX21heF9lbnRyb3B5KClgIGZ1bmN0aW9uIGJlbG93OwpyZXBsYWNlIHRoZSBgeHhgIHZhbHVlcyBiZWxvdyB3aXRoIHRoZSAqbWF4aW11bSogdmFsdWUgcHJvdmlkZWQgYnkgdGhlCmBtdHJ5YCBhbmQgYG1pbl9uYCB2YXJpYWJsZXMsIGFib3ZlLiBZb3UgY2FuIHNlZSB0aGF0IGB0cmVlX2dyaWRgIGlzCnNpbXBseSBhIGNvbWJpbmF0aW9uIG9mIHZhbHVlcyBvZiB0aGUgaHlwZXJwYXJhbWV0ZXJzLgoKYGBge3J9CnRyZWVfZ3JpZCA8LSBncmlkX21heF9lbnRyb3B5KG10cnkocmFuZ2UoMSwgNSkpLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICBtaW5fbihyYW5nZSgyLCA0MCkpLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICB0cmVlcyhyYW5nZSgxLDYwMCkpLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICBzaXplID0gMTApCgp0cmVlX2dyaWQKYGBgCgpOb3csIHdlJ3JlIHJlYWR5IHRvIGZpdCB0aGUgbW9kZWwuIE5vdGUgLSB0aGlzIGNhbiB0YWtlIHNvbWUgdGltZSwgdGhlCmxvbmdlc3Qgb2YgYW55IGZ1bmN0aW9uIHdlJ3ZlIHJ1biBzbyBmYXIsIGFzIHdlJ3JlIGVzdGltYXRpbmcgYSkgYXMgbWFueQptb2RlbHMgYXMgdGhlcmUgYXJlIGZvbGRzICgqdiA9KiAxMCBhcyBkZWZhdWx0KSBhbmQgYikgZGlzdGluY3QKY29tYmluYXRpb25zIG9mIGh5cGVycGFyYW1ldGVycyAoYHNpemUgPSAxMGAgYXMgZGVmYXVsdCkuCgpgYGB7ciwgd2FybmluZyA9IEZBTFNFfQojIGZpdCBtb2RlbCB3aXRoIHR1bmVfZ3JpZCgpCmZpdHRlZF9tb2RlbCA8LSBteV93ZiAlPiUgCiAgICB0dW5lX2dyaWQoCiAgICAgICAgcmVzYW1wbGVzID0ga2ZjdiwKICAgICAgICBncmlkID0gdHJlZV9ncmlkLAogICAgICAgIG1ldHJpY3MgPSBtZXRyaWNfc2V0KHJvY19hdWMsIGFjY3VyYWN5LCBrYXAsIHNlbnNpdGl2aXR5LCBzcGVjaWZpY2l0eSwgcHJlY2lzaW9uKQogICAgKQpgYGAKCkhlcmUgY29tZXMgc29tZSBmdXJ0aGVyIGFkZGl0aW9uYWwgc3RlcHMuIFRoaXMgbmV4dCBzdGVwIGlzIGtleQp0ZWNobmljYWxseSBhbmQgY29uY2VwdHVhbGx5IC0gd2UncmUgZXhhbWluaW5nIHRoZSBiZXN0IHR1bmluZwpwYXJhbWV0ZXJzICpyYW5rZWQgYnkgdGhlaXIgcHJlZGljdGl2ZSBhY2N1cmFjeSouCgpgYGB7cn0KIyBleGFtaW5lIGJlc3Qgc2V0IG9mIHR1bmluZyBwYXJhbWV0ZXJzOyByZXBlYXQ/CnNob3dfYmVzdChmaXR0ZWRfbW9kZWwsIG4gPSAxMCwgbWV0cmljID0gImFjY3VyYWN5IikKYGBgCgpUaGlzIGZ1bmN0aW9uIHNpbXBseSBpbmRpY2F0ZXMgdGhhdCB5b3Ugd2FudCB0byB1c2UgdGhlIGJlc3Qgb2YgdGhlIHNldHMKb2YgdHVuaW5nIHBhcmFtZXRlcnMgZXhhbWluZWQgdGhvdWdoIHRoZSBjb2RlIGluIHRoZSBhYm92ZSBjaHVuayAtCmxpdGVyYWxseSB0aGUgZmlyc3Qgcm93LgoKYGBge3J9CiMgc2VsZWN0IGJlc3Qgc2V0IG9mIHR1bmluZyBwYXJhbWV0ZXJzCmJlc3RfdHJlZSA8LSBmaXR0ZWRfbW9kZWwgJT4lCiAgICBzZWxlY3RfYmVzdChtZXRyaWMgPSAiYWNjdXJhY3kiKQpgYGAKCk5leHQsIHdlJ2xsIGZpbmFsaXplIHdvcmtmbG93IHdpdGggYmVzdCBzZXQgb2YgdHVuaW5nIHBhcmFtZXRlcnMgYW5kCnRoZW4gZml0IHRoZSBtb2RlbCBvbiB0aGUgdHJhaW5pbmcgZGF0YS4KCmBgYHtyfQpmaW5hbF93ZiA8LSBteV93ZiAlPiUgCiAgICBmaW5hbGl6ZV93b3JrZmxvdyhiZXN0X3RyZWUpCgpmaW5hbF9maXQgPC0gZmluYWxfd2YgJT4lIAogICAgbGFzdF9maXQodHJhaW5fdGVzdF9zcGxpdCwgbWV0cmljcyA9IG1ldHJpY19zZXQocm9jX2F1YywgYWNjdXJhY3ksIGthcCwgc2Vuc2l0aXZpdHksIHNwZWNpZmljaXR5LCBwcmVjaXNpb24pKQpgYGAKCldlIGNhbiBzZWUgdGhhdCBgZmluYWxfZml0YCBpcyBmb3IgYSBzaW5nbGUgZml0OiBhIHJhbmRvbSBmb3Jlc3Qgd2l0aAp0aGUgYmVzdCBwZXJmb3JtaW5nIHR1bmluZyBwYXJhbWV0ZXJzIHRyYWluZWQgd2l0aCB0aGUgKmVudGlyZSogdHJhaW5pbmcKc2V0IG9mIGRhdGEgdG8gcHJlZGljdCB0aGUgdmFsdWVzIGluIG91ciAob3RoZXJ3aXNlIG5vdCB1c2VkLyJzcGVudCIpCnRlc3Rpbmcgc2V0IG9mIGRhdGEuCgpgYGB7cn0KZmluYWxfZml0CmBgYAoKIyMgU3RlcCA1OiBJbnRlcnByZXQgYWNjdXJhY3kKCkxhc3QsIHdlIGNhbiBpbnRlcnByZXQgdGhlIGFjY3VyYWN5IG9mIG91ciB0dW5lZCBtb2RlbC4KCmBgYHtyIHBhbmVsLWNodW5rLTUsIGVjaG8gPSBUUlVFLCBldmFsID0gVFJVRX0KIyBmaXQgc3RhdHMKZmluYWxfZml0ICU+JQogICAgY29sbGVjdF9tZXRyaWNzKCkKYGBgCgpJbnRlcnByZXRpbmcgdGhlc2UgLSBhcGFydCBmcm9tIGBhY2N1cmFjeWAgLSBtYXkgcHJlc2VudCBzb21lCmNoYWxsZW5nZXMuIEZpcnN0LCB3ZSBjYW4gZm9jdXMgb24gdGhlIGFjY3VyYWN5IC0gYXJvdW5kIDg5JSAoMC44ODYpLgpBY2N1cmFjeSBhbmQgdGhlIG90aGVycyBhcmUgZGVmaW5lZCBiZWxvdy4KCi0gICAqQWNjdXJhY3kqOiBGb3IgdGhlIGtub3duIGNvZGVzLCB3aGF0IHBlcmNlbnRhZ2Ugb2YgdGhlIHByZWRpY3Rpb25zCiAgICBhcmUgY29ycmVjdAoKLSAgICpDb2hlbidzIEsqOiBTYW1lIGFzIGFjY3VyYWN5LCB3aGlsZSBhY2NvdW50IGZvciB0aGUgYmFzZSByYXRlIG9mCiAgICAoY2hhbmNlKSBhZ3JlZW1lbnQKCi0gICAqU2Vuc2l0aXZpdHkgKEFLQSByZWNhbGwpKjogQW1vbmcgdGhlIHRydWUgInBvc2l0aXZlcyIsIHdoYXQKICAgIHBlcmNlbnRhZ2UgYXJlIGNsYXNzaWZpZWQgYXMgInBvc2l0aXZlIj8KCi0gICAqU3BlY2lmaWNpdHkqOiBBbW9uZyB0aGUgdHJ1ZSAibmVnYXRpdmVzIiwgd2hhdCBwZXJjZW50YWdlIGFyZQogICAgY2xhc3NpZmllZCBhcyAibmVnYXRpdmUiPwoKLSAgICpST0MgQVVDKjogRm9yIGRpZmZlcmVudCBsZXZlbHMgb2YgdGhlIHRocmVzaG9sZCwgd2hhdCBpcyB0aGUKICAgIHNlbnNpdGl2aXR5IGFuZCBzcGVjaWZpY2l0eT8KCllvdSdsbCBoYXZlIHRoZSBjaGFuY2UgdG8gaW50ZXJwcmV0IHRoZXNlIGZ1cnRoZXIgaW4gdGhlIGJhZGdlIGZvciB0aGlzCmxlYXJuaW5nIGxhYi4KCk9uZSBsYXN0IG5vdGUgLSB3ZSBtYXkgYmUgaW50ZXJlc3RlZCB0byBzZWUgd2hpY2ggdmFyaWFibGVzIHdlcmUgbW9zdAppbXBvcnRhbmNlLiBXZSBjYW4gZG8gdGhpcyB3aXRoIHRoZSBmb2xsb3dpbmcuCgpgYGB7cn0KZmluYWxfZml0ICU+JSAKICAgIHBsdWNrKCIud29ya2Zsb3ciLCAxKSAlPiUgICAKICAgIGV4dHJhY3RfZml0X3BhcnNuaXAoKSAlPiUgCiAgICB2aXAobnVtX2ZlYXR1cmVzID0gMTApCmBgYAoKIyMjIPCfp7YgS25pdCAmIENoZWNrIOKchQoKQ29uZ3JhdHVsYXRpb25zIC0geW91J3ZlIGNvbXBsZXRlZCB0aGlzIGNhc2Ugc3R1ZHkhIENvbnNpZGVyIG1vdmluZyBvbgp0byB0aGUgYmFkZ2UgYWN0aXZpdHkgbmV4dC4K

Learning Lab 3 Case Study

Meina Zhu

July 15, 2022

Step 0: Loading and setting up

Your Turn ⤵

Step 1. Split data

Your Turn ⤵

Step 2: Engineer features

Your Turn ⤵

Step 3: Specify recipe, model, and workflow

Step 4: Fit model

Your Turn ⤵

Step 5: Interpret accuracy

🧶 Knit & Check ✅