Assignment 5

# Load necessary libraries
  library(clarify)
  library(readr)
  library(ggplot2)

Model Fitting

## # A tibble: 6 × 18
##   `Player name` position Games `At-bat`  Runs  Hits `Double (2B)`
##   <chr>         <chr>    <dbl>    <dbl> <dbl> <dbl>         <dbl>
## 1 B Bonds       LF        2986     9847  2227  2935           601
## 2 H Aaron       RF        3298    12364  2174  3771           624
## 3 B Ruth        RF        2504     8399  2174  2873           506
## 4 A Pujols      1B        3080    11421  1914  3384           686
## 5 A Rodriguez   SS        2784    10566  2021  3115           548
## 6 W Mays        CF        2992    10881  2062  3283           523
## # ℹ 11 more variables: `third baseman` <dbl>, `home run` <dbl>,
## #   `run batted in` <dbl>, `a walk` <dbl>, Strikeouts <chr>,
## #   `stolen base` <dbl>, `Caught stealing` <chr>, AVG <dbl>,
## #   `On-base Percentage` <dbl>, `Slugging Percentage` <dbl>,
## #   `On-base Plus Slugging` <dbl>
##  [1] "Player name"           "position"              "Games"                
##  [4] "At-bat"                "Runs"                  "Hits"                 
##  [7] "Double (2B)"           "third baseman"         "home run"             
## [10] "run batted in"         "a walk"                "Strikeouts"           
## [13] "stolen base"           "Caught stealing"       "AVG"                  
## [16] "On-base Percentage"    "Slugging Percentage"   "On-base Plus Slugging"
## 
## Call:
## glm(formula = outcome ~ `At-bat` + Hits + Runs + `Double (2B)` + 
##     `home run`, family = binomial, data = data)
## 
## Coefficients:
##                 Estimate Std. Error z value Pr(>|z|)
## (Intercept)   -7.705e+02  2.274e+04  -0.034    0.973
## `At-bat`       9.701e-03  2.651e+00   0.004    0.997
## Hits           3.611e-03  1.710e+01   0.000    1.000
## Runs           1.586e-02  6.143e+00   0.003    0.998
## `Double (2B)` -4.538e-02  3.807e+01  -0.001    0.999
## `home run`     1.341e+00  4.126e+01   0.033    0.974
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 3.0723e+02  on 2499  degrees of freedom
## Residual deviance: 3.9438e-06  on 2494  degrees of freedom
##   (8 observations deleted due to missingness)
## AIC: 12
## 
## Number of Fisher Scoring iterations: 25

Generate Simulation Object

## [1] "clarify_sim"

Average Marginal Effects

## A `clarify_est` object (from `sim_ame()`)
##  - Average marginal effect of `home run`
##  - 1000 simulated values
##  - 1 quantity estimated:                               
##  E[dY/d(home run)] 1.057681e-09

Predictions and First Differences at Set Values

## A `clarify_est` object (from `sim_setx()`)
##  - Predicted outcomes at specified values
##    + Predictors set: At-bat, Hits
##    + All others set at typical values (see `help("sim_setx")` for definition)
##  - 1000 simulated values
##  - 4 quantities estimated:                                         
##  At-bat = 5000, Hits = 1500  2.220446e-16
##  At-bat = 10000, Hits = 1500 2.220446e-16
##  At-bat = 5000, Hits = 3000  2.220446e-16
##  At-bat = 10000, Hits = 3000 2.220446e-16

Discussion of Results

In this section, you should discuss the results obtained from the clarify package, including the average marginal effects and predictions at set values. Compare these results with the ones you obtained in the previous assignment and highlight any new insights provided by the clarify package.

Make sure to include only the essential results that support your discussion and suppress any irrelevant outputs.

Conclusion

Summarize the key findings and their implications. Discuss how the use of the clarify package has enhanced your understanding of the model’s results and any potential applications of these insights.