Assignment 5

Donna Parker

2025-03-17

This data set will explore whether the weight of a car (wt) has an effect on the probability of having a manual transmission (am) using a logistic regression model. The data set contains information on 32 cars, including their weight, miles per gallon (mpg), and transmission type (automatic or manual). The goal is to estimate the average marginal effect of weight on the probability of having a manual transmission and to interpret the results. We will also calculate predictions and first differences to compare the probabilities of manual transmission for different scenarios. Finally, we will visualize the results using a histogram of the simulated average marginal effects (AMEs) for weight.

rm = (list = ls())
gc()
##          used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 562212 30.1    1263128 67.5   686382 36.7
## Vcells 997577  7.7    8388608 64.0  1876044 14.4
# Install necessary packages

library(clarify)
library(ggplot2)
library(rmdformats)
data(mtcars)
head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

This data consists of 32 observations and 11 variables. The variables are as follows.

# Run the logistic regression model

model1 <- glm(am ~ wt + mpg, family = binomial, data = mtcars)

# Simulate the model
sim_object <- sim(model1)

# Calculate average marginal effects for weight (wt)

ame_results <- sim_ame(sim_object, var = "wt")

# View the summary of the results
summary(ame_results)
##             Estimate  2.5 % 97.5 %
## E[dY/d(wt)]   -0.512 -0.723 -0.143

This result indicates that for every 1,000-pound increase in car weight (wt), the probability of having a manual transmission decreases by about 51.2%, on average. The relationship between weight and transmission type is negative.

Confidence Interval = (2.5% and 97.5%)

The confidence interval ranges from -0.704 to -0.199. This means that there is a 95% chance that the true average marginal effect falls within this range. The entire interval is negative, which supports the conclusion that increased weight is associated with a lower probability of manual transmission.

Key Takeaways:

Weight has a statistically significant effect on the likelihood of manual transmission because the confidence interval does not include 0.

The effect is negative, indicating that heavier cars are less likely to have manual transmissions.

Next, we will calculate predictions and first differences to compare the probabilities of manual transmission for specific scenarios. We will use the sim_setx() function to generate predictions and first differences for two different scenarios: one with a car weighing 3,000 lbs and getting 25 mpg, and another with a car weighing 4,000 lbs and getting 20 mpg. We will then interpret the results and discuss the implications of these findings.

# Predictions and First Differences
# Use the sim_setx() function to calculate predictions and first differences at specific values of your predictors.

# Generate predictions and first differences
predictions <- sim_setx(
  sim_object,
  x = list(wt = 3, mpg = 25),  # Set the baseline values
  x1 = list(wt = 4, mpg = 20) # Set the alternative values
)

# View the summary of the predictions
summary(predictions)
##                   Estimate     2.5 %    97.5 %
## wt = 3, mpg = 25  1.88e-01  2.10e-02  7.34e-01
## wt = 4, mpg = 20  1.90e-03  2.07e-05  2.10e-01
## FD               -1.86e-01 -6.32e-01 -1.88e-02
# This will compare the predicted probabilities of having a manual transmission (am = 1) when wt and mpg are set to the given values. The first difference is the change in the probability between the two scenarios.

Predictions for Specific Scenarios: (wt = 3, mpg = 25 and wt = 4, mpg = 20)

These are the predicted probabilities of having a manual transmission (am = 1) for the two specified scenarios in my sim_setx() function:

wt = 3, mpg = 25 (First Scenario):

Estimate: 1.88e-01 or 0.188. This means the model predicts an 18.8% chance of having a manual transmission for a car weighing 3,000 lbs with 25 mpg.

Confidence Interval (2.5% - 97.5%): Ranges from 1.94e-02 (1.94%) to 6.94e-01 (69.4%). This indicates that thereโ€™s some uncertainty in the estimate, but most of the probability lies between these bounds.

wt = 4, mpg = 20 (Second Scenario):

Estimate: 1.90e-03 or 0.0019. This means the model predicts only a 0.19% chance of having a manual transmission for a car weighing 4,000 lbs with 20 mpg.

Confidence Interval (2.5% - 97.5%): Ranges from 2.46e-05 (0.0025%) to 1.81e-01 (18.1%), showing more uncertainty in the estimate, but the probability remains quite low.

  1. First Difference (FD)

Estimate for FD (-1.86e-01 or -0.186): This represents the difference in probabilities between the two scenarios (wt = 3, mpg = 25 and wt = 4, mpg = 20):

The probability of having a manual transmission decreases by 18.6 percentage points when moving from the first scenario to the second.

Confidence Interval for FD (2.5% - 97.5%): The range is -6.04e-01 (-60.4%) to -1.77e-02 (-1.77%), meaning there is strong evidence that the probability decreases (as the entire interval is negative).

Key Takeaways:

Lighter and more fuel-efficient cars (e.g., wt = 3, mpg = 25) are more likely to have a manual transmission.

Heavier and less fuel-efficient cars (e.g., wt = 4, mpg = 20) are much less likely to have a manual transmission.

The first difference (FD) highlights that weight and mpg have a significant negative effect on the likelihood of manual transmission.

This result supports the idea that manual transmissions are more commonly found in smaller, more fuel-efficient cars

# Plotting the simulation object

ame_df <- as.data.frame(ame_results)

# Create a plot for AMEs
# Convert simulated AMEs to a data frame
ame_df <- data.frame(AME = as.numeric(ame_results))

# Create a histogram of the simulated AMEs
ggplot(ame_df, aes(x = AME)) +
  geom_histogram(binwidth = 0.05, fill = "blue", color = "black") +
  geom_vline(xintercept = attributes(ame_results)$original, color = "red", linetype = "dashed") +
  labs(
    title = "Distribution of Simulated AMEs for Weight (wt)",
    x = "Average Marginal Effect (AME)",
    y = "Frequency"
  ) +
  theme_minimal()

This histogram shows the distribution of the simulated average marginal effects (AMEs) for weight (wt) on the likelihood of having a manual transmission (am = 1).

The histogram is roughly bell-shaped, which suggests that the simulated AMEs are normally distributed, with most values concentrated around the center (mean) estimate.

Center of the Distribution:

The majority of AMEs cluster near the red dashed line, which represents the original point estimate (around -0.512 in this case). This means the original AME aligns well with the central tendency of the simulations.

Spread/Width of the Histogram:

The spread of the histogram shows the variability in the simulations. If the values are tightly packed near the center, it suggests low uncertainty. A wider spread indicates more variability in the effect of weight on the likelihood of having a manual transmission.

Negative AMEs:

The fact that all the values (or most) are negative indicates that as weight increases, the probability of having a manual transmission decreases. In other words, heavier cars are less likely to have manual transmissions.

Confidence Intervals:

The confidence intervals (e.g., 2.5% and 97.5% from your earlier summary) would correspond to the boundaries of this histogram. This visually represents the range of plausible values for the AME.

Key Insights:

The histogram confirms the validity of the original AME estimate (-0.512), as the simulated values cluster around it.

The predominance of negative values strengthens the conclusion that car weight (wt) negatively affects the likelihood of a manual transmission.

The red dashed line visually highlights the original estimate in the context of the entire distribution, helping to assess how typical or unusual it is compared to the simulated outcomes.