Problem

A rookie is brought to a baseball club on the assumption that he will have a .300 batting average. (Batting average is the ratio of the number of hits to the number of times at bat.) In the first year, he comes to bat 300 times and his batting average is .267. Assume that his at bats can be considered Bernoulli trials with probability .3 for success. Could such a low average be considered just bad luck or should he be sent back to the minor leagues? Comment on the assumption of Bernoulli trials in this situation.

set.seed(1125)
# Set the parameters
n_at_bats <- 300  # Number of times at bat
n_seasons <- 10000  # Number of seasons to simulate

# Simulate batting average over multiple seasons
batting_averages <- replicate(n_seasons, mean(rbinom(n_at_bats, 1, 0.3)))

# Calculate the mean and standard deviation of the batting averages
mean_batting_average <- mean(batting_averages)
sd_batting_average <- sd(batting_averages)

# Use CLT to approximate the distribution of batting average
# Generate random samples from a normal distribution
simulated_batting_averages <- rnorm(n_seasons, mean = mean_batting_average, sd = sd_batting_average)

# Calculate the proportion of seasons where the batting average is less than 0.267
low_average_prop <- mean(simulated_batting_averages < 0.268)

# Output the result
print(low_average_prop)
## [1] 0.1125

Statistical Significance: The output suggests that around 11.25% of seasons had a batting average lower than 0.268. This means that a batting average of 0.267 or worse could potentially occur due to random chance in about 11.25% of seasons, based on the simulated data.

Bernoulli Trials Assumption: The assumption of Bernoulli trials with a probability of success of 0.3 (hits) might not perfectly capture the complexities of baseball batting. While it provides a simplified model, real-life batting performance can be influenced by various factors such as pitcher skill, hitter skill, weather conditions, and more. Therefore, the actual distribution of batting averages may not precisely follow a Bernoulli distribution since they might not be fully independent.