Creating the Monte Carlo method with \(n = 5000\) points, the following simulation is created using the uniform distribution for generating \(x\) and \(y\) values:
n <- 5000 # number of simulations
Q <- 0 # counter for number in circle
for (i in 1:n) {
# generate random numbers
x <- runif(1, 0, 1)
y <- runif(1, 0, 1)
# if inside cirlce; increment
if (x^2 + y^2 <= 1) {
Q <- Q + 1
}
}
# calculate ratio of areas
ratio <- Q / n
# estimate pi
pi_est <- ratio * 4
The estimated value produced by the simulation is \(\pi \approx 3.1616\).
A function is written to generate the numbers using the middle-square method:
mid_sq <- function(seed, n) {
# start sequence with seed
gen <- seed
# get length of seed
len <- nchar(as.character(seed))
# determine which digits to grab from squared numbers
dig_start <- (2 * len) / 4 + 1
dig_end <- 3 * (2 * len) / 4
for (i in 1:n) {
# square last number generated
new_gen <- gen[length(gen)]^2
# convert new number to string to subset
new_gen <- as.character(new_gen)
# ensure number is of proper length
lead_zero <- len * 2 - nchar(new_gen)
new_gen <- paste0(paste(rep('0', lead_zero), collapse = ''), new_gen)
# get middle digits
new_gen <- substr(new_gen, dig_start, dig_end)
# add new number to array of generated numbers
gen <- c(gen, as.numeric(new_gen))
}
gen
}
| Â | a | b | c |
|---|---|---|---|
| 0 | 1009 | 653217 | 3043 |
| 1 | 180 | 692449 | 2598 |
| 2 | 324 | 485617 | 7496 |
| 3 | 1049 | 823870 | 1900 |
| 4 | 1004 | 761776 | 6100 |
| 5 | 80 | 302674 | 2100 |
| 6 | 64 | 611550 | 4100 |
| 7 | 40 | 993402 | 8100 |
| 8 | 16 | 847533 | 6100 |
| 9 | 2 | 312186 | 2100 |
| 10 | 0 | 460098 | 4100 |
| 11 | 690169 | 8100 | |
| 12 | 333248 | 6100 | |
| 13 | 54229 | 2100 | |
| 14 | 940784 | 4100 | |
| 15 | 74534 | 8100 | |
| 16 | 555317 | ||
| 17 | 376970 | ||
| 18 | 106380 | ||
| 19 | 316704 | ||
| 20 | 301423 |
Random sequence a rapidly degenerates to zero. Sequence b exhibits degeneration, but it is far less rapid than part a. Sequence c exhibits cycling beginning with the fourth number generated.
The total of the two dice is simulated by running a Monte Carlo simulation for each weighted die:
# load probabilities
die_1 <- c(0.1, 0.1, 0.2, 0.3, 0.2, 0.1)
die_2 <- c(0.3, 0.1, 0.2, 0.1, 0.05, 0.25)
# convert to cumulative probabilities
die_1 <- cumsum(die_1)
die_2 <- cumsum(die_2)
# set up container for sums
tot <- numeric()
for (i in 1:300) {
# generate two random numbers
d1 <- runif(1)
d2 <- runif(1)
# get rolls associated with values
d1 <- min(which(d1 < die_1))
d2 <- min(which(d2 < die_2))
# add sum to set of sums
tot <- c(tot, d1 + d2)
}
The count of observation by sum is below:
| Sum | Count | Proportion |
|---|---|---|
| 2 | 13 | 0.04333 |
| 3 | 9 | 0.03 |
| 4 | 30 | 0.1 |
| 5 | 43 | 0.1433 |
| 6 | 48 | 0.16 |
| 7 | 51 | 0.17 |
| 8 | 43 | 0.1433 |
| 9 | 23 | 0.07667 |
| 10 | 20 | 0.06667 |
| 11 | 10 | 0.03333 |
| 12 | 10 | 0.03333 |
To generate the probabilities for the Monte Carlo simulation, the provided odds must be converted to probabilies: \[n:1 \ \ \textrm{odds} \implies \ \frac{1}{n+1} \ \ \textrm{probability}\]
The sum of the individual probabilities do not add up to exactly 1, so each probability is divided by the sum of the probabilities to give the adjusted probability.
odds <- c(7, 5, 9, 12, 4, 35, 15, 4)
prob <- 1 / (odds + 1)
prob.adj <- prob / (sum(prob))
| Â | Probability | Adjusted.Probability |
|---|---|---|
| Euler’s Folly | 0.125 | 0.1304 |
| Leapin’ Leibniz | 0.1667 | 0.1738 |
| Newton Lobell | 0.1 | 0.1043 |
| Count Cauchy | 0.07692 | 0.08022 |
| Pumped up Poisson | 0.2 | 0.2086 |
| Loping L’Hopital | 0.02778 | 0.02897 |
| Steaming’ Stokes | 0.0625 | 0.06518 |
| Dancin’ Dantzig | 0.2 | 0.2086 |
These adjusted probabilites are used to create the simulation:
# set up counter for wins
wins <- rep(0, length(prob.adj))
# get cumulative distribution
cum_prob <- cumsum(prob.adj)
# run simulation
for (i in 1:1000) {
# generate random number
w <- runif(1)
# determine winning horse
w <- min(which(w < cum_prob))
# increment horse's winnings
wins[w] <- wins[w] + 1
}
| Â | Wins |
|---|---|
| Euler’s Folly | 147 |
| Leapin’ Leibniz | 172 |
| Newton Lobell | 102 |
| Count Cauchy | 91 |
| Pumped up Poisson | 199 |
| Loping L’Hopital | 20 |
| Steaming’ Stokes | 59 |
| Dancin’ Dantzig | 210 |
The horse with the most wins was Dancin’ Dantzig – this is unsurprising, as it was tied for the best odds (4:1). Loping L’Hopital had the fewest wins – this too is unsurprising, since it had the worst odds (35:1).
Assuming that the lag can only take the discrete values given (i.e. a minimum of 2 days, maximum of 7, and no half days), the simulation can be set up as follows:
lag_count <- c(10, 25, 30, 20, 13, 2)
# calculate cumulative probabilities
lag_prob <- cumsum(lag_count / 100)
# set up counter for lags
lag_days <- rep(0, length(lag_count))
# run simulation
for (i in 1:1000) {
# generate random number
l <- runif(1)
# determine lag duration based on number
l <- min(which(l < lag_prob))
# increment lag count
lag_days[l] <- lag_days[l] + 1
}
| Lag time | Occurences |
|---|---|
| 2 | 113 |
| 3 | 247 |
| 4 | 302 |
| 5 | 182 |
| 6 | 136 |
| 7 | 20 |
Based on the observed occurences, the simulation results align with what would be expected.
Table 5.18 contains two subtables – one for arrivals, and one for service times. Using the midpoints of each interval:
arrival <- data.frame(
times = c(19.5, 29.5, 39.5, 49.5, 59.5, 69.5, 79.5, 89.5, 99.5, 109.5, 119.5, 129.5, 140),
probs = c(0.009, 0.029, 0.035, 0.051, 0.09, 0.161, 0.2, 0.172, 0.125, 0.071, 0.037, 0.017, 0.003)
)
service <- data.frame(
times = c(47, 52, 57, 62, 67, 72, 77, 82, 87.5),
probs = c(0.017, 0.045, 0.095, 0.086, 0.13, 0.185, 0.208, 0.143, 0.091)
)
# convert to cumulative probabilities
arrival$probs <- cumsum(arrival$probs)
service$probs <- cumsum(service$probs)
To establish a smooth polynomial function, plots of times vs. probabilities are created for each set of observations:
The curve for arrival times shown above appears to resemble that of a cubic equation for \(x\) vs. \(y^3\). As such, a third-degree polynomial is fit to the data:
mod_arr <- lm(times ~ poly(probs, 3, raw = TRUE), data = arrival)
arrival$preds <- predict(mod_arr, arrival)
The equation for the best-fit third-order fit is \[b = 18.4377 + 302.2950 x - 561.4642 x^2 + 371.2266 x^3\] This appears reasonable when plotted against the original data:
To try fitting a one-term model, a number of transforms are examined. A transformation using the square root of probability appears to provide a near-linear relationship:
A linear equation is fit for this one-term model:
mod_arr_sq <- lm(times ~ I(sqrt(probs)), data = service)
The equation for this line is given by \[u = 40.4257 + 43.9362 \sqrt{x}\]
Plotting this equation against the original data, it appears reasonable: