ISDA609: Mathmatical Modeling Techniques for Data Analytics \ Assignment:04

Page 191, Question 3

Using Monte Carlo simulation, write an algorthm to calculate an approximation to \(\pi\) by considering the number of random points selected inside the quarter circle:\[ Q: x^2 + y^2 = 1, x \geq 0, y \geq 0 \] where the quarter circle is taken to be inside the square \[ S:0\leq x \leq 1 and 0 \leq y \leq 1\] Use the equation \(\pi / 4\) = area Q/area S.

Answer Page 191, Question 3

We begin by finding an approximation of the area under a nonnegative curve. Specifically, suppose y = f(x) is some given continuous function satisfying 0<=f(x)<= M over closed interval a<=x<=b. Here the number M is simply some constant that bounds of the function . In our case M= 1. f(x) = \(x^2 + y^2\)

counter=0;  
mcx=runif(10000, 0, 1)      
mcy=runif(10000, 0, 1) 
s <- mcx^2 + mcy^2 
for (i in 1: length(s)){ 
        if (s[i]<=1) counter=counter+1
    }

Therefore the approximation to \(\pi\) is

pi= 4*counter/length(s)
pi

## [1] 3.1592

Let’s show the results graphically as well…

qctest<- mcx^2 + mcy^2 <= 1
df= data.frame(mcx,mcy,qctest)
for (i in 1:10000) { if (df$qctest[i]==TRUE)  df$inside[i]<- 1 else  df$inside[i]<- 0} 

library(ggplot2)

## Warning: package 'ggplot2' was built under R version 3.2.3

p <- ggplot(data=df) + geom_point(aes(x=mcx, y=mcy,color=inside))+
                      scale_colour_gradient(high="green", low="gray")
p

Page 194, Question 1

Use the middle-square method to generate
1. 10 random numbers using \(x_0\) = 1009
2. 20 random numbers using \(x_0\) =653217
3. 15 random numbers using \(x_0\) = 3043
4. Comment about the results of each sequence. Was there cycling? Did each sequence degenerate rapidly?

Answer Page 194, Question 1

Creating a function to generate Random numbers using Middle-square Method

library(stringr)

randomgen<- function(x, l) {
  
  y= x^2
    lenx<- floor (log10 (abs (x))) + 1
  leny<- floor (log10 (abs (y))) + 1
  #l=4
  d = l*2 - leny
  
  if (x==0) c=0 else {
  if (d > 0 ) {
    #d = lenx*2 - leny
    if (d == l) {  stry<- as.character(y)
                  r<- str_sub(stry, 0, l/2)
                 c<- as.numeric(r, NA.rm = TRUE)
                 }else
                 
                   if (d > l) {  if (floor(log10 (abs (y))) + 1 == 1) {c=0} else {
                     stry<- as.character(y)
                     r<- str_sub(stry, 0, (l/2)-1)
                     c<- as.numeric(r, NA.rm = TRUE)
                                 }
                   }else 
                     {
    i = lenx / 2
    stry<- as.character(y)
    r<- str_sub(stry, i, leny - i)
    c<- as.numeric(r, NA.rm = TRUE)
                 }
                 }
    
  if (d == 0 )  {  
    i = lenx / 2
    stry<- as.character(y)
    r<- str_sub(stry, i+1, leny - i)
    c<- as.numeric(r, NA.rm = TRUE)
  }
  }
  return(c)   
}    


  
gen <- function(n, seed, l)
{
  randoms <- c()
  xi <- seed
  for(i in 1:n)
  {
    xi <- randomgen(xi,l )
    randoms[i] <- xi
  }
  return (c(seed,randoms))
}

a. 10 random numbers using \(x_0\) = 1009

gen(10,1009, 4)

##  [1] 1009  180  324 1049 1004   80   64   40   16    2    0

Using middle square method to generate random numbers with seed 1009, do

lead to degenerating results.

Now let’s make sure we clearly see the degenerating phenomenon by generating 20 random numbers.

gen(20,1009, 4)

##  [1] 1009  180  324 1049 1004   80   64   40   16    2    0    0    0    0
## [15]    0    0    0    0    0    0    0

b. 20 random numbers using \(x_0\) =653217

gen(20,653217, 6)

##  [1] 653217 692449 485617 823870 761776 302674 611550 993402 847533 312186
## [11] 460098 690169 333248  54229 940784  74534 555317 376970 106380 316704
## [21] 301423

Using middle square method to generate 20 random numbers with seed 653217 , does not

exhibit degenerating or cycling. However, as we increase the generated number to over

258, it started degenerating and cycling as per below

x<- gen(400,653217, 6)
# finding the numbers between the 258th and 300th positions:

x[258:400]

##   [1] 130  16  25  62  38  14  19  36  12  14  19  36  12  14  19  36  12
##  [18]  14  19  36  12  14  19  36  12  14  19  36  12  14  19  36  12  14
##  [35]  19  36  12  14  19  36  12  14  19  36  12  14  19  36  12  14  19
##  [52]  36  12  14  19  36  12  14  19  36  12  14  19  36  12  14  19  36
##  [69]  12  14  19  36  12  14  19  36  12  14  19  36  12  14  19  36  12
##  [86]  14  19  36  12  14  19  36  12  14  19  36  12  14  19  36  12  14
## [103]  19  36  12  14  19  36  12  14  19  36  12  14  19  36  12  14  19
## [120]  36  12  14  19  36  12  14  19  36  12  14  19  36  12  14  19  36
## [137]  12  14  19  36  12  14  19

#x[258:300]

hist(x, horizon=TRUE)

## Warning in plot.window(xlim, ylim, "", ...): "horizon" is not a graphical
## parameter

## Warning in title(main = main, sub = sub, xlab = xlab, ylab = ylab, ...):
## "horizon" is not a graphical parameter

## Warning in axis(1, ...): "horizon" is not a graphical parameter

## Warning in axis(2, ...): "horizon" is not a graphical parameter

#gen(258,653217, 6)

c. 15 random numbers using \(x_0\) = 3043

gen(15,3043, 4)

##  [1] 3043 2598 7496 1900 6100 2100 4100 8100 6100 2100 4100 8100 6100 2100
## [15] 4100 8100

Using middle square method to generate 15 random numbers with seed 3043, does lead

to cycling right after the fourth iteration. The sequence (6100, 2100, 4100, 8100)

keeps repeating right after the fourth iteration.

Hence, the series begins oscillations at the fourth iteration.

to see the cycling clearly, lets try to generate 30 random numbers

gen(30,3043, 4)

##  [1] 3043 2598 7496 1900 6100 2100 4100 8100 6100 2100 4100 8100 6100 2100
## [15] 4100 8100 6100 2100 4100 8100 6100 2100 4100 8100 6100 2100 4100 8100
## [29] 6100 2100 4100

Page 201, Question 4

Horse Race- Construct and perform a Monte Carlo simulation of a horse race. you can be creative and use odds from newspaper, or simulate the Mathematical Derby with the entries and odd shown in the following table:

# Horses participating
Horses <- c("Euler_Folly", 
            "Leapin_Leibniz", 
            "Newton_Lobell", 
            "Count_Cauchy", 
            "Pumped_Poisson", 
            "Loping_Hopital", 
            "Steamin_Stokes", 
            "Dancin_Dantzig")


s<- 1/((1/7)+(1/5)+(1/9)+(1/12)+(1/4)+(1/35)+(1/15)+(1/4))

# Normalize weights
wts <- c(s/7, s/5, s/9, s/12, s/4, s/35, s/15, s/4) 
# Chances against winning
df_odds <- data.frame(Horses, wts)
df_odds <- df_odds[order(df_odds$wts, decreasing = TRUE), ]

# Simulate race results
sim_results <- table(sample(Horses, 1000, replace=TRUE, prob=wts))

# Order the the simulation results
sim_results <- sim_results[order(sim_results)]

df <- data.frame(sim_results[order(sim_results, decreasing = TRUE)])
names(df) <- "Frequency"
df

##                Frequency
## Pumped_Poisson       233
## Dancin_Dantzig       205
## Leapin_Leibniz       188
## Euler_Folly          125
## Newton_Lobell         94
## Count_Cauchy          77
## Steamin_Stokes        53
## Loping_Hopital        25

row.names(df)[nrow(df)]

## [1] "Loping_Hopital"

From the above table:

The horse that won the most races is the first element in the sorted data frame

df.row.names(df)[1]. In the above simulation it is case Dancin’ Dantzig

The horse that won the fewest races is the last element in the sorted data frame.

row.names(df)[nrow(df)]. In the above simulation, it is `loping L’Hopital

The results are not surprising. They reflect the probabilities of winning with

a small of error especially the first and second position.

below is a graphical representation

Page 211 # 3

In many situations, the time T between deliveries and the order quantity Q is not fixed. Instead, an order is placed for a specific amount of gasoline. Depending on how many orders are placed in a given time interval, the time to fill an order varies.You have no reason to believe that the performance of the delivery operation will change. Therefore, you have examined records for the past 100 deliveries and found the following lag times, or extra days, required to fill your order:

Construct a Monte Carlo simulation for the lag time submodel. If you have a handheld calculator or computer available, test your submodel by running 1000 trials and comparing the number of occurrences of the various lag times with the historical data.

# actual data recorded for last 100 days
lagtime <- 2:7
occurrences <- c(10, 25, 30, 20, 13, 2)

# To simplify the model, We will assume that a lag time on One day never happened. 
# hence the below for simplicity

lagtime <- c(1, lagtime)
occurrences <- c(0, occurrences)

# calculating the probabilities
probs <- occurrences / sum(occurrences)
# Calculate cumulative probabilities
cum.probs <- cumsum(probs)
##

##
x = cum.probs
y = lagtime
par(mfrow = c(2,1))
plot(x, y, main = "approx(.) and approxfun(.)")
points(approx(x, y), col = 2, pch = "*")
points(approx(x, y, method = "linear"), col = 4, pch = "*")
# using our a=10 and b = 1 
abline(1, 10, col="red")
############

# We can now run  Monte Carlo simulation by generating uniformally distributed 
# values between 0 and 1 and using  linear model to simulate lag times.

trials <- 1000
model1 <- approxfun(cum.probs, lagtime, method="linear")
sim <- model1(runif(trials))

## Comparing the historical data with the data from our model:
  
model1probs <- as.numeric(table(round(sim))/trials)
df <- data.frame(lagtime, "initial"=probs, "model"=round(model1probs, 2))
df

##   lagtime initial model
## 1       1    0.00  0.05
## 2       2    0.10  0.18
## 3       3    0.25  0.29
## 4       4    0.30  0.24
## 5       5    0.20  0.17
## 6       6    0.13  0.06
## 7       7    0.02  0.01

ISDA609: Mathmatical Modeling Techniques for Data Analytics \[ \ \] Assignment:04

Mohamed Elmoudni

February 27, 2016