About

In this lab we will focus on sensitivity analysis and Monte Carlo simulations.

Sensitivity analysis is the study of how the uncertainty in the output of a mathematical model or system (numerical or otherwise) can be apportioned to different sources of uncertainty in its inputs. We will use the lpSolveAPI R-package as we did in the previous lab.

Monte Carlo Simulations utilize repeated random sampling from a given universe or population distribution to derive certain results. This type of simulation is known as a probabilistic simulation, as opposed to a deterministic simulation.

An example of a Monte Carlo simulation is the one applied to approximate the value of pi. The simulation is based on generating random points within a unit square and see how many points fall within the circle enclosed by the unit square (marked in red). The value of pi is estimated by the number of points inside the circle over the total number of points generated inside the square. The higher the number of sampled points the closer the result is to the actual result. After selecting 30,000 random points, the estimate for pi is much closer to the actual value within the four decimal points of precision. The interested and curious reader is encouraged to explore the mathematical foundation behind the logic.

In this lab, we will learn how to generate random samples with various simulations and how to run a sensitivity analysis on the marketing use case covered so far.

Setup

Remember to always set your working directory to the source file location. Go to ‘Session’, scroll down to ‘Set Working Directory’, and click ‘To Source File Location’. Read carefully the below and follow the instructions to complete the tasks and answer any questions. Submit your work to RPubs as detailed in previous notes.

Note

For your assignment you may be using different data sets than what is included here. Always read carefully the instructions on Sakai. For clarity, tasks/questions to be completed/answered are highlighted in red color (visible in preview) and numbered according to their particular placement in the task section. Quite often you will need to add your own code chunk.

Execute all code chunks, preview, publish, and submit link on Sakai.


Task 1: Sensitivity Analysis

In order to conduct the sensitivity analysis, we will need to download again the lpSolveAPI package unless you have it already installed in your R environment

# Require will load the package only if not installed 
# Dependencies = TRUE makes sure that dependencies are install
if(!require("lpSolveAPI",quietly = TRUE))
  install.packages("lpSolveAPI",dependencies = TRUE, repos = "https://cloud.r-project.org")

We will revisit and solve again the marketing case discussed in class (also part of previous lab).

# We start with `0` constraint and `2` decision variables. The object name `lpmark` is discretionary.
lpmark = make.lp(0, 2)
# Define type of optimization as maximum and dump the screen output into a `dummy` variable
dummy = lp.control(lpmark, sense="max") 
# Set the objective function coefficients 
set.objfn(lpmark, c(275.691, 48.341))

Add all constraints to the model.

add.constraint(lpmark, c(1, 1), "<=", 350000)
add.constraint(lpmark, c(1, 0), ">=", 15000)
add.constraint(lpmark, c(0, 1), ">=", 75000)
add.constraint(lpmark, c(2, -1), "=", 0)
add.constraint(lpmark, c(1, 0), ">=", 0)
add.constraint(lpmark, c(0, 1), ">=", 0)

Now, view the problem setting in tabular/matrix form. This is a good checkpoint to confirm that our contraints have been properly set.

lpmark
Model name: 
               C1       C2            
Maximize  275.691   48.341            
R1              1        1  <=  350000
R2              1        0  >=   15000
R3              0        1  >=   75000
R4              2       -1   =       0
R5              1        0  >=       0
R6              0        1  >=       0
Kind          Std      Std            
Type         Real     Real            
Upper         Inf      Inf            
Lower           0        0            
# solve
solve(lpmark) 
[1] 0

Next we get the optimum results.

# display the objective function optimum value
get.objective(lpmark)
[1] 43443517
# display the decision variables optimum values
get.variables(lpmark)
[1] 116666.7 233333.3

For the sensitivity part we will add two new code sections to obtain the sensitivity results. Frist, we will obtain the sensitivity values due to changes in the coefficients of the objective function.

# display sensitivity to coefficients of objective function. 
get.sensitivity.obj(lpmark)
$`objfrom`
[1]  -96.6820 -137.8455

$objtill
[1] 1e+30 1e+30

The first part of the output labeled objfrom displays the lower limit/boundary and the output labeled objtill displays the upper limit/boundary of the objective function coefficients sensitivity.

##### 1A) Explain in clear words what the sensitivity boundary values (lower/upper limits) represent in context of the marketing model. Refer to class notes.

The sensitivity boundary values represent how much we can how changes in the coefficients of the objective function will impact the optimum Z value, but not necessarily the optimum solution that are radio and tv. In the cases, we can tell that the lower limit is labeled ‘objfrom’ which the value -96.6820 and -137.8455, and the upper limit is labeled ‘objtill’ which the value is infinity.Also,if the coefficients change out that range, it woule affect the ridao, tv and sales. It says that the coefficients of two decision variable which are optimum solution radio and tv woule not chang by -96.6820 and -137.8455, but it is going to change the optimum value for sales.

Next we will obtain the sensitivity results due to changes in resources impacting the constraints.

# display sensitivity to right hand side constraints. 
# There will be a total of m+n values where m is the number of contraints and n is the number of decision variables
get.sensitivity.rhs(lpmark) 
$`duals`
[1] 124.12433   0.00000   0.00000  75.78333   0.00000   0.00000   0.00000   0.00000

$dualsfrom
[1]  1.125e+05 -1.000e+30 -1.000e+30 -3.050e+05 -1.000e+30 -1.000e+30 -1.000e+30 -1.000e+30

$dualstill
[1] 1.00e+30 1.00e+30 1.00e+30 4.75e+05 1.00e+30 1.00e+30 1.00e+30 1.00e+30

For this exercise we are only interested in the first six values (corresponding to the six constraints) of the output labeled duals.

##### 1B) Explain in clear words what each of the zero and non-zero sensitivity values represent. In your explanation identify the binding/non-binding constraints, the surplus/slack, and the marginal values.

The zero values represent that the constraint is non-binding, which means have surplus/slack and no marginal value, and it is not going to change the optimum value for sales. However, the non-zero means that the constraint is binding with no surplus or slack, and represent the marginal values change in the optimum value by adjusting a binding constraint. In this case, we can tell that there are four non-binding constraints and two binding constraints. For the binding constraint, if we increase in $ 1, the optimum value of sales will increase 124.12433 and 75.78333 as well in the first/fourth constraint.

To acquire a better understanding of the sensitivity results, and to confirm integrity of the calculations, independent tests can be conducted. To demonstrate, we will repeat the linear programming (LP) optimization problem by slightly tweaking one binding constraint and verifying the impact.

##### 1C) Define a new model object lpmark1 and add the constraints exactly as entered at beginning of this task. All being equal, change the budget constraint by a marginal $1 value and solve the LP problem. Note the new optimum value for sales. Calculate the differential change in optimum sales from the earlier computed optimum sales, and compare your calculation to the value obtained from the sensitivity calculation.

# Define a new model object called lpmark1
lpmark1 = make.lp(0, 2)
# Repeat rest of commands with the one constraint change for budget. Solve and display the objective function optimum value
# Define type of optimization as maximum and dump the screen output into a `dummy` variable
dummy = lp.control(lpmark1, sense="max") 
# Set the objective function coefficients 
set.objfn(lpmark1, c(275.691, 48.341))
add.constraint(lpmark1, c(1, 1), "<=", 350001)
add.constraint(lpmark1, c(1, 0), ">=", 15000)
add.constraint(lpmark1, c(0, 1), ">=", 75000)
add.constraint(lpmark1, c(2, -1), "=", 0)
add.constraint(lpmark1, c(1, 0), ">=", 0)
add.constraint(lpmark1, c(0, 1), ">=", 0)
lpmark1
Model name: 
               C1       C2            
Maximize  275.691   48.341            
R1              1        1  <=  350001
R2              1        0  >=   15000
R3              0        1  >=   75000
R4              2       -1   =       0
R5              1        0  >=       0
R6              0        1  >=       0
Kind          Std      Std            
Type         Real     Real            
Upper         Inf      Inf            
Lower           0        0            
# solve
solve(lpmark1) 
[1] 0
# display the objective function optimum value
get.objective(lpmark1)
[1] 43443641
# display the decision variables optimum values
get.variables(lpmark1)
[1] 116667 233334
# display sensitivity to coefficients of objective function. 
get.sensitivity.obj(lpmark1)
$`objfrom`
[1]  -96.6820 -137.8455

$objtill
[1] 1e+30 1e+30
# display sensitivity to right hand side constraints. 
# There will be a total of m+n values where m is the number of contraints and n is the number of decision variables
get.sensitivity.rhs(lpmark1) 
$`duals`
[1] 124.12433   0.00000   0.00000  75.78333   0.00000   0.00000   0.00000   0.00000

$dualsfrom
[1]  1.12500e+05 -1.00000e+30 -1.00000e+30 -3.05001e+05 -1.00000e+30 -1.00000e+30 -1.00000e+30 -1.00000e+30

$dualstill
[1] 1.00000e+30 1.00000e+30 1.00000e+30 4.75002e+05 1.00000e+30 1.00000e+30 1.00000e+30 1.00000e+30

Lpmark sales equal to 43,443,517 and Lpmark1 sales equal to 43,443,641. The $1 change in the first constraint, it would result in a change of 124 dollars in the optimum value of sales. It is clear that the budget constraint is binding, because when I changed the budget constraint, it changed the optimum value for sales.

##### 1D) Based on the previous exercise explain in clear words, and without running another solver again , how would you check the integrity of the other marginal value identified in 1B).

I would check the integrity of the other marginal value from 1B) by increasing the 4th constraint to $1. I would then compare the optimum values for sales and observe the changes that appeared. Finally, I would check to see if the optimum sales value changed by 75.7833.


Task 2: Monte Carlo Simulation

For this task we will be running a Monte Carlo simulation to calculate the probability that the daily return from S&P will be > 5%. We will assume that the historical S&P daily return follows a normal distribution with an average daily return of 0.03 (%) and a standard deviation of 0.97 (%). Consider this as our population.

To begin we will generate 100 random samples from the normal distribution. For the generated samples we will calculate the mean, standard deviation, and probability of occurrence where the simulation result is greater than 5%.

To generate random samples from a normal distribution we will use the rnorm() function in R. In the example below we set the number of runs (or samples) to 100.

# number of simulations/samples
runs = 100
# random number generator per defined normal distribution with given mean and standard deviation
sims =  rnorm(runs,mean=0.03,sd=0.97)
# Mean calculated from the random distribution of samples
average = mean(sims)
average
[1] 0.01210434
# STD calculated from the random distribution of samples
std = sd(sims) 
std
[1] 0.9556717
# probability of occurrence on any given day based on samples will be equal to count (or sum) where sample result is greater than 5% divided by total number of samples. 
prob = sum(sims >=0.05)/runs
prob
[1] 0.44

##### 2A) Repeat the above calculations for the two cases where the number of simulations/samples is equal to 1000 and 10000. For each case record the mean, standard deviation, and probability.

# number of simulations/samples
runs = 1000
# random number generator per defined normal distribution with given mean and standard deviation
sims =  rnorm(runs,mean=0.03,sd=0.97)
# Mean calculated from the random distribution of samples
average = mean(sims)
average
[1] 0.03734675
# STD calculated from the random distribution of samples
std = sd(sims) 
std
[1] 0.983215
# probability of occurrence on any given day based on samples will be equal to count (or sum) where sample result is greater than 5% divided by total number of samples. 
prob = sum(sims >=0.05)/runs
prob
[1] 0.492
# number of simulations/samples
runs = 10000
# random number generator per defined normal distribution with given mean and standard deviation
sims =  rnorm(runs,mean=0.03,sd=0.97)
# Mean calculated from the random distribution of samples
average = mean(sims)
average
[1] 0.04024144
# STD calculated from the random distribution of samples
std = sd(sims) 
std
[1] 0.9639549
# probability of occurrence on any given day based on samples will be equal to count (or sum) where sample result is greater than 5% divided by total number of samples. 
prob = sum(sims >=0.05)/runs
prob
[1] 0.4994

##### 2B) List in seperate lines the values for mean, standard deviation, and probability for all three cases: 100, 1000, and 10000 simulations. Describe how the values change/behave as the number of simulations is increased. What is your best bet on the probability of occurrence greater than 5% and why? How is this behavior similar to the introductory image use case demonstration to calculate pi?

Sample: 100 Mean: 0.01210434 Standard Deviation:0.9556717 Probability: 0.44

Sample: 1000 Mean: 0.03734675 Standard Deviation: 0.983215 Probability: 0.492

Sample: 10000 Mean: 0.04024144 Standard Deviation: 0.9639549 Probability: 0.4994

As the number of simulations increased, the average got closer and closer to 0.03 standard deviation got closer to 0.97. In terms of the probability differences, it stayed around 0.50 which means 50% probability. I would guess that the probability is around 49% because that was about the number for the most accurate simulation. This is similar to the image that was presented in the introductory paragraph because they both make the point that with more samples, the more accurate your results will be.

The last exercise is optional for those interested in further enhancing their subject matter learning, and refining their problem skills in R. Your work will be assessed but you will not be graded for this exercise. You can follow the instructions presented in the video Excel equivalent example at [https://www.youtube.com/watch?v=wKdmEXCvo9s]

##### 2C-Optional!) Repeat the exercise for the S&P daily return where all is equal except we are now interested in the weekly cumulative return and the probability that the weekly cummulative return is greater than 5%. Set the number of simulations to 10000.

# number of simulations/samples
runs = 10000
# random number generator per defined normal distribution with given mean and standard deviation
sims =  rnorm(runs,mean=0.03,sd=0.97)
# Mean calculated from the random distribution of samples
average = mean(sims)
average
[1] 0.02373781
# STD calculated from the random distribution of samples
std = sd(sims) 
std
[1] 0.9663548
# probability of occurrence on any given day based on samples will be equal to count (or sum) where sample result is greater than 5% divided by total number of samples. 
prob = sum(sims >=0.05)/runs
prob
[1] 0.48
---
title: "BSAD343 Fall 2018 Lab Worksheet 08"
author: "Yi Pan"
date: "11-06-2018"
output:
  html_notebook: default
  html_document: default
  pdf_document: default
subtitle: Sensitivity Analysis & Simulations (bsad-lab08)
---


### About

In this lab we will focus on sensitivity analysis and Monte Carlo simulations. 

Sensitivity analysis is the study of how the uncertainty in the output of a mathematical model or system (numerical or otherwise) can be apportioned to different sources of uncertainty in its inputs. We will use the `lpSolveAPI` R-package as we did in the previous lab. 

Monte Carlo Simulations utilize repeated random sampling from a given universe or population distribution to derive certain results. This type of simulation is known as a probabilistic simulation, as opposed to a deterministic simulation.

An example of a Monte Carlo simulation is the one applied to approximate the value of `pi`. The simulation is based on generating random points within a unit square and see how many points fall within the circle enclosed by the unit square (marked in red). The value of `pi` is estimated by the number of points inside the circle over the total number of points generated inside the square.  The higher the number of sampled points the closer the result is to the actual result.  After selecting 30,000 random points, the estimate for `pi` is much closer to the actual value within the four decimal points of precision. The interested and curious reader is encouraged to explore the mathematical foundation behind the logic.

![](imgs/montecarlo.gif)

In this lab, we will learn how to generate random samples with various simulations and how to run a sensitivity analysis on the marketing use case covered so far.

### Setup

Remember to always set your working directory to the source file location. Go to 'Session', scroll down to 'Set Working Directory', and click 'To Source File Location'. Read carefully the below and follow the instructions to complete the tasks and answer any questions.  Submit your work to RPubs as detailed in previous notes. 

### Note

For your assignment you may be using different data sets than what is included here. Always read carefully the instructions on Sakai.  For clarity, tasks/questions to be completed/answered are highlighted in red color (visible in preview) and numbered according to their particular placement in the task section.  Quite often you will need to add your own code chunk.

Execute all code chunks, preview, publish, and submit link on Sakai.

--------------

### Task 1: Sensitivity Analysis

In order to conduct the sensitivity analysis, we will need to download again the `lpSolveAPI` package unless you have it already installed in your R environment

```{r}
# Require will load the package only if not installed 
# Dependencies = TRUE makes sure that dependencies are install
if(!require("lpSolveAPI",quietly = TRUE))
  install.packages("lpSolveAPI",dependencies = TRUE, repos = "https://cloud.r-project.org")
```

We will revisit and solve again the marketing case discussed in class (also part of previous lab). 
```{r}
# We start with `0` constraint and `2` decision variables. The object name `lpmark` is discretionary.
lpmark = make.lp(0, 2)

# Define type of optimization as maximum and dump the screen output into a `dummy` variable
dummy = lp.control(lpmark, sense="max") 
# Set the objective function coefficients 
set.objfn(lpmark, c(275.691, 48.341))
```

Add all constraints to the model.

```{r}
add.constraint(lpmark, c(1, 1), "<=", 350000)
add.constraint(lpmark, c(1, 0), ">=", 15000)
add.constraint(lpmark, c(0, 1), ">=", 75000)
add.constraint(lpmark, c(2, -1), "=", 0)
add.constraint(lpmark, c(1, 0), ">=", 0)
add.constraint(lpmark, c(0, 1), ">=", 0)
```

Now, view the problem setting in tabular/matrix form. This is a good checkpoint to confirm that our contraints have been properly set.

```{r}
lpmark
# solve
solve(lpmark) 
```

Next we get the optimum results.
```{r}
# display the objective function optimum value
get.objective(lpmark)

# display the decision variables optimum values
get.variables(lpmark)
```

For the sensitivity part we will add two new code sections to obtain the sensitivity results. Frist, we will obtain the sensitivity values due to changes in the coefficients of the objective function.

```{r}
# display sensitivity to coefficients of objective function. 
get.sensitivity.obj(lpmark)
```

The first part of the output labeled `objfrom` displays the lower limit/boundary and the output labeled `objtill` displays the upper limit/boundary of the objective function coefficients sensitivity.

<span style="color:red">
##### 1A)  Explain in clear words what the sensitivity boundary values (lower/upper limits) represent in context of the marketing model. Refer to class notes.
</span>

The sensitivity boundary values represent how much we can  how changes in the coefficients of the objective function will impact the optimum Z value, but not necessarily the optimum solution that are radio and tv. In the cases, we can tell that the lower limit is labeled 'objfrom' which the value -96.6820 and -137.8455, and the upper limit is labeled 'objtill' which the value is infinity.Also,if the coefficients change out that range, it woule affect the ridao, tv and sales. It says that the coefficients of two decision variable which are optimum solution radio and tv woule not chang by -96.6820 and -137.8455, but it is going to change the optimum value for sales.  


Next we will obtain the sensitivity results due to changes in resources impacting the constraints. 

```{r}
# display sensitivity to right hand side constraints. 
# There will be a total of m+n values where m is the number of contraints and n is the number of decision variables
get.sensitivity.rhs(lpmark) 
```

For this exercise we are only interested in the first six values (corresponding to the six constraints) of the output labeled `duals`. 

<span style="color:red">
##### 1B)  Explain in clear words what each of the zero and non-zero sensitivity values represent. In your explanation identify the binding/non-binding constraints, the surplus/slack, and the marginal values.
</span>

The zero values represent that the constraint is non-binding, which means have surplus/slack and no marginal value, and it is not going to change the optimum value for sales. However, the non-zero means that the constraint is binding with no surplus or slack, and represent the marginal values change in the optimum value by adjusting a binding constraint. In this case, we can tell that there are four non-binding constraints and two binding constraints. For the binding constraint, if we increase in $ 1, the optimum value of sales will increase 124.12433 and 75.78333 as well in the first/fourth constraint.


To acquire a better understanding of the sensitivity results, and to confirm integrity of the calculations, independent tests can be conducted.  To demonstrate, we will repeat the linear programming (LP) optimization problem by slightly tweaking one binding constraint and verifying the impact.

<span style="color:red">
##### 1C)  Define a new model object `lpmark1` and add the constraints exactly as entered at beginning of this task. All being equal, change the budget constraint by a marginal $1 value and solve the LP problem.  Note the new optimum value for sales. Calculate the differential change in optimum sales from the earlier computed optimum sales,  and compare your calculation to the value obtained from the sensitivity calculation.  
</span>

```{r}
# Define a new model object called lpmark1
lpmark1 = make.lp(0, 2)
# Repeat rest of commands with the one constraint change for budget. Solve and display the objective function optimum value

# Define type of optimization as maximum and dump the screen output into a `dummy` variable
dummy = lp.control(lpmark1, sense="max") 
# Set the objective function coefficients 
set.objfn(lpmark1, c(275.691, 48.341))

```

```{r}
add.constraint(lpmark1, c(1, 1), "<=", 350001)
add.constraint(lpmark1, c(1, 0), ">=", 15000)
add.constraint(lpmark1, c(0, 1), ">=", 75000)
add.constraint(lpmark1, c(2, -1), "=", 0)
add.constraint(lpmark1, c(1, 0), ">=", 0)
add.constraint(lpmark1, c(0, 1), ">=", 0)
```

```{r}
lpmark1
# solve
solve(lpmark1) 
```

```{r}
# display the objective function optimum value
get.objective(lpmark1)

# display the decision variables optimum values
get.variables(lpmark1)
```

```{r}
# display sensitivity to coefficients of objective function. 
get.sensitivity.obj(lpmark1)
```

```{r}
# display sensitivity to right hand side constraints. 
# There will be a total of m+n values where m is the number of contraints and n is the number of decision variables
get.sensitivity.rhs(lpmark1) 
```

Lpmark sales equal to 43,443,517 and Lpmark1 sales equal to 43,443,641. The $1 change in the first constraint, it would result in a change of 124 dollars in the optimum value of sales. It is clear that the budget constraint is binding, because when I changed the budget constraint, it changed the optimum value for sales.

<span style="color:red">
##### 1D)  Based on the previous exercise explain in clear words, and without running another solver again , how would you check the integrity of the other marginal value identified in 1B).
</span>

I would check the integrity of the other marginal value from 1B) by increasing the 4th constraint to $1. I would then compare the optimum values for sales and observe the changes that appeared. Finally, I would check to see if the optimum sales value changed by 75.7833.

----------

### Task 2: Monte Carlo Simulation

For this task we will be running a Monte Carlo simulation to calculate the probability that the daily return from S&P will be > 5%. 
We will assume that the historical S&P daily return follows a normal distribution with an average daily return of 0.03 (%) and a standard deviation of 0.97 (%). Consider this as our population.

To begin we will generate 100 random samples from the normal distribution. For the generated samples we will calculate the mean, standard deviation, and probability of occurrence where the simulation result is greater than 5%. 

To generate random samples from a normal distribution we will use the `rnorm()` function in R. In the example below we set the number of runs (or samples) to 100.

```{r}
# number of simulations/samples
runs = 100
# random number generator per defined normal distribution with given mean and standard deviation
sims =  rnorm(runs,mean=0.03,sd=0.97)
```

```{r}
# Mean calculated from the random distribution of samples
average = mean(sims)
average
```

```{r}
# STD calculated from the random distribution of samples
std = sd(sims) 
std
```

```{r}
# probability of occurrence on any given day based on samples will be equal to count (or sum) where sample result is greater than 5% divided by total number of samples. 
prob = sum(sims >=0.05)/runs
prob
```

<span style="color:red">
##### 2A)  Repeat the above calculations for the two cases where the number of simulations/samples is equal to 1000 and 10000.  For each case record the mean, standard deviation, and probability.
</span>

```{r}
# number of simulations/samples
runs = 1000
# random number generator per defined normal distribution with given mean and standard deviation
sims =  rnorm(runs,mean=0.03,sd=0.97)

# Mean calculated from the random distribution of samples
average = mean(sims)
average

# STD calculated from the random distribution of samples
std = sd(sims) 
std


# probability of occurrence on any given day based on samples will be equal to count (or sum) where sample result is greater than 5% divided by total number of samples. 
prob = sum(sims >=0.05)/runs
prob

```

```{r}
# number of simulations/samples
runs = 10000
# random number generator per defined normal distribution with given mean and standard deviation
sims =  rnorm(runs,mean=0.03,sd=0.97)

# Mean calculated from the random distribution of samples
average = mean(sims)
average

# STD calculated from the random distribution of samples
std = sd(sims) 
std


# probability of occurrence on any given day based on samples will be equal to count (or sum) where sample result is greater than 5% divided by total number of samples. 
prob = sum(sims >=0.05)/runs
prob
```

<span style="color:red">
##### 2B)  List in seperate lines the values for mean, standard deviation, and probability for all three cases: 100, 1000, and 10000 simulations. Describe how the values change/behave as the number of simulations is increased. What is your best bet on the probability of occurrence greater than 5% and why? How is this behavior similar to the introductory image use case demonstration to calculate `pi`? 
</span>

Sample: 100 Mean:  0.01210434 Standard Deviation:0.9556717 Probability: 0.44

Sample: 1000 Mean: 0.03734675 Standard Deviation: 0.983215 Probability: 0.492

Sample: 10000 Mean: 0.04024144  Standard Deviation: 0.9639549  Probability: 0.4994

As the number of simulations increased, the average got closer and closer to 0.03 standard deviation got closer to 0.97. In terms of the probability differences, it stayed around 0.50 which means 50% probability. I would guess that the probability is around 49% because that was about the number for the most accurate simulation. This is similar to the image that was presented in the introductory paragraph because they both make the point that with more samples, the more accurate your results will be.

The last exercise is optional for those interested in further enhancing their subject matter learning, and refining their problem skills in R. Your work will be assessed but you will not be graded for this exercise. You can follow the instructions presented in the video Excel equivalent example at [https://www.youtube.com/watch?v=wKdmEXCvo9s]  

<span style="color:red">
##### 2C-Optional!)  Repeat the exercise for the S&P daily return where all is equal except we are now interested in the weekly cumulative return and the probability that the weekly cummulative return is greater than 5%. Set the number of simulations to 10000.
</span>

```{r}
# number of simulations/samples
runs = 10000
# random number generator per defined normal distribution with given mean and standard deviation
sims =  rnorm(runs,mean=0.03,sd=0.97)

# Mean calculated from the random distribution of samples
average = mean(sims)
average

# STD calculated from the random distribution of samples
std = sd(sims) 
std


# probability of occurrence on any given day based on samples will be equal to count (or sum) where sample result is greater than 5% divided by total number of samples. 
prob = sum(sims >=0.05)/runs
prob
```




