Business Analytics Lab Worksheet 08

About

In this lab we will focus on sensitivity analysis and Monte Carlo simulations.

Sensitivity analysis is the study of how the uncertainty in the output of a mathematical model or system (numerical or otherwise) can be apportioned to different sources of uncertainty in its inputs. We will use the lpSolveAPI R-package as we did in the previous lab.

Monte Carlo Simulations utilize repeated random sampling from a given universe or population to derive certain results. This type of simulation is known as a probabilistic simulation, as opposed to a deterministic simulation.

An example of a Monte Carlo simulation is the one applied to approximate the value of pi. The simulation is based on generating random points within a unit square and see how many points fall within the circle enclosed by the unit square (marked in red). The higher the number of sampled points the closer the result is to the actual result. After selecting 30,000 random points, the estimate for pi is much closer to the actual value within the four decimal points of precision.

Monte Carlo Simulation

In this lab, we will learn how to generate random samples with various simulations and how to run a sensitivity analysis on the marketing use case covered so far.

Setup

Remember to always set your working directory to the source file location. Go to ‘Session’, scroll down to ‘Set Working Directory’, and click ‘To Source File Location’. Read carefully the below and follow the instructions to complete the tasks and answer any questions. Submit your work to RPubs as detailed in previous notes.

Note

For your assignment you may be using different data sets than what is included here. Always read carefully the instructions on Sakai. Tasks/questions to be completed/answered are highlighted in larger bolded fonts and numbered according to their particular placement in the task section.

Task 1: Sensitivity Analysis

In order to conduct the sensitivity analysis, we will need to download again the lpSolveAPI package unless you have it already installed in your R environment

# Require will load the package only if not installed 
# Dependencies = TRUE makes sure that dependencies are install
if(!require("lpSolveAPI",quietly = TRUE))
  install.packages("lpSolveAPI",dependencies = TRUE, repos = "https://cloud.r-project.org")

We will revisit and solve again the marketing case discussed in class (also part of previous lab).

# We start with `0` constraint and `2` decision variables. The object name `lpmark` is discretionary.
lpmark = make.lp(0, 2)

# Define type of optimization as maximum and dump the screen output into a `dummy` variable
dummy = lp.control(lpmark, sense="max") 
# Set the objective function coefficients 
set.objfn(lpmark, c(275.691, 48.341))

Add all constraints to the model.

add.constraint(lpmark, c(1, 1), "<=", 350000)
add.constraint(lpmark, c(1, 0), ">=", 15000)
add.constraint(lpmark, c(0, 1), ">=", 75000)
add.constraint(lpmark, c(2, -1), "=", 0)
add.constraint(lpmark, c(1, 0), ">=", 0)
add.constraint(lpmark, c(0, 1), ">=", 0)

Now, view the problem setting in tabular/matrix form. This is a good checkpoint to confirm that our contraints have been properly set.

lpmark

## Model name: 
##                C1       C2            
## Maximize  275.691   48.341            
## R1              1        1  <=  350000
## R2              1        0  >=   15000
## R3              0        1  >=   75000
## R4              2       -1   =       0
## R5              1        0  >=       0
## R6              0        1  >=       0
## Kind          Std      Std            
## Type         Real     Real            
## Upper         Inf      Inf            
## Lower           0        0

# solve
solve(lpmark)

## [1] 0

Next we get the optimum results.

# display the objective function optimum value
get.objective(lpmark)

## [1] 43443517

# display the decision variables optimum values
get.variables(lpmark)

## [1] 116666.7 233333.3

For the sensitivity part we will add two new code sections to obtain the sensitivity results.

# display sensitivity to coefficients of objective function. 
get.sensitivity.obj(lpmark)

## $objfrom
## [1]  -96.6820 -137.8455
## 
## $objtill
## [1] 1e+30 1e+30

1A) For this exercise we are only interested in the first part of the output labeled `objfrom`. Explain in coincise manner what the sensitivity results represent in reference to the marketing model.

If you change the coefficient for Radio (-96.6820 to 1e+30(infinity)), your optimum solution(sales) will not change, it is non-binding between these values.

If you change the coefficient for TV (-137.8455 to 1e+30(infinity)), your optimum solution(sales) will not change, due to the non-binding quality of this range of values.

# display sensitivity to right hand side constraints. 
# There will be a total of m+n values where m is the number of contraints and n is the number of decision variables
get.sensitivity.rhs(lpmark)

## $duals
## [1] 124.12433   0.00000   0.00000  75.78333   0.00000   0.00000   0.00000
## [8]   0.00000
## 
## $dualsfrom
## [1]  1.125e+05 -1.000e+30 -1.000e+30 -3.050e+05 -1.000e+30 -1.000e+30
## [7] -1.000e+30 -1.000e+30
## 
## $dualstill
## [1] 1.00e+30 1.00e+30 1.00e+30 4.75e+05 1.00e+30 1.00e+30 1.00e+30 1.00e+30

1B) For this exercise we are only interested in the first part of the output labeled `duals`. Explain in coincise manner what the two non-zero sensitivity results represent. Distinguish the binding/non-binding constraints, the surplus/slack, and marginal values.

Regarding the ‘duals’ values, the two non-zero sensitivity results are marginal and binding & non-binding, so with individual dollar increases, the optimum value is changed. Value 1: 124.12. What this value translates to is that if you were to increase $350,000 by $1, the optimum value would change by $124.12. This value is binding because it is equal to a value greater than 0.

Value 2: 75.78. Being that it is the fourth value, this translates to the idea that if we were to add $1 to the fourth constraint 2XRadio - XTV (represented by 2, -1 in line 65), that our optimum value would change by $75.78. This value is non-binding because the values of 2, -1 are equal to 0. This is a signal of surplus being that there is a surplus in the budget if there is a marginal increase.

To acquire a better understanding of the sensitivity results, and to confirm integrity of the calculations, independent tests can be conducted.

1C) Run the linear programing solver again starting from the begining, by defining a new model object `lpmark1`. All being equal, change the budget constraint by only $1 and solve. Note the optimum value for sales as given by the objective function. Calculate the differential change in sales. Share your observations.

# Define a new model object called lpmark1,
lpmark1 = make.lp(0, 2)
# Repeat rest of commands with the one constraint change for budget. Solve and display the objective function optimum value 
# (add $1 to $350,000). 
# see line 54, change to lpmark'1'
dummy = lp.control(lpmark1, sense="max")
set.objfn(lpmark1, c(275.691, 48.341))
#do I change these^^ to 124.12 and $75.78??
add.constraint(lpmark1, c(1, 1), "<=", 350001)
add.constraint(lpmark1, c(1, 0), ">=", 15000)
add.constraint(lpmark1, c(0, 1), ">=", 75000)
add.constraint(lpmark1, c(2, -1), "=", 0)
add.constraint(lpmark1, c(1, 0), ">=", 0)
add.constraint(lpmark1, c(0, 1), ">=", 0)
lpmark1

## Model name: 
##                C1       C2            
## Maximize  275.691   48.341            
## R1              1        1  <=  350001
## R2              1        0  >=   15000
## R3              0        1  >=   75000
## R4              2       -1   =       0
## R5              1        0  >=       0
## R6              0        1  >=       0
## Kind          Std      Std            
## Type         Real     Real            
## Upper         Inf      Inf            
## Lower           0        0

solve(lpmark1)

## [1] 0

get.objective(lpmark1)

## [1] 43443641

get.variables(lpmark1)

## [1] 116667 233334

get.sensitivity.rhs(lpmark1)

## $duals
## [1] 124.12433   0.00000   0.00000  75.78333   0.00000   0.00000   0.00000
## [8]   0.00000
## 
## $dualsfrom
## [1]  1.12500e+05 -1.00000e+30 -1.00000e+30 -3.05001e+05 -1.00000e+30
## [6] -1.00000e+30 -1.00000e+30 -1.00000e+30
## 
## $dualstill
## [1] 1.00000e+30 1.00000e+30 1.00000e+30 4.75002e+05 1.00000e+30 1.00000e+30
## [7] 1.00000e+30 1.00000e+30

get.sensitivity.obj(lpmark1)

## $objfrom
## [1]  -96.6820 -137.8455
## 
## $objtill
## [1] 1e+30 1e+30

Examining this data, one comes to the conclusion that increasing the constraint by $1 will have a very small effect on the decision variables. The change is by .3, or less than .003% and .002% respectively. The only other change is the optimum value of $43,443,641, up from $43,443,517. This is a $124 increase, which we saw in 1B.

1D) Based on the previous exercise explain in clear words, and without running another solver again , how would you check the integrity of the other marginal value identified in 1B).

To check the integrity of the marginal value of the fourth value of 75.78 in 1B, one would have to change the fourth constraint (R4) from 0 to 1 since we had a marginal increase. If one were to run another solver, they would see a slight increase as we did in 1C. From there, you could see if the optimum value changes.

Task 2: Monte Carlo Simulation

For this task we will be running a Monte Carlo simulation to calculate the probability that the daily return from S&P will be > 5%. We will assume that the historical S&P daily return follows a normal distribution with an average daily return of 0.03 (%) and a standard deviation of 0.97 (%).

To begin we will generate 100 random samples from the normal distribution. For the generated samples we will calculate the mean, standard deviation, and probability of occurrence where the simulation result is greater than 5%.

To generate random samples from a normal distribution we will use the rnorm() function in R. In the example below we set the number of runs (or samples) to 100.

# number of simulations/samples
runs = 100
# random number generator per defined normal distribution with given mean and standard deviation
sims =  rnorm(runs,mean=0.03,sd=0.97)

# Mean calculated from the random distribution of samples
average = mean(sims)
average

## [1] 0.04983691

# STD calculated from the random distribution of samples
std = sd(sims) 
std

## [1] 0.8648796

# probability of occurrence on any given day based on samples will be equal to count (or sum) where sample result is greater than 5% divided by total number of samples. 
prob = sum(sims >=0.05)/runs
prob

## [1] 0.51

2A) Repeat the above calculations for the two cases where the number of simulations/samples is equal to 1000 and 10000. For each case record the mean, standard deviation, and probability.

# Repeat calculations here
runsA = 1000
# random number generator per defined normal distribution with given mean and standard deviation
simsA =  rnorm(runsA,mean=0.03,sd=0.97)
averageA=mean(simsA)
averageA

## [1] 0.03097492

stdA=sd(simsA)
stdA

## [1] 0.9839586

probA=sum(simsA>=.05)/runsA
probA

## [1] 0.489

runsB = 10000
# random number generator per defined normal distribution with given mean and standard deviation
simsB =  rnorm(runsB,mean=0.03,sd=0.97)
averageB=mean(simsB)
averageB

## [1] 0.02502488

stdB=sd(simsB)
stdB

## [1] 0.9869048

probB=sum(simsB>=.05)/runsB
probB

## [1] 0.4905

2B) List in a tabular form the values for mean, standard deviation, and probability for all three cases: 100, 1000, and 10000 simulations. Describe how the values change/behave as the number of simulations is increased. What is your best bet on the probability of occurrence greater than 5% and why? How is this similar to the image use case to calculate `pi` that was presented in the introductory paragraph?

as.table(c(average, std, prob))

##          A          B          C 
## 0.04983691 0.86487965 0.51000000

as.table(c(averageA, stdB, probB))

##          A          B          C 
## 0.03097492 0.98690484 0.49050000

as.table(c(averageB, stdB, probB))

##          A          B          C 
## 0.02502488 0.98690484 0.49050000

mean(prob,probA,probB)

## [1] 0.51

The best bet on a probability of occurence greater than 5% is moderate to low. This is because our probability values, averaged together come to roughly 47%. So there is less than a 50% chance that we would see an occurence greater than 5%. Although, as the number of simulations is increased, our probability values will change. This is because of the Law of large numbers, stating that the more simulations exercised, the closer we get to our true average. This is simlar to the image use case used to calculate ‘pi’ in the introductory paragraph because as the number of points selected in the Monte Carlo simulation increased to 30,000, the true value within four decimal places of precision was reached. Likewise, if we were to conduct runs on 100000, instead of 1000 or 10000, we would come closer to our true value.

The last 2C) exercise is optional for those interested in further enhancing their subject matter learning, and refining their skills in R. Your work will be assessed but you will not be graded for this exercise. You can follow the instructions presented in the video Excel equivalent example at [https://www.youtube.com/watch?v=wKdmEXCvo9s]

2C) Repeat the exercise for the S&P daily return where all is equal except we are now interested in the weekly cumulative return and the probability that the weekly cummulative return is greater than 5%. Set the number of simulations to 10000.

Business Analytics Lab Worksheet 08

CME Group Foundation Business Analytics Lab

Iain Buehler

November 3rd, 2017

About

Setup

Note

Task 1: Sensitivity Analysis

1A) For this exercise we are only interested in the first part of the output labeled `objfrom`. Explain in coincise manner what the sensitivity results represent in reference to the marketing model.

1B) For this exercise we are only interested in the first part of the output labeled `duals`. Explain in coincise manner what the two non-zero sensitivity results represent. Distinguish the binding/non-binding constraints, the surplus/slack, and marginal values.

1D) Based on the previous exercise explain in clear words, and without running another solver again , how would you check the integrity of the other marginal value identified in 1B).

Task 2: Monte Carlo Simulation

2A) Repeat the above calculations for the two cases where the number of simulations/samples is equal to 1000 and 10000. For each case record the mean, standard deviation, and probability.

2C) Repeat the exercise for the S&P daily return where all is equal except we are now interested in the weekly cumulative return and the probability that the weekly cummulative return is greater than 5%. Set the number of simulations to 10000.

Business Analytics Lab Worksheet 08

CME Group Foundation Business Analytics Lab

Iain Buehler

November 3rd, 2017

About

Setup

Note

Task 1: Sensitivity Analysis

1A) For this exercise we are only interested in the first part of the output labeled objfrom. Explain in coincise manner what the sensitivity results represent in reference to the marketing model.

1B) For this exercise we are only interested in the first part of the output labeled duals. Explain in coincise manner what the two non-zero sensitivity results represent. Distinguish the binding/non-binding constraints, the surplus/slack, and marginal values.

1D) Based on the previous exercise explain in clear words, and without running another solver again , how would you check the integrity of the other marginal value identified in 1B).

Task 2: Monte Carlo Simulation

2A) Repeat the above calculations for the two cases where the number of simulations/samples is equal to 1000 and 10000. For each case record the mean, standard deviation, and probability.

2C) Repeat the exercise for the S&P daily return where all is equal except we are now interested in the weekly cumulative return and the probability that the weekly cummulative return is greater than 5%. Set the number of simulations to 10000.

1A) For this exercise we are only interested in the first part of the output labeled `objfrom`. Explain in coincise manner what the sensitivity results represent in reference to the marketing model.

1B) For this exercise we are only interested in the first part of the output labeled `duals`. Explain in coincise manner what the two non-zero sensitivity results represent. Distinguish the binding/non-binding constraints, the surplus/slack, and marginal values.