Answer the following questions by yourself in as much detail as possible. I take academic integrity very seriously and I know that you all do as well as it is what retains, or in its absence destroys, the value of our joint mission to excel. Partial credit will be awarded in the event that the reasoning is sufficiently clear to uncover the parts of a question correctly answered; be systematic in solving the problems. This will always take you far. Supply answers in relevant metrics with complete answers; computer output alone is insufficient. Faulty work could cost the company millions. Moreover, we really do wish to explain what we are discovering along the way to build toward a justification for our recommendation in the final phase.
You have been given three data sets for use in tackling the problem. We have two data files that are identical: an Excel workbook and an .RData file.
Financials: In one spreadsheet, we have revenues and expenses data for the last 36 weeks for two product lines in frozen foods (measured in 1000s of dollars per week). Pay no mind to the times that may appear alongside the dates this is an Excel vestige.
- Tidy: Revenues and Expenses are recorded by Product by Week and WeekNo.
- Kiev.Expense records Expenses for the Chicken Kiev product in 1000s of US dollars.
- Kiev.Revenue records Revenues for the Chicken Kiev product in 1000s of US dollars.
- Marsala.Expense records Expenses for the Chicken Marsala product in 1000s of US dollars.
- Marsala.Revenue records Revenues for the Chicken Marsala product in 1000s of US dollars.
Satisfaction: In one spreadsheet, the levels of product satisfaction are recorded for random samples of customers for each product.
+ Tidy: Response records Like or NotLike for the given Product in that row.
+ Kiev.Like records Like or NotLike for the sample asked about Chicken Kiev.
+ Marsala.Like records Like or NotLike for the sample asked about Chicken Marsala.InputCost: In one spreadsheet, the costs of inputs for the two products are described in Raw and Tidy form.
+ Tidy: Cost is the realized cost of inputs (per unit) in US dollars for Product
+ Kiev.Cost is the realized cost of inputs (per unit) in US dollars for Chicken Kiev.
+ Marsala.Cost is the realized cost of inputs (per unit) in US dollars for Chicken Marsala.
A major grocer’s food division has a problem and your job is to recommend a course of action. UnNamed Grocer (UNG) is currently outsourcing the production of Chicken Kiev and Chicken Marsala to the same vendor. The vendor has suffered permanent setbacks due to COVID-19 related supply chain decimation and can no longer fully supply the grocer’s required quantities for both products. The vendor can continue to supply either one of the two products but not both and a new vendor will be needed if both products are continued. We will wish to examine the customer satisfaction, the financial elements, and cost structures are important elements in the decision because margins in the business are quite tight.
Key company procedures and definitions precede questions in some sections. This happens somewhat frequently so there are well established rules to use to evaluate the data in many cases. The company demands use of a 95% confidence interval/level for all decision making unless otherwise noted. Never forget that R’s help ? is your friend.
## [1] "Financials.Raw" "Financials.Tidy" "InputCost.Raw"
## [4] "InputCost.Tidy" "Satisfaction.Raw" "Satisfaction.Tidy"
If we see the names of the spreadsheets above, they are active in this point in the Markdown. As described below, there are three spreadsheets that have been imported into six data.frames in R.
result <- prob_norm(mean = 0, stdev = 1, lb = -1, ub = 1)
summary(result)
## Probability calculator
## Distribution: Normal
## Mean : 0
## St. dev : 1
## Lower bound : -1
## Upper bound : 1
##
## P(X < -1) = 0.159
## P(X > -1) = 0.841
## P(X < 1) = 0.841
## P(X > 1) = 0.159
## P(-1 < X < 1) = 0.683
## 1 - P(-1 < X < 1) = 0.317
plot(result)
The probability would be about .683
result <- prob_norm(mean = 0, stdev = 1, lb = -1.5, ub = 0.8)
summary(result)
## Probability calculator
## Distribution: Normal
## Mean : 0
## St. dev : 1
## Lower bound : -1.5
## Upper bound : 0.8
##
## P(X < -1.5) = 0.067
## P(X > -1.5) = 0.933
## P(X < 0.8) = 0.788
## P(X > 0.8) = 0.212
## P(-1.5 < X < 0.8) = 0.721
## 1 - P(-1.5 < X < 0.8) = 0.279
plot(result)
The probability would be about .721
result <- prob_norm(mean = 25, stdev = 5, plb = 0.2, pub = 0.8)
summary(result, type = "probs")
## Probability calculator
## Distribution: Normal
## Mean : 25
## St. dev : 5
## Lower bound : 0.2
## Upper bound : 0.8
##
## P(X < 20.792) = 0.2
## P(X > 20.792) = 0.8
## P(X < 29.208) = 0.8
## P(X > 29.208) = 0.2
## P(20.792 < X < 29.208) = 0.6
## 1 - P(20.792 < X < 29.208) = 0.4
plot(result, type = "probs")
The lower bound would be 20.792 meanwhile the upper bound would be 29.208, this would represent 80% of the data
1-ppois(6, 10)
## [1] 0.8698586
The probability of a misprint would be .869 or 87% when rounded
The log-normal probability distribution is a continuous probability distribution of a random variable whose logarithm has a normal distribution. If \(x\) is log-normal, then \(y = ln(x)\) is normal (or \(exp(y) = x\). It is also known as Galton’s distribution after Sir Francis Galton, a legend in the history of statistics, it is used to characterise how long people tend to linger on internet articles and the length of reddit posts along with the duration of games of chess and the size of living tissues. Other applications include hospitalizations during pandemics. In R, it is (d,p,q,r)lnorm(?, meanlog=0, sdlog=1) by default.
plnorm(1, meanlog = 0, sdlog = 1, lower.tail = TRUE, log.p = FALSE)
## [1] 0.5
The probability that the values are less than 1 would be .5
plnorm(.95, meanlog = 0, sdlog = 1, lower.tail = TRUE, log.p = FALSE)
## [1] 0.4795459
The probability of the values being under .95 would be ~.479
hist(rlnorm(1000, meanlog = 0, sdlog = 2))
summary(InputCost.Raw)
## ChickenKiev.Cost ChickenMarsala.Cost
## Min. :1.915 Min. :1.928
## 1st Qu.:2.373 1st Qu.:2.253
## Median :2.491 Median :2.514
## Mean :2.494 Mean :2.491
## 3rd Qu.:2.624 3rd Qu.:2.721
## Max. :3.012 Max. :3.001
## NA's :15
visualize(
InputCost.Raw,
xvar = c("ChickenKiev.Cost", "ChickenMarsala.Cost"),
combx = TRUE,
type = "dist",
custom = FALSE
)
Suppose that each product has input costs that are normally distributed with the mean and standard deviation given in the summaries above (you may round to two digits to the right of the decimal place). What is the probability of costs above $2.25
result <- prob_norm(mean = 2.494, stdev = 0.24, lb = 2.25, ub = 2.49)
summary(result)
## Probability calculator
## Distribution: Normal
## Mean : 2.494
## St. dev : 0.24
## Lower bound : 2.25
## Upper bound : 2.49
##
## P(X < 2.25) = 0.155
## P(X > 2.25) = 0.845
## P(X < 2.49) = 0.493
## P(X > 2.49) = 0.507
## P(2.25 < X < 2.49) = 0.339
## 1 - P(2.25 < X < 2.49) = 0.661
plot(result)
The probability of it being above 2.25 would be ~84%
+ for Chicken Marsala?
result <- prob_norm(
mean = 2.491,
stdev = 0.275,
lb = 2.25,
ub = 2.491
)
summary(result)
## Probability calculator
## Distribution: Normal
## Mean : 2.491
## St. dev : 0.275
## Lower bound : 2.25
## Upper bound : 2.491
##
## P(X < 2.25) = 0.19
## P(X > 2.25) = 0.81
## P(X < 2.491) = 0.5
## P(X > 2.491) = 0.5
## P(2.25 < X < 2.491) = 0.31
## 1 - P(2.25 < X < 2.491) = 0.69
plot(result)
For chicken marsala, the probability would be 81%
result <- prob_norm(mean = 2.494, stdev = .24, plb = 0.05, pub = 0.95)
summary(result, type = "probs")
## Probability calculator
## Distribution: Normal
## Mean : 2.494
## St. dev : 0.24
## Lower bound : 0.05
## Upper bound : 0.95
##
## P(X < 2.099) = 0.05
## P(X > 2.099) = 0.95
## P(X < 2.889) = 0.95
## P(X > 2.889) = 0.05
## P(2.099 < X < 2.889) = 0.9
## 1 - P(2.099 < X < 2.889) = 0.1
plot(result, type = "probs")
result <- prob_norm(
mean = 2.494,
stdev = 0.24,
lb = 2.494,
ub = 2.65
)
summary(result)
## Probability calculator
## Distribution: Normal
## Mean : 2.494
## St. dev : 0.24
## Lower bound : 2.494
## Upper bound : 2.65
##
## P(X < 2.494) = 0.5
## P(X > 2.494) = 0.5
## P(X < 2.65) = 0.742
## P(X > 2.65) = 0.258
## P(2.494 < X < 2.65) = 0.242
## 1 - P(2.494 < X < 2.65) = 0.758
plot(result)
The probability would be 50%
result <- prob_norm(mean = 2.491, stdev = .275, plb = 0.05, pub = 0.95)
summary(result, type = "probs")
## Probability calculator
## Distribution: Normal
## Mean : 2.491
## St. dev : 0.275
## Lower bound : 0.05
## Upper bound : 0.95
##
## P(X < 2.039) = 0.05
## P(X > 2.039) = 0.95
## P(X < 2.943) = 0.95
## P(X > 2.943) = 0.05
## P(2.039 < X < 2.943) = 0.9
## 1 - P(2.039 < X < 2.943) = 0.1
plot(result, type = "probs")
result <- prob_norm(
mean = 2.491,
stdev = 0.275,
lb = 2.45,
ub = 2.491
)
summary(result)
## Probability calculator
## Distribution: Normal
## Mean : 2.491
## St. dev : 0.275
## Lower bound : 2.45
## Upper bound : 2.491
##
## P(X < 2.45) = 0.441
## P(X > 2.45) = 0.559
## P(X < 2.491) = 0.5
## P(X > 2.491) = 0.5
## P(2.45 < X < 2.491) = 0.059
## 1 - P(2.45 < X < 2.491) = 0.941
plot(result)
The probability that it would be under 2.45 would be 44.1%
Rule:
* The true probability of Like must be greater than or equal to 0.5 to continue a product.
table(Satisfaction.Raw$ChickenKiev.Like)
##
## Like NotLike
## 75 52
table(Satisfaction.Raw$ChickenMarsala.Like)
##
## Like NotLike
## 64 38
+ What is the probability of Like for Chicken Kiev?
75/127
## [1] 0.5905512
59.05% would be the probability for Chicken Kiev
+ What is the probability of Like for Chicken Marsala?
64/102
## [1] 0.627451
62.74% Would be the probability for Chicken Marsala
+ Which product is more popular, taking no account of uncertainty?
It seems like Chicken Marsala was more popular based on the overall percentage of likes vs notlikes in their sample
prop.test(table(Satisfaction.Raw$ChickenMarsala.Like))
##
## 1-sample proportions test with continuity correction
##
## data: table(Satisfaction.Raw$ChickenMarsala.Like), null probability 0.5
## X-squared = 6.1275, df = 1, p-value = 0.01331
## alternative hypothesis: true p is not equal to 0.5
## 95 percent confidence interval:
## 0.5256440 0.7195325
## sample estimates:
## p
## 0.627451
It should be continued as it was not lower than 5%
prop.test(table(Satisfaction.Raw$ChickenKiev.Like))
##
## 1-sample proportions test with continuity correction
##
## data: table(Satisfaction.Raw$ChickenKiev.Like), null probability 0.5
## X-squared = 3.811, df = 1, p-value = 0.05092
## alternative hypothesis: true p is not equal to 0.5
## 95 percent confidence interval:
## 0.4996549 0.6758894
## sample estimates:
## p
## 0.5905512
It should continue, The p value is above .50
Chicken Marsala was preferred as it had a higher p value(.62) compared to chicken kiev (.59)
Product line financials are equally or more important. Average net income is the key decision metric. It should obviously be maximized. You can use R or Microsoft Excel for the calculation of net income.
Rules:
* Products with average revenues less than 100,000 must be discontinued.
* Outstanding food products average over $10,000 in net income per week and must be continued.
* Products should be continued if their average net incomes are positive.
* Products should be prioritized if their average net incomes are different.
result <- explore(
Financials.Raw,
vars = c(
"ChickenKiev.Expense",
"ChickenKiev.Revenue",
"ChickenMarsala.Revenue",
"ChickenMarsala.Expense"
),
fun = c(
"n_obs", "sd", "mean", "min", "max", "median", "p10",
"p25", "p90"
),
nr = Inf
)
## Warning: attributes are not identical across measure variables;
## they will be dropped
# summary()
dtab(result) %>% render()
The standard deviations for all the values seem pretty large and similar to each other, except for chicken marsala revenue, sitting at 6.5, having less noise compared to the rest.
The averages for all of them are also real close, except chicken kiev seems to have higher averages on both revenue and expense compared to chicken marsala.
visualize(
Financials.Raw,
xvar = c("ChickenKiev.Revenue", "ChickenMarsala.Revenue"),
type = "dist",
custom = FALSE
)
visualize(
Financials.Raw,
xvar = c("ChickenKiev.Expense", "ChickenMarsala.Expense"),
type = "dist",
custom = FALSE
)
result <- prob_norm(
mean = 111.774,
stdev = 9.999,
plb = 0.05,
pub = 0.95
)
summary(result, type = "probs")
## Probability calculator
## Distribution: Normal
## Mean : 111.774
## St. dev : 9.999
## Lower bound : 0.05
## Upper bound : 0.95
##
## P(X < 95.327) = 0.05
## P(X > 95.327) = 0.95
## P(X < 128.221) = 0.95
## P(X > 128.221) = 0.05
## P(95.327 < X < 128.221) = 0.9
## 1 - P(95.327 < X < 128.221) = 0.1
plot(result, type = "probs")
result <- prob_norm(mean = 98.785, stdev = 10.735, plb = 0.10, pub = 0.90)
summary(result, type = "probs")
## Probability calculator
## Distribution: Normal
## Mean : 98.785
## St. dev : 10.735
## Lower bound : 0.1
## Upper bound : 0.9
##
## P(X < 85.028) = 0.1
## P(X > 85.028) = 0.9
## P(X < 112.542) = 0.9
## P(X > 112.542) = 0.1
## P(85.028 < X < 112.542) = 0.8
## 1 - P(85.028 < X < 112.542) = 0.2
plot(result, type = "probs")
#ChickenMarsala
result <- prob_norm(mean = 96.990, stdev = 8.438, plb = 0.10, pub = 0.90)
summary(result, type = "probs")
## Probability calculator
## Distribution: Normal
## Mean : 96.99
## St. dev : 8.438
## Lower bound : 0.1
## Upper bound : 0.9
##
## P(X < 86.176) = 0.1
## P(X > 86.176) = 0.9
## P(X < 107.804) = 0.9
## P(X > 107.804) = 0.1
## P(86.176 < X < 107.804) = 0.8
## 1 - P(86.176 < X < 107.804) = 0.2
plot(result, type = "probs")
ChickenKiev
summary(Financials.Raw$ChickenKiev.Revenue - Financials.Raw$ChickenKiev.Expense)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.734 8.191 14.046 12.989 16.812 28.692
Chickenmarsala
summary(Financials.Raw$ChickenMarsala.Revenue - Financials.Raw$ChickenMarsala.Expense)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.705 5.336 8.479 7.967 10.898 13.251
plot(Financials.Tidy$WeekNo, (Financials.Tidy$Revenues - Financials.Tidy$Expenses), xlabel = "Week", ylabel = "NetIncome")
## Warning in plot.window(...): "xlabel" is not a graphical parameter
## Warning in plot.window(...): "ylabel" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "xlabel" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "ylabel" is not a graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "xlabel" is not a
## graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "ylabel" is not a
## graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "xlabel" is not a
## graphical parameter
## Warning in axis(side = side, at = at, labels = labels, ...): "ylabel" is not a
## graphical parameter
## Warning in box(...): "xlabel" is not a graphical parameter
## Warning in box(...): "ylabel" is not a graphical parameter
## Warning in title(...): "xlabel" is not a graphical parameter
## Warning in title(...): "ylabel" is not a graphical parameter
No product needs to be discontinued. If you reference the summary in the table a bit above (Q.1 of Basic Financial Variables) both items were above the 100,000 breakpoint
+ Provide 95% confidence intervals for average revenues for each product.
result <- prob_norm(mean = 108.365, stdev = 9.052, plb = 0.05, pub = 0.95)
summary(result, type = "probs")
## Probability calculator
## Distribution: Normal
## Mean : 108.365
## St. dev : 9.052
## Lower bound : 0.05
## Upper bound : 0.95
##
## P(X < 93.476) = 0.05
## P(X > 93.476) = 0.95
## P(X < 123.254) = 0.95
## P(X > 123.254) = 0.05
## P(93.476 < X < 123.254) = 0.9
## 1 - P(93.476 < X < 123.254) = 0.1
plot(result, type = "probs")
Average = 108.365625 Standard Deviation = 9.052668651 (I got these values through excel, just doing the simple coding under the combined revenues column)
Chicken kiev would be considered outstanding as it is over 10k net income (mean of 12.989), meanwhile chicken marsala is under 10k (mean of 7.967)
a. What is the 95% confidence interval for net income for **Chicken.Kiev**?
result <- prob_norm(mean = 12.989, stdev = 6.236, plb = 0.05, pub = 0.95)
summary(result, type = "probs")
## Probability calculator
## Distribution: Normal
## Mean : 12.989
## St. dev : 6.236
## Lower bound : 0.05
## Upper bound : 0.95
##
## P(X < 2.732) = 0.05
## P(X > 2.732) = 0.95
## P(X < 23.246) = 0.95
## P(X > 23.246) = 0.05
## P(2.732 < X < 23.246) = 0.9
## 1 - P(2.732 < X < 23.246) = 0.1
plot(result, type = "probs")
pnorm(.95, mean = 12.989, sd = 6.236, lower.tail = TRUE, log.p = FALSE)
## [1] 0.02676847
I calculated standard deviation on excel with the column I added for determining net income of each product seperately
b. Should **Chicken.Kiev** be continued (by net income)?
Yes, Chicken Kiev should be continued based on net income as it remains positive
c. What is the 95% confidence interval for net income for **Chicken.Marsala**?
result <- prob_norm(mean = 7.967, stdev = 3.586, plb = 0.05, pub = 0.95)
summary(result, type = "probs")
## Probability calculator
## Distribution: Normal
## Mean : 7.967
## St. dev : 3.586
## Lower bound : 0.05
## Upper bound : 0.95
##
## P(X < 2.069) = 0.05
## P(X > 2.069) = 0.95
## P(X < 13.865) = 0.95
## P(X > 13.865) = 0.05
## P(2.069 < X < 13.865) = 0.9
## 1 - P(2.069 < X < 13.865) = 0.1
plot(result, type = "probs")
pnorm(.95, mean = 7.967, sd = 3.586, lower.tail = TRUE, log.p = FALSE)
## [1] 0.02518688
I calculated SD in excel as well like with chicken kiev
d. Should **Chicken.Marsala** be continued (by net income)?
Based on new income, it should also be continued as it does not fall into the negative
There is a difference in net income at least in the average and standard deviation values. Chicken kiev is almost double in both values compared to chicken marsala. However, in terms of the confidence value, Chicken kiev is only higher by ~.001
+ Which one is more profitable?
It seems like Chicken Kiev is more profitable based on the net income calculations which showed it earned more after even its costs
+ Provide the [two-sided] 95% confidence interval for the average difference in net income.
result <- prob_norm(
mean = 5.022,
stdev = 6.485,
plb = 0.05,
pub = 0.95
)
summary(result, type = "probs")
## Probability calculator
## Distribution: Normal
## Mean : 5.022
## St. dev : 6.485
## Lower bound : 0.05
## Upper bound : 0.95
##
## P(X < -5.645) = 0.05
## P(X > -5.645) = 0.95
## P(X < 15.689) = 0.95
## P(X > 15.689) = 0.05
## P(-5.645 < X < 15.689) = 0.9
## 1 - P(-5.645 < X < 15.689) = 0.1
plot(result, type = "probs")
Some recipes are more prone to harboring disease than others. The company is willing to accept the occasional risk but wishes to discontinue even moderately dangerous products.
Historically, Chicken Marsala has had recalls arrive at a rate of 4 per year. What is the probability of seven or fewer recalls in a given year for Chicken Marsala?
1-ppois(4, 7)
## [1] 0.8270084
Which do you recommend and why? Summarise all of the relevant findings as justification.
I am not quite sure how to answe this one as I only have data for chicken marsala to work with. Chicken kiev recall rates were not provided or I just did not know how to go about this question with the data provided. So, for what I have, I would discontinue chicken marsala as it had a high value based on a poisson test
For up to 5 bonus points, summarize everything that is important in a single graphic.