DATA604_Home_Work_8

Dilip Ganesan

6.5.4

Derive Inverse CDF Formula for generating random variates from a continous uniform distribution.

We know the CDF of continous uniform distribution from the DES-Banks as below.

$$ F(x) = \begin{cases} 0 & x < a
\ \frac{x - a}{b - a} & a \leq x \leq b \
1 & x > b \end{cases} $$

Setting the middle section equal to R and solving for R yields

F(X)=(X − a)/(b − a)=R
Solving X interms of R

X = R(b − a)+a

6.5.5

Derive Inverse CDF Formula for generating random variates from a Weibull distribution.

We know the CDF of Weibull distribution from the DES-Banks as below.

x ≥ 0, is
F(x)=1 − e^{( − x/λ)^k}

Setting the F(R)=R and solving for R yields

X = λ[−ln(1−R)]^1/k

6.5.8

# Loading Wholesale Prices for Oats, Peas, Beans and Barley
whole_oats = 1.05
whole_peas = 3.17
whole_beans = 1.99
whole_barle = 0.95

# Loading Retail Prices for Oats, Peas, Beans and Barley
retail_oats = 1.29
retail_peas = 3.76
retail_beans = 2.23
retail_barley = 1.65

# Demand Range.
demand_oats = data.frame(
              weight = c(0, 0.5, 1, 1.5, 2.0, 3.0, 4.0, 5.0, 7.5, 10.0),
              prob = c(0.05, 0.07, 0.09, 0.11, 0.15, 0.25, 0.10, 0.09, 0.06, 0.03)
              )
demand_peas = data.frame(
              weight = c(0, 0.5, 1.0, 1.5, 2.0, 3.0), 
              prob = c(0.1, 0.2, 0.2, 0.3, 0.1, 0.1)
              )
demand_beans = data.frame(
               weight = c(0, 1.0, 3.0, 4.5), 
               prob = c(0.2, 0.4, 0.3, 0.1)
              )
demand_barley = data.frame(
                weight = c(0, 0.5, 1.0, 3.5), 
                prob = c(0.2, 0.4, 0.3, 0.1)
              )

# Sample of each demand for 90 days 
oats_90_sam = sample(demand_oats$weight, prob=demand_oats$prob, 90,replace=TRUE)
peas_90_sam = sample(demand_peas$weight, prob=demand_peas$prob, 90, replace = TRUE)
beans_90_sam = sample(demand_beans$weight, prob=demand_beans$prob, 90, replace=TRUE)
bar_90_sam = sample(demand_barley$weight, prob=demand_barley$prob,90,replace = TRUE)

# Calculation of Cost Price, Selling Price and Profit
cost_price_day = oats_90_sam * whole_oats +
                 peas_90_sam * whole_peas +
                 beans_90_sam * whole_beans +
                 bar_90_sam * whole_barle


selling_price_day = oats_90_sam * retail_oats +
                  peas_90_sam * retail_peas +
                  beans_90_sam * retail_beans +
                  bar_90_sam * retail_barley

profit_per_day = selling_price_day - cost_price_day 

# Final Result in Data Frame.
finaldf = data.frame("Oats" = oats_90_sam,"Peas" = peas_90_sam,
                     "Beans"= beans_90_sam, "Barley"=bar_90_sam, 
                     "Cost Price" = cost_price_day, "Selling Price" 
                     = selling_price_day, "Profit" = profit_per_day)
knitr::kable(finaldf)

Oats	Peas	Beans	Barley	Cost.Price	Selling.Price	Profit
2.0	2.0	1.0	1.0	11.380	13.980	2.600
1.0	2.0	4.5	0.5	16.820	19.670	2.850
2.0	3.0	3.0	1.0	18.530	22.200	3.670
1.0	1.5	3.0	1.0	12.725	15.270	2.545
4.0	2.0	3.0	1.0	17.460	21.020	3.560
0.0	0.0	0.0	1.0	0.950	1.650	0.700
2.0	2.0	3.0	1.0	15.360	18.440	3.080
5.0	1.5	1.0	0.0	11.995	14.320	2.325
3.0	1.0	0.0	1.0	7.270	9.280	2.010
5.0	0.5	0.0	0.0	6.835	8.330	1.495
3.0	2.0	1.0	0.5	11.955	14.445	2.490
3.0	1.0	1.0	0.5	8.785	10.685	1.900
1.5	1.0	1.0	3.5	10.060	13.700	3.640
3.0	1.5	0.0	0.5	8.380	10.335	1.955
7.5	0.5	1.0	0.5	11.925	14.610	2.685
3.0	1.5	3.0	0.5	14.350	17.025	2.675
0.0	1.5	4.5	1.0	14.660	17.325	2.665
3.0	1.5	3.0	3.5	17.200	21.975	4.775
2.0	1.5	1.0	0.0	8.845	10.450	1.605
7.5	1.5	0.0	0.5	13.105	16.140	3.035
1.5	1.5	0.0	1.0	7.280	9.225	1.945
0.0	1.5	0.0	3.5	8.080	11.415	3.335
3.0	1.5	1.0	1.0	10.845	13.390	2.545
0.0	2.0	0.0	0.0	6.340	7.520	1.180
4.0	0.0	1.0	0.5	6.665	8.215	1.550
1.5	1.5	4.5	0.0	15.285	17.610	2.325
4.0	1.5	3.0	1.0	15.875	19.140	3.265
1.0	1.5	3.0	1.0	12.725	15.270	2.545
4.0	1.0	3.0	0.0	13.340	15.610	2.270
3.0	1.0	3.0	1.0	13.240	15.970	2.730
3.0	1.5	3.0	0.5	14.350	17.025	2.675
1.0	1.0	1.0	1.0	7.160	8.930	1.770
4.0	1.0	1.0	0.5	9.835	11.975	2.140
0.0	0.0	3.0	0.5	6.445	7.515	1.070
0.5	3.0	3.0	0.5	16.480	19.440	2.960
0.0	3.0	1.0	1.0	12.450	15.160	2.710
5.0	3.0	0.0	1.0	15.710	19.380	3.670
10.0	1.0	0.0	0.0	13.670	16.660	2.990
10.0	0.0	1.0	0.5	12.965	15.955	2.990
3.0	0.5	1.0	0.5	7.200	8.805	1.605
3.0	1.0	1.0	0.0	8.310	9.860	1.550
2.0	2.0	1.0	0.0	10.430	12.330	1.900
0.0	1.0	1.0	1.0	6.110	7.640	1.530
7.5	0.5	1.0	0.5	11.925	14.610	2.685
5.0	1.5	3.0	0.5	16.450	19.605	3.155
4.0	0.0	0.0	0.5	4.675	5.985	1.310
1.5	1.0	3.0	0.5	11.190	13.210	2.020
2.0	1.0	1.0	1.0	8.210	10.220	2.010
1.0	1.5	1.0	1.0	8.745	10.810	2.065
0.0	1.0	3.0	0.5	9.615	11.275	1.660
1.0	0.0	1.0	0.5	3.515	4.345	0.830
4.0	3.0	1.0	1.0	16.650	20.320	3.670
2.0	3.0	3.0	0.5	18.055	21.375	3.320
4.0	0.5	4.5	0.0	14.740	17.075	2.335
4.0	3.0	1.0	0.5	16.175	19.495	3.320
3.0	3.0	4.5	1.0	22.565	26.835	4.270
2.0	3.0	4.5	0.5	21.040	24.720	3.680
2.0	0.5	1.0	0.5	6.150	7.515	1.365
2.0	1.5	3.0	0.0	12.825	14.910	2.085
3.0	1.5	3.0	0.5	14.350	17.025	2.675
3.0	1.0	3.0	1.0	13.240	15.970	2.730
0.5	1.0	1.0	1.0	6.635	8.285	1.650
3.0	0.5	1.0	0.0	6.725	7.980	1.255
1.0	0.0	0.0	0.5	1.525	2.115	0.590
5.0	2.0	3.0	0.5	18.035	21.485	3.450
1.0	0.5	4.5	3.5	14.915	18.980	4.065
1.5	3.0	0.0	1.0	12.035	14.865	2.830
3.0	0.0	1.0	0.5	5.615	6.925	1.310
1.5	3.0	3.0	1.0	18.005	21.555	3.550
7.5	0.5	3.0	0.5	15.905	19.070	3.165
3.0	0.5	1.0	0.5	7.200	8.805	1.605
1.0	3.0	1.0	0.5	13.025	15.625	2.600
10.0	2.0	4.5	1.0	26.745	32.105	5.360
1.0	1.5	3.0	1.0	12.725	15.270	2.545
3.0	1.0	1.0	0.5	8.785	10.685	1.900
1.0	1.5	1.0	0.5	8.270	9.985	1.715
0.0	1.5	0.0	0.0	4.755	5.640	0.885
4.0	1.5	1.0	0.0	10.945	13.030	2.085
4.0	0.5	1.0	0.5	8.250	10.095	1.845
7.5	1.0	1.0	1.0	13.985	17.315	3.330
5.0	2.0	3.0	0.5	18.035	21.485	3.450
2.0	1.0	4.5	0.5	14.700	17.200	2.500
2.0	0.5	3.0	0.5	10.130	11.975	1.845
5.0	0.5	1.0	1.0	9.775	12.210	2.435
3.0	1.5	1.0	0.0	9.895	11.740	1.845
2.0	1.0	4.5	1.0	15.175	18.025	2.850
2.0	0.5	1.0	0.0	5.675	6.690	1.015
4.0	0.0	0.0	1.0	5.150	6.810	1.660
1.0	0.5	3.0	0.0	8.605	9.860	1.255
4.0	1.0	1.0	0.5	9.835	11.975	2.140

# Summation Data Frame
summationdf = data.frame("Total Cost Price" = sum(finaldf$Cost.Price),
                   "Total Selling Price" = sum(finaldf$Selling.Price),
                   "Total Profit" = sum(finaldf$Profit))

knitr::kable(summationdf)

Total.Cost.Price	Total.Selling.Price	Total.Profit
1040.55	1257.975	217.425

The simulation generated Revenue as 1257.975 and the cost price as 1040.55 with a Profit of 217.425

Util Function for Goodness of Fit.

q1 = read_excel('data.xlsx', sheet = '06_01', col_names = FALSE)$X0
q2 = read_excel('data.xlsx', sheet = '06_02', col_names = FALSE)$X0
q3 = read_excel('data.xlsx', sheet = '06_03', col_names = FALSE)$X0

cont = c('norm', 'exp', 'lnorm', 'gamma', 'weibull')
disc = c('geom', 'nbinom')


goodnessfit <- function(x, dists) {
  # set up container for fits
  fits = vector(mode = 'list', length = length(dists))
  names(fits) = dists
  # fit each distribution
  for (i in 1:length(dists)) {
    d = dists[i]
    fits[[d]] = fitdist(x, d)
  }
  # create plots comparing fits
  denscomp(fits, legendtext = dists)
  qqcomp(fits, legendtext = dists)
  cdfcomp(fits, legendtext = dists)
  ppcomp(fits, legendtext = dists)
  
  gofstat(fits)
 
}

6.5.1

goodnessfit(q1, cont)

## Goodness-of-fit statistics
##                              1-mle-norm 2-mle-exp 3-mle-lnorm 4-mle-gamma
## Kolmogorov-Smirnov statistic  0.1550512 0.1640203  0.05044292  0.07461236
## Cramer-von Mises statistic    0.3280986 0.2021652  0.01825121  0.05174037
## Anderson-Darling statistic    1.9697634 1.4234709  0.14142579  0.35230811
##                              5-mle-weibull
## Kolmogorov-Smirnov statistic    0.07934941
## Cramer-von Mises statistic      0.05847634
## Anderson-Darling statistic      0.44537170
## 
## Goodness-of-fit criteria
##                                1-mle-norm 2-mle-exp 3-mle-lnorm
## Akaike's Information Criterion   314.1765  294.1736    284.9146
## Bayesian Information Criterion   317.6518  295.9113    288.3900
##                                4-mle-gamma 5-mle-weibull
## Akaike's Information Criterion    288.2076      290.3611
## Bayesian Information Criterion    291.6830      293.8364

# From the comparison lognormal distribution looks the best fit.
dist='lnorm'
for(i in length(dist)){
  x <- fitdist(q1, dist[i])
  gofstat(x)
  plot(x)
}

6.5.2

goodnessfit(q2, cont)

## Goodness-of-fit statistics
##                              1-mle-norm 2-mle-exp 3-mle-lnorm 4-mle-gamma
## Kolmogorov-Smirnov statistic  0.1589070 0.3144366  0.06764782  0.09941791
## Cramer-von Mises statistic    0.3826912 1.1473789  0.03973639  0.11444195
## Anderson-Darling statistic    2.2740826 5.9711731  0.26739444  0.71667740
##                              5-mle-weibull
## Kolmogorov-Smirnov statistic     0.1140618
## Cramer-von Mises statistic       0.2008969
## Anderson-Darling statistic       1.3230284
## 
## Goodness-of-fit criteria
##                                1-mle-norm 2-mle-exp 3-mle-lnorm
## Akaike's Information Criterion   310.3453  320.4776    282.6829
## Bayesian Information Criterion   314.0456  322.3278    286.3832
##                                4-mle-gamma 5-mle-weibull
## Akaike's Information Criterion    288.3562      296.2684
## Bayesian Information Criterion    292.0565      299.9687

# From the comparison even for this question lognormal distribution looks the best fit.
dist='lnorm'
for(i in length(dist)){
  x <- fitdist(q2, dist[i])
  gofstat(x)
  plot(x)
}

6.5.3

goodnessfit(q3, disc)

## Chi-squared statistic:  31.0542 1.919848 
## Degree of freedom of the Chi-squared distribution:  3 2 
## Chi-squared p-value:  8.280024e-07 0.3829221 
##    the p-value may be wrong with some theoretical counts < 5  
## Chi-squared table:
##      obscounts theo 1-mle-geom theo 2-mle-nbinom
## <= 1         7       19.876752          9.265752
## <= 2        13        6.351383         10.230512
## <= 3        11        4.745696         10.078506
## <= 4         8        3.545942          7.446571
## > 4          6       10.480227          7.978659
## 
## Goodness-of-fit criteria
##                                1-mle-geom 2-mle-nbinom
## Akaike's Information Criterion   203.2825     166.2799
## Bayesian Information Criterion   205.0891     169.8932

# From the comparison nbinorm distribution looks the best fit.
dist='nbinom'
for(i in length(dist)){
  x <- fitdist(q3, dist[i])
  gofstat(x)
  plot(x)
}

References

https://stackoverflow.com/questions/29121040/how-to-properly-use-gofstat-in-r