DATA604_Home_Work_8

Dilip Ganesan

6.5.4

Derive Inverse CDF Formula for generating random variates from a continous uniform distribution.

We know the CDF of continous uniform distribution from the DES-Banks as below.

$$ F(x) = \begin{cases} 0 & x < a
\ \frac{x - a}{b - a} & a \leq x \leq b \
1 & x > b \end{cases} $$

Setting the middle section equal to R and solving for R yields

F(X)=(X − a)/(b − a)=R
Solving X interms of R

X = R(b − a)+a

6.5.5

Derive Inverse CDF Formula for generating random variates from a Weibull distribution.

We know the CDF of Weibull distribution from the DES-Banks as below.

x ≥ 0, is
F(x)=1 − e( − x/λ)k

Setting the F(R)=R and solving for R yields

X = λ[−ln(1−R)]1/k

6.5.8

# Loading Wholesale Prices for Oats, Peas, Beans and Barley
whole_oats = 1.05
whole_peas = 3.17
whole_beans = 1.99
whole_barle = 0.95

# Loading Retail Prices for Oats, Peas, Beans and Barley
retail_oats = 1.29
retail_peas = 3.76
retail_beans = 2.23
retail_barley = 1.65

# Demand Range.
demand_oats = data.frame(
              weight = c(0, 0.5, 1, 1.5, 2.0, 3.0, 4.0, 5.0, 7.5, 10.0),
              prob = c(0.05, 0.07, 0.09, 0.11, 0.15, 0.25, 0.10, 0.09, 0.06, 0.03)
              )
demand_peas = data.frame(
              weight = c(0, 0.5, 1.0, 1.5, 2.0, 3.0), 
              prob = c(0.1, 0.2, 0.2, 0.3, 0.1, 0.1)
              )
demand_beans = data.frame(
               weight = c(0, 1.0, 3.0, 4.5), 
               prob = c(0.2, 0.4, 0.3, 0.1)
              )
demand_barley = data.frame(
                weight = c(0, 0.5, 1.0, 3.5), 
                prob = c(0.2, 0.4, 0.3, 0.1)
              )

# Sample of each demand for 90 days 
oats_90_sam = sample(demand_oats$weight, prob=demand_oats$prob, 90,replace=TRUE)
peas_90_sam = sample(demand_peas$weight, prob=demand_peas$prob, 90, replace = TRUE)
beans_90_sam = sample(demand_beans$weight, prob=demand_beans$prob, 90, replace=TRUE)
bar_90_sam = sample(demand_barley$weight, prob=demand_barley$prob,90,replace = TRUE)

# Calculation of Cost Price, Selling Price and Profit
cost_price_day = oats_90_sam * whole_oats +
                 peas_90_sam * whole_peas +
                 beans_90_sam * whole_beans +
                 bar_90_sam * whole_barle


selling_price_day = oats_90_sam * retail_oats +
                  peas_90_sam * retail_peas +
                  beans_90_sam * retail_beans +
                  bar_90_sam * retail_barley

profit_per_day = selling_price_day - cost_price_day 

# Final Result in Data Frame.
finaldf = data.frame("Oats" = oats_90_sam,"Peas" = peas_90_sam,
                     "Beans"= beans_90_sam, "Barley"=bar_90_sam, 
                     "Cost Price" = cost_price_day, "Selling Price" 
                     = selling_price_day, "Profit" = profit_per_day)
knitr::kable(finaldf)
Oats Peas Beans Barley Cost.Price Selling.Price Profit
2.0 2.0 1.0 1.0 11.380 13.980 2.600
1.0 2.0 4.5 0.5 16.820 19.670 2.850
2.0 3.0 3.0 1.0 18.530 22.200 3.670
1.0 1.5 3.0 1.0 12.725 15.270 2.545
4.0 2.0 3.0 1.0 17.460 21.020 3.560
0.0 0.0 0.0 1.0 0.950 1.650 0.700
2.0 2.0 3.0 1.0 15.360 18.440 3.080
5.0 1.5 1.0 0.0 11.995 14.320 2.325
3.0 1.0 0.0 1.0 7.270 9.280 2.010
5.0 0.5 0.0 0.0 6.835 8.330 1.495
3.0 2.0 1.0 0.5 11.955 14.445 2.490
3.0 1.0 1.0 0.5 8.785 10.685 1.900
1.5 1.0 1.0 3.5 10.060 13.700 3.640
3.0 1.5 0.0 0.5 8.380 10.335 1.955
7.5 0.5 1.0 0.5 11.925 14.610 2.685
3.0 1.5 3.0 0.5 14.350 17.025 2.675
0.0 1.5 4.5 1.0 14.660 17.325 2.665
3.0 1.5 3.0 3.5 17.200 21.975 4.775
2.0 1.5 1.0 0.0 8.845 10.450 1.605
7.5 1.5 0.0 0.5 13.105 16.140 3.035
1.5 1.5 0.0 1.0 7.280 9.225 1.945
0.0 1.5 0.0 3.5 8.080 11.415 3.335
3.0 1.5 1.0 1.0 10.845 13.390 2.545
0.0 2.0 0.0 0.0 6.340 7.520 1.180
4.0 0.0 1.0 0.5 6.665 8.215 1.550
1.5 1.5 4.5 0.0 15.285 17.610 2.325
4.0 1.5 3.0 1.0 15.875 19.140 3.265
1.0 1.5 3.0 1.0 12.725 15.270 2.545
4.0 1.0 3.0 0.0 13.340 15.610 2.270
3.0 1.0 3.0 1.0 13.240 15.970 2.730
3.0 1.5 3.0 0.5 14.350 17.025 2.675
1.0 1.0 1.0 1.0 7.160 8.930 1.770
4.0 1.0 1.0 0.5 9.835 11.975 2.140
0.0 0.0 3.0 0.5 6.445 7.515 1.070
0.5 3.0 3.0 0.5 16.480 19.440 2.960
0.0 3.0 1.0 1.0 12.450 15.160 2.710
5.0 3.0 0.0 1.0 15.710 19.380 3.670
10.0 1.0 0.0 0.0 13.670 16.660 2.990
10.0 0.0 1.0 0.5 12.965 15.955 2.990
3.0 0.5 1.0 0.5 7.200 8.805 1.605
3.0 1.0 1.0 0.0 8.310 9.860 1.550
2.0 2.0 1.0 0.0 10.430 12.330 1.900
0.0 1.0 1.0 1.0 6.110 7.640 1.530
7.5 0.5 1.0 0.5 11.925 14.610 2.685
5.0 1.5 3.0 0.5 16.450 19.605 3.155
4.0 0.0 0.0 0.5 4.675 5.985 1.310
1.5 1.0 3.0 0.5 11.190 13.210 2.020
2.0 1.0 1.0 1.0 8.210 10.220 2.010
1.0 1.5 1.0 1.0 8.745 10.810 2.065
0.0 1.0 3.0 0.5 9.615 11.275 1.660
1.0 0.0 1.0 0.5 3.515 4.345 0.830
4.0 3.0 1.0 1.0 16.650 20.320 3.670
2.0 3.0 3.0 0.5 18.055 21.375 3.320
4.0 0.5 4.5 0.0 14.740 17.075 2.335
4.0 3.0 1.0 0.5 16.175 19.495 3.320
3.0 3.0 4.5 1.0 22.565 26.835 4.270
2.0 3.0 4.5 0.5 21.040 24.720 3.680
2.0 0.5 1.0 0.5 6.150 7.515 1.365
2.0 1.5 3.0 0.0 12.825 14.910 2.085
3.0 1.5 3.0 0.5 14.350 17.025 2.675
3.0 1.0 3.0 1.0 13.240 15.970 2.730
0.5 1.0 1.0 1.0 6.635 8.285 1.650
3.0 0.5 1.0 0.0 6.725 7.980 1.255
1.0 0.0 0.0 0.5 1.525 2.115 0.590
5.0 2.0 3.0 0.5 18.035 21.485 3.450
1.0 0.5 4.5 3.5 14.915 18.980 4.065
1.5 3.0 0.0 1.0 12.035 14.865 2.830
3.0 0.0 1.0 0.5 5.615 6.925 1.310
1.5 3.0 3.0 1.0 18.005 21.555 3.550
7.5 0.5 3.0 0.5 15.905 19.070 3.165
3.0 0.5 1.0 0.5 7.200 8.805 1.605
1.0 3.0 1.0 0.5 13.025 15.625 2.600
10.0 2.0 4.5 1.0 26.745 32.105 5.360
1.0 1.5 3.0 1.0 12.725 15.270 2.545
3.0 1.0 1.0 0.5 8.785 10.685 1.900
1.0 1.5 1.0 0.5 8.270 9.985 1.715
0.0 1.5 0.0 0.0 4.755 5.640 0.885
4.0 1.5 1.0 0.0 10.945 13.030 2.085
4.0 0.5 1.0 0.5 8.250 10.095 1.845
7.5 1.0 1.0 1.0 13.985 17.315 3.330
5.0 2.0 3.0 0.5 18.035 21.485 3.450
2.0 1.0 4.5 0.5 14.700 17.200 2.500
2.0 0.5 3.0 0.5 10.130 11.975 1.845
5.0 0.5 1.0 1.0 9.775 12.210 2.435
3.0 1.5 1.0 0.0 9.895 11.740 1.845
2.0 1.0 4.5 1.0 15.175 18.025 2.850
2.0 0.5 1.0 0.0 5.675 6.690 1.015
4.0 0.0 0.0 1.0 5.150 6.810 1.660
1.0 0.5 3.0 0.0 8.605 9.860 1.255
4.0 1.0 1.0 0.5 9.835 11.975 2.140
# Summation Data Frame
summationdf = data.frame("Total Cost Price" = sum(finaldf$Cost.Price),
                   "Total Selling Price" = sum(finaldf$Selling.Price),
                   "Total Profit" = sum(finaldf$Profit))

knitr::kable(summationdf)
Total.Cost.Price Total.Selling.Price Total.Profit
1040.55 1257.975 217.425

The simulation generated Revenue as 1257.975 and the cost price as 1040.55 with a Profit of 217.425

Util Function for Goodness of Fit.

q1 = read_excel('data.xlsx', sheet = '06_01', col_names = FALSE)$X0
q2 = read_excel('data.xlsx', sheet = '06_02', col_names = FALSE)$X0
q3 = read_excel('data.xlsx', sheet = '06_03', col_names = FALSE)$X0

cont = c('norm', 'exp', 'lnorm', 'gamma', 'weibull')
disc = c('geom', 'nbinom')


goodnessfit <- function(x, dists) {
  # set up container for fits
  fits = vector(mode = 'list', length = length(dists))
  names(fits) = dists
  # fit each distribution
  for (i in 1:length(dists)) {
    d = dists[i]
    fits[[d]] = fitdist(x, d)
  }
  # create plots comparing fits
  denscomp(fits, legendtext = dists)
  qqcomp(fits, legendtext = dists)
  cdfcomp(fits, legendtext = dists)
  ppcomp(fits, legendtext = dists)
  
  gofstat(fits)
 
}

6.5.1

goodnessfit(q1, cont)

## Goodness-of-fit statistics
##                              1-mle-norm 2-mle-exp 3-mle-lnorm 4-mle-gamma
## Kolmogorov-Smirnov statistic  0.1550512 0.1640203  0.05044292  0.07461236
## Cramer-von Mises statistic    0.3280986 0.2021652  0.01825121  0.05174037
## Anderson-Darling statistic    1.9697634 1.4234709  0.14142579  0.35230811
##                              5-mle-weibull
## Kolmogorov-Smirnov statistic    0.07934941
## Cramer-von Mises statistic      0.05847634
## Anderson-Darling statistic      0.44537170
## 
## Goodness-of-fit criteria
##                                1-mle-norm 2-mle-exp 3-mle-lnorm
## Akaike's Information Criterion   314.1765  294.1736    284.9146
## Bayesian Information Criterion   317.6518  295.9113    288.3900
##                                4-mle-gamma 5-mle-weibull
## Akaike's Information Criterion    288.2076      290.3611
## Bayesian Information Criterion    291.6830      293.8364
# From the comparison lognormal distribution looks the best fit.
dist='lnorm'
for(i in length(dist)){
  x <- fitdist(q1, dist[i])
  gofstat(x)
  plot(x)
}

6.5.2

goodnessfit(q2, cont)

## Goodness-of-fit statistics
##                              1-mle-norm 2-mle-exp 3-mle-lnorm 4-mle-gamma
## Kolmogorov-Smirnov statistic  0.1589070 0.3144366  0.06764782  0.09941791
## Cramer-von Mises statistic    0.3826912 1.1473789  0.03973639  0.11444195
## Anderson-Darling statistic    2.2740826 5.9711731  0.26739444  0.71667740
##                              5-mle-weibull
## Kolmogorov-Smirnov statistic     0.1140618
## Cramer-von Mises statistic       0.2008969
## Anderson-Darling statistic       1.3230284
## 
## Goodness-of-fit criteria
##                                1-mle-norm 2-mle-exp 3-mle-lnorm
## Akaike's Information Criterion   310.3453  320.4776    282.6829
## Bayesian Information Criterion   314.0456  322.3278    286.3832
##                                4-mle-gamma 5-mle-weibull
## Akaike's Information Criterion    288.3562      296.2684
## Bayesian Information Criterion    292.0565      299.9687
# From the comparison even for this question lognormal distribution looks the best fit.
dist='lnorm'
for(i in length(dist)){
  x <- fitdist(q2, dist[i])
  gofstat(x)
  plot(x)
}

6.5.3

goodnessfit(q3, disc)

## Chi-squared statistic:  31.0542 1.919848 
## Degree of freedom of the Chi-squared distribution:  3 2 
## Chi-squared p-value:  8.280024e-07 0.3829221 
##    the p-value may be wrong with some theoretical counts < 5  
## Chi-squared table:
##      obscounts theo 1-mle-geom theo 2-mle-nbinom
## <= 1         7       19.876752          9.265752
## <= 2        13        6.351383         10.230512
## <= 3        11        4.745696         10.078506
## <= 4         8        3.545942          7.446571
## > 4          6       10.480227          7.978659
## 
## Goodness-of-fit criteria
##                                1-mle-geom 2-mle-nbinom
## Akaike's Information Criterion   203.2825     166.2799
## Bayesian Information Criterion   205.0891     169.8932
# From the comparison nbinorm distribution looks the best fit.
dist='nbinom'
for(i in length(dist)){
  x <- fitdist(q3, dist[i])
  gofstat(x)
  plot(x)
}

References

https://stackoverflow.com/questions/29121040/how-to-properly-use-gofstat-in-r