Chapter 12 - Monsters & Mixtures

This chapter introduced several new types of regression, all of which are generalizations of generalized linear models (GLMs). Ordered logistic models are useful for categorical outcomes with a strict ordering. They are built by attaching a cumulative link function to a categorical outcome distribution. Zero-inflated models mix together two different outcome distributions, allowing us to model outcomes with an excess of zeros. Models for overdispersion, such as beta-binomial and gamma-Poisson, draw the expected value of each observation from a distribution that changes shape as a function of a linear model.

Place each answer inside the code chunk (grey box). The code chunks should contain a text response or a code that completes/answers the question or activity requested. Make sure to include plots if the question requests them.

Finally, upon completion, name your final output .html file as: YourName_ANLY505-Year-Semester.html and publish the assignment to your R Pubs account and submit the link to Canvas. Each question is worth 5 points.

Questions

12-1. At a certain university, employees are annually rated from 1 to 4 on their productivity, with 1 being least productive and 4 most productive. In a certain department at this certain university in a certain year, the numbers of employees receiving each rating were (from 1 to 4): 12, 36, 7, 41. Compute the log cumulative odds of each rating.

## Defining Parameters:

# Number of employees receiving rating (from 1 to 4) in a certain university were : 12, 36, 7, and 41.

Data_f <- c(12, 36, 7, 41)


## Determination for Discrete proportion of each response value:

Prop_Data_f <- Data_f/sum(Data_f)

Prop_Data_f
## [1] 0.12500000 0.37500000 0.07291667 0.42708333
## Determination for the cummulative proportion leveraging or utilizing the "cumsum" function.


CumSum_Prop_Data_f <- cumsum(Prop_Data_f)


CumSum_Prop_Data_f
## [1] 0.1250000 0.5000000 0.5729167 1.0000000
## Determination for the computation for the "log cummulative" odds of each employees rating.


Log_Cumltv_Odds <- log(CumSum_Prop_Data_f/(1-CumSum_Prop_Data_f))


Log_Cumltv_Odds
## [1] -1.9459101  0.0000000  0.2937611        Inf
## Determination for the computation for the "log cummulative" odds of each employees rating using the alternate "Logit"  R function and rounding to two digits.

logit <- function(CumSum_Prop_Data_f) log(CumSum_Prop_Data_f/(1-CumSum_Prop_Data_f)) 
round( lco <- logit( CumSum_Prop_Data_f) , 2 )
## [1] -1.95  0.00  0.29   Inf
## Result: 
# The computed log cummulative odds for each of the employees rating was found to be ,-1.95  0.00,  0.29, and Infinity.

12-2. Make a version of Figure 12.5 for the employee ratings data given just above.

## Determination for the new version of chapters Fig "12.5" between the "Cummlative proportion" and "response" for the above scenario's employees rating data.

## Extracting the parameters from the above scenario's employees rating data for the plot determination, reference to chapters Fig "12.5" plot .

## "Data_f" parameter from above employees rating data.

Data_f <- c(12, 36, 7, 41)


## Discrete proportion of each response value from above employees rating data scenario:

Prop_Data_f <- Data_f/sum(Data_f)

Prop_Data_f
## [1] 0.12500000 0.37500000 0.07291667 0.42708333
## Calculated cumulative proportion leveraging or utilizing the "cumsum" function from above employees rating data scenario:.

CumSum_Prop_Data_f <- cumsum(Prop_Data_f)


CumSum_Prop_Data_f
## [1] 0.1250000 0.5000000 0.5729167 1.0000000
# Determination for the plot for the "Employees rating Data" from above:


plot( 1:4 , CumSum_Prop_Data_f , type="b" , xlab="Response" ,
ylab="Cumulative proportion" , ylim=c(0,1)
)
segments(1:4, 0, 1:4, CumSum_Prop_Data_f)
for (i in 1:4) {
  segments(i + 0.05, c(0, CumSum_Prop_Data_f) [i], i + 0.05, CumSum_Prop_Data_f[i], col ="blue")
}

## Here in the plot the vertical "Y-axis" represents the "Cumulative" probability proportion and the horizontal "X-axis" represents and displays possible observable outcomes for the employees rating values, from 1 through 4. 

12-3. In 2014, a paper was published that was entitled “Female hurricanes are deadlier than male hurricanes.”191 As the title suggests, the paper claimed that hurricanes with female names have caused greater loss of life, and the explanation given is that people unconsciously rate female hurricanes as less dangerous and so are less likely to evacuate. Statisticians severely criticized the paper after publication. Here, you’ll explore the complete data used in the paper and consider the hypothesis that hurricanes with female names are deadlier.

Acquaint yourself with the columns by inspecting the help ?Hurricanes. In this problem, you’ll focus on predicting deaths using femininity of each hurricane’s name. Fit and interpret the simplest possible model, a Poisson model of deaths using femininity as a predictor. You can use quap or ulam. Compare the model to an intercept-only Poisson model of deaths. How strong is the association between femininity of name and deaths? Which storms does the model fit (retrodict) well? Which storms does it fit poorly?

## Extracting or loading  for "Hurricanes" data set, for focussing on predicting deaths using femininity of each hurricane’s name.

data(Hurricanes)


Data_1 <- Hurricanes


str(Data_1)
## 'data.frame':    92 obs. of  8 variables:
##  $ name        : Factor w/ 83 levels "Able","Agnes",..: 38 77 1 9 47 20 40 60 27 33 ...
##  $ year        : int  1950 1950 1952 1953 1953 1954 1954 1954 1955 1955 ...
##  $ deaths      : int  2 4 3 1 0 60 20 20 0 200 ...
##  $ category    : int  3 3 1 1 1 3 3 4 3 1 ...
##  $ min_pressure: int  960 955 985 987 985 960 954 938 962 987 ...
##  $ damage_norm : int  1590 5350 150 58 15 19321 3230 24260 2030 14730 ...
##  $ female      : int  1 0 0 1 1 1 1 1 1 1 ...
##  $ femininity  : num  6.78 1.39 3.83 9.83 8.33 ...
## Determination of Model (Model 1) for predicting deaths using femininity of each hurricane’s name, leveraging the "ulam" function.

Model_1 <- ulam(
  alist(
    deaths ~ dpois( lambda ),
    log(lambda) <- a + bf*femininity,
    a ~ dnorm(0,10),
    bf ~ dnorm(0,5)
  ) ,
  data=Data_1, chains=4 , cores=4, log_lik=TRUE)
## Running MCMC with 4 parallel chains, with 1 thread(s) per chain...
## 
## Chain 1 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 1 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 1 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 1 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 1 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 1 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 1 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 1 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 1 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 1 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 1 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 2 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 2 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 2 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 2 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 2 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 2 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 2 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 2 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 2 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 2 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 2 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 2 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 3 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 3 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 3 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 3 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 3 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 3 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 3 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 3 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 3 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 3 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 3 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 3 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 4 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 4 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 4 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 4 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 4 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 4 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 4 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 4 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 4 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 4 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 4 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 4 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 1 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 1 finished in 0.2 seconds.
## Chain 2 finished in 0.2 seconds.
## Chain 3 finished in 0.2 seconds.
## Chain 4 finished in 0.2 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 0.2 seconds.
## Total execution time: 0.4 seconds.
## Determination for the intercept-only Poisson model of deaths, leveraging the "ulam" function.

Intercept_Model_2 <- ulam(
  alist(
    deaths ~ dpois( lambda ),
    log(lambda) <- a ,
    a ~ dnorm(0,10)
  ) ,
  data=Data_1, chains=4 , cores=4, log_lik=TRUE)
## Running MCMC with 4 parallel chains, with 1 thread(s) per chain...
## 
## Chain 1 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 1 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 1 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 1 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 1 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 1 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 1 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 1 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 1 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 1 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 1 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 1 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 2 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 2 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 2 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 2 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 2 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 2 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 2 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 2 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 2 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 2 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 2 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 2 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 3 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 3 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 3 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 3 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 3 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 3 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 3 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 3 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 3 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 3 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 3 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 3 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 4 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 4 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 4 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 4 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 4 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 4 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 4 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 4 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 4 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 4 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 4 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 4 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 1 finished in 0.0 seconds.
## Chain 2 finished in 0.0 seconds.
## Chain 3 finished in 0.0 seconds.
## Chain 4 finished in 0.0 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 0.0 seconds.
## Total execution time: 0.2 seconds.
## Determination for the comparison of both the above models i.e. "Model_1", and "Model_2" utlizing "WAIC" function.

Comp_Waic <-
  compare(Model_1, Intercept_Model_2, func = WAIC) %>% 
  as_tibble(rownames = "Model") %>% 
  knitr::kable(digits = 2)

Comp_Waic
Model WAIC SE dWAIC dSE pWAIC weight
Model_1 4425.26 1003.68 0.00 NA 133.94 0.87
Intercept_Model_2 4429.05 1065.73 3.79 135.47 75.32 0.13
## Determination for the comparison of both the above models i.e. "Model_1", and "Model_2" utlizing "PSIS" function.

Comp_PSIS <-
  compare(Model_1, Intercept_Model_2, func = PSIS) %>% 
  as_tibble(rownames = "Model") %>% 
  knitr::kable(digits = 2)

Comp_PSIS
Model PSIS SE dPSIS dSE pPSIS weight
Model_1 4379.29 983.88 0.00 NA 110.96 1
Intercept_Model_2 4437.34 1074.48 58.05 151.24 79.47 0
## Model Comparison Inference: 
# The model comparison for our "Model_1" for predicting deaths using femininity of each hurricane’s name, and "Intercept_Model_2" for the intercept-only Poisson model of deaths, leveraging the "ulam" function, presented with a comparatively lower "WAIC" as well as "PSIS" score value for our "Model_1" in comparison with the Intercept Model (Intercept_Model_2"), however, this was not a significant difference between both models "WAIC", and "PSIS" score values, and the Standard error for both the models were to be on the higher end, in addition to our "Model_1" also having the full "Akaike" weight, thus fitting the "Model_1" comparatively slightly well compared to  poorly fit "Intercept_Model_2".


## Determination of plot for analyzing the strength of association between femininity of name and deaths.

## Computation for the sampling distribution.

Mdl <- sim(Model_1)

Mdl.mean <- colMeans(Mdl)

Mdl.PI <- apply(Mdl, 2, PI)

## Determination for the trend.

Grouped_Plot_Feminity_Name_Deaths<-
  plot(y = Data_1$deaths, x = Data_1$femininity, col=rangi2, ylab ="Hurricance Deaths", xlab ="Femininity of Hurricane Names", pch=16)
points(y = Mdl.mean, x= Data_1$femininity, pch=1)
segments(x0 = Data_1$femininity, x1= Data_1$femininity, y0 = Mdl.PI[1,], y1 = Mdl.PI[2,])

## Determination for plotting the sampling intervals.

lines(y = Mdl.mean[order(Data_1$femininity)],  x = sort(Data_1$femininity))

lines( Mdl.PI[1,order(Data_1$femininity)],  x = sort(Data_1$femininity), lty=2 )

lines( Mdl.PI[2,order(Data_1$femininity)],  x = sort(Data_1$femininity), lty=2 )

## Plot outcome Inference for assessing the magnitude of strength association between femininity of hurricane name and deaths:

# Referring to the plot output, it appears that their is an association to an extent between the femininity of the hurricane names and the percent of mean deaths caused by hurricanes, since majority of the deaths proportions on the higher scale of femininity of name i.e. between 8 and 10, and deaths on the lower scale of femininity of name i.e. between 0 and 3, appear to be somewhat lesser compared to higher "femininity scale, further, it is also apparent from the plot that some of the highest percentages of deaths i.e. above the deaths scale of 100 and above are majorly observed as an outliers for the higher scale femininity of hurricane names, as against the lower scale of femininity names, thus illustrating an over dispersion for our select model, accompanied by further failure to retrodict various hurricanes with higher deaths magnitude.

12-4. Counts are nearly always over-dispersed relative to Poisson. So fit a gamma-Poisson (aka negative-binomial) model to predict deaths using femininity. Show that the over-dispersed model no longer shows as precise a positive association between femininity and deaths, with an 89% interval that overlaps zero. Can you explain why the association diminished in strength?

## Extracting or loading  for "Hurricanes" data set, for focussing on predicting deaths using femininity of each hurricane’s name.

data(Hurricanes)

Data<-Hurricanes



## Standardization for Femininity.


Data$Fem_std = (Data$femininity - mean(Data$femininity)) / sd(Data$femininity)


Dat = list(D = Data$deaths, F = Data$Fem_std)


#f = alist(
  #D ~ dpois(lambda), 
  #log(lambda) <- a + bF * F, 
  #a ~ dnorm(1, 1),
  #bF ~ dnorm(0, 1)
#)



## Determination for the Gamma-Poisson (aka negative-binomial) model to predict "deaths" (D) using "femininity" (F).

Model_Gmma_P_2 <- ulam(
  alist(
    D ~ dgampois(lambda, scale),
    log(lambda) <- a + bF * F,
    a ~ dnorm(1, 1),
    bF ~ dnorm(0, 1),
    scale ~ dexp(1) # strictly positive hence why exponential prior
  ),
  data = Dat, chains = 4, cores = 4, log_lik = TRUE
)
## Running MCMC with 4 parallel chains, with 1 thread(s) per chain...
## 
## Chain 1 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 1 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 1 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 1 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 1 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 1 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 1 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 1 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 1 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 2 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 2 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 2 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 2 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 2 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 2 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 2 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 2 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 2 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 3 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 3 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 3 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 3 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 3 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 3 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 3 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 3 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 3 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 4 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 4 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 4 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 4 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 4 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 4 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 4 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 4 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 4 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 1 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 1 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 1 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 1 finished in 0.2 seconds.
## Chain 2 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 2 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 2 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 2 finished in 0.2 seconds.
## Chain 3 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 3 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 3 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 4 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 4 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 4 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 3 finished in 0.2 seconds.
## Chain 4 finished in 0.2 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 0.2 seconds.
## Total execution time: 0.4 seconds.
## Determination of precis for our Gamma-Poisson model (Model_Gmma_P_2)

precis(Model_Gmma_P_2)
##            mean         sd        5.5%     94.5%    n_eff     Rhat4
## a     2.9712260 0.15053028  2.74149775 3.2197452 1880.910 0.9995362
## bF    0.2114124 0.15643863 -0.04269352 0.4568642 1559.623 1.0021640
## scale 0.4527142 0.06334508  0.35550085 0.5576498 1473.167 1.0027844
## Determination of plot leveraging the Gamma-Poisson (aka negative-binomial) model (Model_Gmma_P_2) to predict "deaths" (D) using "femininity" (F).


plot(Dat$F, Dat$D,
  pch = 16, lwd = 2,
  col = rangi2, xlab = "Femininity of Names(std)", ylab = "Hurricane deaths"
)

# Determination for the Model Trend and linking with our "Gamma-Poisson Model (Model_Gmma_P_2) and Trend Superimpose.

Data_prdct = list(F = seq(from = -2, to = 2, length.out = 1e2))
lambda = link(Model_Gmma_P_2, data = Data_prdct)
lambda.mu = apply(lambda, 2, mean)
lambda.PI = apply(lambda, 2, PI)
lines(Data_prdct$F, lambda.mu)
shade(lambda.PI, Data_prdct$F)


# Sampling distribution computation:

deaths_sim = sim(Model_Gmma_P_2, data = Data_prdct)
deaths_sim.PI = apply(deaths_sim, 2, PI)


# Sampling interval superimpose (Dashed lines):

lines(Data_prdct$F, deaths_sim.PI[1, ], lty = 2)
lines(Data_prdct$F, deaths_sim.PI[2, ], lty = 2)

## Determination for the posterior distribution density plot for the assessment:

Postr <- extract.samples(Model_Gmma_P_2)
par(mfrow = c(1, 3))

 for (femnine in -1:1) {
  for (i in 1:1e2) {
    curve(dgamma2(
      x, 
      exp(Postr$a[i] + Postr$bF[i] * femnine), # linear model with inverse link applied
      Postr$scale[i] # Gamma scale
    ),
    from = 0, to = 70, xlab = "Mean Hurricane Deaths", ylab = "Density scale",
    ylim = c(0, 0.19), col = col.alpha("black", 0.2),
    add = ifelse(i == 1, FALSE, TRUE)
    )
  }
  mtext(concat("femininity  =   ", femnine))
}

## Plot outputs Result inference & why the association between "deaths" and femininity of "hurricane names" diminished in strength:

# It is evident from some of our plots output reference to our "Gamma-Poisson Model (Model_Gmma_P_2), that the over-dispersed model to an extent no longer shows a significant and precise positive association between femininity and deaths, or lost it's association strength in comparison of our prior Model (Model_1), and with a major proportion of interval overlapping zero, and probably the reasoning for this could be due to the fact that because our new gamma model based distribution permits the Hurricane death rate's computation for each of the individual outcomes such as for femininity -1, 0, or 1 etc, rather than computing a single and collective Hurricane death rate for the complete set of hurricanes, further these individual hurricane death rates are also derived or sampled from the common distribution and which being a function of various hurricane names associated feminism or femininity extent.

12-5. In the data, there are two measures of a hurricane’s potential to cause death: damage_norm and min_pressure. Consult ?Hurricanes for their meanings. It makes some sense to imagine that femininity of a name matters more when the hurricane is itself deadly. This implies an interaction between femininity and either or both of damage_norm and min_pressure. Fit a series of models evaluating these interactions. Interpret and compare the models. In interpreting the estimates, it may help to generate counterfactual predictions contrasting hurricanes with masculine and feminine names. Are the effect sizes plausible?

## Extracting or loading  for "Hurricanes" data set.

data(Hurricanes)

Data <- Hurricanes


## Standardization for  variables "Femininity", "minimum pressure" and "damage norm".


Data$fem_std = (Data$femininity - mean(Data$femininity)) / sd(Data$femininity)


Dat = list(D = Data$deaths, F = Data$fem_std)


Dat$P = standardize(Data$min_pressure)

Dat$N = standardize(Data$damage_norm)



## Determination for the model for evaluating interactions between "femininity"  and "min_pressure" for death prediction.


Model_P_1 <- ulam(
  alist(
    D ~ dgampois(lambda, scale),
    log(lambda) <- a + bF * F + bP * P + bFP * F * P,
    a ~ dnorm(1, 1),
    c(bF, bP, bFP) ~ dnorm(0, 1),
    scale ~ dexp(1)
  ),
  data = Dat, cores = 4, chains = 4, log_lik = TRUE
)
## Running MCMC with 4 parallel chains, with 1 thread(s) per chain...
## 
## Chain 1 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 1 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 1 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 1 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 1 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 2 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 2 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 2 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 2 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 2 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 2 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 2 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 3 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 3 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 3 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 3 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 3 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 3 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 3 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 4 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 4 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 4 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 4 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 4 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 4 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 4 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 1 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 1 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 1 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 1 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 1 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 1 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 1 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 2 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 2 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 2 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 2 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 2 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 3 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 3 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 3 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 3 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 4 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 4 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 4 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 4 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 1 finished in 0.4 seconds.
## Chain 2 finished in 0.3 seconds.
## Chain 3 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 4 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 3 finished in 0.3 seconds.
## Chain 4 finished in 0.3 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 0.3 seconds.
## Total execution time: 0.5 seconds.
## Examination of the new revised model using precis function.

precis(Model_P_1)
##             mean        sd        5.5%      94.5%    n_eff     Rhat4
## a      2.7581290 0.1459503  2.52882130  3.0000828 2224.044 0.9986628
## bFP    0.3004974 0.1451408  0.07643746  0.5446021 2073.040 0.9991555
## bP    -0.6724652 0.1348172 -0.88945032 -0.4545539 1849.414 0.9998611
## bF     0.2969126 0.1457235  0.05744096  0.5161517 2261.422 0.9993146
## scale  0.5521478 0.0810264  0.43298214  0.6868408 2850.073 0.9984705
## Determination for plot for our new model i.e., "Model_P_1" for evaluating interactions between "femininity"  and "min_pressure" and Trend Superimpose.


P_seq <- seq(from = -3, to = 2, length.out = 1e2)

Pred_dat <- data.frame(F = -1, P = P_seq)

lambda_m <- link(Model_P_1, data = Pred_dat)

lambda_m.mu <- apply(lambda_m, 2, mean)

lambda_m.PI <- apply(lambda_m, 2, PI)

Pred_dat <- data.frame(F = 1, P = P_seq)

lambda_f <- link(Model_P_1, data = Pred_dat)

lambda_f.mu <- apply(lambda_f, 2, mean)

lambda_f.PI <- apply(lambda_f, 2, PI)

# Determination of plot for sqrt() for better and clear visualization.

plot(Dat$P, sqrt(Dat$D),
  pch = 1, lwd = 2, col = ifelse(Dat$F > 0, "blue", "orange"),
  xlab = "Minimum pressure (stnd)", ylab = "Deaths (Sqrt)"
)
lines(P_seq, sqrt(lambda_m.mu), lty = 2)

shade(sqrt(lambda_m.PI), P_seq)

lines(P_seq, sqrt(lambda_f.mu), lty = 1, col = "blue")

shade(sqrt(lambda_f.PI), P_seq, col = col.alpha("blue", 0.2))

## Determination for the model for evaluating interactions between "femininity" (F)  and "damage norm" (N) for death prediction.


Model_N_1 = ulam(
  alist(
    D ~ dgampois(lambda, scale),
    log(lambda) <- a + bf * F + bn * N + bfn * F * N,
    a ~ dnorm(1, 1),
    c(bf, bn, bfn) ~ dnorm(0, 1),
    scale ~ dexp(1)
  ),
  data = Dat, chains = 4, cores = 4, log_lik = TRUE
)
## Running MCMC with 4 parallel chains, with 1 thread(s) per chain...
## 
## Chain 1 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 1 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 1 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 1 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 1 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 1 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 1 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 2 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 2 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 2 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 2 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 2 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 3 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 3 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 3 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 3 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 3 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 4 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 4 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 4 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 4 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 4 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 1 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 1 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 1 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 1 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 2 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 2 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 2 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 2 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 2 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 2 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 3 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 3 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 3 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 3 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 3 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 4 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 4 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 4 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 4 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 1 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 2 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 3 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 3 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 4 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 4 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 4 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 1 finished in 0.4 seconds.
## Chain 2 finished in 0.4 seconds.
## Chain 3 finished in 0.4 seconds.
## Chain 4 finished in 0.4 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 0.4 seconds.
## Total execution time: 0.5 seconds.
## Examination of the new revised model using precis function.

precis(Model_N_1)
##             mean        sd        5.5%     94.5%    n_eff     Rhat4
## a     2.56780743 0.1263407  2.36285670 2.7737027 2837.329 0.9988527
## bfn   0.31392844 0.1993745 -0.00528385 0.6133865 2370.809 0.9997482
## bn    1.24677513 0.2136046  0.91553689 1.5868960 2115.580 1.0014993
## bf    0.09112078 0.1286935 -0.11496481 0.3025833 2105.661 1.0002539
## scale 0.68381397 0.1017399  0.52777699 0.8542334 2184.786 0.9988263
## Model Comparison:

## Comparison utilizing "WAIC" function.

Comp_Waic <-
  compare(Model_P_1, Model_N_1, func = WAIC) %>% 
  as_tibble(rownames = "Model") %>% 
  knitr::kable(digits = 2)

Comp_Waic
Model WAIC SE dWAIC dSE pWAIC weight
Model_N_1 670.19 33.81 0.00 NA 6.59 1
Model_P_1 694.32 38.40 24.12 19 8.40 0
## Comparing the models  utilizing "PSIS" function.

Comp_PSIS <- 
  compare(Model_P_1, Model_N_1, func = PSIS) %>% 
  as_tibble(rownames = "Model") %>% 
  knitr::kable(digits = 2)

Comp_PSIS
Model PSIS SE dPSIS dSE pPSIS weight
Model_N_1 670.50 34.06 0.00 NA 6.74 1
Model_P_1 694.58 38.70 24.08 19.11 8.53 0
## Model Comparison Findings: 
# The comparison for both of our model i.e. "Model_P_1" for evaluating interactions between "femininity"  and " Minimum Pressure", as well as Model (Model_N_1) for evaluating interactions between "femininity"  and "damage norm", utilizing the "WAIC" as well as "PSIS" function didn't present any significant differences for the "WAIC", and "PSIS" score values, however, it was also noted that in comparison the Model_N_1 for evaluating the damage norm yielded a comparitively lesser standard error, full Akaike weight and slightly lesser "WAIC" as well as "PSIS" score values, hence having a slightly better fit in comparison to our "Pressure Model (Model_P_1).



## Determination for plot for our new model i.e., "Model_N_1" for evaluating interactions between "femininity"  and "damage norm" and Trend Superimpose.


N_seq <- seq(from = -1, to = 5.5, length.out = 1e2) 

Pred_dat <- data.frame(F = -1, N = N_seq)

lambda_m <- link(Model_N_1, data = Pred_dat)

lambda_m.mu <- apply(lambda_m, 2, mean)

lambda_m.PI <- apply(lambda_m, 2, PI)


Pred_dat <- data.frame(F = 1, N = N_seq)
lambda_f <- link(Model_N_1, data = Pred_dat)

lambda_f.mu <- apply(lambda_f, 2, mean)
lambda_f.PI <- apply(lambda_f, 2, PI)


# Determination of plot for sqrt() for better and clear visualization.

plot(Dat$N, sqrt(Dat$D),
  pch = 1, lwd = 2, col = ifelse(Dat$F > 0, "purple", "red"),
  xlab = "Normalized damage norm (std)", ylab = "Deaths (Sqrt)"
)
lines(N_seq, sqrt(lambda_m.mu), lty = 2)
shade(sqrt(lambda_m.PI), N_seq)

lines(N_seq, sqrt(lambda_f.mu), lty = 1, col = "yellow")
shade(sqrt(lambda_f.PI), N_seq, col = col.alpha("yellow", 0.2))

## Models Effective size comparison:

precis(Model_N_1)
##             mean        sd        5.5%     94.5%    n_eff     Rhat4
## a     2.56780743 0.1263407  2.36285670 2.7737027 2837.329 0.9988527
## bfn   0.31392844 0.1993745 -0.00528385 0.6133865 2370.809 0.9997482
## bn    1.24677513 0.2136046  0.91553689 1.5868960 2115.580 1.0014993
## bf    0.09112078 0.1286935 -0.11496481 0.3025833 2105.661 1.0002539
## scale 0.68381397 0.1017399  0.52777699 0.8542334 2184.786 0.9988263
precis(Model_P_1)
##             mean        sd        5.5%      94.5%    n_eff     Rhat4
## a      2.7581290 0.1459503  2.52882130  3.0000828 2224.044 0.9986628
## bFP    0.3004974 0.1451408  0.07643746  0.5446021 2073.040 0.9991555
## bP    -0.6724652 0.1348172 -0.88945032 -0.4545539 1849.414 0.9998611
## bF     0.2969126 0.1457235  0.05744096  0.5161517 2261.422 0.9993146
## scale  0.5521478 0.0810264  0.43298214  0.6868408 2850.073 0.9984705
## Models Effective sample size (n_eff) comparison: The "Model_N_1" for evaluating the interactions between "femininity"  and "damage norm" yielded us comparatively lesser effective sample sizes (n_eff) values in comparison to our other model (Model_P_1) for evaluating interactions between "femininity"  and " Minimum Pressure".



## Models Plots findings:

# Model_N_1" Plot Findings for evaluating the interactions between "femininity"  and "damage norm":

# The plot for for evaluating interactions between "femininity" of hurricane names with normalized "damage norm" yielded a strong association and interactions at the upper left corner of the plot and at lower normalized damaged scale compared to higher normalized damaged scale, however the clear extent of distinction between the feminine or masculine names or hurricanes was not accomplished by this model plot, further it was also apparent that the this association or interaction between "femininity" and normalized "damage norm" diminishes, as we progress towards the right side and at a higher normalized damage scale.


# Model_P_1" Plot Findings for evaluating the interactions between "femininity"  and "minimum pressure":

# The plot for evaluating interactions between "femininity" of hurricane names with minimum pressure yielded a much strong association and interaction between the minimum pressure scale of negative 1 and positive 1.5, further it is also apparent that the this association or interaction between "femininity" and normalized "damage norm" diminishes comparitively, as we progress towards the more negative or left side of the minimum pressure scale, also some extreme deviated observations, and possibly  outliers were also observed in the plot.