Analysis (Fact)

Perform a goodness-of-fit test.

The test’s statistic has a value of χ2 = 14.413.

The test has a p-value of 0.0443.

The test provides evidence that the distribution [is]/[is not] binomial because p value is less than 0.05, so we reject the null hypothesis .

## Tables and Figures
## Find the maximum dice size: 10

## Find the group size: 8

## Considering a roll of 1/2/3 to be a success, find the number of successes per grouping.

## Number of successes when rolling a 10 sided dice.

##            0    1    2    3    4    5    6    7    8    
## Success 1681 5718 8495 7531 4017 1301  251   39    1    
library(readxl)
library(vcd)
## Loading required package: grid
library(data.table)
library(tidyverse)
## ── Attaching packages
## ───────────────────────────────────────
## tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.1      ✔ purrr   0.3.5 
## ✔ tibble  3.1.8      ✔ dplyr   1.0.10
## ✔ tidyr   1.2.1      ✔ stringr 1.5.0 
## ✔ readr   2.1.3      ✔ forcats 0.5.2 
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::between()   masks data.table::between()
## ✖ dplyr::filter()    masks stats::filter()
## ✖ dplyr::first()     masks data.table::first()
## ✖ dplyr::lag()       masks stats::lag()
## ✖ dplyr::last()      masks data.table::last()
## ✖ purrr::transpose() masks data.table::transpose()
library(readr)
data <- read_csv("/Users/mex/2023 Session/data.3.csv")
## Rows: 29034 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (8): roll 1, roll 2, roll 3, roll 4, roll 5, roll 6, roll 7, roll 8
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
spec(data)
## cols(
##   `roll 1` = col_double(),
##   `roll 2` = col_double(),
##   `roll 3` = col_double(),
##   `roll 4` = col_double(),
##   `roll 5` = col_double(),
##   `roll 6` = col_double(),
##   `roll 7` = col_double(),
##   `roll 8` = col_double()
## )
c <- 0
Success <- matrix(NA, nrow = nrow(data), ncol = 1)
for (row in 1:nrow(data)) {
  count <- 0
  for (n in data[row,]) {
    for (i in n) {
      if (i < 4) {
        count <- count + 1
      }
    }
  }
  Success[row,1] <- count
}

T0 <- table(factor(Success[,1], levels = 0:8))
print(T0)
## 
##    0    1    2    3    4    5    6    7    8 
## 1681 5718 8495 7531 4017 1301  251   39    1
T1=as.matrix(table(factor(Success, levels = 0:8)))

colnames(T1) = c("Success")
T2=t(as.table(T1))
T2
##            0    1    2    3    4    5    6    7    8
## Success 1681 5718 8495 7531 4017 1301  251   39    1
T5_fit= goodfit(T0, type="binomial", par = list(size=8))
T5_fit
## 
## Observed and fitted values for binomial distribution
## with parameters estimated by `ML' 
## 
##  count observed     fitted pearson residual
##      0     1681 1675.44957        0.1356003
##      1     5718 5741.97282       -0.3163651
##      2     8495 8609.32218       -1.2321002
##      3     7531 7376.30287        1.8012040
##      4     4017 3949.92216        1.0672958
##      5     1301 1353.68716       -1.4320085
##      6      251  289.95333       -2.2876022
##      7       39   35.48950        0.5892770
##      8        1    1.90042       -0.6531612
T5_fit$par
## $prob
## [1] 0.2999113
## 
## $size
## [1] 8
summary(T5_fit)# the test statistic has x^2 value of 9.9 with p value of 0.54. So we will not reject the null hypothesis and can conclude the data will follow a BIN(13, 3/8) distribution
## 
##   Goodness-of-fit test for binomial distribution
## 
##                       X^2 df  P(> X^2)
## Likelihood Ratio 14.41346  7 0.0442977
#Plot

plot(T5_fit, scale= 'raw', type = 'hanging', shade =T, main= "Number of successes when rolling 8 sided dice.")