Compute one proportion z-test in R

res <- prop.test(x = 95, n = 160, p = 0.5, 
                 correct = FALSE)
# Printing the results
res 
## 
##  1-sample proportions test without continuity correction
## 
## data:  95 out of 160, null probability 0.5
## X-squared = 5.625, df = 1, p-value = 0.01771
## alternative hypothesis: true p is not equal to 0.5
## 95 percent confidence interval:
##  0.5163169 0.6667870
## sample estimates:
##       p 
## 0.59375

Interpretation of the result

# The p-value of the test is 0.01771, which is less than the significance level alpha = 0.05. We can conclude that the proportion of male with cancer is significantly different from 0.5 with a p-value = 0.01771.

Access to the values returned by prop.test()

The format of the R code to use for getting these values is as follow:

# printing the p-value
res$p.value
## [1] 0.01770607
# printing the mean
res$estimate
##       p 
## 0.59375
# printing the confidence interval
res$conf.int
## [1] 0.5163169 0.6667870
## attr(,"conf.level")
## [1] 0.95

Compute Two-Proportions Z-Test in R

res1 <- prop.test(x = c(490, 400), n = c(500, 500))
# Printing the results
res 
## 
##  1-sample proportions test without continuity correction
## 
## data:  95 out of 160, null probability 0.5
## X-squared = 5.625, df = 1, p-value = 0.01771
## alternative hypothesis: true p is not equal to 0.5
## 95 percent confidence interval:
##  0.5163169 0.6667870
## sample estimates:
##       p 
## 0.59375

Interpretation of the result

# The p-value of the test is 2.36310^{-19}, which is less than the significance level alpha = 0.05. We can conclude that the proportion of smokers is significantly different in the two groups with a p-value = 2.36310^{-19}.

Access to the values returned by prop.test() function

# printing the p-value
res1$p.value
## [1] 2.363439e-19
# printing the mean
res1$estimate
## prop 1 prop 2 
##   0.98   0.80
# printing the confidence interval
res1$conf.int
## [1] 0.1408536 0.2191464
## attr(,"conf.level")
## [1] 0.95

Chi-square Goodness of Fit Test in R

Question 1: Are these colors equally common? If these colors were equally distributed, the expected proportion would be 1/3 for each of the color.

Answer to Q1: Are the colors equally common?

tulip <- c(81, 50, 27)
res2 <- chisq.test(tulip, p = c(1/3, 1/3, 1/3))
res2
## 
##  Chi-squared test for given probabilities
## 
## data:  tulip
## X-squared = 27.886, df = 2, p-value = 8.803e-07
# The p-value of the test is 8.80310^{-7}, which is less than the significance level alpha = 0.05. We can conclude that the colors are significantly not commonly distributed with a p-value = 8.80310^{-7}.
# Access to the expected values
res2$expected
## [1] 52.66667 52.66667 52.66667

Question 2:Suppose that, in the region where you collected the data, the ratio of red, yellow and white tulip is 3:2:1 (3+2+1 = 6). This means that the expected proportion is:

###3/6 (= 1/2) for red ###2/6 ( = 1/3) for yellow ###1/6 for white

Answer to Q2 comparing observed to expected proportions

tulip <- c(81, 50, 27)
res3 <- chisq.test(tulip, p = c(1/2, 1/3, 1/6))
res3
## 
##  Chi-squared test for given probabilities
## 
## data:  tulip
## X-squared = 0.20253, df = 2, p-value = 0.9037
# The p-value of the test is 0.9037, which is greater than the significance level alpha = 0.05. We can conclude that the observed proportions are not significantly different from the expected proportions.

Access to the values returned by chisq.test() function

# printing the p-value
res3$p.value
## [1] 0.9036928
# printing the mean
res3$estimate
## NULL

Chi-Square Test of Independence in R

Data format: Contingency tables

# Import the data
file_path <- "http://www.sthda.com/sthda/RDoc/data/housetasks.txt"
housetasks <- read.delim(file_path, row.names = 1)
# head(housetasks)

Graphical display of contengency tables

Compute chi-square test in R

chisq <- chisq.test(housetasks)
chisq
## 
##  Pearson's Chi-squared test
## 
## data:  housetasks
## X-squared = 1944.5, df = 36, p-value < 2.2e-16
# Observed counts
chisq$observed
##            Wife Alternating Husband Jointly
## Laundry     156          14       2       4
## Main_meal   124          20       5       4
## Dinner       77          11       7      13
## Breakfeast   82          36      15       7
## Tidying      53          11       1      57
## Dishes       32          24       4      53
## Shopping     33          23       9      55
## Official     12          46      23      15
## Driving      10          51      75       3
## Finances     13          13      21      66
## Insurance     8           1      53      77
## Repairs       0           3     160       2
## Holidays      0           1       6     153
# Expected counts
round(chisq$expected,2)
##             Wife Alternating Husband Jointly
## Laundry    60.55       25.63   38.45   51.37
## Main_meal  52.64       22.28   33.42   44.65
## Dinner     37.16       15.73   23.59   31.52
## Breakfeast 48.17       20.39   30.58   40.86
## Tidying    41.97       17.77   26.65   35.61
## Dishes     38.88       16.46   24.69   32.98
## Shopping   41.28       17.48   26.22   35.02
## Official   33.03       13.98   20.97   28.02
## Driving    47.82       20.24   30.37   40.57
## Finances   38.88       16.46   24.69   32.98
## Insurance  47.82       20.24   30.37   40.57
## Repairs    56.77       24.03   36.05   48.16
## Holidays   55.05       23.30   34.95   46.70
round(chisq$residuals, 3)
##              Wife Alternating Husband Jointly
## Laundry    12.266      -2.298  -5.878  -6.609
## Main_meal   9.836      -0.484  -4.917  -6.084
## Dinner      6.537      -1.192  -3.416  -3.299
## Breakfeast  4.875       3.457  -2.818  -5.297
## Tidying     1.702      -1.606  -4.969   3.585
## Dishes     -1.103       1.859  -4.163   3.486
## Shopping   -1.289       1.321  -3.362   3.376
## Official   -3.659       8.563   0.443  -2.459
## Driving    -5.469       6.836   8.100  -5.898
## Finances   -4.150      -0.852  -0.742   5.750
## Insurance  -5.758      -4.277   4.107   5.720
## Repairs    -7.534      -4.290  20.646  -6.651
## Holidays   -7.419      -4.620  -4.897  15.556
library(corrplot)
## corrplot 0.92 loaded
corrplot(chisq$residuals, is.cor = FALSE)

# Contibution in percentage (%)
contrib <- 100*chisq$residuals^2/chisq$statistic
round(contrib, 3)
##             Wife Alternating Husband Jointly
## Laundry    7.738       0.272   1.777   2.246
## Main_meal  4.976       0.012   1.243   1.903
## Dinner     2.197       0.073   0.600   0.560
## Breakfeast 1.222       0.615   0.408   1.443
## Tidying    0.149       0.133   1.270   0.661
## Dishes     0.063       0.178   0.891   0.625
## Shopping   0.085       0.090   0.581   0.586
## Official   0.688       3.771   0.010   0.311
## Driving    1.538       2.403   3.374   1.789
## Finances   0.886       0.037   0.028   1.700
## Insurance  1.705       0.941   0.868   1.683
## Repairs    2.919       0.947  21.921   2.275
## Holidays   2.831       1.098   1.233  12.445
# Visualize the contribution
corrplot(contrib, is.cor = FALSE)

Access to the values returned by chisq.test() function

# printing the p-value
chisq$p.value
## [1] 0
# printing the mean
chisq$estimate
## NULL