cor.test(rock$area,rock$perm)
##
## Pearson's product-moment correlation
##
## data: rock$area and rock$perm
## t = -2.9305, df = 46, p-value = 0.005254
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.6118206 -0.1267915
## sample estimates:
## cor
## -0.396637
Analysis:
The 95% confidence interval for rho ranged from -.61 to -.13. This is a fairly narrow range that doesn’t intersect 0, so if this is confidence interval isn’t an outlier, this sample has a possibility to be negatively correlated. This is further evident given how low the p-value, .0052, is compared with the alpha of .05. Given this information, we would reject the null hypothesis that the true correlation = 0. We can see that rock area and rock permeability have a moderate inverse correlation of -.40.
#install.packages(“BayesFactor”)
library(BayesFactor)
## Loading required package: coda
## Loading required package: Matrix
## ************
## Welcome to BayesFactor 0.9.12-4.2. If you have questions, please contact Richard Morey (richarddmorey@gmail.com).
##
## Type BFManual() to open the manual.
## ************
bfCorTest <- function (x,y) # Get r from BayesFactor
{
zx <- scale(x) # Standardize X
zy <- scale(y) # Standardize Y
zData <- data.frame(x=zx,rhoNot0=zy) # Put in a data frame
bfOut <- generalTestBF(x ~ rhoNot0, data=zData) # linear coefficient
mcmcOut <- posterior(bfOut,iterations=10000) # posterior samples
print(summary(mcmcOut[,'rhoNot0'])) # Show the HDI for r
return(bfOut) # Return Bayes factor object
}
bfCorTest(rock$area, rock$perm)
##
## Iterations = 1:10000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 10000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## -0.348443 0.137369 0.001374 0.001532
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## -0.62030 -0.44033 -0.34858 -0.25577 -0.07943
## Bayes factor analysis
## --------------
## [1] rhoNot0 : 8.072781 ±0%
##
## Against denominator:
## Intercept only
## ---
## Bayes factor type: BFlinearModel, JZS
Analysis:
The output of the Bayesian method indicates a mean correlation in the posterior distribution of rho of -.35. This is slightly larger than the -.40 produced by the correlation test. The High Density Interval for the Bayesian Method is slightly wider, ranging from -.60 to -.09, but we this range gives us more certainty of the true correlation. The Bayes Factor, rhoNot0, also supports the alternative hypothesis 8:1.
ucbDF <- UCBAdmissions[,,1]
ucbDF
## Gender
## Admit Male Female
## Admitted 512 89
## Rejected 313 19
chisq.test(ucbDF,correct = FALSE)
##
## Pearson's Chi-squared test
##
## data: ucbDF
## X-squared = 17.248, df = 1, p-value = 3.28e-05
Analysis:
This chi-squared test evaluates a 2X2 table regarding the admittance of students by gender at UC Berkeley. The reported value is 17.25 with 1 degree of freedom. The degree of freedom gives insight into how many objects may vary to perform this test. In this instance, since it was a 2 X 2 table, only 1 variable could be manipulated. The extremely low p-value of 3.28e-05 indicates that two factors aren’t independent and that a male applicants are more likely to be rejected than woman from UC Berkley
ctUCOUT <- contingencyTableBF(ucbDF, sampleType = 'poisson')
ctUCOUT
## Bayes factor analysis
## --------------
## [1] Non-indep. (a=1) : 1111.64 ±0%
##
## Against denominator:
## Null, independence, a = 1
## ---
## Bayes factor type: BFcontingencyTable, poisson
Analysis:
The Bayes Factor further proves that the two factors aren’t independent from one another. The strong non-indep indicates a strong 1111:1 support of the alternative hypothesis.
cnPostOut <- contingencyTableBF(ucbDF, sampleType = 'poisson', posterior = TRUE, iterations = 10000)
summary(cnPostOut)
##
## Iterations = 1:10000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 10000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## lambda[1,1] 510.62 22.674 0.22674 0.22262
## lambda[2,1] 312.22 17.637 0.17637 0.17637
## lambda[1,2] 89.54 9.529 0.09529 0.09529
## lambda[2,2] 19.96 4.465 0.04465 0.04465
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## lambda[1,1] 468.40 494.91 510.05 525.87 556.39
## lambda[2,1] 278.48 300.16 311.89 323.91 347.98
## lambda[1,2] 71.95 83.01 89.13 95.63 109.52
## lambda[2,2] 12.14 16.77 19.62 22.77 29.71
maleProp <- cnPostOut[,'lambda[1,1]']/cnPostOut[,'lambda[2,1]']
femaleProp <- cnPostOut[,'lambda[2,1]']/cnPostOut[,'lambda[2,2]']
diffPop <- maleProp - femaleProp
hist(diffPop)
abline(v=quantile(diffPop,c(0.025)), col='black')
abline(v=quantile(diffPop,c(0.975)), col='black')
mean(diffPop)
## [1] -14.83673
Analysis: The mean difference in the porportion of the columns is -14.9.Each of the Lambda values represents a factor in the original data set(Male accepted, male rejected, female accepted, female rejected).While the mean values returns a value similar to the original data, the HDI indicates ranges of each variable. The male accepted ranges from 468 - 555 at 95%. The male rejected could be as low as 279 and as high as 348 at 95%. The female accepted can range from 72 - 109 and the female rejected can be as low as 12 or as high as 30 @ 95% HDI.