HW_3_14July2023

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.2     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.2     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(infer)
library(EnvStats)

## 
## Attaching package: 'EnvStats'
## 
## The following objects are masked from 'package:stats':
## 
##     predict, predict.lm
## 
## The following object is masked from 'package:base':
## 
##     print.default

msdlabs <- read.csv("msd_labs.csv")

#msdlabs$race <- ifelse (msdlabs$AFAM == 1,1,NA )

##msdlabs$race <- ifelse (msdlabs$WHITE == 1,1)

msdlabs_race <- unite(msdlabs, race, AFAM, WHITE,OTHRACE)

### There were no variables in the file that had > 2 levels. I attempted to combine
### the race codes into 1 variable so I could use it in ANOVA. I tried doing that w/ the ifelse 
### command; it didn't work. I then tried the unite command. It worked, but gave a result I 
### could not use. Somehow that code was deleted. In my R code, there is a small blue box
### in the chunk above this one. The unite code should be in that chunk. I took the result of the 
### unite function, imported it to NCSS, edited the race column in NCSS, exported it to .csv, then read ### it into R and ran the ANOVA. As I explained in the write up at the end of the code, something didn't ###  work in R, but when i ran the recoded file in NCSS, it worked as expected. Go figure.

write.csv (msdlabs_race, file = "msdlabs_race.csv")

msdlabs_recoded <-  read.csv ("msdlabs_recoded.csv")

###install.packages("lawstat")
###install.packages("sjstats")
library(lawstat)
library(sjstats)

## 
## Attaching package: 'sjstats'

## The following object is masked from 'package:EnvStats':
## 
##     cv

## The following object is masked from 'package:infer':
## 
##     p_value

race_anova <-  (aov(formula = HOUSEHOLD ~ I6, data = msdlabs_recoded))
summary(race_anova)

##               Df  Sum Sq Mean Sq F value   Pr(>F)    
## I6             1   21144   21144   11.67 0.000652 ***
## Residuals   1554 2815846    1812                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

levene.test(msdlabs_recoded$HOUSEHOLD, msdlabs_recoded$I6)

## 
##  Modified robust Brown-Forsythe Levene-type test based on the absolute
##  deviations from the median
## 
## data:  msdlabs_recoded$HOUSEHOLD
## Test Statistic = 3.3522, p-value = 0.01833

effectsize::cohens_f(race_anova)

## For one-way between subjects designs, partial eta squared is equivalent
##   to eta squared. Returning eta squared.

## # Effect Size for ANOVA
## 
## Parameter | Cohen's f |      95% CI
## -----------------------------------
## I6        |      0.09 | [0.04, Inf]
## 
## - One-sided CIs: upper bound fixed at [Inf].

### TukeyHSD(race_anova, conf.level=.95)

### A one-way ANOVA was executed to determine if there were statistically significant differences
### in the mean household income for 4 groups of students: Black, White, other races, and AIAN and API ### together. The ANOVA was  statistically significant (F = 11.67, df = 1, 1554, p = .000652). This 
### suggests that  at least one of the 4 groups had an income level that differed significantly from the ### others. Cohen's F shows a relatively weak effect size. 
### A problem:  I do not understand why there was only 1 degree of freedom when 
### there were 4 groups. There should have been 3 degrees of freedom.  I ran the same test in NCSS and 
### the results were as expected: 3 df, and the Tukey HSD showed statistically significant differences ### among the groups. The Tukey HSD did not run in R because R said "Error in TukeyHSD.aov(race_anova, ### conf.level = 0.95) :no factors in the fitted model."   I do not understand what went wrong.

HW_3_14July2023

Jerome

2023-07-14