The data file Weeklylab9data.xlsx contains workers performance (a percentage based on several factors such as output, quality of output, follow safety procedures, team work, etc.) in a production facility. Production was measured under two conditions for each worker (before and after lunch) and each worker had one kind of production task out of two (weld or bolt). Analyze the data using a repeated measures ANOVA and then as a mixed effects regression model to see if there is any difference in performance before or after lunch, or for different kinds of tasks.
library(readxl)
WeeklyLab9=read_excel("C:/Users/jcolu/OneDrive/Documents/Harrisburg/Summer 2018/ANLY 510/WeeklyLab9.xlsx")
WeeklyLab9
## # A tibble: 100 x 4
## Participant BeforeAfter Task Performance
## <dbl> <dbl> <dbl> <dbl>
## 1 1.00 0 0 0.600
## 2 2.00 0 0 0.500
## 3 3.00 0 0 0.700
## 4 4.00 0 0 0.600
## 5 5.00 0 0 0.700
## 6 6.00 0 0 0.700
## 7 7.00 0 0 0.300
## 8 8.00 0 0 0.500
## 9 9.00 0 0 0.300
## 10 10.0 0 0 0.400
## # ... with 90 more rows
Determine if data needs to be factorized
str(WeeklyLab9)
## Classes 'tbl_df', 'tbl' and 'data.frame': 100 obs. of 4 variables:
## $ Participant: num 1 2 3 4 5 6 7 8 9 10 ...
## $ BeforeAfter: num 0 0 0 0 0 0 0 0 0 0 ...
## $ Task : num 0 0 0 0 0 0 0 0 0 0 ...
## $ Performance: num 0.6 0.5 0.7 0.6 0.7 0.7 0.3 0.5 0.3 0.4 ...
It looks like data needs to be factorized
WeeklyLab9$BeforeAfter=factor(WeeklyLab9$BeforeAfter)
WeeklyLab9$Participant=factor(WeeklyLab9$Participant)
WeeklyLab9$Task=factor(WeeklyLab9$Task)
str(WeeklyLab9)
## Classes 'tbl_df', 'tbl' and 'data.frame': 100 obs. of 4 variables:
## $ Participant: Factor w/ 50 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ BeforeAfter: Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
## $ Task : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
## $ Performance: num 0.6 0.5 0.7 0.6 0.7 0.7 0.3 0.5 0.3 0.4 ...
Analyze the dependent variable - Performance.
summary(WeeklyLab9$Performance)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.3000 0.4000 0.5100 0.5186 0.6225 0.7300
Use density ploy to see how performance is evenly distributed & skewness.
plot(density(WeeklyLab9$Performance))
Data looks skewed to the right. Use D’Agostino test to see if dependent variale needs to be transformed.
library(moments)
agostino.test(WeeklyLab9$Performance)
##
## D'Agostino skewness test
##
## data: WeeklyLab9$Performance
## skew = -0.065896, z = -0.286090, p-value = 0.7748
## alternative hypothesis: data have a skewness
D’Agostino test confirms skewness. However, data isn’t skewed too much - it doesn’t need to be transformed.
Use Model with ANOVA
Model=aov(WeeklyLab9$Performance~WeeklyLab9$BeforeAfter*WeeklyLab9$Task)
Model
## Call:
## aov(formula = WeeklyLab9$Performance ~ WeeklyLab9$BeforeAfter *
## WeeklyLab9$Task)
##
## Terms:
## WeeklyLab9$BeforeAfter WeeklyLab9$Task
## Sum of Squares 0.000196 0.005184
## Deg. of Freedom 1 1
## WeeklyLab9$BeforeAfter:WeeklyLab9$Task Residuals
## Sum of Squares 0.002304 2.043520
## Deg. of Freedom 1 96
##
## Residual standard error: 0.1458995
## Estimated effects may be unbalanced
summary(Model)
## Df Sum Sq Mean Sq F value Pr(>F)
## WeeklyLab9$BeforeAfter 1 0.0002 0.000196 0.009 0.924
## WeeklyLab9$Task 1 0.0052 0.005184 0.244 0.623
## WeeklyLab9$BeforeAfter:WeeklyLab9$Task 1 0.0023 0.002304 0.108 0.743
## Residuals 96 2.0435 0.021287
From the ANOVA model, we can determine that there is no significance among the independent variables.
Plot Residuals
Model_Res=proj(Model)
qqnorm(Model_Res)
qqline(Model_Res)
The Residual plots show that the ANOVA model is not a good fit.
Linear Regression Model
library(lmerTest)
## Warning: package 'lmerTest' was built under R version 3.4.4
## Loading required package: lme4
## Warning: package 'lme4' was built under R version 3.4.4
## Loading required package: Matrix
##
## Attaching package: 'lmerTest'
## The following object is masked from 'package:lme4':
##
## lmer
## The following object is masked from 'package:stats':
##
## step
LinearModel=lmer(WeeklyLab9$Performance~WeeklyLab9$BeforeAfter*WeeklyLab9$Task+(1|WeeklyLab9$Participant))
summary(LinearModel)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula:
## WeeklyLab9$Performance ~ WeeklyLab9$BeforeAfter * WeeklyLab9$Task +
## (1 | WeeklyLab9$Participant)
##
## REML criterion at convergence: -84.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.6366 -0.7499 -0.0622 0.7278 1.4405
##
## Random effects:
## Groups Name Variance Std.Dev.
## WeeklyLab9$Participant (Intercept) 0.001728 0.04157
## Residual 0.019558 0.13985
## Number of obs: 100, groups: WeeklyLab9$Participant, 50
##
## Fixed effects:
## Estimate Std. Error df
## (Intercept) 0.53200 0.02918 95.37128
## WeeklyLab9$BeforeAfter1 -0.01240 0.03956 48.00000
## WeeklyLab9$Task1 -0.02400 0.04127 95.37128
## WeeklyLab9$BeforeAfter1:WeeklyLab9$Task1 0.01920 0.05594 48.00000
## t value Pr(>|t|)
## (Intercept) 18.232 <2e-16 ***
## WeeklyLab9$BeforeAfter1 -0.313 0.755
## WeeklyLab9$Task1 -0.582 0.562
## WeeklyLab9$BeforeAfter1:WeeklyLab9$Task1 0.343 0.733
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) WkL9$BA1 WL9$T1
## WklyLb9$BA1 -0.678
## WklyLb9$Ts1 -0.707 0.479
## WL9$BA1:WL9 0.479 -0.707 -0.678
No significant factors stick out. Plot residuals.
qqnorm(resid(LinearModel))
qqline(resid(LinearModel))
The plots indicate that either neither the ANOVA model or the Linear Model have a fit.