The data file Weeklylab8data.xlsx contains participant attractiveness ratings of three versions of a commercial product. Each participant was assigned to a focus group (4-6 participants were in each focus group) and everyone in the focus group rated the same version of the product on a scale from 0 to 100 (larger numbers more support for the product). Focus groups were constructed in several different cities across the NE. Standard demographic questions were included so that they could be used to vary out their potential effects. Analyze the data, being sure to take into account the nesting structure, and summarize the results.
library(readxl)
WeeklyLab8=read_excel("C:/Users/jcolu/OneDrive/Documents/Harrisburg/Summer 2018/ANLY 510/WeeklyLab8.xlsx")
WeeklyLab8
## # A tibble: 154 x 8
## Participant FocusGroup City Age Gender Income Version Rating
## <dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1.00 1.00 PITT 20.0 0 56594 1.00 20.7
## 2 2.00 1.00 PITT 57.0 0 114612 1.00 31.8
## 3 3.00 1.00 PITT 63.0 0 63011 1.00 34.1
## 4 4.00 1.00 PITT 42.0 0 23596 1.00 31.6
## 5 5.00 1.00 PITT 51.0 1.00 85726 1.00 14.7
## 6 6.00 2.00 PITT 48.0 1.00 103276 2.00 43.7
## 7 7.00 2.00 PITT 58.0 0 77469 2.00 59.8
## 8 8.00 2.00 PITT 63.0 0 24615 2.00 54.7
## 9 9.00 2.00 PITT 19.0 1.00 86913 2.00 38.4
## 10 10.0 3.00 PITT 29.0 0 63328 3.00 60.9
## # ... with 144 more rows
Initial Data Analysis - using str function
str(WeeklyLab8)
## Classes 'tbl_df', 'tbl' and 'data.frame': 154 obs. of 8 variables:
## $ Participant: num 1 2 3 4 5 6 7 8 9 10 ...
## $ FocusGroup : num 1 1 1 1 1 2 2 2 2 3 ...
## $ City : chr "PITT" "PITT" "PITT" "PITT" ...
## $ Age : num 20 57 63 42 51 48 58 63 19 29 ...
## $ Gender : num 0 0 0 0 1 1 0 0 1 0 ...
## $ Income : num 56594 114612 63011 23596 85726 ...
## $ Version : num 1 1 1 1 1 2 2 2 2 3 ...
## $ Rating : num 20.7 31.8 34.1 31.6 14.7 ...
Data Needs to be factorized
WeeklyLab8$FocusGroup <- factor(WeeklyLab8$FocusGroup)
WeeklyLab8$Gender <- factor(WeeklyLab8$Gender)
WeeklyLab8$Version <- factor(WeeklyLab8$Gender)
WeeklyLab8$Version <- factor(WeeklyLab8$Version)
WeeklyLab8$Participant <-factor(WeeklyLab8$Participant)
Review Factored data set
str(WeeklyLab8)
## Classes 'tbl_df', 'tbl' and 'data.frame': 154 obs. of 8 variables:
## $ Participant: Factor w/ 154 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ FocusGroup : Factor w/ 32 levels "1","2","3","4",..: 1 1 1 1 1 2 2 2 2 3 ...
## $ City : chr "PITT" "PITT" "PITT" "PITT" ...
## $ Age : num 20 57 63 42 51 48 58 63 19 29 ...
## $ Gender : Factor w/ 2 levels "0","1": 1 1 1 1 2 2 1 1 2 1 ...
## $ Income : num 56594 114612 63011 23596 85726 ...
## $ Version : Factor w/ 2 levels "0","1": 1 1 1 1 2 2 1 1 2 1 ...
## $ Rating : num 20.7 31.8 34.1 31.6 14.7 ...
plot(density(WeeklyLab8$Rating))
Graph reflects that data set is skewed
library(moments)
agostino.test(WeeklyLab8$Rating)
##
## D'Agostino skewness test
##
## data: WeeklyLab8$Rating
## skew = 0.41168, z = 2.10400, p-value = 0.03538
## alternative hypothesis: data have a skewness
Since Skewness is significatnt, it needs to be addressed.
agostino.test(log(WeeklyLab8$Rating))
##
## D'Agostino skewness test
##
## data: log(WeeklyLab8$Rating)
## skew = -1.3295, z = -5.5791, p-value = 2.417e-08
## alternative hypothesis: data have a skewness
agostino.test(log(WeeklyLab8$Rating+20))
##
## D'Agostino skewness test
##
## data: log(WeeklyLab8$Rating + 20)
## skew = -0.38009, z = -1.95170, p-value = 0.05097
## alternative hypothesis: data have a skewness
agostino.test(log(WeeklyLab8$Rating+40))
##
## D'Agostino skewness test
##
## data: log(WeeklyLab8$Rating + 40)
## skew = -0.15701, z = -0.82595, p-value = 0.4088
## alternative hypothesis: data have a skewness
agostino.test(log(WeeklyLab8$Rating+60))
##
## D'Agostino skewness test
##
## data: log(WeeklyLab8$Rating + 60)
## skew = -0.038325, z = -0.202620, p-value = 0.8394
## alternative hypothesis: data have a skewness
agostino.test(log(WeeklyLab8$Rating+80))
##
## D'Agostino skewness test
##
## data: log(WeeklyLab8$Rating + 80)
## skew = 0.03755, z = 0.19852, p-value = 0.8426
## alternative hypothesis: data have a skewness
Skwenewss is too much
agostino.test(log(WeeklyLab8$Rating+70))
##
## D'Agostino skewness test
##
## data: log(WeeklyLab8$Rating + 70)
## skew = 0.0032933, z = 0.0174170, p-value = 0.9861
## alternative hypothesis: data have a skewness
plot(density(log(WeeklyLab8$Rating+70)))
The graph can show that data is better evenly distributed.