We wish to compare the speed of a new processor to an older processor for which the mean speed is known.
o For this scenario we will use a One-sample T-test because we are comparing the given mean speed of the old processor with the other samples of data.
We would like to test whether age groups (child, teen, young adult, adult, senior) differ in their enjoyment (a continuo us variable) of a commercial.
o For this case, it’s appropriate to use One-way ANOVA since it compares a constant variable (age groups) against a categorical value (Enjoyment).
We are interested in whether men and women differ in their rates of driving foreign or domestic vehicles.
o In this instance, we should use a Chi - Square Test of Independence because we want to compare two sets of categorical dependents variables.
We have run an experiment looking at the effects of sugar and salt content (both taken at three levels) on reported appetitiveness (i.e., enjoyment of taste: a continuous variable).
o In this occasion a Paired T-test would assess the relationship between the independent variables (Sugar & Salt Content) and the dependent variable (appetitiveness).
Focus groups were randomly assigned to one of two commercials and their likelihood of buying the product advertised was taken on a scale from 0 to 100. We wish to know if the commercials differed in their effectiveness.
o We could use One-way ANOVA test in order to examine the relationship between categorical predictor with two levels (the commercials shown) and a continuous dependent variable (likelihood of buying the product in the scale of 1 to 100) .
We have been observing the outcomes dice thrown on the craps table at our local casino and we have a feeling the dice are loaded (set to land on certain numbers more frequently). We want to test if they are fair dice (equally likely to land on all sides).
o The Chi-Square Goodness of Fit test helps to analyze the proportions of the outcome (in the crabs table) in relation to the data gathered in our observation (randomly rolling the dice).
We have a 20-factor experiment each with three levels (a 320 design). We can only perform one replicate, but we want to know what effects, including interactions, seem to have an impact.
o A Factorial Screening should be used because there is only one replicate of the experiment.
Patients ratings of pain (on a scale of 1 to 100) were taken before and after a drug treatment was given and we want to know if the drug reduced pain significantly.
o A Paired T-Test should be used to check whether the outcome of the drug is dependent to the pain pills.
Dish_type= c("Meat","Vegetarian","Vegan","Meat","Vegetarian","Vegan")
Sales= c(155, 120, 5, 200, 300, 100)
Conference = c("Business","Business", "Business","Entertainment", "Entertainment", "Entertainment")
Food_Cart=data.frame(Dish_type,Sales,Conference)
str(Food_Cart)
## 'data.frame': 6 obs. of 3 variables:
## $ Dish_type : Factor w/ 3 levels "Meat","Vegan",..: 1 3 2 1 3 2
## $ Sales : num 155 120 5 200 300 100
## $ Conference: Factor w/ 2 levels "Business","Entertainment": 1 1 1 2 2 2
boxplot(Food_Cart$Sales~Food_Cart$Conference,main="Sales Per Conference", ylab="Sales")
boxplot(Food_Cart$Sales~Food_Cart$Dish_type, main ="Sales Per Dish", ylab="Sales")
Box plot Conclusions: Just based on the conference type, there are more sales generated in an Entertainment event than a business event. Just Based on the dish type, there are more sales gentared by vegetarian dishes.
pie(Food_Cart$Sales[1:3]/sum(Food_Cart$Sales[1:3]), labels= c("Meat","Vegetarian","Vegan"), main = "Dish Sales in Business Conference" )
pie(Food_Cart$Sales[4:6]/sum(Food_Cart$Sales[4:6]), labels= c("Meat","Vegetarian","Vegan"), main = "Dish Sales in Entertainment Conference" )
Pie Chart Conclusions: Proportionally there is not a spefic dish that generates the most sales per event, since Meat sales more in Business conferences and Vegetarian sales more in entertainment.
library(readxl)
Bulbs= read_excel("C:/Users/jcolu/OneDrive/Documents/Harrisburg/Summer 2018/ANLY 510/Exam1Q2.xlsx")
str(Bulbs)
## Classes 'tbl_df', 'tbl' and 'data.frame': 900 obs. of 4 variables:
## $ BlueLevel : chr "None" "Low" "Moderate" "None" ...
## $ Lum : chr "Low" "Low" "Low" "Medium" ...
## $ Impression: num 20 33 -9 36 55 -2 35 97 13 15 ...
## $ FocusGroup: num 1 1 1 1 1 1 1 1 1 2 ...
Factorize Data Set
Bulbs$BlueLevel=factor(Bulbs$BlueLevel,levels = c("None","Low","Moderate"))
Bulbs$Lum=factor(Bulbs$Lum,levels=c("Low", "Medium", "High"))
Bulbs$FocusGroup=as.character(Bulbs$FocusGroup)
str(Bulbs)
## Classes 'tbl_df', 'tbl' and 'data.frame': 900 obs. of 4 variables:
## $ BlueLevel : Factor w/ 3 levels "None","Low","Moderate": 1 2 3 1 2 3 1 2 3 1 ...
## $ Lum : Factor w/ 3 levels "Low","Medium",..: 1 1 1 2 2 2 3 3 3 1 ...
## $ Impression: num 20 33 -9 36 55 -2 35 97 13 15 ...
## $ FocusGroup: chr "1" "1" "1" "1" ...
Model=aov(Bulbs$Impression~Bulbs$FocusGroup*Bulbs$BlueLevel+Bulbs$Lum)
summary(Model)
## Df Sum Sq Mean Sq F value Pr(>F)
## Bulbs$FocusGroup 99 278 3 0.029 1
## Bulbs$BlueLevel 2 580572 290286 3019.267 <2e-16 ***
## Bulbs$Lum 2 155072 77536 806.453 <2e-16 ***
## Bulbs$FocusGroup:Bulbs$BlueLevel 198 1906 10 0.100 1
## Residuals 598 57494 96
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
library(lsmeans)
## Warning: package 'lsmeans' was built under R version 3.4.4
## The 'lsmeans' package is being deprecated.
## Users are encouraged to switch to 'emmeans'.
## See help('transition') for more information, including how
## to convert 'lsmeans' objects and scripts to work with 'emmeans'.
lsmip(object = Model, formula = Bulbs$BlueLevel~Bulbs$Lum, main= "Blue Bulbs & Luminosity ", xlab="Levels of Light", ylab= "Luminosity")
Conclusions: Based on the visualization of our data model, we can conclude that Low levels of blue light generally yield to higher levels of luminosity.