Exam 1 - ANLY 545

Indicate the simplest and most appropriate test for each of the following situations (2.5% of exam). Please also include your reasoning for choosing the test for partial/full credit:

We wish to compare the speed of a new processor to an older processor for which the mean speed is known.

o For this scenario we will use a One-sample T-test because we are comparing the given mean speed of the old processor with the other samples of data.
We would like to test whether age groups (child, teen, young adult, adult, senior) differ in their enjoyment (a continuo us variable) of a commercial.

o For this case, it’s appropriate to use One-way ANOVA since it compares a constant variable (age groups) against a categorical value (Enjoyment).
We are interested in whether men and women differ in their rates of driving foreign or domestic vehicles.

o In this instance, we should use a Chi - Square Test of Independence because we want to compare two sets of categorical dependents variables.
We have run an experiment looking at the effects of sugar and salt content (both taken at three levels) on reported appetitiveness (i.e., enjoyment of taste: a continuous variable).

o In this occasion a Paired T-test would assess the relationship between the independent variables (Sugar & Salt Content) and the dependent variable (appetitiveness).
Focus groups were randomly assigned to one of two commercials and their likelihood of buying the product advertised was taken on a scale from 0 to 100. We wish to know if the commercials differed in their effectiveness.

o We could use One-way ANOVA test in order to examine the relationship between categorical predictor with two levels (the commercials shown) and a continuous dependent variable (likelihood of buying the product in the scale of 1 to 100) .
We have been observing the outcomes dice thrown on the craps table at our local casino and we have a feeling the dice are loaded (set to land on certain numbers more frequently). We want to test if they are fair dice (equally likely to land on all sides).

o The Chi-Square Goodness of Fit test helps to analyze the proportions of the outcome (in the crabs table) in relation to the data gathered in our observation (randomly rolling the dice).
We have a 20-factor experiment each with three levels (a 320 design). We can only perform one replicate, but we want to know what effects, including interactions, seem to have an impact.

o A Factorial Screening should be used because there is only one replicate of the experiment.
Patients ratings of pain (on a scale of 1 to 100) were taken before and after a drug treatment was given and we want to know if the drug reduced pain significantly.

o A Paired T-Test should be used to check whether the outcome of the drug is dependent to the pain pills.

Your food cart sells a meat, a vegetarian, and a vegan dish. Your cart is located near a very busy convention center acting nearly every day as a venue for conferences related to business and entertainment. You have an intuition that there are differences in what each group of attendees prefers for lunch and you would like to see if your intuitions are correct so you can better prepare for each type of event (e.g., prep for more meat dishes during business conferences). You collect data on the numbers of each meal type being bought and what kind of event occurred that day over the past 30 days. The data is below. Perform the most suitable/appropriate analysis and summarize the results (2.5% of exam).

Dish_type= c("Meat","Vegetarian","Vegan","Meat","Vegetarian","Vegan")
Sales= c(155, 120, 5, 200, 300, 100)
Conference = c("Business","Business", "Business","Entertainment", "Entertainment", "Entertainment")
Food_Cart=data.frame(Dish_type,Sales,Conference)
str(Food_Cart)

## 'data.frame':    6 obs. of  3 variables:
##  $ Dish_type : Factor w/ 3 levels "Meat","Vegan",..: 1 3 2 1 3 2
##  $ Sales     : num  155 120 5 200 300 100
##  $ Conference: Factor w/ 2 levels "Business","Entertainment": 1 1 1 2 2 2

boxplot(Food_Cart$Sales~Food_Cart$Conference,main="Sales Per Conference", ylab="Sales")

boxplot(Food_Cart$Sales~Food_Cart$Dish_type, main ="Sales Per Dish", ylab="Sales")

Box plot Conclusions: Just based on the conference type, there are more sales generated in an Entertainment event than a business event. Just Based on the dish type, there are more sales gentared by vegetarian dishes.

pie(Food_Cart$Sales[1:3]/sum(Food_Cart$Sales[1:3]), labels= c("Meat","Vegetarian","Vegan"), main = "Dish Sales in Business Conference" )

pie(Food_Cart$Sales[4:6]/sum(Food_Cart$Sales[4:6]), labels= c("Meat","Vegetarian","Vegan"), main = "Dish Sales in Entertainment Conference" )

Pie Chart Conclusions: Proportionally there is not a spefic dish that generates the most sales per event, since Meat sales more in Business conferences and Vegetarian sales more in entertainment.

Your firm is involved in creating energy (efficient) saving lightbulbs using LEDs. One of the major hurdles to getting customers to buy the new lightbulbs is that many complain that the bulbs put out light that is very unfriendly to the eyes. Two common factors are generally associated with light seen as being unfriendly/uncomfortable: One is the level of luminance (i.e., how bright it is) and the other is the amount of “blue” light present (e.g., more blue light puts more strain on the eyes). Your firm has run an experiment with 3 levels of luminance (“low”, “medium”, and “high”) and 3 levels of blue (“no blue”, “low blue”, and “moderate blue”). 100 focus groups took part in the experiment and each focus groups rated their impressions of the light for each possible of the 9 possible combinations in random order. Ratings went from -100 for (Hated the Light) to +100 (Loved the Light). The data set is titled Exam1Q2.xlsx. Perform the appropriate analysis and summarize the results (5% of exam).

library(readxl)
Bulbs= read_excel("C:/Users/jcolu/OneDrive/Documents/Harrisburg/Summer 2018/ANLY 510/Exam1Q2.xlsx")
str(Bulbs)

## Classes 'tbl_df', 'tbl' and 'data.frame':    900 obs. of  4 variables:
##  $ BlueLevel : chr  "None" "Low" "Moderate" "None" ...
##  $ Lum       : chr  "Low" "Low" "Low" "Medium" ...
##  $ Impression: num  20 33 -9 36 55 -2 35 97 13 15 ...
##  $ FocusGroup: num  1 1 1 1 1 1 1 1 1 2 ...

Factorize Data Set

Bulbs$BlueLevel=factor(Bulbs$BlueLevel,levels = c("None","Low","Moderate"))
Bulbs$Lum=factor(Bulbs$Lum,levels=c("Low", "Medium", "High"))
Bulbs$FocusGroup=as.character(Bulbs$FocusGroup)
str(Bulbs)

## Classes 'tbl_df', 'tbl' and 'data.frame':    900 obs. of  4 variables:
##  $ BlueLevel : Factor w/ 3 levels "None","Low","Moderate": 1 2 3 1 2 3 1 2 3 1 ...
##  $ Lum       : Factor w/ 3 levels "Low","Medium",..: 1 1 1 2 2 2 3 3 3 1 ...
##  $ Impression: num  20 33 -9 36 55 -2 35 97 13 15 ...
##  $ FocusGroup: chr  "1" "1" "1" "1" ...

Model=aov(Bulbs$Impression~Bulbs$FocusGroup*Bulbs$BlueLevel+Bulbs$Lum)
summary(Model)

##                                   Df Sum Sq Mean Sq  F value Pr(>F)    
## Bulbs$FocusGroup                  99    278       3    0.029      1    
## Bulbs$BlueLevel                    2 580572  290286 3019.267 <2e-16 ***
## Bulbs$Lum                          2 155072   77536  806.453 <2e-16 ***
## Bulbs$FocusGroup:Bulbs$BlueLevel 198   1906      10    0.100      1    
## Residuals                        598  57494      96                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

library(lsmeans)

## Warning: package 'lsmeans' was built under R version 3.4.4

## The 'lsmeans' package is being deprecated.
## Users are encouraged to switch to 'emmeans'.
## See help('transition') for more information, including how
## to convert 'lsmeans' objects and scripts to work with 'emmeans'.

lsmip(object = Model, formula = Bulbs$BlueLevel~Bulbs$Lum, main= "Blue Bulbs & Luminosity ", xlab="Levels of Light", ylab= "Luminosity")

Conclusions: Based on the visualization of our data model, we can conclude that Low levels of blue light generally yield to higher levels of luminosity.

Exam 1 - ANLY 545

Juan Colunga

June 28, 2018