# Chicken Weights by Feed Type ####

# An experiment was conducted to measure and compare the effectiveness
# of various feed supplements on the growth rate of chickens. 

data("chickwts")
summary(chickwts)
##      weight             feed   
##  Min.   :108.0   casein   :12  
##  1st Qu.:204.5   horsebean:10  
##  Median :258.0   linseed  :12  
##  Mean   :261.3   meatmeal :11  
##  3rd Qu.:323.5   soybean  :14  
##  Max.   :423.0   sunflower:12

Plot your data first (as always). Be sure to include an informative figure caption.

data("chickwts")
plot(chickwts$weight~chickwts$feed) # y axis first x axis second

This is a box and whisker plot of the various feeds and their corresponding weight.

Are your data normal and have homogeneity of variance? Provide three pieces of graphical evidence that argues your case. Be sure to include informative figure captions. Explain your answer.

data("chickwts")
hist(chickwts$weight)

qqnorm(chickwts$weight)
qqline(chickwts$weight) # run these to see normality of data set

The distribution of this data appears to be relatively normal. The histogram plot appears to be normally distributed and many of the data point are directly on or very near the best fit line in the QQ Normal plot.

Did you transform your data? If so, state which transformation you used. Provide three pieces of evidence that your data more closely approximates a normal distribution and homogeneity of variance. If not, state why you did not transform the data.

I did not transform my data because the plots indicated relative normality in the data set.

Create an ANOVA object with the appropriate data (raw or transformed). Present your results in the R Markdown file. Did you accept or reject the null hypothesis? Are the results statistically significant? Provide and interpret two evidence graphs that the residuals meet the assumption of the ANOVA.

chickwts.avo <- aov(weight ~ feed, data = chickwts)
plot(chickwts.avo)

summary(chickwts.avo) # use these to properly summarize an ANOVA test.
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## feed         5 231129   46226   15.37 5.94e-10 ***
## Residuals   65 195556    3009                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

I reject the null hypothesis because i came to an P value of 5.94e-10 which is significantly less than the .05. There is a significant difference between the weights of the chickens in relation to the food type.

Looking at the QQline we see that the residuals seem to fall along the best fit line well. Looking at the fitted value we see a pretty good fit wihtout large outliers trends in the outliers. We can assume that the Anova assumptions are met.

Did you perform a multiple comparison test? Why or why not? Explain. Present your code if appropriate.

Yes i did because ANOVA tells us that there is a significant difference but not which one is different from which. For that we run a TukeyHSD

TukeyHSD(chickwts.avo) # use this to check for which variable is statistically different from which.
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = weight ~ feed, data = chickwts)
## 
## $feed
##                            diff         lwr       upr     p adj
## horsebean-casein    -163.383333 -232.346876 -94.41979 0.0000000
## linseed-casein      -104.833333 -170.587491 -39.07918 0.0002100
## meatmeal-casein      -46.674242 -113.906207  20.55772 0.3324584
## soybean-casein       -77.154762 -140.517054 -13.79247 0.0083653
## sunflower-casein       5.333333  -60.420825  71.08749 0.9998902
## linseed-horsebean     58.550000  -10.413543 127.51354 0.1413329
## meatmeal-horsebean   116.709091   46.335105 187.08308 0.0001062
## soybean-horsebean     86.228571   19.541684 152.91546 0.0042167
## sunflower-horsebean  168.716667   99.753124 237.68021 0.0000000
## meatmeal-linseed      58.159091   -9.072873 125.39106 0.1276965
## soybean-linseed       27.678571  -35.683721  91.04086 0.7932853
## sunflower-linseed    110.166667   44.412509 175.92082 0.0000884
## soybean-meatmeal     -30.480519  -95.375109  34.41407 0.7391356
## sunflower-meatmeal    52.007576  -15.224388 119.23954 0.2206962
## sunflower-soybean     82.488095   19.125803 145.85039 0.0038845

Summarize your results in a paragraph similar to the example in the ``Reporting Your Results’’ section.

The weight of the chickens varies significantly from one feed type to another. A Tucky HSD multiple comparison was used to test the statistical difference between different paired feeds and their corresponding body weights. The test showed statistically significant differences between all of the feeds except for a few tests. The one with no significant difference were the tests between meatmeal and casein, sunflower and casein, linseed and horsebean, meatmeal and linseed, soybean and linseed, soybean and meatmeal, and sunflower and meatmeal.

Please turn–in your homework via Sakai by saving and submitting an R Markdown PDF or HTML file from R Pubs!