Background:
Farming is a business. Like any other business they seek efficiencies in their production process. A dairy farm’s product is milk. To ensure cows produce milk, farmers invest lots of time and resources in their breeding protocols. Any cycle when a cow is eligible to become pregnant and a pregnancy is not achieved creates a loss for the farm. Yet, it’s important to balance the costs of the breeding protocol against the gains in pregnancy rate.
In this case, a farm is trying to decide whether or not a product called CiDR can be reused from one breeding cycle to the next or if they should use a new product every cycle. The product is an insert that delivers progesterone and increases the odds of pregnancy.
Please analyze the data provided and make a recommendation regarding the effectiveness of new or reused CiDR in terms of cow pregnancies.
Note, the farm has requested analysis of its timed artificial insemination cows (TAI) only.
The data contain the following fields:
Heifer_Breeding = read.csv("Heifer Breeding.csv")
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
Filter for only those that had successful pregnancies with the TAI.
tai_data <- Heifer_Breeding %>%
filter(TAI==1,Preg==1)
## in order to visualize the new data set using a bar chart
ggplot(tai_data, aes(x=New)) + geom_bar()
in the chart you can see that there are more successful pregnancies using the CiDR method. Given that information we test to see if there is a relationship using a Chi Square Test of frequencies H0: The number of those getting pregnent using the CiDR method are not related to eachother
chisq.test(table(tai_data$New, tai_data$Preg))
##
## Chi-squared test for given probabilities
##
## data: table(tai_data$New, tai_data$Preg)
## X-squared = 0.019608, df = 1, p-value = 0.8886
Seeing that the p-value is greater than our margin of rejection (.05) we fail to reject the null hypothesis. Because of this we now know that our frequencies are not related.