#(a) Since more patients on pioglitazone had cardiovascular problems (5,386 vs. 2,593) we can conclude that the rate of cardiovascular problems for those on a pioglitazone treatment is higher.
#rate rosiglitazone
2593/67593
## [1] 0.03836196
#rate pioglitazone
5386/159978
## [1] 0.03366713
#False, cardiovascular problems on pioglitazone actually occur at a less frequent rate than those on rosiglitazone. Although there are more instances of cardiovascular problems on pioglitazone, but when you compare the instances to the number of people that take pioglitazone, the rate is only .034. When comparing this to the number of instances of cardiovascular problems of those on rosiglitazone compared to the number that take rosiglitazone, the rate of cardiovascular problems is .038, which is a greater rate.
#(b)The data suggest that diabetic patients who are taking rosiglitazone are more likely to have cardiovascular problems since the rate of incidence was 3.8% for patients on this treatment, while it was only 3.4% for patients on pioglitazone.
#True, The data suggests that the rate of incidence is higher for those on rosiglitazone than those on pioglitazone. However, the difference between 3.8% and 3.4% is a very small difference. This data was taken from a sample of diabetic patients on these medications. The sample should hopefully be representative of the effects of these medications on cardiovascular problems, but it is just a sample meant to represent/identify a true relationship. However, we do not know if the difference in rates occured by random chance based on the sample taken.
#(c) The fact that the rate of incidence is higher for the rosglitazone group proves that rosglitazone causes serious cardiovascular problems.
#False, regardless of the rate being higher or not, causal conclusions about rosiglitazone and cardiovascular problems cannot be made based on this data alone. The data does indicate that there might be a relationship between rosiglitazone and cardiovascular problems, but there is no control group (diabetics not taking any medication) to compare the rate of cardiovascular problems to.
#(d) Based on the information provided so far, we cannot tell if the difference between the rates of incidences is due to a relationship between the two variables or due to chance.
#True
#(a) What proportion of patients in the treament group and what proportion of patients in the control group died?
#Control died:
30/34
## [1] 0.8823529
#Treament died:
45/69
## [1] 0.6521739
#(b) One approach for investiagating whether or not the treatment is effective is to use a randomization technique.
#b: i. What are the claims being tested?
#H0: Null hypothesis: There is no difference in death rates between treatment and control groups.
#Ha: Alternative hypothesis: There is a difference in death rates between treatment and control groups.
#b: ii. Fill in the blank paragraph. Blanks are indicated by use of quotation marks.
#b: ii. Write alive on “28” cards, write dead on “75” cards, then split into two groups: one group size of “69” representing treatment, and another group size of “34” representing control. We calculate the difference between the proportion of dead cards and in the treatment and control groups and record this value. We repeate this many times to build a distribution centered at “0.” Lastly, we calculate a fraction of simulations where the simulated differences in proportions are “as or more extreme than our sample difference.”
#b iii. What do the simulation results shown below suggest about the effectiveness of the transplant program? #proportion treatment died - proportion control died
.65-.88
## [1] -0.23
#3(a) What types of study is this?
#This is a randomized controlled experiment. Participants are randomly assigned to either the control (placebo) group or the antibiotic (experimental) group. It is a true experiment because all variables are held equivalent (for example, the same taste and packaging of the pills) expect for the treament itself. Therefore, causality can be determined. If there is a significant difference in outcome for the treament group compared to the control group we know that the cause of the difference can be attributed to the treatment.
#3b.Does this study make use of blinding?
#This study does make use of blinding in the sense that the patients were unaware of which treatment they were recieving. They either recieved the placebo or the actual treatment. They did not know because both types of pills intentionally had the same taste and packaging. However, the study was not double blind because it did not indicate that the researchers did not know which group the patients were in.
#3c. Compute the difference in the proportions of patients who self-reported an improvement in symptoms in the two groups: p(antibiotic)-p(placebo)
#antibiotic
yes_antibiotic <- 66/85
yes_antibiotic
## [1] 0.7764706
#placebo
yes_placebo <- 65/81
yes_placebo
## [1] 0.8024691
#difference in proportions
yes_antibiotic - yes_placebo
## [1] -0.02599855
#3d. At first glance, does the antibiotic or placebo appear to be more effective for the treatment? Explain using statistics.
#Based on this difference of about -.02, it appears that the placebo was more effective. We subtracted the proportion of yes reports of the placebo group (.80) from the proportion of yes reports from the antibiotic (.78). Since the difference was negative, this would indicate that the proportion of yes reports from the placebo group is larger. However, we do not know if this reflects a true difference in effectiveness of the antibiotic or if this difference is due to random chance based on our sample.
#3e. There are two competing claims that this study is used to compare: the null hypothesis that the antibiotic has no impact and the alternative hypothesis that it has an impact, Write out these claims in easy-to-understand language.
#Null hypothesis: There is no difference in self-reported improvement in symptoms between the antibiotic and the placebo.
#Alternative hypothesis: There is a difference in self-reported improvement in symptoms between the antibiotic and the placebo.
#3f. Below is a histogram of simulation results computed under the null hypothesis. In each simulation, the summary value reported was the number of patients who recieved antibiotics and self reported an improvement in symptoms. Write a conclusion for the hypothesis test in plain language.
#This histogram shows the simulated number of patients who self-reported an improvement under the null hypothesis that there is no difference in effectiveness between the antibiotic and the placebo. Based on this historgram, a value of 66 appears to be a value we would typically expect to observe. Therefore, we would fail to reject the null because our sample data is typical given the null. We will thus conclude that there is no difference in improvement in symptoms between the placebo and the antibiotic groups.
#4a. What are the hypotheses?
#Null (H0): There is no difference in yawns between yawning group (treatment) and not yawning (control) group.
#Alternative (HA): There is a difference in yawns between yawning group (treatment) and not yawning (control) group.
#4b. Calculate the observed difference between the yawning rates under the two scenarios.
#yawn rate treatment
yawn_treatment <- 10/34
#yawn rate control
yawn_control <- 4/16
#yawn rate difference
yawn_treatment - yawn_control
## [1] 0.04411765
#The difference in rates of yawns between the treatment (yawn group) and control (no yawn group) is .04.
#4c. Estimate the p-value using the figure above and determine the conclusion of the hypothesis test.
#Using the figure, it appears that a difference of .04 is a difference we could expect to observe under the null because the frequency in values close to .04 is high. Therefore, we fail to reject the null and conclude that the data is not sufficient evidence to suggest that yawning in the presence of others causes a difference in yawning.
#5a.Write the hypotheses for testing if the proportion of high school students who followed the news about Egypt is different than the proportion of American adults who did.
#Null (H0): There is no difference in proportion of students who followed the Egptian Revolution to proportion of adults who followed the Egyptian Revolution.
#Alternative (HA): There is a difference in proportion of students who followed the Egptian Revolution to proportion of adults who followed the Egyptian Revolution.
#5b. Calculate the proportion of high schoolers in this sample who followed the the news about Egypt.
followed_students <- 17/30
followed_students
## [1] 0.5666667
#5c. Describe how to perform a simulation and, once you had results, how to estimate the p-value.
#At first glance, we can see there is a difference in proportion of adults who followed the Egyptian revolution (.69) compared to students who followed the Egyptian revolution (.57). This is a difference of .12. However, we do not know if this reflects a true difference in rates of people who followed the Egyptian revolution. This difference could have occured due to chance based on our sample of students. However, we can run a simulation to determine if this is a difference we could typically expect to occur given that there is no difference in rates between adults and students who followed the Egyptian revolution. Our null hypothesis is that there is no difference between adults and students, but since we do not have the number of adults sampled, we can use an equivalent statement for our null that the proportion of students who followed the news is .69 (because this would give us a difference in proportions of 0). Based on this null, we can run a simulation in which we have a population of students where 69% of the students say “yes” to following the Egyptian revolution. We can then randomly pull 30 students from this population and observe the proportion of that 30 that say “yes.” We can repeat this simluation over and over (10,000 times) to build a distribution of proportions we would expect to observe given the null. Then, we can compare our actual sample proportion of .57 to our simulated distribution of proportions to see if our proportion is typical given the null.
#5d. Using the histogram, estimate the p-value using the plot and determine the conclusion of the hypothesis test.
#Based on the histogram, it appears that a proportion of .57 of students who followed the Egyptian revolution does not seem extremely typical, but also does not appear to be a value we would rarely expect to observe. The frequency in which we would expect to see a value of .57 is somewhat high. Therefore, we would fail to reject the null and conclude that our data is not sufficient evidence to say that adults and students followed the Egyptian revolution at different rates.
#5e. Write code to simulate this distribution and find the p-value for your simulation. #simulated binomial
nsim <- 10000
simdis <- rbinom(nsim, 30, .69)
hist(simdis, xlab = "Number of Students who follow Egyptian Revolution")
abline(v=17, col= "maroon", lwd=2, lty=2)
#calculate p value
pbinom(17, 30, .69,)*2
## [1] 0.2105051
#the probabilty of our sample data given the null is .21. Based on our simulation, we fail to reject the null.