Data from a case-control study of 200 esophageal cancer cases and 775 community-based controls are shown below. Detailed dietary data were obtained by interview. This study addresses the relation between alcohol consumption (dichotomized an 80 grams per day) and esophageal cancer.
p1 <- read.xls("/Users/jenniferpolson/Documents/School/2018-W/BE M228/Final/Data Entry.xlsx", 1); p1
row.names(p1) <- p1$X
p1 <- subset(p1, select = -X )
OR1 <- (p1[1,1]*p1[2,2])/(p1[2,1]*p1[1,2])
The odds ratio is 5.6400847.
UpperOR1 <- exp(log(OR1)+(1.96*sqrt((1/p1[1,1]))+(1/p1[2,2])+(1/p1[2,1])+(1/p1[1,2]))); UpperOR1
## [1] 7.030318
LowerOR1 <- exp(log(OR1)-(1.96*sqrt((1/p1[1,1]))+(1/p1[2,2])+(1/p1[2,1])+(1/p1[1,2]))); LowerOR1
## [1] 4.524768
#Since the odds ratio is not equal to one and the 95% percent confidence interval does not overlap with 1, we can be confident in saying that we can disregard the null hypothesis
The 95% confidence interval is 7.030318,4.5247676. Since the confidence interval doesn’t contain 1 (which would cause us to accept the null hypothesis that OR = 1), we can conclude that the null hypothesis can be rejected.
In an experiment to compare two diets for fattening beef steers, nine pairs of animals were chosen from the herd; members of each pair were matched as closely as possible with respect to hereditary factors. The members of each pair were randomly allocated, one to each diet. The following table below shows the weight gains (lb) of the animals over a 140-day test period on diet 1 and on diet 2.
p2 <- read.xls("/Users/jenniferpolson/Documents/School/2018-W/BE M228/Final/Data Entry.xlsx", 2); p2
summary(p2$Difference)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -50.00 -34.00 8.00 22.89 70.00 100.00
test2b <- t.test(p2$Diet.1, p2$Diet.2, paired = T); test2b
##
## Paired t-test
##
## data: p2$Diet.1 and p2$Diet.2
## t = 1.1585, df = 8, p-value = 0.2801
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -22.67122 68.44900
## sample estimates:
## mean of the differences
## 22.88889
Based on a p-value of 0.2800675, we are unable to reject the null hypothesis that there is a difference between the two diets.
test2c <- wilcox.test(p2$Diet.1, p2$Diet.2, paired = T); test2c
##
## Wilcoxon signed rank test
##
## data: p2$Diet.1 and p2$Diet.2
## V = 32, p-value = 0.3008
## alternative hypothesis: true location shift is not equal to 0
Based on a p-value of 0.3007812, we are unable to reject the null hypothesis that there is a difference between the two diets.
The results of both the parametric and non-parametric tests are statistically insignificant (0.2800675 and 0.3007812, respectively), however, the parametric test assumes a normal distribution for the data, which is not the case Therefore, the Wilcoxon rank-sum test may be more indicative since the data do not follow a normal distribution.
A psychologist conducted a study to examine the nature of the relation, if any, between an employee’s emotional stability (X) and the employee’s ability to perform in a task group (Y). Emotional stability (X) and the employee’s ability to perfume in a task group (Y). Emotional stability was measured by a written test for which the higher the score, the greater is the emotional stability. Ability to perform in a task group (Y=1 if able, Y=0 if unable) was evaluated by the supervisor. The results for 27 employees were:
p3 <- read.xls("/Users/jenniferpolson/Documents/School/2018-W/BE M228/Final/Data Entry.xlsx", 3); p3
#install.packages('ROCR')
#library(ROCR)
set.seed(520)
n <- nrow(p3); n
## [1] 27
p <- 0.6
nrows = seq(1, n, 1)
trainingrows <- sample(nrows, floor(n * p))
training <- p3[trainingrows, ]
testing <- p3[-trainingrows, ]
model3a = glm(Task.Ability ~ Stability, data = training, family = "binomial"); summary(model3a)
##
## Call:
## glm(formula = Task.Ability ~ Stability, family = "binomial",
## data = training)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.7874 -0.7089 0.4338 0.7195 1.5228
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -10.84353 5.71420 -1.898 0.0577 .
## Stability 0.02092 0.01058 1.977 0.0480 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 21.170 on 15 degrees of freedom
## Residual deviance: 15.779 on 14 degrees of freedom
## AIC: 19.779
##
## Number of Fisher Scoring iterations: 4
The model used is binomial, given that there is a discrete outcome i.e., if the task ability is accomplished or not. The β0, or intercept of the simple logistic regression model is -10.8435312, while β1, which corresponds to the stability coefficient, is 0.0209157.
em3 <- exp(model3a$coefficients[2])
For every unit increase in stability, the odds that the task ability is 1 is multiplied by exp(β1), which is 1.0211359. This is an estimated odds ratio for the stability test measurement.
case3c = data.frame(Task.Ability = 1, Stability = 550)
prob3c = predict(model3a, case3c, type="response")
The probability of task ability given the aforementioned test score is 0.6592798.
score3d = (.70 - model3a$coefficients[1])/model3a$coefficients[2]; score3d
## (Intercept)
## 551.9083
If 70 percent of the employees are able to perform a task, it is estimated that they will have a score of 551.90832.
library(pROC)
## Type 'citation("pROC")' for a citation.
##
## Attaching package: 'pROC'
## The following objects are masked from 'package:stats':
##
## cov, smooth, var
library(ROCR)
## Loading required package: gplots
##
## Attaching package: 'gplots'
## The following object is masked from 'package:stats':
##
## lowess
testing$pred <- predict(model3a, newdata = testing, type = "response")
pred_result <- prediction(testing$pred, testing$Task.Ability)
pred_performance <- performance(pred_result,"tpr","fpr")
plot(pred_performance, main = "Model Performance", xlab = "1-Specificity", ylab = "Sensitivity")
roc3 <- pROC::roc(testing$Task.Ability, testing$pred)
auc3d <- auc(roc3); auc3d
## Area under the curve: 0.7857
opts = coords(roc3, x = "best", best.method = "closest.topleft");
The ideal sensitivity and specificty are 0.7142857 and 0.75 respectively, based on the criteria about being closest to the top left of the graph; that is, optimizing for both measures of performance; of note, this is at a probability threshold of 0.6860224. The model performance is 0.7857143, which is not optimal, it is not prudent to make a conclusion about the stability score and predictive capacity for task ability, especially if this is about employees and their livelihood.
An investigator wants to examine data from a study that investigated the effects of a treatment on patients with chronic obstructive pulmonary disease. Patients involved in the study were recruited from three different medical centers - Johns Hopkins University Medical Center, Rancho Los Amigos Medical Center, and the St. Louis University Medical Center. Before combining the subjects for analysis into one group, one should first examine the baseline characteristics to ensure that the patients from the different centers are comparable. Extent of disease is one variable that is most important and is listed below by center:
p4 <- read.xls("/Users/jenniferpolson/Documents/School/2018-W/BE M228/Final/Data Entry.xlsx", 4); p4
Test the null hypothesis at the 5% level, that at the time when the study was conducted, whether the means of extent of disease at all three centers were identical.
#install.packages("MASS")
library(MASS)
group1 <- na.omit(p4$Hopkins)
group2 <- na.omit(p4$Rancho)
group3 <- na.omit(p4$St..Louis..Extent.of.Disease.)
response <- c(group1, group2, group3)
group <- c(rep("A", length(group1)), rep("B", length(group2)), rep("C", length(group3)))
eg1 <- data.frame(response, group)
model4 = lm(eg1$response ~ eg1$group)
summary(model4)
##
## Call:
## lm(formula = eg1$response ~ eg1$group)
##
## Residuals:
## Min 1Q Median 3Q Max
## -28.000 -6.400 1.600 5.462 25.000
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 39.400 3.168 12.435 1.36e-14 ***
## eg1$groupB 8.138 4.650 1.750 0.0886 .
## eg1$groupC -7.400 4.871 -1.519 0.1375
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 12.27 on 36 degrees of freedom
## Multiple R-squared: 0.2108, Adjusted R-squared: 0.167
## F-statistic: 4.809 on 2 and 36 DF, p-value: 0.0141
aov(model4)
## Call:
## aov(formula = model4)
##
## Terms:
## eg1$group Residuals
## Sum of Squares 1448.144 5420.831
## Deg. of Freedom 2 36
##
## Residual standard error: 12.27105
## Estimated effects may be unbalanced
Null hypothesis: The mean disease extent is the same among all three centers. Alternative hypothesis: The means of disease extend at all three centers are not identical.
p_js = t.test(p4$Hopkins, p4$St..Louis..Extent.of.Disease.)
p_jr = t.test(p4$Hopkins, p4$Rancho)
p_rs = t.test(p4$Rancho, p4$St..Louis..Extent.of.Disease.)
p_raw = c(p_js$p.value, p_jr$p.value, p_rs$p.value)
names = c("Hopkins-STL", "Hopkins-Rancho", "Rancho-STL")
p_adj <- data.frame(names, p_raw)
p_adj$p_bonf = p.adjust(p_adj$p_raw, method = "bonferroni")
p_adj$p_holm = p.adjust(p_adj$p_raw, method = "holm")
p_adj$p_hoch = p.adjust(p_adj$p_raw, method = "hochberg")
p_adj
According to the table above, the means for the samples at Johns Hopkins and St. Louis are the same, no matter what form of correction. Bonferroni is typically regarded s the strictest form of correction; under this adjustment, none of the means are statistically different, and the null hypothesis cannot be rejected. The Holm method, while not as strict, indicates the same thing. However, performing Benjamini-Hochberg correction shows that the Rancho Los Amigos Medical Center is different from the other two centers (p = 0.043)
The medical imaging division and pulmonary medicine are involved in a study emphysematous lungs. A prediction equation needs to be developed predicting Y grams of emphysema from various imaging texture measures x1( m) x2(mm) x3( m2) x4(mm).
p5 <- read.xls("/Users/jenniferpolson/Documents/School/2018-W/BE M228/Final/Data Entry.xlsx", 5)
model5a <- lm(Y..g. ~ x1 + x2 + x3 + x4, data = p5)
s5 <- summary(model5a)
AIC(model5a)
## [1] 77.28271
s5
##
## Call:
## lm(formula = Y..g. ~ x1 + x2 + x3 + x4, data = p5)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.1388 -1.3761 -0.7588 2.1733 4.0201
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -30.1369 37.5282 -0.803 0.44264
## x1 2.0699 0.4562 4.537 0.00141 **
## x2 2.5816 0.7397 3.490 0.00683 **
## x3 0.6360 0.4603 1.382 0.20039
## x4 1.1060 0.7648 1.446 0.18208
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.107 on 9 degrees of freedom
## Multiple R-squared: 0.9757, Adjusted R-squared: 0.9648
## F-statistic: 90.18 on 4 and 9 DF, p-value: 2.954e-07
Running the linear model of this indicates that only the p-values for x1 and x2 are significant to the model (rs5$coefficients[4,2]
and 1.381729), respectively.
model5b <- lm(Y..g. ~ x1 + x2, data = p5)
s5b <- summary(model5b)
AIC(model5b)
## [1] 76.32074
The multiple R-quared is 0.9756564, adjusted: 0.9648371 for the model that includes all factors. For the model that only inlcudes X1 and X2, the values are 0.9697568, adjusted: 0.9642581. This means that the model including all of the variables is a better model for the proposed study.
case5c = data.frame(x1=5.2, x2=21.3, x3=19.7, x4=12.2)
prob5c <- predict(model5a, case5c, type="response")
The grams of emphesema for the predicted for those parameters is 61.6369059.
In the beginning of lecture, we have discussed this figure below to illustrate the process of statistical study and design. Using the idea of figure and two articles of New England Journal Article with the tile of “Adjuvant Sunitinib in High-Risk Renal-Cell Carcinoma after Nephrectomy” (http://www.nejm.org/doi/pdf/10.1056/NEJMoa1611406), answer the questions below
The null hypothesis to be tested is that sunitinib does not significantly affect the incidence of disease recurrence, a second cancer, or death among patients with locoregional renal-cell carcinoma. The alternative hypothesis is that there is an effect of sunitinib on the incidence of events listed above, in that the incidence is lower. This is a one-sided hypothesis.
The patients with high risk of recurrence have a higher survival in a disease-free state if they take sunitinib. However, there is toxicity as a result,
The study conductors recruited 615 patients with a high-risk form of renal cell carcinoma; these patients either received a placebo or the drug (sunitinib) at the following schedule: 4-weeks-on, 2-weeks-off for one year or disease recurrence. They assesed for disease-free survival, on a double blind, phase III trial.
The methods they used were determining the statistical power required for the target hazard ratio/effect size. They performed a two-sided log rank test, despite the fact that their hypothesis was one-sided. They implemented repreated measure analysis for a few parameters using covariates; multiple testing needs to be taken care when looking at covariates; the mean of the two groups was used to compare overall long-term outcomes.
The Kaplan-Meier curve for survival indicates a hazard ratio of 0.76, with a 95% CI of 0.59–0.98. This means that, at a given time, around 3/4 of the treatment subjects will experience an event of disease recurrence, a second cancer, or death as compared to the group receiving the placebo.
The decision here is that there is a slight diminishment of risk of the adverse incidents among patients who take sunitinib, but there are also toxic effects. For patients witht he specific type of renal carcinoma, this can be recommended to reduce recurrence or mortality.
Bonus Questions: (a) Describe the type1 and type2 errors. (b) Describe the p-value. (c) Describe the study power. The authors
P(A) x P(B) = 0.3*0.7 = 0.21
P(A = 0)x[P(E|B=0) + P(E|B=1)] = 0.30 x [0.10 + 0.90] = 0.30
Two pathologists are counting the number of cancer cells in a sample. Imagine that they are simultaneously examining the same tissue sample using their own microscope to make measurements M1 and M2 of the number of cancer cells N in the sample. Each microscope can be out of focus (events F1 and F2). In such a case, the pathologist will undercount by three or more cells or, if N is less than three, fail to detect any cancer cells at all.
Which of them correctly, but not necessarily efficiently, represent the above information?
The fourth network represents this information; the number of cells N isn’t dependent on any of the nodes stated in the problem, and thus the fourth option is the only one with that belief. Moreover, M1 and M2 are dependent on both the number of cells N as well as the focus events of each microscope.
Which network topology is preferred and why?
For Bayesian networks, directed, acyclic graphs provide the best topographical structure. This is because direct and indirect conditional dependencies can be quantified and discerned from the graphical structure. This allows for conditional probability tables to be constructed for a system.