Project Description

The Royal National Hospital for Rheumatic Diseases in Bath carried out a study to determine if additional stretching exercises improved range of motion in the hips of patients with Ankylosing Spondylitis (AS). AS is a chronic form of inflammatory arthritis that limits spine and muscle motion.

Thirty-nine patients with ‘typical’ AS were randomly allocated to either a control group receiving standard treatment or a group receiving the additional stretching exercises. The study was designed so patients were twice as likely to be assigned to the group receiving additional stretching. Upon admission and again after three weeks, the patients were assessed by several measurements on each hip for flexion, extension, abduction, and rotation extent on a scale of 0-180 degrees to determine improvement. Only flexion and lateral rotation are of concern here.

The raw data provided by the client was analyzed to determine if in fact the additional stretching exercises were more effective in improving hip rotation than the standard treatment.

Research Questions

Question 1

Has the stretched group improved significantly more than the control group?

Question 2

Can a model be produced to predict the improvement for a patient?

Statistical Questions

Question 1

Is the improvement for the Treatment Group statistically significant?

Question 2

Does the fact that the data is listed by hip and not by patient introduce any error that can be accounted for?

Question 3

Is there a model that incorperates both Rotation and Flexion as to make it more intuitive for understanding?

Variables

Variable	Description	Type	Units	Type.of.Variable	Range	Mean	Median
CPreFlex	Control Pre-Flexion	Control	Degrees	Ordinal	0-180	110	112
CPostFlex	Control Post-Flexion	Control	Degrees	Ordinal	0-180	113.8	115
CDiffFlex	Control Difference-In-Flexion	Control	Change in Degrees	Ordinal	-180-180	3.792	3
CPreRot	Control Pre-Rotation	Control	Degrees	Ordinal	0-180	25	26
CPostRot	Control Post-Rotation	Control	Degrees	Ordinal	0-180	25.96	27
CDiffRot	Control Difference-In-Rotation	Control	Change in Degrees	Ordinal	-180-180	0.9583	1.5
TPreFlex	Treatment Pre-Flexion	Explanatory	Degrees	Ordinal	0-180	116.5	120
TPostFlex	Treatment Post-Flexion	Explanatory	Degrees	Ordinal	0-180	124	126
TDiffFlex	Treatment Difference-In-Flexion	Response	Change in Degrees	Ordinal	-180-180	7.481	6
TPreRot	Treatment Pre-Rotation	Explanatory	Degrees	Ordinal	0-180	24.78	25
TPostRot	Treatment Post-Rotation	Explanatory	Degrees	Ordinal	0-180	31.37	32
TDiffRot	Treatment Difference-In-Rotation	Explanatory	Change in Degrees	Ordinal	-180-180	6.593	5

Exploratory Data Analysis

Read and Summarize Data

##     CPreFlex       CPostFlex        CPreRot         CPostRot    
##  Min.   : 81.0   Min.   : 96.0   Min.   : 4.00   Min.   : 2.00  
##  1st Qu.:105.0   1st Qu.:110.0   1st Qu.:21.75   1st Qu.:24.00  
##  Median :112.0   Median :115.0   Median :26.00   Median :27.00  
##  Mean   :110.0   Mean   :113.8   Mean   :25.00   Mean   :25.96  
##  3rd Qu.:114.2   3rd Qu.:120.0   3rd Qu.:29.00   3rd Qu.:30.25  
##  Max.   :126.0   Max.   :126.0   Max.   :36.00   Max.   :41.00  
##  NA's   :30      NA's   :30      NA's   :30      NA's   :30     
##     TPreFlex       TPostFlex      TPreRot         TPostRot    
##  Min.   : 77.0   Min.   : 88   Min.   : 2.00   Min.   :10.00  
##  1st Qu.:111.2   1st Qu.:120   1st Qu.:20.00   1st Qu.:26.00  
##  Median :120.0   Median :126   Median :25.00   Median :32.00  
##  Mean   :116.5   Mean   :124   Mean   :24.78   Mean   :31.37  
##  3rd Qu.:125.0   3rd Qu.:129   3rd Qu.:31.50   3rd Qu.:37.75  
##  Max.   :135.0   Max.   :139   Max.   :48.00   Max.   :50.00  
##                                                               
##    CDiffFlex         CDiffRot         TDiffFlex          TDiffRot     
##  Min.   :-5.000   Min.   :-9.0000   Min.   :-11.000   Min.   :-8.000  
##  1st Qu.: 0.750   1st Qu.:-1.2500   1st Qu.:  2.000   1st Qu.: 2.000  
##  Median : 3.000   Median : 1.5000   Median :  6.000   Median : 5.000  
##  Mean   : 3.792   Mean   : 0.9583   Mean   :  7.481   Mean   : 6.593  
##  3rd Qu.: 6.250   3rd Qu.: 4.0000   3rd Qu.:  9.750   3rd Qu.:10.750  
##  Max.   :30.000   Max.   :16.0000   Max.   : 49.000   Max.   :22.000  
##  NA's   :30       NA's   :30

The Two Plots Below are for visual inspection that the Treatment Group DID have improvement, the blue line signifies a threshhold for where there was zero or negative improvement. The important note here, is that the majority of Treatment Hips (Both in Flexion and Rotation) are above that line, signifying that there was most likely improvement and doing further analysis would be justified.

Plot Improvement for Treatment Flexion

Plot Improvement for Treatment Rotation

Plot of Hips (Treatment) Density

Plot of Hips (Control) Density

Noteworthy Point of Figures Above

There seems to be some difference between the Control and Treatment pre-scores
The Treatment and Control groups’ pre-scores should be more similar to each other if they were truly selected at random
See Appendix for more detail

Statistical Analysis

Addressing Statitsical Question 1

## 
##  Welch Two Sample t-test
## 
## data:  data$TDiffFlex and data$CDiffFlex
## t = 1.901, df = 62.271, p-value = 0.03097
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  0.4489492       Inf
## sample estimates:
## mean of x mean of y 
##  7.481481  3.791667

Here we have a T-Test, which tests to see if the Treatment Group’s Improvement is Statistically larger than the Control Group’s
The Test resulted in a TRUE value, meaning that the Treatment Group DID improve more than the Control Group
This means that the stretching therapy did in fact work and we can move forward in producing a model to estimate the improvement for a patient after the treatment

Addressing Statistical Question 2

Check Collinearity between Even and Odd Hips

## [1] 0.6592946

We had a hunch that because the data is listed by Hip and not by Patient, that the Odd and Even Hips would essentially describe each other - thus leading to a bad model
To determine this we used a statistical function that calculates this in the form of a percentage (ICC Value) where a value between 0.01 and 0.50 is bad, 0.51 to 0.80 is moderate, and 0.81 to 1.00 is good
Our value came out to be 0.6593, which although is not terrible, it does point to the Even and Odd hips reporting too similar of values to each other because every pair (1,2), (3,4), etc. are the same patient
Therefore, when building our model, we will consider this and adjust the model accordingly

Addressing Statistical Question 3

Model Building and Testing

##  TPreRot TPreFlex 
## 1.058889 1.058889

Here we have the VIF (Variance Inflation Factor) which double checks that the variables we used are not insignificant to the model
Values for this Less than 10 are generally considered good for a model
And here we can see that for both TPreRot and TPreFlex, they are less than 2, making them good to use for a model

##                 Analysis of Variance          Response: TDiffFlex 
## 
##  Factor     d.f. Partial SS MS         F     P     
##  TPreRot     1    569.235    569.23496 12.21 0.001 
##  TPreFlex    1   2672.351   2672.35104 57.34 <.0001
##  REGRESSION  2   2816.502   1408.25096 30.22 <.0001
##  ERROR      51   2376.980     46.60744

Here we have an ANOVA Table, which displays various properties of the model
The important values here are under the “P” column, where values less than 0.05 are good
More specifically, the P-Values show the significance of each variable, similarly to the VIF Values above
The point of this test was to perform an extra check on our model before continuing

Model Summary

## [1] 0.5423148

Here we have the R-Squared Value, which gives the accuracy of the model to the actual data in the form of a decimal
Values here between 0.50 and 1.00 are considered good to continue with
However, because this model uses both the Even and Odd Hips, we believe that we can achieve a more accurate model by using only the Even or Odd Hips

Odd Hips Model Option

## [1] 0.4912659

This is the R-Squared Value for the Model using only the Odd Hips, and as you can see it is lower than the model using both, so we won’t use this one

Even Hips Model Option

## [1] 0.6401323

Here we have the R-Squared Value for the Model using only the Even Hips, and as you can see it is higher than the model using both, therefore we will use this model
It seems that the collinerarity between the even and odd hips we noted earlier is playing a factor in our model
By remapping the model to use only the even hips, it is more accurate
This may be because EVERY Even Hip was greater than its Odd Hip counterpart

Regression Plot of Even Hips Model

Here we have a Regression Plot, which plots the model (Red Line) to the Data (Black Dots)
The Shaded Region around the red line dictates the Error of the model
This is helpful because it shows the region where we are 95% confident the value we’re predicting with the model will be

Recomendations

Question 1

Has the stretched group improved significantly more than the control group?

There is statistical evidence to suggest the daily stretching treatment is more successful in improving range of hip motion in comparison to the standard treatment. It can therefore be concluded that this new treatment should be implemented over the previous.

Question 2

Can a model be produced to predict the improvement for a patient?

We were able to find a model that incorperates the Pre-Rotation and Pre-Flexion measurements in order to predict the improvement. This allows anyone to determine if, based on their Pre-Measurements, they would improve from having the stretch treatment.

Considerations

While the results show that receiving the stretching treatment daily improved hip motion in patients with ankylosing spondylitis more than the standard treatment, it should be noted the sample sizes for them differ substantially. The stretching group contained twice as many patients as those given standard treatment.
Additionally, the control group and treatment groups’ Pre-Scores are significantly different from each other when they should be roughly the same if they were truly randomly sampled. If tested again, a larger sample size should be used to smooth out the effects of biased sampling. Preferablly, a sample size of at least 30-50 seperate patients should be taken.
The reliability of the hip measurements is also called into question. Values given to the tenth degree of hip rotation are unrealistic when taken by hand on a human body. To resolve this, either a more scientific method of measuring should be used, or the numbers should be rounded to a degree in which a human can measure.
It should also be noted that Ankylosing Spondylitis is Spinal condition, and only has a side-effect in the Hips. A more accurate method of measuring the treatment’s effectiveness would be to use a measurement for Spinal Flexion and Rotation.

Appendix

R Code

TABLE1 <- read.xlsx("E:/Dropbox/case study 2/varstable.xlsx", sheetName = "Sheet1")
tableHTML(TABLE1, widths = c(100,600,100,300,200,200,200,200), theme = "rshiny-blue",rownames = FALSE) %>%
  add_css_header(css = list(c('font-size', 'border'), c('20px', '2px solid blue')),
                 headers = c(1, 2, 3, 4, 5, 6, 7, 8)) %>%
  add_css_row(css = list(c('background-color'), c('pink')), rows = 1:6) %>%
  add_css_row(css = list(c('background-color'), c('lightblue')), rows = 7:12) %>%
  add_css_column(css = list(c('border'), c('1px solid grey')), columns = 1:8)
data <- read.csv("E:/Dropbox/case study 2/data.csv")
summary(data)
plot(data$TDiffFlex ~ c(1:54))
title(main = "Improvement for Treatment Group Flexion", sub = "Blue Line Shows No Improvement Threshold")
abline(h=0, col="BLUE", lty=2)
plot(data$TDiffRot ~ c(1:54))
title(main = "Improvement for Treatment Group Rotation", sub = "Blue Line Shows No Improvement Threshold")
abline(h=0, col="BLUE", lty=2)
#Mapping Even and Odd Hips to Seperate Data Sets
even_indexes<-seq(2,54,2)
odd_indexes<-seq(1,54,2)
data_odd <- data[odd_indexes,]
data_even <- data[even_indexes,]

TDiffFlexOdd <- data_odd$TDiffFlex
TDiffFlexEven <- data_even$TDiffFlex

CDiffFlexOdd <- data_odd$CDiffFlex
CDiffFlexEven <- data_even$CDiffFlex
ggplot(data = data, aes(x=1, y=data$TPreFlex)) + geom_boxplot() + geom_count(color="blue") + ggtitle("Boxplot with Overlayed Density Plot of Treatment Group Flexion") + guides(size = guide_legend("Density"))
ggplot(data = data, aes(x=1, y=data$CPreFlex)) + geom_boxplot() + geom_count(color="blue") + ggtitle("Boxplot with Overlayed Density Plot of Control Group Flexion") + guides(size = guide_legend("Density"))
t.test(x=data$TDiffFlex,y=data$CDiffFlex,alternative = 'greater')
#colin <- lm(TDiffFlexEven ~ TDiffFlexOdd)
#plot(colin)
#Use ICC Package to test collinearity
require("ICC")
ICCest(x=TDiffFlexEven, y=TDiffFlexOdd, data = NULL, alpha = 0.05, CI.type = c("THD", "Smith"))$ICC
testmodel1 <- ols(TDiffFlex ~ TPreRot + TPreFlex, data = data)
vif(testmodel1)
anova(testmodel1)
testmodel2 <- lm(TDiffFlex ~ TPreRot + TPreFlex, data = data)

summary(testmodel2)$r.squared
odd_hip_model <- lm(TDiffFlexOdd ~ TPreRot + TPreFlex, data = data_odd)

summary(odd_hip_model)$r.squared
even_hip_model <- lm(TDiffFlexEven ~ TPreRot + TPreFlex, data = data_even)

summary(even_hip_model)$r.squared
ggplotRegression <- function (fit) {
ggplot(fit$model, aes_string(x, y = names(fit$model)[1])) +
  geom_point() +
  stat_smooth(method = "lm", col = "red") +
  labs(title = "Regression Plot")
}
#Here we've created a function that plots the 
#model results to the risk values so we can see if we can improve our
#model in any way as well as use this function for future models
#We can apply our model to this function
x <- 1:nrow(data_even)
ggplotRegression(even_hip_model)
ggplot(data = data_even, aes(x=1, y=TDiffFlexEven)) + geom_boxplot() + geom_count(color="blue") + ggtitle("Boxplot with overlayed Density Plot") + guides(size = guide_legend("Density"))
ggplot(data = data_even, aes(x=1, y=data_even$CDiffFlex)) + geom_boxplot() + geom_count(color="blue") + ggtitle("Boxplot with overlayed Density Plot") + guides(size = guide_legend("Density"))
ggplot(data = data, aes(x=1, y=sample(data$TPreFlex, 24)-data$CPreFlex)) + geom_boxplot() + geom_count(color="blue") + ggtitle("Boxplot with overlayed Density Plot") + guides(size = guide_legend("Density")) + geom_hline(yintercept = 0, col="Red", lty=2) + labs(subtitle = "Note that the Average Difference between the Treatment and Control is NOT equal to 0")
t.test(x=data$TPreFlex,y=data$CPreFlex,alternative = 'greater')
#Alternative Hypothesis = TRUE therefore there is a significant difference between the Treatment and Control Groups' Pre-Flex Scores
head(data, 10)

Extra Figures

Plot of Even Hip (Treatment) Density

ggplot(data = data_even, aes(x=1, y=TDiffFlexEven)) + geom_boxplot() + geom_count(color="blue") + ggtitle("Boxplot with overlayed Density Plot") + guides(size = guide_legend("Density"))

Plot of Even Hips (Control) Density

ggplot(data = data_even, aes(x=1, y=data_even$CDiffFlex)) + geom_boxplot() + geom_count(color="blue") + ggtitle("Boxplot with overlayed Density Plot") + guides(size = guide_legend("Density"))

Plot of Difference between Control and Treatment Pre-Flex

ggplot(data = data, aes(x=1, y=sample(data$TPreFlex, 24)-data$CPreFlex)) + geom_boxplot() + geom_count(color="blue") + ggtitle("Boxplot with overlayed Density Plot") + guides(size = guide_legend("Density")) + geom_hline(yintercept = 0, col="Red", lty=2) + labs(subtitle = "Note that the Average Difference between the Treatment and Control is NOT equal to 0")

Extra Detail Regarding Difference Between Control and Treatment Groups

## 
##  Welch Two Sample t-test
## 
## data:  data$TPreFlex and data$CPreFlex
## t = 2.5362, df = 55.374, p-value = 0.007027
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  2.226952      Inf
## sample estimates:
## mean of x mean of y 
##  116.5000  109.9583

Raw Data

##    CPreFlex CPostFlex CPreRot CPostRot TPreFlex TPostFlex TPreRot TPostRot
## 1       100       100      23       17      125       126      25       36
## 2       105       103      18       12      120       127      35       37
## 3       114       115      21       24      135       135      28       40
## 4       115       116      28       27      135       135      24       34
## 5       123       126      25       29      100       113      26       30
## 6       126       121      26       27      110       115      24       26
## 7       105       110      35       33      122       123      22       42
## 8       105       102      33       24      122       125      24       37
## 9       120       123      25       30      124       126      29       29
## 10      123       118      22       27      124       135      28       31
##    CDiffFlex CDiffRot TDiffFlex TDiffRot
## 1          0       -6         1       11
## 2         -2       -6         7        2
## 3          1        3         0       12
## 4          1       -1         0       10
## 5          3        4        13        4
## 6         -5        1         5        2
## 7          5       -2         1       20
## 8         -3       -9         3       13
## 9          3        5         2        0
## 10        -5        5        11        3

Case Study 2

Chase Rosendale, Kanchan Sayers, Lei Wang

September 18, 2017

Project Description

Research Questions

Question 1

Question 2

Statistical Questions

Question 1

Question 2

Question 3

Variables

Exploratory Data Analysis

Read and Summarize Data

Plot Improvement for Treatment Flexion

Plot Improvement for Treatment Rotation

Plot of Hips (Treatment) Density

Plot of Hips (Control) Density

Noteworthy Point of Figures Above

Statistical Analysis

Addressing Statitsical Question 1

Addressing Statistical Question 2

Check Collinearity between Even and Odd Hips

Addressing Statistical Question 3

Model Building and Testing

Model Summary

Odd Hips Model Option

Even Hips Model Option

Regression Plot of Even Hips Model

Recomendations

Question 1

Question 2

Considerations

Appendix

R Code

Extra Figures

Plot of Even Hip (Treatment) Density

Plot of Even Hips (Control) Density

Plot of Difference between Control and Treatment Pre-Flex

Extra Detail Regarding Difference Between Control and Treatment Groups

Raw Data