What is stress echocardiography?
A stress echocardiography, also known as an echocardiography stress test or stress echo, is a procedure that determines how well your heart and blood vessels are working.
Your doctor may ask you to take a stress echo test if you have chest pain that they think is due to coronary artery disease or a myocardial infarction, which is a heart attack. This test also provides information on how much exercise you can safely tolerate if you’re in cardiac rehabilitation. The test also helps doctor to determine how well the treatments such as bypass grafting, angioplasty, and anti-anginal or antiarrhythmic medications are working.
During a stress echocardiography, you’ll exercise on a treadmill or stationary bike while your doctor monitors your blood pressure and heart rhythm. When your heart rate reaches peak levels, your doctor will take ultrasound images of your heart to determine whether your heart muscles are getting enough blood and oxygen while you exercise.
I have chosen stress echo data originally published by UCLA, Department of Physiology.
General Explanation of the Study
This data is from a study that was trying to determine if a drug called “dobutamine” could be used effectively in a test for measuring a patient’s risk of having a heart attack, or “cardiac event.” For younger patients, a typical test of this risk is called “Stress Echocardiography.” It involves raising the patient’s heart rate by exercise–often by having the patient run on a treadmill–and then taking various measurements, such as heart rate and blood pressure, as well as more complicated measurements of the heart. The problem with this test is that it often cannot be used on older patients whose bodies can’t take the stress of hard exercise. The key to assessing risk, however, is putting stress on the heart before taking the relevant measurements. While exercise can’t be used to create this stress for older patients, the drug dobutamine can. This study, then, was partly an attempt to see if the stress echocardiography test was still effective in predicting cardiac events when the stress on the heart was produced by dobutamine instead of exercise. More specifically, though, the study sought to pinpoint which measurements taken during the stress echocardiography test were most helpful in predicting whether or not a patient suffered a cardiac event over the next year.
# load data
library(tidyverse)
library(psych)
library(knitr)
stressEcho.data <- read.csv('https://raw.githubusercontent.com/niteen11/MSDS/master/DATA606/ProjectProposal/dataset/stressEcho.csv')
#Research Question 2
rq1.heartRate <- select(stressEcho.data,maxhr,newMI,newPTCA,newCABG,death)
#Research Question 2
rq2.stressEcho <- select(stressEcho.data,X,posSE,newMI,newPTCA,newCABG,death)
rq2.stressEcho <- rename(rq2.stressEcho,id=X)
#per data file: posse stress echocardiogram was positive (1 = yes)
stress.positive <- filter(rq2.stressEcho,posSE==1)
stress.negative <- filter(rq2.stressEcho,posSE==0)
#Research Question 3
rq3.restWMA<- select(stressEcho.data,X,restwma,newMI,newPTCA,newCABG,death)
rq3.restWMA <- rename(rq3.restWMA,id=X)
#per data file: restwma cardiologist sees wall motion anamoly on echocardiogram (1 = yes)
restWMA.positive <- filter(rq3.restWMA,restwma==1)
restWMA.negative <- filter(rq3.restWMA,restwma==0)
#Research Question 4
rq4.age.ecg<- select(stressEcho.data,age,ecg,newMI,newPTCA,newCABG,death)
# there are 3 categories of ecw diagnosis: equivocal, MI and Normal
ecg.equivocal <- filter(rq4.age.ecg,ecg=='equivocal')
ecg.MI <- filter(rq4.age.ecg,ecg=='MI')
ecg.normal <- filter(rq4.age.ecg,ecg=='normal')
You should phrase your research question in a way that matches up with the scope of inference your dataset allows for.
What are the cases, and how many are there?
dim(stressEcho.data)
## [1] 558 32
summary(stressEcho.data$gender)
## female male
## 338 220
The accompanying data file contains the complete data for the final study population, which included 220 men and 338 women. The data collected on each subject is explained below.
1. The actual data file is a comma delimited text file. The first row contains the abbreviations for the information recorded on each subject. The remaining rows each represent a single patient’s corresponding information.
2. For the purposes of the study, the “cardiac events” that the “Dobutamine Stress Echocardiography” was attempting to predict were broken down into four categories:
CARDIAC EVENTS:
Note : that several of the original variables have been renamed and recoded for the S datasets. Event variables that originally were coded 0=yes 1=no have been recoded 0=no 1=yes. equivecg and posecg are combined into a new variable ecg. The original hxofcig variable had values 0 .5 1 assumed to represent heavy, moderate, and none. %mphr(b) has been renamed pctMphr.
What type of study is this (observational/experiment)? This was an experimental study.
Vanderbilt link : http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/stressEcho.html
UCLA link : http://www.stat.ucla.edu/projects/datasets/cardiac-explanation.html
NCBI (National Center for Biotechnology Information) link: https://www.ncbi.nlm.nih.gov/pubmed/10080472
Prognostic value of dobutamine stress echocardiography in predicting cardiac events in patients with known or suspected coronary artery disease.
Keywords for Dataset: Medical, Biology, Physiology
Data Provided By: Alan Garfinkel, PhD, UCLA, Department of Physiology
The complete citation for the journal in which the results of the study were published is as follows:
Garfinkel, Alan, et. al. “Prognostic Value of Dobutamine Stress Echocardiography in Predicting Cardiac Events in Patients With Known or Suspected Coronary Artery Disease.” Journal of the American College of Cardiology 33.3 (1999) 708-16.
What is the response variable, and what type is it (numerical/categorical)?
Note: “0” means that the patient DID NOT suffer the corresponding cardiac event, and a “1” means that he DID.
What is the explanatory variable, and what type is it (numerical/categorical)? The Explanatory variables are:
Provide summary statistics relevant to your research question. For example, if you’re comparing means across groups provide means, SDs, sample sizes of each group. This step requires the use of R, hence a code chunk is provided below. Insert more code chunks as needed.
# RQ#1
summary(rq1.heartRate$maxhr)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 58.0 104.2 120.0 119.4 133.0 200.0
qqnorm(rq1.heartRate$maxhr)
qqline(rq1.heartRate$maxhr)
#summary(stressEcho.data)
describe(rq2.stressEcho)
## vars n mean sd median trimmed mad min max range skew
## id 1 558 279.50 161.22 279.5 279.50 206.82 1 558 557 0.00
## posSE 2 558 0.24 0.43 0.0 0.18 0.00 0 1 1 1.19
## newMI 3 558 0.05 0.22 0.0 0.00 0.00 0 1 1 4.11
## newPTCA 4 558 0.05 0.21 0.0 0.00 0.00 0 1 1 4.20
## newCABG 5 558 0.06 0.24 0.0 0.00 0.00 0 1 1 3.73
## death 6 558 0.04 0.20 0.0 0.00 0.00 0 1 1 4.49
## kurtosis se
## id -1.21 6.83
## posSE -0.58 0.02
## newMI 14.92 0.01
## newPTCA 15.65 0.01
## newCABG 11.92 0.01
## death 18.22 0.01
describe(stress.positive)
## vars n mean sd median trimmed mad min max range skew
## id 1 136 314.22 137.70 294.5 318.20 128.24 57 555 498 -0.07
## posSE 2 136 1.00 0.00 1.0 1.00 0.00 1 1 0 NaN
## newMI 3 136 0.10 0.31 0.0 0.01 0.00 0 1 1 2.58
## newPTCA 4 136 0.10 0.30 0.0 0.00 0.00 0 1 1 2.72
## newCABG 5 136 0.15 0.36 0.0 0.07 0.00 0 1 1 1.89
## death 6 136 0.08 0.27 0.0 0.00 0.00 0 1 1 3.04
## kurtosis se
## id -0.85 11.81
## posSE NaN 0.00
## newMI 4.71 0.03
## newPTCA 5.44 0.03
## newCABG 1.59 0.03
## death 7.30 0.02
describe(stress.negative)
## vars n mean sd median trimmed mad min max range skew
## id 1 422 268.31 166.72 277.5 266.53 222.39 1 558 557 0.07
## posSE 2 422 0.00 0.00 0.0 0.00 0.00 0 0 0 NaN
## newMI 3 422 0.03 0.18 0.0 0.00 0.00 0 1 1 5.19
## newPTCA 4 422 0.03 0.18 0.0 0.00 0.00 0 1 1 5.19
## newCABG 5 422 0.03 0.17 0.0 0.00 0.00 0 1 1 5.65
## death 6 422 0.03 0.17 0.0 0.00 0.00 0 1 1 5.41
## kurtosis se
## id -1.30 8.12
## posSE NaN 0.00
## newMI 25.04 0.01
## newPTCA 25.04 0.01
## newCABG 30.04 0.01
## death 27.35 0.01
rq2.stressEcho.tidy <- gather(rq2.stressEcho,CardiacEvent,EventStatus,newMI:death) %>%
arrange(id)
rq2.stressEcho.tidy.df <- rq2.stressEcho.tidy %>%
group_by(posSE,CardiacEvent, EventStatus) %>%
summarise(count=n())
kable(rq2.stressEcho.tidy.df)
posSE | CardiacEvent | EventStatus | count |
---|---|---|---|
0 | death | 0 | 409 |
0 | death | 1 | 13 |
0 | newCABG | 0 | 410 |
0 | newCABG | 1 | 12 |
0 | newMI | 0 | 408 |
0 | newMI | 1 | 14 |
0 | newPTCA | 0 | 408 |
0 | newPTCA | 1 | 14 |
1 | death | 0 | 125 |
1 | death | 1 | 11 |
1 | newCABG | 0 | 115 |
1 | newCABG | 1 | 21 |
1 | newMI | 0 | 122 |
1 | newMI | 1 | 14 |
1 | newPTCA | 0 | 123 |
1 | newPTCA | 1 | 13 |
ggplot(data=filter(rq2.stressEcho.tidy,EventStatus==1), aes(posSE))+
geom_bar(aes(fill=CardiacEvent))+
geom_text(stat='count', aes(label=..count..), vjust=-0.2)+
scale_x_continuous(breaks = c(0, 1))+
xlab('Positive Stress Echocardiography (0 = No, 1 = yes) ')+
facet_wrap(~CardiacEvent)+
ggtitle('Stress Echocardiography Test : Suffer Cardiac Event')+
theme_bw()
ggplot(data=filter(rq2.stressEcho.tidy,EventStatus==0), aes(posSE))+
geom_bar(aes(fill=CardiacEvent))+
scale_x_continuous(breaks = c(0, 1))+
xlab('Positive Stress Echocardiography (0 = No, 1 = yes) ')+
geom_text(stat='count', aes(label=..count..), vjust=-0.2)+
facet_wrap(~CardiacEvent)+
ggtitle('Stress Echocardiography Test : Did Not Suffer Cardiac Event')+
theme_bw()
describe(rq3.restWMA)
## vars n mean sd median trimmed mad min max range skew
## id 1 558 279.50 161.22 279.5 279.50 206.82 1 558 557 0.00
## restwma 2 558 0.46 0.50 0.0 0.45 0.00 0 1 1 0.16
## newMI 3 558 0.05 0.22 0.0 0.00 0.00 0 1 1 4.11
## newPTCA 4 558 0.05 0.21 0.0 0.00 0.00 0 1 1 4.20
## newCABG 5 558 0.06 0.24 0.0 0.00 0.00 0 1 1 3.73
## death 6 558 0.04 0.20 0.0 0.00 0.00 0 1 1 4.49
## kurtosis se
## id -1.21 6.83
## restwma -1.98 0.02
## newMI 14.92 0.01
## newPTCA 15.65 0.01
## newCABG 11.92 0.01
## death 18.22 0.01
rq3.restWMA.tidy <- gather(rq3.restWMA,CardiacEvent,EventStatus,newMI:death) %>%
arrange(id)
rq3.restWMA.tidy.df <- rq3.restWMA.tidy %>%
group_by(restwma,CardiacEvent, EventStatus) %>%
summarise(count=n())
kable(rq3.restWMA.tidy.df)
restwma | CardiacEvent | EventStatus | count |
---|---|---|---|
0 | death | 0 | 283 |
0 | death | 1 | 18 |
0 | newCABG | 0 | 271 |
0 | newCABG | 1 | 30 |
0 | newMI | 0 | 277 |
0 | newMI | 1 | 24 |
0 | newPTCA | 0 | 279 |
0 | newPTCA | 1 | 22 |
1 | death | 0 | 251 |
1 | death | 1 | 6 |
1 | newCABG | 0 | 254 |
1 | newCABG | 1 | 3 |
1 | newMI | 0 | 253 |
1 | newMI | 1 | 4 |
1 | newPTCA | 0 | 252 |
1 | newPTCA | 1 | 5 |
ggplot(data=filter(rq3.restWMA.tidy), aes(restwma))+
geom_bar(aes(fill=CardiacEvent))+
scale_x_continuous(breaks = c(0, 1))+
xlab('Wall motion anamoly on echocardiogram (0 = No, 1 = yes) ')+
geom_text(stat='count', aes(label=..count..), vjust=0.2)+
facet_wrap(~CardiacEvent~EventStatus,ncol = 2)
describe(rq4.age.ecg)
## vars n mean sd median trimmed mad min max range skew
## age 1 558 67.34 12.05 69 67.99 10.38 26 93 67 -0.58
## ecg* 2 558 2.24 0.90 3 2.30 0.00 1 3 2 -0.49
## newMI 3 558 0.05 0.22 0 0.00 0.00 0 1 1 4.11
## newPTCA 4 558 0.05 0.21 0 0.00 0.00 0 1 1 4.20
## newCABG 5 558 0.06 0.24 0 0.00 0.00 0 1 1 3.73
## death 6 558 0.04 0.20 0 0.00 0.00 0 1 1 4.49
## kurtosis se
## age 0.41 0.51
## ecg* -1.59 0.04
## newMI 14.92 0.01
## newPTCA 15.65 0.01
## newCABG 11.92 0.01
## death 18.22 0.01
summary(rq4.age.ecg$age)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 26.00 60.00 69.00 67.34 75.00 93.00
summary(rq4.age.ecg$ecg)
## equivocal MI normal
## 176 71 311
ggplot(rq4.age.ecg,aes(rq4.age.ecg$age))+
geom_bar(aes(fill=ecg))+
xlab('Age')+
theme_bw()