Math 247 Final Project Report

Introduction

My research question for my study was: Do residents across the islands with high perceived stress levels have higher blood pressure? My population parameter is the true difference in the proportion of residents with high blood pressure between those with high perceived stress and those without high perceived stress. My initial conjecture about the value is that the proportion of high blood pressure in high-stress individuals is about 5 percent greater than among low-stress individuals, as there was a positive link from previous literature, but it wasn’t particularly strong.

The relationship between chronic stress and high blood pressure has been deeply explored in medical literature.Sparrenberger et al. (2009) conducted a systemic review of observational studies investigating a potential link between chronic stress and hypertension. They concluded that while many studies did find positive associations between the two variables, the overall data was inconclusive due to differing study designs and definition of stress. Hudzinski, Frohlich, and Holloway (1988) investigated the possible relationship from a medical clinical perspective by looking at the role of the sympathetic nervous system in linking high stress to high blood pressure. Their findings suggest that long-term, chronic activation of stress physiological pathways could potentially contribute to hypertension.

I believe the true population difference is likely greater than my prediction of 5% greater due to the fact that the villagers aren’t actual people and the program doesn’t encapsulate all potential associations between variables present across the human population.

Data Collection Methods

The observational units in this study were villagers from across all islands. To obtain villagers in my contacts, I posted an advertisement in each town in the local newspaper with the question, “Are you stressed?” to get people in my sample who responded with both yes and no to that question. I obtained 272 villagers in my contacts after returning to the villages a day later and asking for consent. Then, I clicked on contacts (which was not random) and took their blood pressure, noted their systolic measurement, and asked them, “On a scale from 1 to 10, how stressed do you feel right now?”. I chose to look at two categorical variables, so to turn these variables into binary categories, I classified blood pressure as high if the systolic pressure was over 140 millimeters of mercury. If their perceived stress level was 5 or over, I classified the individual as someone with high stress. I spent around 10 different sessions at different times of the day to collect data.

To acknowledge what went wrong, I couldn’t figure out how to sample randomly when I had the villagers in my contacts. Also, it was hard to cultivate a specific question about chronic stress that the villagers could understand and respond to, so I had to ask a question about their current stress levels, which doesn’t address stress levels over a long-period of time.

Descriptive Statistics

library(readr)
Stress_HBP_Data <- 
  read_csv("~/OneDrive - Whitman College/Statistics Project Data - MP3 - Sheet1.csv")
head(Stress_HBP_Data, n=2)

tally(HBP ~ HPS, data = Stress_HBP_Data, format = "count", margin = TRUE)

##        HPS
## HBP      No Yes
##   No     66  30
##   Yes    68  13
##   Total 134  43

tally(HBP ~ HPS, data = Stress_HBP_Data, format = "prop")

##      HPS
## HBP          No       Yes
##   No  0.4925373 0.6976744
##   Yes 0.5074627 0.3023256

I created a two-way table with the independent variable in the columns (stress levels) and the dependent variable (blood pressure levels) in the rows. From this table, we can see that for the low to medium perceived stress, the blood pressure was fairly even for high and low blood pressure. In contrast, 69.76% of respondents with high perceived stress had low blood pressure. We can also see that the sample size for the low stress group is over three times the size of the sample size in the high stress group.

mosaicplot(HPS ~ HBP, data = Stress_HBP_Data)

bargraph(~HPS, groups = HBP, data = Stress_HBP_Data, auto.key = TRUE)

We can see that there is a low percentage of individuals with high stress and high blood pressure compared to the other categories. There appears to be an association between the two variables.

Analysis of Results

My population parameter is the true difference in the proportion of residents with high blood pressure between those with high perceived stress and those without high perceived stress. My null hypothesis is that there is no difference in the proportion of high blood pressure between villagers with high stress and those without high stress. \[H_0: \pi_{highstress} - \pi_{lowstress} = 0\] My alternative hypothesis is the the proportion of high stress individuals with high blood pressure is different then the proportion of individuals without high stress with high blood pressure.

\[H_A: \pi_{highstress} - \pi_{lowstress} \neq 0\]

A type I error in this context would happen if I concluded that the stress levels are associated with a difference in high blood pressure, but in relation, there is no true difference in the true population. A type II error would happen if I concluded that their is no difference in high blood pressure between stress levels, when in reality, a true difference does exist between stress and blood pressure. My measurements cannot reasonably be considered a representative sample from the population of interest, as my sampling method was a convenience method, not a random sample.

The test statistic I used for a two-sample z-test was the difference in sample proportions of high blood pressure which was 0.3023-0.5074 which results in a statistic of -0.205.

#sample sizes
n.lowstress<- 134 
n.highstress<- 43  

# counts
highbp.lowstress<- 68
highbp.highstress<- 13

#sample proportions 
p.hat.lowstress <- highbp.lowstress/n.lowstress 
p.hat.highstress <- highbp.highstress/n.highstress 

# difference between the sample proportions of males vs females
p.hat.diff<-p.hat.highstress-p.hat.lowstress
cat("difference in sample proportions of people with high blood pressure between high stress and low stress groups is",p.hat.diff)

## difference in sample proportions of people with high blood pressure between high stress and low stress groups is -0.2051371

The data met the validity conditions as there are at least 10 observations in each of the four cells of the two-way table. I found a two-sided p-value of 0.0188, which is less than 0.05 and a standardized statistic of -2.35. This p-value and standardized statistic suggest strong evidence against the null hypothesis. The p-value of 0.0188 is the probability of observing a difference in the proportion of residents with high blood pressure between high-stress and low-stress groups as extreme as -0.205, assuming that there is no true difference in the population.

#pooled proportion (total successes/combined sample size)
p.hat.pooled<- (highbp.highstress+highbp.lowstress)/(n.lowstress + n.highstress)

# Standard error of the theoretical sampling distribution based on a pooled proportion
SE.diff.null<-sqrt((p.hat.pooled*(1-p.hat.pooled))/n.lowstress 
                   + (p.hat.pooled*(1-p.hat.pooled))/n.highstress)

# Standardized statistic, z
z <- p.hat.diff/SE.diff.null 
cat("standardized statistic z is",round(z,2))

## standardized statistic z is -2.35

#Theory-based test p-value
left.tailed.p.value<-pnorm(z, 0, 1, lower.tail = TRUE) 

two.sided.p.value<-2*left.tailed.p.value
cat("two-sided p-value is",two.sided.p.value)

## two-sided p-value is 0.01880851

Therefore, I can reject my null hypothesis and conclude that the proportion of residents with high blood pressure is different between the two groups. The 95% confidence interval is (-0.37, -0.04).

# Standard error of the sampling distribution based on individual proportions
SE.diff.CI<-sqrt((p.hat.lowstress*(1-p.hat.lowstress))/n.lowstress 
                   + (p.hat.highstress*(1-p.hat.highstress))/n.highstress)

# margin of error for 95% CI
MoE <- 1.96 * SE.diff.CI
#MoE

LB<-p.hat.diff - MoE # lower limit of 95% CI
UB<-p.hat.diff + MoE # upper limit of 95% CI
round(cbind(LB,UB),2)

##         LB    UB
## [1,] -0.37 -0.04

We are 95% confident that the true difference in proportions of residents with high blood pressure between the high-stress and non high-stress groups lies between -0.37 and -0.04. Since zero is not included in the confidence interval,the interval suggests that the data provides strong evidence of a real difference in proportions. The confidence interval supports the same conclusion as the p-value that high-stress residents had a lower rate of high blood pressure.

Conclusion

The data did not support my original expectation that high perceived stress would be associated with higher rates of high blood pressure, as high blood pressure was lower among high-stress residents (30.2%) than among low-stress residents (50.7%). Due to a p-value of 0.0188 and a confidence interval that does not include zero, it supports that there is a statistically significant difference between the two stress groups and that high-stress villagers may have lower blood pressure. I did not expect these results, especially based on the previous literature suggesting a positive association between chronic stress and hypertension. Other variables could be influencing this outcome, like age, exercise behaviors, current health issues, and the fact that I didn’t ask a question that properly addresses chronic stress.If I would redo this study, I would consider controlling potential confounding variables such as age, diet, or health habits to be able to focus solely on the effects of stress on blood pressure. I would look at continuous blood pressure values, most likely just systolic, making it a quantitative variable to analyze the data with more nuance. A future study could focus on whether the relationship between stress and blood pressure varies by age group or gender. It is not reasonable to generalize my results to the full Island population, as I did not use random sampling.

Bibliography: references to literature mentioned in the introduction

F. Sparrenberger, F. T. Cichelero,A. M. Ascoli, F. P. Fonseca,G. Weiss, O. Berwanger,S. C. Fuchs,L. B. Moreira, & F. D. Fuchs, Does psychosocial stress cause hypertension? A systematic review of observational studies, Journal of Human Hypertension, Volume 23(1), 2009, Pages 12–19, (https://doi.org/10.1038/jhh.2008.74)

L. G. Hudzinski, E. D. Frohlich, & R. D. Holloway, (1988). Hypertension and stress, Clinical cardiology, Volume 11(9), 1988, Pages 622–626,(https://doi.org/10.1002/clc.4960110906)