library(readr)
Smoker_Data <- read_csv("~/MINI PROJECT DATA - Sheet1-3.csv")

Note that the written report should have the style of a research paper. That means that there needs to be a continuing storyline. Do not include the questions, just your answers to them. Do not use bullet points. Have connective narratives between the sentences.

Introduction

The human body’s ability to utilize oxygen efficiently during physical activity is a critical determinant of overall health. Smoking is a risk factor for various health complications, including respiratory diseases and cardiovascular disorders. However, smoking’s impact on the maximum oxygen uptake, an important physiological parameter, is the topic of this study.

The research question that was proposed is: What is the difference in the maximum oxygen uptake between smokers and non-smokers? The parameter of interest was the difference in means of the maximum oxygen uptake between two groups, smokers and non-smokers. The maximum oxygen uptake is measured using the unit mL/kg/min, which is milliliters of oxygen in a minute per kilogram of body weight.

There are many different studies about the effect of smoking on lungs and lungs’ capacity, but studies that inspired me to do my study were: “Effects of smoking and smoking cessation on longitudinal decline in pulmonary function” and “Effects of Smoking on Chest Expansion, Lung Function, and Respiratory Muscle Strength of Youths”. The mentioned studies investigated the effect of smoking on the lungs, by measuring different variables that are important for lungs, such as lung capacity, chest diameter, respiratory muscle strength, etc. Before conducting the study, I expected the group of non-smoking people to have better maximum oxygen uptake than the group of smokers.

Data Collection Methods

Observational units in this study were Islanders, only male adult population, that are between 20 and 53 years old. In this study, there are 100 people that were observed, 50 non-smokers and 50 smokers. Each city on the Islands has its hall. By going into the hall, we can see birth records of people born in that city. People were simply selected from birth records, taking into account the year of their birth, so they are eligible for the study’s age group. Furthermore, Islanders were selected so that each of the observed subjects did not have a history of diseases that might have had an effect on oxygen uptake. So, only people with a good health history were selected, and only smoking could affect their maximum oxygen uptake. Before testing, subjects were not doing any physical activity, they were completely isolated from any side factors that may cause the measurement to be done wrongly. To be clear, I was going through the city’s birth record and after checking their health history, whether they were eligible or not for the study. As I was just going through birth records in order people are mentioned there, there was no randomness involved.

After selecting subjects that have all study conditions satisfied, they had to do a task that is called “Oxygen Uptake Submaximum”. Also, subjects were put in two different groups, whether they are smokers or non-smokers. Oxygen uptake submaximum is measured in milliliters of oxygen in a minute per kilogram of body weight. After the measurement was done, the subject’s results were put next to their respective category in the result table. Names were not put in the data table, as this study is anonymous and we do not need any information except whether people smoke or not, and their oxygen uptake.

In addition, it should be mentioned that 13 people declined to participate in the study, which gives us that 11.5% of people declined to be part of the study. Before testing, subjects were not doing any physical activity, they were completely isolated from any side factors that may cause the measurement to be done wrongly.

Descriptive Statistics

This study had two variables, categorical variable, which is whether a person is smoker or non-smoker, and the second variable was quantitative, maximum oxygen uptake, measured in mL/kg/min.

\(n_{smokers} = n_{non-smoker} = 50\)

\(\bar{x}_{smokers} = 62.182\) and \(\bar{x}_{non-smokers} = 58.178\)

\(SD_{smokers} = 14.17425\) and \(SD_{non-smokers} = 13.44304\)

Side-by-side boxplot is below:

bwplot(smoker ~ uptake, 
       horizontal = TRUE, 
       main="Side-by-side boxplots",
       data = Smoker_Data)

By looking at the boxplot, we can see that the values in both groups are similar. Medians between groups are very different, and as both medians are not close to the center of the box, we can say that data is skewed in both groups.

Analysis of Results

The parameter of interest is difference in means of the maximum oxygen uptake between two groups, smokers and non-smokers.

Null hypothesis states that the population means will be the same, \(H_0 : \mu_{smokers}-\mu_{non-smokers} = 0\) Alternative hypothesis states that the population means will be different, \(H_A : \mu_{non-smokers} > \mu_{smokers}\)

In this setting, type I error would be if I wrongly rejected the null hypothesis, which means that error occurred while rejecting that smokers and non-smokers have equal maximum oxygen uptake, while type II error would be if I wrongly failed to reject the null hypothesis, which means that if smokers and non-smokers don’t have the same means of the maximum oxygen uptake, I failed to reject that.

As mentioned earlier, as the data collection method is not random, we cannot consider this sample as a representative.

Appropriate standardized statistic is t-statistic, and it’s value is -1.449309. When talking about validity conditions, they are satisfied in this study, as we have at least 20 subjects in each group, and data is not strongly skewed.

stat(t.test(uptake ~ smoker, data = Smoker_Data))
##         t 
## -1.449309

The p-value of 0.1504528 is the probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct.

two.sided.p.value<-pval(t.test(uptake ~ smoker, data = Smoker_Data))
cat(two.sided.p.value)
## 0.1504528

P-value that is observed is 0.1504528, what is considered a high p-value, so we do not have evidence against the null hypothesis.

All in all, we do not have evidence against the null hypothesis, and the data results are showing that predicted alternative hypothesis is not true, as we observed that the mean value of smoker is higher than the mean value of non-smokers. We do not have evidence for or against any of the hypotheses proposed.

confint(t.test(uptake ~ smoker, data = Smoker_Data))

We are 95% confident that the difference in means between groups is between (-9.486672, 1.478672), which means that non-smokers will have from 9.486672 less mean value than smokers, up to 1.478672 higher mean value than smokers. 0 is included in our confidence interval, which means that the null-hypothesis is a plausible value. The confidence interval shows the same conclusion - we do not have evidence against the null hypothesis.

Conclusion

The study brought unexpected results when talking about relationship between smoking and the maximum oxygen uptake. Obviously, as the data showed that smokers have higher mean value than non-smokers, data did not behave as expected. It is important to understand this data with caution, as I am not sure that Islanders are made accurately as humans, so results do not need to be realistic.

Subjects were selected by going through birth records, study does not have randomness, so we cannot generalize this study.

Moving forward, I would replicate study in a larger population, in order to get more accurate data. Also, air quality is a big contributor to the overall health of lungs. By selecting people from the same city, we can also be sure that people have the same backgrounds, when talking about quality of the air they are inhaling. Air is still used more than smoke, if a person smokes, so it is a significant contributor to person’s lung health.

Additionally, the question that might be proposed from this study is comparing the maximum oxygen uptake between male and female subjects, as we might get more accurate results on Islanders, because of physiological differences between genders.

Bibliography: references to literature mentioned in the introduction

References: C. M. Burchfiel, E. B. Marcus, J. D. Curb, C. J. Maclean, W M Vollmer, L. R. Johnson, K. O. Fong, B. L. Rodriguez, K. H. Masaki, A. S. Buist: Effects of smoking and smoking cessation on longitudinal decline in pulmonary function, DOI: 10.1164/ajrccm.151.6.7767520

A. Tantisuwat, P. Thaveeratitham, Effects of Smoking on Chest Expansion, Lung Function, and Respiratory Muscle Strength of Youths, PMCID: PMC3944281, PMID: 24648624, doi: 10.1589/jpts.26.167