library(ggplot2)
library(dplyr)
library(knitr)load("brfss2013.RData")
df = brfss2013The Behavioral Risk Factor Surveillance System (BRFSS) is “a collaborative project between all of the states in the United States (US) and participating US territories and the Centers for Disease Control and Prevention (CDC).”
According to the Overview text “The BRFSS objective is to collect uniform, state-specific data on preventive health practices and risk behaviors that are linked to chronic diseases, injuries, and preventable infectious diseases that affect the adult population.”
The data is collected using telephone and cellular telephone-based surveys. In conducting the BRFSS landline telephone survey, interviewers collect data from a randomly selected adult in a household. In conducting the cellular telephone version of the BRFSS questionnaire, interviewers collect data from an adult who participates by using a cellular telephone and resides in a private residence or college housing.
The data are transmitted to the CDC for editing, processing, weighting, and analysis. An edited and weighted data file is provided to each participating health department for each year of data collection, and summary reports of state-specific data are prepared by the CDC.
Research quesion 1: What is the consumption behavior of fruits status of health?
In this question we will try to understand the daily behavior of fruits according to declared health status of intervierws. Our hipotheses is that the people with more week consumption of fruits, juices and vegetables has excellent and good health status.
Research quesion 2: What is the consumption behavior of juices status of health?
In this question we will try to understand the daily behavior of natural juices, v according to declared health status of intervierws. Our hipotheses is that the people with more week consumption of fruits, juices and vegetables has excellent and good health status.
Research quesion 3: What is the consumption behavior of vegetables status of health?
In this question we will try to understand the daily behavior of vegetables according to declared health status of intervierws. Our hipotheses is that the people with more week consumption of fruits, juices and vegetables has excellent and good health status.
Research quesion 1/2/3:
Below we will to analysis 4 variables: genhlth, fruitjui1, fvgreen, vegetab1. For each variable we check the factors and the proportion for each one. After that, we will select some factors related to periodic to analyse and to the end we make analysis comparing the health status with the food that we analyzed.
Percentage = prop.table(table(df$genhlth))
Frequencies = summary(df$genhlth)
Frequencies = Frequencies[1:5]
genhlth_df = data.frame(Frequencies, Percentage)
genhlth_df = genhlth_df[,-2]kable(genhlth_df, caption = "Health Status Table")| Frequencies | Freq | |
|---|---|---|
| Excellent | 85482 | 0.1745279 |
| Very good | 159076 | 0.3247841 |
| Good | 150555 | 0.3073868 |
| Fair | 66726 | 0.1362339 |
| Poor | 27951 | 0.0570673 |
Note that the most part of interviewrs (80%) declaring Excellent and Good Health Status.
We selected the factors between 201 and 299 in the fruiju1 variable. This factors correspond how many juice fruits the interviewer consume during the week.
### Juice Excellent
prop.fruits.ex = df %>%
filter(fruitju1 <= 299 & fruitju1 >=201) %>%
filter(genhlth == "Excellent")
### Juice Very Good
prop.fruits.vergood = df %>%
filter(fruitju1 <= 299 & fruitju1 >=201) %>%
filter(genhlth == "Very good")
### Juice Good
prop.fruits.good = df %>%
filter(fruitju1 <= 299 & fruitju1 >=201) %>%
filter(genhlth == "Good")
### Juice Fair
prop.fruits.fair = df %>%
filter(fruitju1 <= 299 & fruitju1 >=201) %>%
filter(genhlth == "Fair")
### Juice Poor
prop.fruits.poor = df %>%
filter(fruitju1 <= 299 & fruitju1 >=201) %>%
filter(genhlth == "Poor")We identified that the differents groups of health status has the same curve distribution. Below we checked the mean and median of sample.
juice.ex = prop.fruits.ex$fruitju1
juice.ver = prop.fruits.vergood$fruitju1
juice.goo = prop.fruits.good$fruitju1
juice.fai = prop.fruits.fair$fruitju1
juice.poo = prop.fruits.poor$fruitju1
summaryfun = function(var) {
a = (var) - 200
summary(a)
}
jex = summaryfun(juice.ex)
jve = summaryfun(juice.ver)
jgo = summaryfun(juice.goo)
jfa = summaryfun(juice.fai)
jpo = summaryfun(juice.poo)
names = c("Excellent", "Very Good", "Good", "Fair", "Poor")
boxplot(jex, jve, jgo, jfa, jpo, outline = FALSE, names = names, main = "Comparing Boxplot Consumption of Juice Fruit by Health Status in the week", ylab = "Average")Using BoxPlot we noted that the average consumption of health foods is the same for the entire classes. We believe that the consumption of differents classes does not have some correlation or causality.