install.packages("NHANES")
install.packages("dplyr")
library(NHANES)

Loading the data

  1. might interfere with knitting depending on the setting; will reinstall packages everytime
  2. This is survey data collected by the US National Center for Health Statistics (NCHS) which has conducted a series of health and nutrition surveys since the early 1960’s. Since 1999 approximately 5,000 individuals of all ages are interviewed in their homes every year and complete the health examination component of the survey. The health examination is conducted in a mobile examination centre (MEC).

Viewing the data

  1. interactive functions might return an error when trying to knit them, it’s why it’s important to add the eval=FALSE command
View(NHANES)

Summarizing the data

  1. Weight has 78 NA’s
  2. Height has 353 NA’s
summary(NHANES)

Writing a function + Debugging with the Browser Command

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
nhanesmini <-select(NHANES, Weight, Height)
# df[5,1]<-NA
# df[3,2]<-NA
# i<-1:10
#the work of the function

diffmeans<-function(df,i){
#  browser()
  subs <- dplyr::slice(df,i) |>
    na.omit()
  output<-mean(pull(subs,1))-mean(pull(subs,2))
  output
}
diffmeans(df=nhanesmini, i=1:10)
## [1] -87.37

replaced two values with NA = diff means

#diffmeans(df=nhanesmini,i=1:10)
  1. Data = df, sub. Values = i. There are a lot less variables in my environment, perhaps there is a focus on the variables which occur in the error.

Writing another function

Create a dataframe of male & female weights with two vectors of uneven lengths. Replace missing data in vector with NA

nhanesmini2<-select(NHANES, Weight, Gender)

nhanesweightmales<-subset(nhanesmini2,Gender=="male", select=Weight)

nhanesweightfemales<-subset(nhanesmini2,Gender=="female", select=Weight)  

Weightmale<-pull(nhanesweightmales,Weight)
Weightfemale<-pull(nhanesweightfemales,Weight)
length(Weightmale)<-length(Weightfemale)
df2<-data.frame(Weightmale=Weightmale,Weightfemale=Weightfemale)

Rows that will be used in the calculation: all rows

i2<-1:5020
diffmeans2<-function(df2,i2){
  subs2 <- dplyr::slice(df2,i2) |>
    na.omit()
  output<-mean(pull(subs2,1))-mean(pull(subs2,2))
  output
}
diffmeans2(df2,i2)
## [1] 9.62244