Last Ran: 2021-09-14

Task(s):

Recode HHQ variable responce entries.
- TASK:: Finish recoding recent illness (resolved factor), hospitalization, and surgury status.
Aggregate/merge processed medication datasets.
- Review manually extracted vs. USP classified beta blocking variables.
Aggregate/merge w. basic demographics.
Write out datasets.
Import Other RedCap Questionnaires..
- CIRS Data (for aggregated corplot w. medications)
- Hru Data (for aggregated corplot w. medications, etc.)
- Macarthur SES data, ses_ (edu prioritized, -to compare with HHQ edu vars)
- V02 Test Data —
- model VO2 of former v. current smokers, alternative v. cigs only, etc.
- compare w. usp and manually generated beta blocker vars
- check expected relations (Rest HR diff, beta.b, PeakV02/kgs, cardiovascular medications, etc.)
Look for basic correlations pertaining to smoking, alcohol usage, and general demographics using health history questionnaire.
- e.g., How does smoking history relate to the other measures of health history and demographics? What about alcohol intake?

Keywords: United States Pharmacopeia–National Formulary (USP–NF),

Health History Questionnaire (HHQ)

Import Data

/Volumes/IGNITE_Imaging/QC_Output/R_IGNITE/RedCap/PRE/Data/HHQ

setwd("/Volumes/IGNITE_Imaging/QC_Output/R_IGNITE/RedCap/PRE/Data/HHQ/")
FILE<-list.files("/Volumes/IGNITE_Imaging/QC_Output/R_IGNITE/RedCap/PRE/Data/HHQ/", pattern="IGNITEDatabase-IGNITEBaselineHealth*")

data<-paste("/Volumes/IGNITE_Imaging/QC_Output/R_IGNITE/RedCap/PRE/Data/HHQ/", FILE, sep="")
HHQraw.df<-read.csv(data,
                    stringsAsFactors = F,
                    na.strings = c("-99991","-99992","-99993","-99994", "-99995",
               "-99996","-99997","-99998","-99999", "-9999910",
               "-999992","----", "---",  "n/a", "Unknown", ""))

HHQraw.df$Notes<-""
Randomized_demos<-HHQraw.df[,(46:52)]

Creating Summary Variables...

HHQ_Mean.Score - Mean HHQ pain score (pain factors 1:13).
HHQ_Sum.Score - Summation of HHQ pain factors 1:13
HHQ_Health_Status.Factor - Ordered categorical factor (sumation of recent event factors)

HHQraw.df<-HHQraw.df %>% mutate(HHQ_Mean.Score=rowMeans(HHQraw.df[,7:20], na.rm = TRUE))
HHQraw.df<-HHQraw.df %>% mutate(HHQ_Sum.Score=rowSums(HHQraw.df[,7:20], na.rm = TRUE))

HHQ Caffeine Consumption -

history_cafe_history - 25). How many 8oz. cups of regular coffee do you have daily?
history_tea_history - 26). How many 8oz. cups of tea do you have daily?
history_soft_history - 27). How many 8oz. caffeinated soft drinks do you have daily?

Entered as Missing Data...

20045

Recode entries reported as...

1. Missing/Performed test incorrectly -

Response includes "decaf"...

Melted.Caffeine.df$Notes<-if_else(str_detect(Melted.Caffeine.df$value,"decaf"), "*decaf*, recoded as 0",as.character(Melted.Caffeine.df$Notes))

Melted.Caffeine.df$value<-if_else(str_detect(Melted.Caffeine.df$value,"decaf"), "0",as.character(Melted.Caffeine.df$value))

"1 in winter"

Melted.Caffeine.df$Notes<-if_else(str_detect(Melted.Caffeine.df$value,"1 in winter"), "==1 in winter, recoded as NA",as.character(Melted.Caffeine.df$Notes))
Melted.Caffeine.df$value<-if_else(str_detect(Melted.Caffeine.df$value,"1 in winter"), "NA",as.character(Melted.Caffeine.df$value))

"occasionally"

Melted.Caffeine.df$Notes<-if_else(str_detect(Melted.Caffeine.df$value,"occasionally"), "occasionally, recoded as NA",as.character(Melted.Caffeine.df$Notes))
Melted.Caffeine.df$value<-if_else(str_detect(Melted.Caffeine.df$value,"occasionally"), "NA",as.character(Melted.Caffeine.df$value))

"---"

Melted.Caffeine.df$Notes<-if_else(str_detect(Melted.Caffeine.df$value,"---"), "---, recoded as NA",as.character(Melted.Caffeine.df$Notes))

Melted.Caffeine.df$value<-if_else(str_detect(Melted.Caffeine.df$value,"---"), "NA",as.character(Melted.Caffeine.df$value))

2. Translated as dates (not included in codebook...)

Entries translated as dates

exp)
* "6-Feb"=="2-6"=>"4"
* "5-Apr"=="4-5"=>"4.5"
* "6-Apr"=="4-6"=>"5"
* "6-May"=="5-6"=> "5.5"
* "2-Jan"=="1-2"=>"1,5"
* "3-Feb"=="2-3"=>"2.5"
* "9-Aug"=="9-10"=>"9.5"
* "7-Jun" =="6-7"=>"6.5"
* "12-Oct"=="10-12"=>"11"
* "4-Feb"=="2-4"=>"3"
* "5-Mar"=="3-5"=>"4"
* "10-Jul"=="7-10"=>"8.5"

Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value==("5-Feb") , "3",as.character(Melted.Caffeine.df$value))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value==("5-Feb") , "3",as.character(Melted.Caffeine.df$value))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value==("5-Apr") , "4.5",as.character(Melted.Caffeine.df$value))

Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value==("6-Feb") , "4",as.character(Melted.Caffeine.df$value))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value==("6-Apr") , "5",as.character(Melted.Caffeine.df$value))

Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value==("2-Jan") , "1.5",as.character(Melted.Caffeine.df$value))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value==("3-Feb") , "2.5",as.character(Melted.Caffeine.df$value))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value==("6-May") , "5.5",as.character(Melted.Caffeine.df$value))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value==("7-Jun") , "6.5",as.character(Melted.Caffeine.df$value))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value==("9-Aug") , "8.5",as.character(Melted.Caffeine.df$value))

Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value==("4-Mar") , "3.5",as.character(Melted.Caffeine.df$value))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value==("7-Jun") , "6.5",as.character(Melted.Caffeine.df$value))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value==("12-Oct") , "11",as.character(Melted.Caffeine.df$value))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value==("4-Feb") , "3",as.character(Melted.Caffeine.df$value))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value==("5-Mar") , "4",as.character(Melted.Caffeine.df$value))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value==("10-Jul") , "8.5",as.character(Melted.Caffeine.df$value))

3. Subjective estimates/ranges...

Typed Text - 1/week, 2/week, etc.

Melted.Caffeine.df$Notes<-if_else(Melted.Caffeine.df$value=="1/week", "1/7, recoded as 0.14285",
                                  as.character(Melted.Caffeine.df$Notes))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value=="1/week", "0.14285",as.character(Melted.Caffeine.df$value))

Melted.Caffeine.df$Notes<-if_else(Melted.Caffeine.df$value=="2/week", "2/7, recoded as 0.2857",
                                  as.character(Melted.Caffeine.df$Notes))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value=="2/week", "0.2857143",as.character(Melted.Caffeine.df$value))

Melted.Caffeine.df$Notes<-if_else(Melted.Caffeine.df$value=="4/week", "4/7, recoded as 0.5714",
                                  as.character(Melted.Caffeine.df$Notes))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value=="4/week", "0.5714",as.character(Melted.Caffeine.df$value))

Typed Text - 0-1/wk, 1-2/wk, etc.

Melted.Caffeine.df$Notes<-if_else(Melted.Caffeine.df$value=="0-1/wk", "0.5/7, recoded as 0.07142857",
                                  as.character(Melted.Caffeine.df$Notes))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value=="0-1/wk", "0.07142857",
                                  as.character(Melted.Caffeine.df$value))

Melted.Caffeine.df$Notes<-if_else(Melted.Caffeine.df$value=="1-2/wk", "1.5/7, recoded as 0.2142857",
                                  as.character(Melted.Caffeine.df$Notes))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value=="1-2/wk", "0.2142857",
                                  as.character(Melted.Caffeine.df$value))


Melted.Caffeine.df$Notes<-if_else(Melted.Caffeine.df$value=="1-2/week", "1.5/7, recoded as  0.2142857",
                                  as.character(Melted.Caffeine.df$Notes))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value=="1-2/week", " 0.2142857",
                                  as.character(Melted.Caffeine.df$value))

Melted.Caffeine.df$Notes<-if_else(Melted.Caffeine.df$value=="2-3/weeek", "2.5/7, recoded as 0.3571",
                                  as.character(Melted.Caffeine.df$Notes))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value=="2-3/weeek", "0.3571",as.character(Melted.Caffeine.df$value))

Typed Text - 1-0, 1-2, etc.

Melted.Caffeine.df$Notes<-if_else(Melted.Caffeine.df$value=="0-2", "range, recoded as avg",as.character(Melted.Caffeine.df$Notes))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value=="0-2", "1",as.character(Melted.Caffeine.df$value))

Melted.Caffeine.df$Notes<-if_else(Melted.Caffeine.df$value=="0-1", "range, recoded as avg",as.character(Melted.Caffeine.df$Notes))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value=="0-1", "0.5",as.character(Melted.Caffeine.df$value))

Typed Text - "24 oz 1 or 2x daily"

Melted.Caffeine.df$Notes<-if_else(Melted.Caffeine.df$value=="24 oz  1 or 2x daily"  , "recoded as 1.5*3" ,as.character(Melted.Caffeine.df$Notes))

Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value=="24 oz  1 or 2x daily", "4.50" ,as.character(Melted.Caffeine.df$value))

Typed Text - "1, about 3x/week"

Three_per_wk<-3/7
Three_per_wk<-as.character(Three_per_wk)
Melted.Caffeine.df$Notes<-if_else(Melted.Caffeine.df$value=="1, about 3x/week"  , "subjective range, 3/7"   ,as.character(Melted.Caffeine.df$Notes))

Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value=="1, about 3x/week", Three_per_wk ,as.character(Melted.Caffeine.df$value))

Typed Text - "2 -16 oz (so 4 8oz.)"

Melted.Caffeine.df$Notes<-if_else(Melted.Caffeine.df$value=="2  -16 oz (so 4  8oz.)", "typed text",as.character(Melted.Caffeine.df$Notes))        
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value=="2  -16 oz (so 4  8oz.)", "4",as.character(Melted.Caffeine.df$value))

4. Mathmatical expressions...

'>1' => 2

Melted.Caffeine.df$Notes<-if_else(Melted.Caffeine.df$value==">1", "mathmatical expression",as.character(Melted.Caffeine.df$Notes))

Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value==">1", "2",as.character(Melted.Caffeine.df$value))

'<1' => 1

Melted.Caffeine.df$Notes<-if_else(Melted.Caffeine.df$value=="1 if any", "typed mathmatical expression",as.character(Melted.Caffeine.df$Notes))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value=="1 if any", "1",as.character(Melted.Caffeine.df$value))

Melted.Caffeine.df$Notes<-if_else(Melted.Caffeine.df$value=="<1", "mathmatical expression",as.character(Melted.Caffeine.df$Notes))
Melted.Caffeine.df$value<-if_else(Melted.Caffeine.df$value=="<1", "1",as.character(Melted.Caffeine.df$value))

Creating Variables...

daily_caffeine - Summated caffeine variables.

Outliers- 10898, 30

Units: Daily 8oz cups.`

HHQ Alcohol Consumption

history_beer_history - 28). How many cans of beer (12 oz.) do you have in normal week?
history_wine_history - 29). How many glasses of wine (5 oz.) do have in a normal week?
history_liquor_history - 30). How many serving of liquor (1.5 oz. shot) do have in a normal week?

Entered as Missing Data...

No Missing Alcohol Demographics Data...

Recode entries reported as...

1. Subjective estimates/ranges...

Any ranges are averaged prior to calculating weekly intake

Typed text

Melted.ALC.df$Notes<-if_else(Melted.ALC.df$value=="4/week", "subjective estimate",as.character(Melted.ALC.df$Notes)) 
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="4/week", "4",as.character(Melted.ALC.df$value)) 

Melted.ALC.df$Notes<-if_else(Melted.ALC.df$value=="none", "typed text",as.character(Melted.ALC.df$Notes)) 
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="none", "0",as.character(Melted.ALC.df$value))

Melted.ALC.df$Notes<-if_else(Melted.ALC.df$value=="n/a", "typed text",as.character(Melted.ALC.df$Notes)) 
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="n/a", "0",as.character(Melted.ALC.df$value))

Typed fraction

 exp) "1/3 <1 glass per week"

Melted.ALC.df$Notes<-if_else(Melted.ALC.df$value=="1/3  <1 glass per week" , "typed text, recoded as 0.33333" ,as.character(Melted.ALC.df$Notes))
third_per_week<-1/3 
third_per_week<-as.character(third_per_week) 
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="1/3  <1 glass per week" , "0.33333" ,as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="0.5 or less", "0.5" ,as.character(Melted.ALC.df$value))

Subjective estimate- e.g. "1/month", "2/month", etc.

Melted.ALC.df$Notes<-if_else(Melted.ALC.df$value=="1/month" , "subjective range, 1/4" ,as.character(Melted.ALC.df$Notes))
One_per_mo<-1/4 
One_per_mo<-as.character(One_per_mo) 
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="1/month", One_per_mo ,as.character(Melted.ALC.df$value))

Melted.ALC.df$Notes<-if_else(Melted.ALC.df$value=="2/month" , "subjective range, 2/4" ,as.character(Melted.ALC.df$Notes))
Two_per_mo<-2/4 
Two_per_mo<-as.character(Two_per_mo) 
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="2/month", Two_per_mo ,as.character(Melted.ALC.df$value))

Subjective estimate- e.g. "1/every other week"

Melted.ALC.df$Notes<-if_else(Melted.ALC.df$value=="1/every other week" , "subjective range, 1/2" ,as.character(Melted.ALC.df$Notes))
half_per_wk<-1/2 
half_per_wk<-as.character(half_per_wk)
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="1/every other week", half_per_wk ,as.character(Melted.ALC.df$value))

Subjective range- e.g. "1-2/month"

Melted.ALC.df$Notes<-if_else(Melted.ALC.df$value=="1-2/month" , "subjective range (1.5/4), recoded as 0.375" ,as.character(Melted.ALC.df$Notes))
One.5_per_mo<-1.5/4 
One.5_per_mo<-as.character(One.5_per_mo) 
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="1-2/month", One.5_per_mo ,as.character(Melted.ALC.df$value))

Subjective range- Narrow spread between range- e.g. "0-1"

Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="0-1", "0.5",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="1-2", "1.5",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="2-3", "2.5",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="3-4", "3.5",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="4-5", "4.5",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="5-6", "4.5",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="6-7", "6.5",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="7-8", "6.5",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="8-9", "8.5",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="9-10", "8.5",as.character(Melted.ALC.df$value))

# Response includes string of text...
Melted.ALC.df$Notes<-if_else(Melted.ALC.df$value=="1-2/week", "range",as.character(Melted.ALC.df$Notes))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="1-2/week", "1.5",as.character(Melted.ALC.df$value))

Subjective range- Moderate spread between integers- e.g. "0-2"

Melted.ALC.df$Notes<-if_else(Melted.ALC.df$value=="0-2", "range",as.character(Melted.ALC.df$Notes))
Melted.ALC.df$Notes<-if_else(Melted.ALC.df$value=="2-4", "range",as.character(Melted.ALC.df$Notes))
Melted.ALC.df$Notes<-if_else(Melted.ALC.df$value=="3-5", "range",as.character(Melted.ALC.df$Notes))
Melted.ALC.df$Notes<-if_else(Melted.ALC.df$value=="4-6", "range",as.character(Melted.ALC.df$Notes))
Melted.ALC.df$Notes<-if_else(Melted.ALC.df$value=="10-12", "range",as.character(Melted.ALC.df$Notes))

Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="0-2", "1",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="2-4", "3",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="3-5", "4",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="4-6", "5",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="10-12", "11",as.character(Melted.ALC.df$value))

Subjective range- Large spread between integers- e.g. "2-6

Melted.ALC.df$Notes<-if_else(Melted.ALC.df$value=="7-10", "range",as.character(Melted.ALC.df$Notes))
Melted.ALC.df$Notes<-if_else(Melted.ALC.df$value=="2-6", "range",as.character(Melted.ALC.df$Notes))

Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="7-10", "8.5",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="2-6", "4",as.character(Melted.ALC.df$value))

2. Translated as dates (not included in codebook...)

Entries translated as dates

 exp)
  * "6-May"=="5-6"=> "5.5"  
  * "2-Jan"=="1-2"=>"1,5"  
  * "3-Feb"=="2-3"=>"2.5"  
  * "9-Aug"=="9-10"=>"9.5"  
  * "7-Jun" =="6-7"=>"6.5"  
  * "12-Oct"=="10-12"=>"11"  
  * "4-Feb"=="2-4"=>"3"  
  * "5-Mar"=="3-5"=>"4"  
  * "10-Jul"=="7-10"=>"8.5"

Melted.ALC.df$value<-if_else(Melted.ALC.df$value==("2-Jan") , "1.5",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value==("3-Feb") , "2.5",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value==("6-May") , "5.5",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value==("7-Jun") , "6.5",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value==("9-Aug") , "8.5",as.character(Melted.ALC.df$value))

Melted.ALC.df$value<-if_else(Melted.ALC.df$value==("4-Mar") , "3.5",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value==("7-Jun") , "6.5",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value==("12-Oct") , "11",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value==("4-Feb") , "3",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value==("5-Mar") , "4",as.character(Melted.ALC.df$value))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value==("10-Jul") , "8.5",as.character(Melted.ALC.df$value))

3. Mathmatical expressions...

'>1' => 2

Melted.ALC.df$Notes<-if_else(Melted.ALC.df$value==">1", "mathmatical expression",as.character(Melted.ALC.df$Notes))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value==">1", "2",as.character(Melted.ALC.df$value))

'<1'. => 1.

Melted.ALC.df$Notes<-if_else(Melted.ALC.df$value=="<1", "mathmatical expression",as.character(Melted.ALC.df$Notes))
Melted.ALC.df$value<-if_else(Melted.ALC.df$value=="<1", "1",as.character(Melted.ALC.df$value))

Creating Variables...

Weekly_alc.score - Continuous estimate.
Heavy_Drinker - (1/0) 14 drinks per week for men and seven per week for women.

Continuous Variable 'Weekly_alc.score'

Binary Variable 'Heavy_Drinker'

 The national average was 17 per week. The Centers for Disease Control defines heavy drinking as 14 drinks per week for men and seven per week for women. A standard drink is defined as 12 ounces of beer, 5 ounces of win or 1.5 ounces of liquor.

ALC.df$Heavy_Drinker<-if_else(ALC.df$screen_gender==2 & ALC.df$Weekly_alc.score>=7, "Heavy", " ")
ALC.df$Heavy_Drinker<-if_else(ALC.df$screen_gender==1 & ALC.df$Weekly_alc.score>=14, "Heavy",as.character(ALC.df$Heavy_Drinker) )
ALC.df$Heavy_Drinker<-if_else(ALC.df$Heavy_Drinker=="Heavy", 1, 0 )
df<-ALC.df %>% select(record_id,Site, screen_gender,Weekly_alc.score, Heavy_Drinker)

library(nVennR)

df$Male<-if_else(df$screen_gender  =="1"  ,1,0)
df$Female<-if_else(df$screen_gender  =="2"  ,1,0)
male<- subset(df, Male == "1")$record_id
female <- subset(df, Female == "1")$record_id
heavy_drinker <- subset(df, Heavy_Drinker == "1")$record_id

df %$%  ctable(screen_gender,Heavy_Drinker,chisq = TRUE, OR = TRUE, RR=TRUE, headings = FALSE) %>% print(method = "render")

	Heavy_Drinker
screen_gender	0				1				Total
1	124	(	87.3%	)	18	(	12.7%	)	142	(	100.0%	)
2	313	(	87.4%	)	45	(	12.6%	)	358	(	100.0%	)
Total	437	(	87.4%	)	63	(	12.6%	)	500	(	100.0%	)
Χ² = 0.0000 df = 1 p = 1.0000 O.R. (95% C.I.) = 0.99 (0.55 - 1.78) R.R. (95% C.I.) = 1.00 (0.93 - 1.08)

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

ven<-plotVenn(list(Male=male, Female=female,  Heavy_Drinker=heavy_drinker), nCycles = 2000,labelRegions=T, showPlot = T)
str <- charToRaw(ven$svg)
rsvg::rsvg_png(str, file = '~/HeavyDrink_Gender_venn.png')

df$Site<-as.factor(df$Site)
df$Site<-as.character(df$Site)
df$UPitt<-if_else(df$Site  =="PITT"  ,1,0)
df$Kansas<-if_else(df$Site  =="KU"  ,1,0)
df$Northeastern<-if_else(df$Site  =="NEU"  ,1,0)
upitt<- subset(df, UPitt == "1")$record_id
ku<- subset(df, Kansas == "1")$record_id
neu<- subset(df, Northeastern == "1")$record_id
heavy_drinker <- subset(df, Heavy_Drinker == "1")$record_id
df$Site<-as.factor(df$Site)
df$heavy_drinker<-as.factor(df$Heavy_Drinker)
df %$%  ctable(Site,Heavy_Drinker,chisq = TRUE, OR = TRUE, RR=TRUE, headings = FALSE) %>% print(method = "render")

	Heavy_Drinker
Site	0				1				Total
KU	153	(	85.5%	)	26	(	14.5%	)	179	(	100.0%	)
NEU	119	(	86.2%	)	19	(	13.8%	)	138	(	100.0%	)
PITT	165	(	90.2%	)	18	(	9.8%	)	183	(	100.0%	)
Total	437	(	87.4%	)	63	(	12.6%	)	500	(	100.0%	)
Χ² = 2.0429 df = 2 p = .3601

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

ven<-plotVenn(list( Kansas=ku,UPitt=upitt,Heavy_Drinker=heavy_drinker,Northeastern=neu), nCycles = 20000,labelRegions=T, showPlot = T)
str <- charToRaw(ven$svg)
rsvg::rsvg_png(str, file = '~/HeavyDrink_site_venn.png')

Consumption groups 'Weekly_alc.group'

 0, 1-3, 3+

ALC.df$Weekly_alc.group<-if_else(ALC.df$Weekly_alc.score==0, "none", as.character(ALC.df$Weekly_alc.score))

ALC.df$Weekly_alc.group<-if_else(ALC.df$Weekly_alc.score>0 &ALC.df$Weekly_alc.score<=3 , "1-3", as.character(ALC.df$Weekly_alc.group))

ALC.df$Weekly_alc.group<-if_else(ALC.df$Weekly_alc.score>3  , "3+", as.character(ALC.df$Weekly_alc.group))

ALC.df$Weekly_alc.group<-factor(ALC.df$Weekly_alc.group, levels=c("none", "1-3", "3+" ))

Caffeine.df$record_id<-as.character(Caffeine.df$record_id)
Alc_Caff<-left_join(ALC.df, Caffeine.df)

Outliers-

Greater than 2SD Site Avg: 46
Greater than 3SD Site Avg: 21
- PITT 6
- KU 8
- NEU 7

HHQ Smoking Demographics

history_cig_history - 31). Do you currently smoke Cigarettes?
smoke_day - 31A) How many packs do you smoke per day?
yrs_smoke - 31B) For how many years have you smoked?
prior_smoke - 32) Did you previously smoke cigarettes, but quit?
packs_prior - 32A) How many packs did you previously smoke per day (on average)?
prior_smoke_yrs - 32B) For how many years did you smoke?
smoke_other - 33) Do you use any other forms of tobacco (Cigars, vaporizers, etc)?
other_type - 33A) Quantify how much of each kind you use, on average (ex. 1 cigar / month etc.)

Entered as Missing Data...

c(10151, 10499)

Missing secondary smoking demographics...

10136 missing prior.smoke.yrs
prior packs smoked complete
smoke_day complete
years smoked complete

Recode entries...

1. Missing/Performed Test Incorrectly -

Currently uses non-inhaled tabaccoo/nicotine products...

#Smokes.df %>% filter(smoke_other!=0) %>%select(smoke_other, other_type)
Smokes.df$smoke_other<-if_else(Smokes.df$other_type=="nicorette", as.integer(0) , as.integer(Smokes.df$smoke_other))
#Smokes.df %>% filter(smoke_other!=0) %>%select(smoke_other, other_type)
Smokes.df$packs_prior<-if_else(Smokes.df$packs_prior=="10", "0.6",Smokes.df$packs_prior)

prior_smoke.factor (eval=FALSE,for now...)

Smokes.df$prior_smoke<-as.character(Smokes.df$prior_smoke)
Smokes.df<-Smokes.df %>%  
  mutate("prior_smoke"= if_else(history_cig_history==1 & 
                             is.na(prior_smoke)==TRUE, 
                           "0", as.character(Smokes.df$prior_smoke)))
Smokes.df$prior_smoke<-as.numeric(Smokes.df$prior_smoke)

2. Subjective estimates/ranges...

Typed text - variable

melted.Smokes.df$Notes<-if_else(melted.Smokes.df$value==("1 1/2") , "fraction",as.character(melted.Smokes.df$Notes))
melted.Smokes.df$value<-if_else(melted.Smokes.df$value==("1 1/2") , "1.5",as.character(melted.Smokes.df$value))

melted.Smokes.df$Notes<-if_else(melted.Smokes.df$value==("1-1.5") , "range",as.character(melted.Smokes.df$Notes))
melted.Smokes.df$value<-if_else(melted.Smokes.df$value==("1-1.5") , "1.25",as.character(melted.Smokes.df$value))

melted.Smokes.df$Notes<-if_else(melted.Smokes.df$value==("1 pack per day") , "typed text", as.character(melted.Smokes.df$Notes))
melted.Smokes.df$value<-if_else(melted.Smokes.df$value==("1 pack per day") , "1",as.character(melted.Smokes.df$value))

melted.Smokes.df$Notes<-if_else(melted.Smokes.df$value==("1 cigarette") , "typed text/subjective estimate",as.character(melted.Smokes.df$Notes))
cig_day<-1/20
melted.Smokes.df$value<-if_else(melted.Smokes.df$value==("1 cigarette") , as.character(cig_day),as.character(melted.Smokes.df$value))

melted.Smokes.df$Notes<-if_else(melted.Smokes.df$value==("about 3 cigarettes/day") , "typed text/subjective range",as.character(melted.Smokes.df$Notes))
cig3_day<-3/20
melted.Smokes.df$value<-if_else(melted.Smokes.df$value==("about 3 cigarettes/day") , as.character(cig3_day),as.character(melted.Smokes.df$value))

3. Translated as dates (not included in codebook...)

Translated as dates...

melted.Smokes.df$value<-if_else(melted.Smokes.df$value==("2-Jan") , "1.5",as.character(melted.Smokes.df$value))

4. Mathmatical expressions...

Typed text - Mathmatical expression

melted.Smokes.df$Notes<-if_else(melted.Smokes.df$value==("social smoker, less than 1 pack per week") , "typed text/mathmatical expression",as.character(melted.Smokes.df$Notes))
melted.Smokes.df$value<-if_else(melted.Smokes.df$value==("social smoker, less than 1 pack per week") , "1",as.character(melted.Smokes.df$value))

'<1' => 1

melted.Smokes.df$Notes<-if_else(melted.Smokes.df$value==("<1") , "mathmatical expression",as.character(melted.Smokes.df$Notes))
melted.Smokes.df$value<-if_else(melted.Smokes.df$value==("<1") , "1",as.character(melted.Smokes.df$value))

'>1' => 2

melted.Smokes.df$Notes<-if_else(melted.Smokes.df$value==(">1") , "mathmatical expression",as.character(melted.Smokes.df$Notes))
melted.Smokes.df$value<-if_else(melted.Smokes.df$value==(">1") , "2",as.character(melted.Smokes.df$value))

Creating Variables...

Smoking.Status - (Current/Former/Never) Current includes primary inhaled alternative users.
Smoking.Status_unc - (Current/Former/Never) Primary inhaled alternative users are classified as "Never".
Current.Smoke.factor - Currently smokes inhaled tabacco/nicotine.
Primary.AltSmoke.factor - No history of cigarette smoking, currently uses alternative inhaled tabacco/nicotine (vape, cigars, etc.).
Secondary.AltSmoke.factor - Former or Current cigarette smoker, now uses alternative inhaled tabacco/nicotine.
pack_day - packs per day (former or current).
yrs_smoke - Years smoked (former or current).

'Uncorrected Smoking.Status' - Not inluding alternative tabbacco/nicotine users

Smokes.df$Smoking.Status<-if_else(Smokes.df$history_cig_history=="1" &
                                    c(Smokes.df$prior_smoke=="0"| is.na(Smokes.df$prior_smoke)), 
                                "Current","")
Smokes.df$Smoking.Status<-if_else(Smokes.df$history_cig_history=="0" & 
                                    Smokes.df$prior_smoke=="1", 
                                  "Former", as.character(Smokes.df$Smoking.Status))

Smokes.df$Smoking.Status<-if_else(Smokes.df$history_cig_history=="0" & 
                                    Smokes.df$prior_smoke=="0", 
                                  "Never",as.character(Smokes.df$Smoking.Status))

Smokes.df$Smoking.Status_unc<-Smokes.df$Smoking.Status

	Smoking.Status
Site	Current				Former				Never				<NA>				Total
KU	5	(	2.8%	)	61	(	34.1%	)	113	(	63.1%	)	0	(	0.0%	)	179	(	100.0%	)
NEU	4	(	2.9%	)	63	(	45.7%	)	71	(	51.4%	)	0	(	0.0%	)	138	(	100.0%	)
PITT	3	(	1.6%	)	92	(	50.3%	)	87	(	47.5%	)	1	(	0.5%	)	183	(	100.0%	)
Total	12	(	2.4%	)	216	(	43.2%	)	271	(	54.2%	)	1	(	0.2%	)	500	(	100.0%	)
Χ² = 10.7784 df = 4 p = .0292

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

	Smoking.Status
screen_gender	Current				Former				Never				Total
Female	11	(	3.1%	)	140	(	39.2%	)	206	(	57.7%	)	357	(	100.0%	)
Male	1	(	0.7%	)	76	(	53.5%	)	65	(	45.8%	)	142	(	100.0%	)
Total	12	(	2.4%	)	216	(	43.3%	)	271	(	54.3%	)	499	(	100.0%	)
Χ² = 9.8515 df = 2 p = .0073

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Uncorrected Binary - 'Current.Smoke.factor'

Smokes_1<-Smokes.df %>% group_by(Site) %>% add_count(Site)

Smokes_1$Current.Smoke.factor<-if_else(Smokes_1$history_cig_history=="1", "1","0")
Smokes_1$Current.Smoke.factor<-as.factor(Smokes_1$Current.Smoke.factor)

Smokes_1$Site_N<-Smokes_1$n
Smokes_1<-Smokes_1 %>% select(-n)
Smokes_1<-Smokes_1 %>% group_by(Site) %>% add_count(Current.Smoke.factor)
Smokes_1$sample_perc<-Smokes_1$n/nrow(Smokes_1)*100
Smokes_1$site_perc<-Smokes_1$n/Smokes_1$Site_N*100

Smokes_1 %$%  ctable(Site,Current.Smoke.factor,chisq = TRUE, OR = TRUE, RR=TRUE,headings = FALSE) %>% print(method = "render")

	Current.Smoke.factor
Site	0				1				Total
KU	174	(	97.2%	)	5	(	2.8%	)	179	(	100.0%	)
NEU	134	(	97.1%	)	4	(	2.9%	)	138	(	100.0%	)
PITT	180	(	98.4%	)	3	(	1.6%	)	183	(	100.0%	)
Total	488	(	97.6%	)	12	(	2.4%	)	500	(	100.0%	)
Χ² = 0.7167 df = 2 p = .6988

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

ggplot(Smokes_1, aes(x = as.factor(Site), y=n, fill=Current.Smoke.factor)) + 
  geom_bar(stat = "identity", position = "dodge") +
  geom_text(aes(label=paste(n, " / ",round(sample_perc ,1.8), "%"," \n", round(site_perc, 1), "%",  sep="")), position = position_dodge(.9),colour="black", size=2.5)+  ylab("")+xlab("")+theme(legend.position = "bottom")

gender.df<-HHQraw.df %>% select(record_id, screen_gender)
gender.df$record_id<-as.character(gender.df$record_id)
plot.df<-left_join(gender.df,Smokes.df)
plot.df<-plot.df[complete.cases(plot.df$screen_gender),]
plot.df<-plot.df[complete.cases(plot.df$Smoking.Status),]

plot.df<-plot.df %>% group_by(screen_gender) %>% add_count(screen_gender)
plot.df$gender_n<-plot.df$n
plot.df<-plot.df %>% select(-n)

plot.df$Current.Smoke.factor<-if_else(plot.df$Smoking.Status=="Current", "Tabbacco User", "Non User")
plot.df$Current.Smoke.factor<-as.factor(plot.df$Current.Smoke.factor)
plot.df<-plot.df %>% group_by(screen_gender) %>% add_count(Current.Smoke.factor)
plot.df$sample_perc<-plot.df$n/nrow(plot.df)*100
plot.df$site_perc<-plot.df$n/plot.df$gender_n*100
plot.df$screen_gender<-if_else(plot.df$screen_gender==1, "Male","Female")
#plot.df$screen_gender<-as.factor(plot.df$screen_gender)
plot.df %$%  ctable(screen_gender,Current.Smoke.factor,chisq = TRUE, OR = TRUE, RR=TRUE,headings = FALSE) %>% print(method = "render")

	Current.Smoke.factor
screen_gender	Non User				Tabbacco User				Total
Female	346	(	96.9%	)	11	(	3.1%	)	357	(	100.0%	)
Male	141	(	99.3%	)	1	(	0.7%	)	142	(	100.0%	)
Total	487	(	97.6%	)	12	(	2.4%	)	499	(	100.0%	)
Χ² = 1.5378 df = 1 p = .2149 O.R. (95% C.I.) = 0.22 (0.029 - 1.74) R.R. (95% C.I.) = 0.98 (0.95 - 1.00)

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

ggplot(plot.df, aes(x =as.factor(screen_gender) ,y=n, fill=Current.Smoke.factor)) + 
  geom_bar(stat = "identity", position = "dodge") +
  geom_text(aes(label=paste(n, " / ",round(sample_perc ,2), "%"," \n", round(site_perc, 1), "%",  sep="")), position = position_dodge(.9),colour="black", size=2.2)+ 
  ylab("")+xlab("")+theme(legend.position = "bottom")

former<-Smokes.df%>%filter( prior_smoke ==1)

Inhaled Smoking Alternatives

Smokes.df$Primary_Alternative_User<-if_else(Smokes.df$prior_smoke=="0" & 
                                          Smokes.df$history_cig_history=="0" &
                                          Smokes.df$smoke_other=="1", 1,0)
Smokes.df$Primary_Alternative_User<-if_else(is.na(Smokes.df$Primary_Alternative_User==TRUE), "0",as.character(Smokes.df$Primary_Alternative_User))

Smokes.df$Primary_Alternative_User<-as.character(Smokes.df$Primary_Alternative_User)

Smokes.df$Secondary_Alternative_User<-if_else((Smokes.df$smoke_other=="1" & 
                                      Smokes.df$Primary_Alternative_User!="1"), 1, 0)
Smokes.df$Secondary_Alternative_User<-if_else(is.na(Smokes.df$Secondary_Alternative_User==TRUE), "0",as.character(Smokes.df$Secondary_Alternative_User))

Smokes.df$alt_smoke.factor<-if_else(Smokes.df$Primary_Alternative_User=="1"|
                                      Smokes.df$Secondary_Alternative_User=="1"|
                                     Smokes.df$smoke_other==1 , "1", "0")

Smokes.df$alt_smoke.factor<-if_else(is.na(Smokes.df$alt_smoke.factor==TRUE), "0",Smokes.df$alt_smoke.factor)

Smokes.df$alt_smoke.factor<-as.factor(Smokes.df$alt_smoke.factor)

alt.df<-Smokes.df %>% filter(smoke_other==1)
alt.df<-left_join(alt.df,gender.df)
alt.df$screen_gender<-if_else(alt.df$screen_gender==1, "Male","Female")
alt.df$screen_gender<-as.factor(alt.df$screen_gender)

#alt.df %>% select(Smoking.Status_unc,alt_smoke.factor, Site, screen_gender , prior_smoke_yrs,packs_prior,other_type) %>% DT::datatable(rownames=FALSE,options = list(pageLength = 13))

'Corrected Smoking.Status' - All inhaled tabaccoo/nicotine products

Smokes.df$smoke_other<-as.character(Smokes.df$smoke_other)
Smokes.df$Smoking.Status<-if_else(Smokes.df$smoke_other=="1" & !is.na(Smokes.df$smoke_other), "Current",Smokes.df$Smoking.Status_unc)
Smokes.df$Current.Smoke.factor<-if_else(Smokes.df$Smoking.Status=="Current", 1,0)

	Smoking.Status
Site	Current				Former				Never				<NA>				Total
KU	9	(	5.0%	)	59	(	33.0%	)	111	(	62.0%	)	0	(	0.0%	)	179	(	100.0%	)
NEU	5	(	3.6%	)	62	(	44.9%	)	71	(	51.4%	)	0	(	0.0%	)	138	(	100.0%	)
PITT	10	(	5.5%	)	86	(	47.0%	)	86	(	47.0%	)	1	(	0.5%	)	183	(	100.0%	)
Total	24	(	4.8%	)	207	(	41.4%	)	268	(	53.6%	)	1	(	0.2%	)	500	(	100.0%	)
Χ² = 9.4292 df = 4 p = .0512

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

	Smoking.Status
screen_gender	Current				Former				Never				Total
Female	13	(	3.6%	)	138	(	38.7%	)	206	(	57.7%	)	357	(	100.0%	)
Male	11	(	7.7%	)	69	(	48.6%	)	62	(	43.7%	)	142	(	100.0%	)
Total	24	(	4.8%	)	207	(	41.5%	)	268	(	53.7%	)	499	(	100.0%	)
Χ² = 9.7065 df = 2 p = .0078

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

'Corrected Binary' - 'Current.Smoke.factor'

	Current.Smoke.factor
Site	0				1				<NA>				Total
KU	170	(	95.0%	)	9	(	5.0%	)	0	(	0.0%	)	179	(	100.0%	)
NEU	133	(	96.4%	)	5	(	3.6%	)	0	(	0.0%	)	138	(	100.0%	)
PITT	172	(	94.0%	)	10	(	5.5%	)	1	(	0.5%	)	183	(	100.0%	)
Total	475	(	95.0%	)	24	(	4.8%	)	1	(	0.2%	)	500	(	100.0%	)
Χ² = 0.6294 df = 2 p = .7300

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

gender.df<-HHQraw.df %>% select(record_id, screen_gender)
gender.df$record_id<-as.character(gender.df$record_id)
plot.df<-left_join(gender.df,Smokes.df)
plot.df<-plot.df[complete.cases(plot.df$screen_gender),]
plot.df<-plot.df[complete.cases(plot.df$Smoking.Status),]

plot.df<-plot.df %>% group_by(screen_gender) %>% add_count(screen_gender)
plot.df$gender_n<-plot.df$n
plot.df<-plot.df %>% select(-n)
plot.df$Current.Smoke.factor<-as.factor(plot.df$Current.Smoke.factor)
plot.df<-plot.df %>% group_by(screen_gender) %>% add_count(Current.Smoke.factor)
plot.df$sample_perc<-plot.df$n/nrow(plot.df)*100
plot.df$site_perc<-plot.df$n/plot.df$gender_n*100
plot.df$screen_gender<-if_else(plot.df$screen_gender==1, "Male","Female")
#plot.df$screen_gender<-as.factor(plot.df$screen_gender)
plot.df %$%  ctable(screen_gender,Current.Smoke.factor,chisq = TRUE, OR = TRUE, RR=TRUE,headings = FALSE) %>% print(method = "render")

	Current.Smoke.factor
screen_gender	0				1				Total
Female	344	(	96.4%	)	13	(	3.6%	)	357	(	100.0%	)
Male	131	(	92.3%	)	11	(	7.7%	)	142	(	100.0%	)
Total	475	(	95.2%	)	24	(	4.8%	)	499	(	100.0%	)
Χ² = 2.8964 df = 1 p = .0888 O.R. (95% C.I.) = 2.22 (0.97 - 5.08) R.R. (95% C.I.) = 1.04 (0.99 - 1.10)

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

ggplot(plot.df, aes(x =as.factor(screen_gender) ,y=n, fill=Current.Smoke.factor)) + 
  geom_bar(stat = "identity", position = "dodge") +
  geom_text(aes(label=paste(n, " / ",round(sample_perc ,2), "%"," \n", round(site_perc, 1), "%",  sep="")), position = position_dodge(.9),colour="black", size=2.2)+ 
  ylab("")+xlab("")+labs(title="Current Smokers (including inhaled alternatives)")+theme(legend.position = "bottom")

Smokes.df$prior_smoke<-if_else(Smokes.df$history_cig_history==1 & is.na(Smokes.df$prior_smoke),"0",as.character(Smokes.df$prior_smoke ))

Smokes.df$smoke_other<-if_else(is.na(Smokes.df$smoke_other) & Smokes.df$other_type=="", "0", as.character(Smokes.df$smoke_other))

Smokes.df$smoke_day<-as.numeric(Smokes.df$smoke_day)
Smokes.df$packs_prior<-as.numeric(Smokes.df$packs_prior)
Smokes.df$prior_smoke_yrs<-as.numeric(Smokes.df$prior_smoke_yrs)
Smokes.df$yrs_smoke<-as.numeric(Smokes.df$yrs_smoke)

Former Smoker Demographics

 
Average packs per day:1.0981111
Median prior packs per day:1

Average number of years smoked:16.814338
Median number of years Smoked:15

Current Smoker Demographics

 
Average packs per day:0.00803
Median prior packs per day:0

Average number of years smoked:0.918
Median number of years Smoked:0

current<-Smokes.df%>%filter(history_cig_history==1)
gender.df<-HHQraw.df %>% select(record_id, screen_gender)
gender.df$record_id<-as.character(gender.df$record_id)
current<-left_join(current,gender.df, by="record_id")
current$gender<-if_else(current$screen_gender==1, "Male", "Female")
current$gender<-as.factor(current$gender)

current_1<-current %>% group_by(Site) %>% add_count(Site)

current_1<-current_1 %>%
  group_by(Site, add=TRUE) %>% 
  add_count(smoke_other)
current_1$sample_perc<-current_1$nn/nrow(current_1)*100
current_1$bar_perc<-current_1$nn/current_1$n*100

ggplot(current_1, aes(x = as.factor(Site),y=n, fill=as.factor(smoke_other))) + 
  geom_bar(stat = "identity", position = "dodge") +
  geom_text(aes(label=paste(nn, " / ",round(sample_perc ,2), "%"," \n", round(bar_perc, 1), "%",  sep="")), position = position_dodge(.9),colour="black", size=2)+ 
  ylab("")+xlab("")+labs(title="Current Cigarette User")

HHQ Education

'educ' - Years of education completed and degrees earned?

edu<-HHQraw.df %>% select(record_id,educ )
edu$Notes<-""
HHQraw.df %>% filter(is.na(HHQraw.df$educ))%>% select(record_id)

## [1] record_id
## <0 rows> (or 0-length row.names)

HHQ Mother Education

'educ_mother' - 35. Years of education completed and degrees earned by mother?

m_edu<-HHQraw.df %>% select(record_id,educ_mother )
m_edu$educ_mother_was<-HHQraw.df$educ_mother
m_edu$Notes<-""

Entered as Missing Values...

c(10013, 10030, 10123, 10297, 10599, 10606, 10922, 20315, 20394, 20396, 20399, 30550, 30879, 30892, 30953, 31109)

Recode entries...

1. Missing Values/Performed Test Incorrectly -

"Unknown"

m_edu$Notes<-if_else(str_detect(m_edu$educ_mother,"Unknown")==TRUE, "==Unknown, recoded as NA", as.character(m_edu$Notes))
m_edu$educ_mother<-if_else(str_detect(m_edu$educ_mother,"Unknown")==TRUE, "NA", as.character(m_edu$educ_mother))

"Adopted"

Assumes parent education is purely genetic in nature..
Go back and check pt form to ensure no more info to include...

m_edu$Notes<-if_else(str_detect(m_edu$educ_mother,"adopted"), "Participant adopted, recoded as NA", as.character(m_edu$Notes))
m_edu$educ_mother<-if_else(str_detect(m_edu$educ_mother,"adopted")==TRUE, "NA", as.character(m_edu$educ_mother))

"Grammar school"

m_edu$Notes<-if_else(str_detect(m_edu$educ_mother,"Grammar school")==TRUE, "==Grammar School, recoded as NA", as.character(m_edu$Notes))

m_edu$educ_mother<-if_else(str_detect(m_edu$educ_mother,"Grammar school")==TRUE, "NA", as.character(m_edu$educ_mother))

2. Automated Recoding

Assumed Responses -str_starts())

m_edu$educ_mother<-if_else(str_starts(m_edu$educ_mother, "4"), "4", as.character(m_edu$educ_mother))
m_edu$educ_mother<-if_else(str_starts(m_edu$educ_mother, "5"), "5", as.character(m_edu$educ_mother))
m_edu$educ_mother<-if_else(str_starts(m_edu$educ_mother, "6"), "6", as.character(m_edu$educ_mother))
m_edu$educ_mother<-if_else(str_starts(m_edu$educ_mother, "7"), "7", as.character(m_edu$educ_mother))
m_edu$educ_mother<-if_else(str_starts(m_edu$educ_mother, "8"), "8", as.character(m_edu$educ_mother))
m_edu$educ_mother<-if_else(str_starts(m_edu$educ_mother, "9"), "9", as.character(m_edu$educ_mother))
m_edu$educ_mother<-if_else(str_starts(m_edu$educ_mother, "10"), "10", as.character(m_edu$educ_mother))
m_edu$educ_mother<-if_else(str_starts(m_edu$educ_mother, "11"), "11", as.character(m_edu$educ_mother))
m_edu$educ_mother<-if_else(str_starts(m_edu$educ_mother, "12"), "12", as.character(m_edu$educ_mother))
m_edu$educ_mother<-if_else(str_starts(m_edu$educ_mother, "13"), "13", as.character(m_edu$educ_mother))
m_edu$educ_mother<-if_else(str_starts(m_edu$educ_mother, "14"), "14",as.character(m_edu$educ_mother))
m_edu$educ_mother<-if_else(str_starts(m_edu$educ_mother, "15"), "15",as.character(m_edu$educ_mother))
m_edu$educ_mother<-if_else(str_starts(m_edu$educ_mother, "16"),"16",as.character(m_edu$educ_mother))
m_edu$educ_mother<-if_else(str_starts(m_edu$educ_mother, "17"),"17",as.character(m_edu$educ_mother))
m_edu$educ_mother<-if_else(str_starts(m_edu$educ_mother, "18"),"18",as.character(m_edu$educ_mother))
m_edu$educ_mother<-if_else(str_starts(m_edu$educ_mother, "19"),"19",as.character(m_edu$educ_mother))
m_edu$educ_mother<-if_else(str_starts(m_edu$educ_mother, "20"),"20",as.character(m_edu$educ_mother))

3. Manually Recode..

Zero Responses

m_edu$Notes<-if_else(str_starts(m_edu$educ_mother, "None "), "None*, recoded as 0", as.character(m_edu$Notes))
m_edu$educ_mother<-if_else(str_starts(m_edu$educ_mother, "None "), "0", as.character(m_edu$educ_mother))

m_edu$Notes<-if_else(str_starts(m_edu$educ_mother, "0 "), "0*, recoded as 0", as.character(m_edu$Notes))
m_edu$educ_mother<-if_else(str_starts(m_edu$educ_mother, "0 "), "0", as.character(m_edu$educ_mother))

Particpant Approximations -str_starts(x, pattern="~"))

m_edu$Notes<-if_else(str_detect(m_edu$educ_mother, "10 yrs"), "10 yrs, recoded as 10", as.character(m_edu$Notes))
m_edu$educ_mother<-if_else(str_detect(m_edu$educ_mother, "10 yrs"),"10",as.character(m_edu$educ_mother))

Typed Text Responses

m_edu$Notes<-if_else(str_detect(m_edu$educ_mother, "Masters"), "*Masters*, recoded as 14", as.character(m_edu$Notes))
m_edu$educ_mother<-if_else(str_detect(m_edu$educ_mother, "Masters"),"14",as.character(m_edu$educ_mother))

Remaining Manual Revisions -str_starts(x, pattern="~"))

m_edu$Notes<-if_else(str_detect(m_edu$educ_mother,"H.S., 12"), "==H.S., recoded as 12", as.character(m_edu$Notes))
m_edu$educ_mother<-if_else(str_detect(m_edu$educ_mother, "H.S., 12"),"12",as.character(m_edu$educ_mother))

HHQ Recent Health Events...

'history_recent_ill_specify'- Have you had any recent illness?
'history_recent_hospital'- Have you recently been hospitalized?
'history_recent_surgery'- Have you recently had any surgical procedures?

Entered as Missing Data...

illness: character(0) hospitalized: character(0) surgery: 10515

Chronic Illnesses -

Recent Illness -

Particpant specified within past 6 mo. or currently has...

Particpant specified within past 3 mo. or currently has...

ill.df$history_recent_ill_TemporalScope_3<-
  if_else(str_detect(ill.df$history_recent_ill_specify, regex("currently has|at present|sorenes|just getting over it now|last week|last month|weeks ago|3 mo ago|a month ago|one month ago|~1 month ago|About 1 month ago|2 months ago|days ago|Currently has|1 month ago|3 months ago|1 week ago|this week|mo. ago|week ago|About 1 month ago|3 mo ago|at present|last week|currently")),"1", as.character(ill.df$history_recent_ill_specify))


#Code for TemporalScope using COG1 date & information provided... 


ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-03-26" & str_detect(ill.df$history_recent_ill_specify, "end of December"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-04-01" & str_detect(ill.df$history_recent_ill_specify, "03/18"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-09-10" & str_detect(ill.df$history_recent_ill_specify, "August 25th"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))




ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-01-28" & str_detect(ill.df$history_recent_ill_specify, "Dec '18"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))


ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2018-10-23" & str_detect(ill.df$history_recent_ill_specify, "July 2018"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))


ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-01-10" & str_detect(ill.df$history_recent_ill_specify, "Jan 2019"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))


ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2020-01-15" & str_detect(ill.df$history_recent_ill_specify, "Flu in Dec"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))


ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-01-10" & str_detect(ill.df$history_recent_ill_specify, "Jan 2019"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))


ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-03-26" & str_detect(ill.df$history_recent_ill_specify, "December"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-02-05" & str_detect(ill.df$history_recent_ill_specify, "12/20-12/26"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-01-14" & str_detect(ill.df$history_recent_ill_specify, "November '18"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-06-26" & str_detect(ill.df$history_recent_ill_specify, "in April"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-05-20" & str_detect(ill.df$history_recent_ill_specify, "April 2019"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-06-18" & str_detect(ill.df$history_recent_ill_specify, "June 2019"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-05-30" & str_detect(ill.df$history_recent_ill_specify, "Pneumonia 2/19"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2018-10-23" & str_detect(ill.df$history_recent_ill_specify, "July 2018"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-07-31" & str_detect(ill.df$history_recent_ill_specify, "June 2019"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2020-01-15" & ill.df$history_recent_ill_specify=="Flu in Dec - all good now.", "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-04-30" & str_detect(ill.df$history_recent_ill_specify,"March 2019") , "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-04-01" & ill.df$history_recent_ill_specify=="Head cold on 03/18 for 8 days - diarrhea, cold, fatigue - good now.","1" ,as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-04-01" & ill.df$history_recent_ill_specify=="Head cold on 03/18 for 8 days - diarrhea, cold, fatigue - good now.","1" ,as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2017-09-12" & ill.df$history_recent_ill_specify=="Cold over July 4th weekend, congestion", "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-01-28" & ill.df$history_recent_ill_specify=="Head cold- no longer had (end of Dec '18)", "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-05-28" & str_detect(ill.df$history_recent_ill_specify,"April 5th"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-08-28" & str_detect(ill.df$history_recent_ill_specify,"July 2019"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-09-10" & str_detect(ill.df$history_recent_ill_specify,"August 25th"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-10-28" & ill.df$history_recent_ill_specify=="Surgery for liver cancer - Aug '19.", "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$Notes<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-10-28" & ill.df$history_recent_ill_specify=="Surgery for liver cancer - Aug '19.", "Chronic",as.character(ill.df$Notes))

ill.df$Notes<-if_else(str_detect(ill.df$history_recent_ill_specify,regex("liver cancer|lasted off and on|chronic|Chronic")), "Chronic",as.character(ill.df$Notes))

ill.df$Notes<-if_else(str_detect(ill.df$history_recent_surgery_s,regex("chemo treatment")), "Chronic",as.character(ill.df$Notes))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-03-26" & str_detect(ill.df$history_recent_ill_specify,"12/2018-2/2019"), "1",  as.character(ill.df$history_recent_ill_TemporalScope_3))


ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-01-10" & str_detect(ill.df$history_recent_ill_specify,"Jan 2019"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$history_recent_ill_specify=="ive had a bad cold (flu) that has lasted off and on for 1 month, maybe an asthmatic reaction to it.", "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(str_detect(ill.df$history_recent_ill_specify,"gall stones and inflamed bile duct") & ill.df$Date.of..Cognitive.Session.1=="2018-04-10", "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-03-26" & str_detect(ill.df$history_recent_ill_TemporalScope_3,"12/2018"), "1", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2018-05-23" & str_detect(ill.df$history_recent_ill_specify, "November 2017"), "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))


ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-05-31" & str_detect(ill.df$history_recent_ill_specify, "12/18"), "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-05-31" & ill.df$history_recent_ill_specify=="12/18 bad cold- good now", "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2017-12-13" & ill.df$history_recent_ill_specify=="Pneumonia - April", "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2018-05-23" & ill.df$history_recent_ill_specify=="Flu- November 2017", "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-03-01" & str_detect(ill.df$history_recent_ill_specify, "october 2018"), "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-04-30" & str_detect(ill.df$history_recent_ill_specify, "Common cold, December 2018"), "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2018-02-19" & ill.df$history_recent_ill_specify=="Skin cancer removed in July 2017", "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-02-13" & ill.df$history_recent_ill_specify=="head cold- Sept. 2018", "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-04-01" & ill.df$history_recent_ill_specify=="head cold- Sept. 2018", "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-10-08" & ill.df$history_recent_ill_specify=="pneumonia April/May- all good now", "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-02-25" & str_detect(ill.df$history_recent_ill_specify, "October '18"), "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-08-28" & str_detect(ill.df$history_recent_ill_specify, "Jan 2019"), "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-10-04" & str_detect(ill.df$history_recent_ill_specify, "2/19"), "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-06-26" & str_detect(ill.df$history_recent_ill_specify, "April 2017"), "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-09-27" & str_detect(ill.df$history_recent_ill_specify, "March/April"), "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-09-23" & str_detect(ill.df$history_recent_ill_specify, "April"), "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-08-26" & str_detect(ill.df$history_recent_ill_specify, "Feb '19"), "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))


ill.df$history_recent_ill_TemporalScope_3<-if_else(str_detect(ill.df$history_recent_ill_specify, "4 months ago"), "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(str_detect(ill.df$history_recent_ill_specify, "6 months"), "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2020-01-03" & str_detect(ill.df$history_recent_ill_specify, "September"), "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2018-04-02" & str_detect(ill.df$history_recent_ill_specify, "Sept 2017"), "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))


### UNSURE IF SHOULD BE WITHIN OR OUTSIDE

# Roughly 3+ mo. ago
ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-12-09" & str_detect(ill.df$history_recent_ill_specify, "Common cold roughly 3"), "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2018-01-31" & str_detect(ill.df$history_recent_ill_specify, "early fall"), "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))


ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-06-19" & str_detect(ill.df$history_recent_ill_specify, "Feb/March"), "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))



ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$Date.of..Cognitive.Session.1=="2019-05-31" & str_detect(ill.df$history_recent_ill_specify, "12/18"), "Not Recent", as.character(ill.df$history_recent_ill_TemporalScope_3))
ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$history_recent_ill_TemporalScope_3=="Not Recent", "0", as.character(ill.df$history_recent_ill_TemporalScope_3))

ill.df$history_recent_ill_TemporalScope_3<-if_else(ill.df$history_recent_ill_TemporalScope_3!="1" & ill.df$history_recent_ill_TemporalScope_3!="0" , "", as.character(ill.df$history_recent_ill_TemporalScope_3))
ill.df$history_recent_ill_TemporalScope_3<-as.numeric(ill.df$history_recent_ill_TemporalScope_3)

Particpant specified illness resolved...

ill.df$history_recent_ill_resolved<-
  if_else(str_detect(ill.df$history_recent_ill_specify, regex("good now|healthy now|clear now|ok now|went away|Went away|great now|resolved|OK now|recovered|fine now|cleared up|Good to exercise|good to go|All better now|all clear now|good to exercise|no problems|under control|Clear now|good after that|resloved|Recovered|feels better now|back in|clean bill of health|no longer had|Good to go now|Passed kidney stones|24 hr stomach bug roughly 2 months ago|moderate case of hives|roughly remembers|had|Cold, Dec-Jan 2019|9/10-9/14|Common cold ~3 months ago|Common cold ~1 month ago|Common cold roughly 1 month ago|Had the flu last month|gall bladder attack|Appendix removed Sept 2017|acute|Cold- about 3 mo ago|Cold over July 4th weekend, congestion|good now|8 days|removed|Had a bout|24 hr stomach bug roughly 2 months ago|Passed kidney stones|moderate case of hives|hospitalized a few hrs. Shingles/diarrhea - given prescription.")),"1", as.character(ill.df$history_recent_ill_specify))


ill.df$history_recent_ill_resolved<-
  if_else(str_detect(ill.df$history_recent_ill_specify, regex("not been resolved|just getting over it now|at present|this week|Currently has|currently|chronic cough that comes and goes|ive had a bad cold|at present|still having")),"0", as.character(ill.df$history_recent_ill_resolved))

ill.df$history_recent_ill_resolved<-
  if_else(str_detect(ill.df$history_recent_ill_specify,"Common cold in November '18") & ill.df$Date.of..Cognitive.Session.1=="2019-01-14" , "1", as.character(ill.df$history_recent_ill_resolved))

Recently hospitalized -

Type of hospital visit...

ill.df$history_recent_hospital_Type<-if_else(str_detect(ill.df$history_recent_hospital_s, regex("surgery|Surgery|replacement|Replacement|Removed|removed|replaced|Removed|Replaced|Repair|repair")), "Surgery", "")

ill.df$history_recent_hospital_Type<-if_else(str_detect(ill.df$history_recent_hospital_s, regex("ER-|emergency room|ER trip|Accident|attack|Dehydration|for a few hours")), "ER",ill.df$history_recent_hospital_Type )

ill.df$history_recent_hospital_Type<-if_else(str_detect(ill.df$history_recent_hospital_s, regex("overnight|Overnight|Observation|observation|nights|Acute gastric ulcer|cancer|Infection due to knee surgery")), "Inpatient", ill.df$history_recent_hospital_Type)

ill.df$history_recent_hospital_Type<-if_else(str_detect(ill.df$history_recent_hospital_s, regex("Outpatient|outpatient|1 day surgery|Tooth extraction")), "Outpatient Surgery", ill.df$history_recent_hospital_Type)

ill.df$history_recent_hospital_Type<-if_else(ill.df$history_recent_hospital_Type=="Surgery", "Inpatient Surgery", ill.df$history_recent_hospital_Type)

ill.df$history_recent_hospital_Type<-as.factor(ill.df$history_recent_hospital_Type)
TMP<-ill.df
TMP$Site<-if_else(str_starts(TMP$record_id, "1"), "PITT",as.character(TMP$record_id))
TMP$Site<-if_else(str_starts(TMP$record_id, "2"), "KU",as.character(TMP$Site))
TMP$Site<-if_else(str_starts(TMP$record_id, "3"), "NEU",as.character(TMP$Site))
TMP$Site<-as.factor(TMP$Site)
TMP<- TMP %>% group_by(Site) %>% add_count(Site)
TMP<-TMP %>% group_by(Site) %>%add_count(history_recent_hospital_Type) %>%
  mutate(
    site_perc=(nn/n)*100,
    sample_perc=(nn/nrow(TMP))*100)
TMP %$%  ctable(Site,history_recent_hospital_Type,chisq = TRUE, OR = TRUE, RR=TRUE,headings = FALSE) %>% print(method = "render")

	history_recent_hospital_Type
Site	ER				Inpatient				Inpatient Surgery				Outpatient Surgery				<NA>				Total
KU	1	(	0.6%	)	3	(	1.7%	)	1	(	0.6%	)	0	(	0.0%	)	174	(	97.2%	)	179	(	100.0%	)
NEU	3	(	2.2%	)	3	(	2.2%	)	2	(	1.4%	)	0	(	0.0%	)	130	(	94.2%	)	138	(	100.0%	)
PITT	6	(	3.3%	)	3	(	1.6%	)	1	(	0.5%	)	3	(	1.6%	)	170	(	92.9%	)	183	(	100.0%	)
Total	10	(	2.0%	)	9	(	1.8%	)	4	(	0.8%	)	3	(	0.6%	)	474	(	94.8%	)	500	(	100.0%	)
Χ² = 6.1450 df = 6 p = .4071

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

ggplot(TMP, aes(x = Site ,y=nn, fill=history_recent_hospital_Type)) + 
  geom_bar(stat = "identity", position = "dodge") +
  geom_text(aes(label=paste(nn, " ", "%"," \n", round(site_perc, 2), "%",  sep="")), position = position_dodge(.9),colour="black", size=2)+ 
  ylab("")+labs()+theme(legend.position = "bottom")

TMP %$%  ctable(Site,history_recent_ill_resolved==0,chisq = TRUE, OR = TRUE, RR=TRUE,headings = FALSE) %>% print(method = "render")

	history_recent_ill_resolved == 0
Site	FALSE				TRUE				<NA>				Total
KU	52	(	29.1%	)	4	(	2.2%	)	123	(	68.7%	)	179	(	100.0%	)
NEU	23	(	16.7%	)	1	(	0.7%	)	114	(	82.6%	)	138	(	100.0%	)
PITT	35	(	19.1%	)	6	(	3.3%	)	142	(	77.6%	)	183	(	100.0%	)
Total	110	(	22.0%	)	11	(	2.2%	)	379	(	75.8%	)	500	(	100.0%	)
Χ² = 2.4857 df = 2 p = .2886

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

gender<-HHQraw.df %>% select(record_id,screen_gender)
TMP<-left_join(TMP,gender)
TMP %$%  ctable(screen_gender,history_recent_ill_resolved==0,chisq = TRUE, OR = TRUE, RR=TRUE,headings = FALSE) %>% print(method = "render")

	history_recent_ill_resolved == 0
screen_gender	FALSE				TRUE				<NA>				Total
1	27	(	19.0%	)	5	(	3.5%	)	110	(	77.5%	)	142	(	100.0%	)
2	83	(	23.2%	)	6	(	1.7%	)	269	(	75.1%	)	358	(	100.0%	)
Total	110	(	22.0%	)	11	(	2.2%	)	379	(	75.8%	)	500	(	100.0%	)
Χ² = 1.3011 df = 1 p = .2540 O.R. (95% C.I.) = 0.39 (0.11 - 1.38) R.R. (95% C.I.) = 0.90 (0.77 - 1.06)

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

TMP %$%  ctable(Site,history_recent_ill_TemporalScope_6,chisq = TRUE, OR = TRUE, RR=TRUE,headings = FALSE) %>% print(method = "render")

	history_recent_ill_TemporalScope_6
Site	0				1				A cold				Acute gastric ulcer				Allergic reaction to dust mites and animal dander. Took antibiotics for 4 days and was good after that.				Antibiotics 5 days for an infected pimple- it went away				bronchitis				Bronchitis and Sinus Infection				Bronchitis, UTI				Cold				cold symptoms- resloved				cold, sore throat, cough				cold/virus				common cold				Common cold				Dental problems - extractions , bone grafts, crowns, and one implant.				Diarrhea 8 days				Doctor thinks she had a possible tick bite and posion ivy				Eye and ear infection				gall stones and inflamed bile duct. had gall bladder removed.				Gallbladder , digestive problems				Had a bout of suspected pneuomia - in " observation " for 4 days in BMC.				Had a chronic cough that comes and goes. Doesn' t feel like it will affect exercise.				had bronchitis from allergies.				Had poison ivy-went on steroids for it. All better now				I had hemorrhoids				Mild left ear infection				neck/ varicose veins				prefer not to answer, asked cog2, the illness has not been resolved and he says is not a safety issue				Red Tide Virus				Shingles				Simple cold				Sinus infection				Sinus infection- may also be allergy related.				Sinus infection/ virus; lymphedema				Sinus issues				Upper respiratory infection				UTI- infection or possible kidney stones. Went away				UTIs and sinus infection				Vertigo/ vomitting - hospitalized a few hrs. Shingles/ diarrhea - given prescription .				white blood count was elevated, lungs clear , throwing up. No clear diagnosis was given levofloxacin to take for 12 days. no problems after cycle				<NA>				Total
KU	4	(	2.2%	)	42	(	23.5%	)	0	(	0.0%	)	0	(	0.0%	)	1	(	0.6%	)	0	(	0.0%	)	0	(	0.0%	)	1	(	0.6%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	1	(	0.6%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	1	(	0.6%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	1	(	0.6%	)	0	(	0.0%	)	1	(	0.6%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	1	(	0.6%	)	1	(	0.6%	)	0	(	0.0%	)	1	(	0.6%	)	0	(	0.0%	)	0	(	0.0%	)	1	(	0.6%	)	123	(	68.7%	)	179	(	100.0%	)
NEU	2	(	1.4%	)	3	(	2.2%	)	1	(	0.7%	)	1	(	0.7%	)	0	(	0.0%	)	1	(	0.7%	)	1	(	0.7%	)	0	(	0.0%	)	0	(	0.0%	)	2	(	1.4%	)	0	(	0.0%	)	0	(	0.0%	)	1	(	0.7%	)	0	(	0.0%	)	1	(	0.7%	)	0	(	0.0%	)	1	(	0.7%	)	0	(	0.0%	)	1	(	0.7%	)	0	(	0.0%	)	0	(	0.0%	)	1	(	0.7%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	1	(	0.7%	)	1	(	0.7%	)	1	(	0.7%	)	1	(	0.7%	)	0	(	0.0%	)	2	(	1.4%	)	1	(	0.7%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	1	(	0.7%	)	0	(	0.0%	)	114	(	82.6%	)	138	(	100.0%	)
PITT	2	(	1.1%	)	24	(	13.1%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	1	(	0.5%	)	2	(	1.1%	)	1	(	0.5%	)	1	(	0.5%	)	0	(	0.0%	)	0	(	0.0%	)	1	(	0.5%	)	1	(	0.5%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	1	(	0.5%	)	1	(	0.5%	)	0	(	0.0%	)	0	(	0.0%	)	1	(	0.5%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	1	(	0.5%	)	0	(	0.0%	)	0	(	0.0%	)	1	(	0.5%	)	1	(	0.5%	)	0	(	0.0%	)	0	(	0.0%	)	1	(	0.5%	)	0	(	0.0%	)	1	(	0.5%	)	0	(	0.0%	)	0	(	0.0%	)	142	(	77.6%	)	183	(	100.0%	)
Total	8	(	1.6%	)	69	(	13.8%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	4	(	0.8%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	2	(	0.4%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	2	(	0.4%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	1	(	0.2%	)	379	(	75.8%	)	500	(	100.0%	)
Χ² = 117.5287 df = 80 p = .0040

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Recent Surgery -

Currently includes:
* Oral surgeries - Tooth extraction, Wisdom Tooth Removal, etc.
* Procedures- Endoscopy, Colonoscopy, Biopsy, Esophagus stretched, kidney stones removed, etc.
* Needs Clarfication (to look up): "Neuroma on foot" , "Hernia"

Creating Variables...

history_recent_ill_resolved - 1/0 If notes specify the illness was resolved...
history_recent_ill_TemporalScope_3/6 - 1/0 Illness within the past 3/6 months (longer than 6mo. below)...

history_recent_hospital_Type - Interpreted from specifications...

history_recent_surgery_surgery - 1/0 partially parced for procedures vs. surgery (0's below...)
history_recent_surgery_Temporal_Scope_yr - 1/0 If notes specify the illness was within the past year

HHQ language...

** First language not English**

### HHQ Summary Variables...

HHQ_Mean.Score - Mean HHQ pain score (pain factors 1:13).
HHQ_Sum.Score - Summation of HHQ pain factors 1:13
HHQ_Health_Status.Factor - Ordered categorical factor (sumation of recent event factors)

HHQ_Health_Status.Factor

ggplot(data, aes(Site, HHQ_Health_Status.Factor,color=as.factor(Site)))+geom_boxplot()+theme(legend.position = "none")

data$Site<-as.factor(data$Site)
data$Site<-as.character(data$Site)
data$UPitt<-if_else(data$Site  =="PITT"  ,1,0)
data$Kansas<-if_else(data$Site  =="KU"  ,1,0)
data$Northeastern<-if_else(data$Site  =="NEU"  ,1,0)
upitt<- subset(df, UPitt == "1")$record_id
ku<- subset(df, Kansas == "1")$record_id
neu<- subset(df, Northeastern == "1")$record_id
HHQ_Health_Status.Factor_1 <- subset(data, HHQ_Health_Status.Factor == "1")$record_id
HHQ_Health_Status.Factor_2 <- subset(data, HHQ_Health_Status.Factor == "2")$record_id
HHQ_Health_Status.Factor_3 <- subset(data, HHQ_Health_Status.Factor == "3")$record_id
data$Site<-as.factor(data$Site)

ven<-plotVenn(list( Kansas=ku,UPitt=upitt,HHQ_Health_Status.Factor_3=HHQ_Health_Status.Factor_3,Northeastern=neu), nCycles = 20000,labelRegions=T, showPlot = T)
str <- charToRaw(ven$svg)
rsvg::rsvg_png(str, file = '~/HHQ_Health_Status.Factor_3_site_venn.png')


ven<-plotVenn(list( Kansas=ku,UPitt=upitt,HHQ_Health_Status.Factor_1=HHQ_Health_Status.Factor_1,Northeastern=neu), nCycles = 20000,labelRegions=T, showPlot = T)
str <- charToRaw(ven$svg)
rsvg::rsvg_png(str, file = '~/HHQ_Health_Status.Factor_1_site_venn.png')


ven<-plotVenn(list( Kansas=ku,UPitt=upitt,HHQ_Health_Status.Factor_2=HHQ_Health_Status.Factor_2,Northeastern=neu), nCycles = 20000,labelRegions=T, showPlot = T)
str <- charToRaw(ven$svg)
rsvg::rsvg_png(str, file = '~/HHQ_Health_Status.Factor_2_site_venn.png')

library(nVennR)

data$Recent_Hospitalization<-if_else(data$history_recent_hospital  =="1"  ,1,0)
data$PITT<-if_else(data$Site  =="PITT"  ,1,0)
data$KU<-if_else(data$Site  =="KU"  ,1,0)
data$NEU<-if_else(data$Site  =="NEU"  ,1,0)


PITT<- subset(data, PITT == "1")$record_id
KU <- subset(data, KU == "1")$record_id
NEU <- subset(data, NEU == "1")$record_id

Recent_Hospitalization <- subset(data, history_recent_hospital == "1")$record_id
data %$%  ctable(Site,HHQ_Sum.Score,chisq = TRUE, OR = TRUE, RR=TRUE, headings = FALSE) %>% print(method = "render")

	HHQ_Sum.Score
Site	0				1				2				3				4				5				6				7				8				9				10				Total
KU	44	(	24.6%	)	65	(	36.3%	)	38	(	21.2%	)	16	(	8.9%	)	12	(	6.7%	)	3	(	1.7%	)	0	(	0.0%	)	1	(	0.6%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	179	(	100.0%	)
NEU	16	(	11.6%	)	26	(	18.8%	)	29	(	21.0%	)	25	(	18.1%	)	16	(	11.6%	)	3	(	2.2%	)	15	(	10.9%	)	4	(	2.9%	)	1	(	0.7%	)	2	(	1.4%	)	1	(	0.7%	)	138	(	100.0%	)
PITT	30	(	16.4%	)	38	(	20.8%	)	38	(	20.8%	)	41	(	22.4%	)	14	(	7.7%	)	13	(	7.1%	)	6	(	3.3%	)	2	(	1.1%	)	1	(	0.5%	)	0	(	0.0%	)	0	(	0.0%	)	183	(	100.0%	)
Total	90	(	18.0%	)	129	(	25.8%	)	105	(	21.0%	)	82	(	16.4%	)	42	(	8.4%	)	19	(	3.8%	)	21	(	4.2%	)	7	(	1.4%	)	2	(	0.4%	)	2	(	0.4%	)	1	(	0.2%	)	500	(	100.0%	)
Χ² = 75.6749 df = 20 p = .0000

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

ven<-plotVenn(list(PITT=PITT, KU=KU, NEU=NEU, Recent_Hospitalization=Recent_Hospitalization), nCycles = 2000,labelRegions=T, showPlot = T)
str <- charToRaw(ven$svg)
rsvg::rsvg_png(str, file = '~/Recent_Hospitalization_Site_venn.png')

USP Classified Medications:

Need to check on labeling scheme that EPICC Used for the following...

usp_data<-readxl::read_excel("/Volumes/IGNITE_Imaging/QC_Output/R_IGNITE/RedCap/PRE/Data/Meds/OUT/IGNITE_MEDS_V3_April26.xlsx",sheet = "FINAL", skip = 3)
usp_data$record_id<-as.character(usp_data$`Record ID`)

Creating Variables...

usp_data_classified_rxs - Number of classified prescription medications
usp_data_classified_otcs - Number of classified OTC medications
usp_data_anticholinergic_rx.factor - Reports taking prescription anticholinergic medications
usp_data_anticholinergic_otc.factor - Reports taking OTC anticholinergic medications
usp_data_beta.factor - On a medication which includes beta blocking ingredients
usp_data_beta_oral.factor - Using oral beta blocking medication
Number of each of the major USP categories..
- Analgestics, Antidepressants, Anxiolytics, Cardiovascular Agents, Blood Products and Modifiers, Anti-Obesity Agents, Blood Glucose Regulators, Anticonvulsants, Antimigraine Agents, Antidementia Agents, Antiemetics, Antimyasthenic Agents, Antineoplastics, Antiparkinson Agents, Antispasticity Agents, Antibacterials, Antivirals, Antiparasitics, Antifungals, Antigout Agents, Anesthetics, Anti-Addiction/ Substance Abuse Treatment Agents, Central Nervous System Agents, Electrolytes/ Minerals/ Metals/ Vitamins, Gastrointestinal Agents, Genitourinary Agents, Hormonal Agents, Stimulant/ Replacement/ Modifying (Adrenal), Hormonal Agents, Stimulant/ Replacement/ Modifying (Sex Hormones/ Modifiers), Hormonal Agents, Stimulant/ Replacement/ Modifying (Thyroid), Immunological Agents, Inflammatory Bowel Disease Agents, Metabolic Bone Disease Agents, Respiratory Tract/ Pulmonary Agents, Sexual Disorder Agents, Skeletal Muscle Relaxants, Sleep Disorder Agents, Dental and Oral Agents, Dermatological Agents, Ophthalmic Agents

Prescription Anticholinergics

MED.df$RX_ANTICHOLINERGIC <- if_else(MED.df$OTC_RX=="Prescription" & MED.df$ANTICHOLINERGIC==1, 1,0)

RX_ANTICHOLINERGIC.df <- MED.df %>% filter(RX_ANTICHOLINERGIC==1)

RX_antichol_2<-RX_ANTICHOLINERGIC.df %>% filter(duplicated(RX_ANTICHOLINERGIC.df$`Record ID`)) %>% select(`Record ID`)

usp_data$usp_data_anticholinergic_rx.factor<-if_else(usp_data$record_id  %in% RX_ANTICHOLINERGIC.df$`Record ID`, 1, 0)

Over The Counter (OTC) Anticholinergics

MED.df$OTC_ANTICHOLINERGIC<-if_else(MED.df$OTC_RX=="OTC" & MED.df$ANTICHOLINERGIC==1,1,0)

OTC_ANTICHOLINERGIC.df<-MED.df %>% filter(OTC_ANTICHOLINERGIC==1)

OTC_ANTICHOL_2<-OTC_ANTICHOLINERGIC.df %>% filter(duplicated(OTC_ANTICHOLINERGIC.df$`Record ID`)) %>% select(`Record ID`)

usp_data$usp_data_anticholinergic_otc.factor<-if_else(usp_data$record_id  %in% OTC_ANTICHOLINERGIC.df$`Record ID`, 1, 0)

All Beta Blocking Ingredients

## CHECK Cardio Cols
RAW_MED_ENTRIES<-readxl::read_excel("/Volumes/IGNITE_Imaging/QC_Output/R_IGNITE/RedCap/PRE/Data/Meds/OUT/IGNITE_MEDS_V3_April26.xlsx",sheet = "ALL_MEDS" )

RAWCARD_1<-RAW_MED_ENTRIES[str_detect(RAW_MED_ENTRIES$USP_DRUG, c("bisoprolol"))==TRUE,]
RAWCARD_2<-RAW_MED_ENTRIES[str_detect(RAW_MED_ENTRIES$USP_DRUG, c("acebutolol"))==TRUE,]
RAWCARD_3<-RAW_MED_ENTRIES[str_detect(RAW_MED_ENTRIES$USP_DRUG, c("atenolol"))==TRUE,]
RAWCARD_4<-RAW_MED_ENTRIES[str_detect(RAW_MED_ENTRIES$USP_DRUG, c("metoprolol"))==TRUE,]
RAWCARD_5<-RAW_MED_ENTRIES[str_detect(RAW_MED_ENTRIES$USP_DRUG, c("nadolol"))==TRUE,]
RAWCARD_6<-RAW_MED_ENTRIES[str_detect(RAW_MED_ENTRIES$USP_DRUG, c("nebivolol"))==TRUE,]
RAWCARD_6<-RAW_MED_ENTRIES[str_detect(RAW_MED_ENTRIES$USP_DRUG, c("propranolol"))==TRUE,]
RAWCARD<-rbind(RAWCARD_1, RAWCARD_2,RAWCARD_3,RAWCARD_4,RAWCARD_5,RAWCARD_6)

usp_data$usp_data_beta.factor<-if_else(usp_data$`Record ID` %in% unique(RAWCARD$`Record ID`), "1", "0")

Oral Beta Blocking Medications

RAWCARD_1<-MED.df[str_detect(MED.df$USP_DRUG, c("bisoprolol")) & str_detect(MED.df$CODE.factor, c("oral")) ==TRUE,]
RAWCARD_2<-MED.df[str_detect(MED.df$USP_DRUG, c("acebutolol"))& str_detect(MED.df$CODE.factor, c("oral"))==TRUE,]
RAWCARD_3<-MED.df[str_detect(MED.df$USP_DRUG, c("atenolol"))& str_detect(MED.df$CODE.factor, c("oral"))==TRUE,]
RAWCARD_4<-MED.df[str_detect(MED.df$USP_DRUG, c("metoprolol"))& str_detect(MED.df$CODE.factor, c("oral"))==TRUE,]
RAWCARD_5<-MED.df[str_detect(MED.df$USP_DRUG, c("nadolol"))& str_detect(MED.df$CODE.factor, c("oral"))==TRUE,]
RAWCARD_6<-MED.df[str_detect(MED.df$USP_DRUG, c("nebivolol"))& str_detect(MED.df$CODE.factor, c("oral"))==TRUE,]
RAWCARD_6<-MED.df[str_detect(MED.df$USP_DRUG, c("propranolol"))& str_detect(MED.df$CODE.factor, c("oral"))==TRUE,]
RAWCARD<-rbind(RAWCARD_1, RAWCARD_2,RAWCARD_3,RAWCARD_4,RAWCARD_5,RAWCARD_6)
usp_data$usp_data_beta_oral.factor<-if_else(usp_data$`Record ID` %in% unique(RAWCARD$`Record ID`), "1", "0")
#usp_data<-usp_data %>% select(`Record ID`, usp_data_beta.factor, usp_data_beta_oral.factor)
data<-left_join(data,usp_data, by="record_id")
data$record_id<-as.character(data$record_id)

Manually Count USP Classified `BETA BLOCKERS/RELATED` Agents

usp_beta.n - USP defined beta blocking agents
usp_beta.factor - USP defined beta blocking agents

CARDIO_MEDS<-readxl::read_excel("/Volumes/IGNITE_Imaging/QC_Output/R_IGNITE/RedCap/PRE/Data/Meds/OUT/IGNITE_MEDS_V3_April26.xlsx",sheet = "CARDIO_1", skip = 2)
CARDIO_MEDS$record_id<-as.character(CARDIO_MEDS$Record_ID)
CARDIO_MEDS$usp_beta.n<-CARDIO_MEDS$`BETA BLOCKERS/RELATED`
CARDIO_MEDS$usp_beta.factor<-if_else(CARDIO_MEDS$usp_beta.n>=1, 1,0)
### BASED ON EPICC RENAME THE REST OF CARDIO VARIABLES AS ABOVE, IF NEEDED.
data<-left_join(data, CARDIO_MEDS)
CARDIO_MEDS_0<-CARDIO_MEDS

Merge HVLT Data

FILE<-list.files("/Volumes/IGNITE_Imaging/QC_Output/R_IGNITE/RedCap/PRE/HH_Provided/", pattern="HV")
PATH<-"/Volumes/IGNITE_Imaging/QC_Output/R_IGNITE/RedCap/PRE/HH_Provided/"
file<-paste(PATH, FILE, sep="")
na_values<-c("99991","99992","99993","99994", "99995","99996","99997","99998","99999", "999910")
file<-read_csv(file,  na=na_values)
file<-remove_empty_cols(file)
file$record_id<-as.character(file$screen_id)
file<-file[,-c(2:3)]
data<-left_join(data,file, by="record_id")
Randomized_dat<-data

Write out processed/merged variables...

write_csv(Randomized_dat, "/Volumes/IGNITE_Imaging/QC_Output/R_IGNITE/RedCap/PRE/Data/HHQ/OUT_DATA/Working_Recode_Vars.txt")

Import/Process Full RedCap Database as.is:

FILE<-list.files("/Volumes/IGNITE_Imaging/QC_Output/R_IGNITE/RedCap/PRE/Data/", pattern=".r")
PATH<-"/Volumes/IGNITE_Imaging/QC_Output/R_IGNITE/RedCap/PRE/Data/"
file<-paste(PATH, FILE, sep="")
source(file)
data$record_id<-as.character(data$record_id)
data<-data %>% filter(str_detect(data$record_id, "--1")==FALSE)
data<-data %>% filter(str_detect(data$record_id, "--2")==FALSE)
data<-data %>% filter(str_detect(data$record_id, "--")==FALSE)
data<-data[complete.cases(data$vo2_site),]
data<-data %>% select(record_id,  ses_year_education_2,  ses_year_education_2.factor,                   
                      starts_with(c("vo2", "cirs", "pss_", "ses_", "rand")),
                      contains(c("race", "score","bmi","nih", "adj_",
                                "hru_","tscore" , "_t")))
data$record_id<-as.character(data$record_id)
data<-left_join(Randomized_dat,data,by=("record_id"))

1. Education Variables

HHQ
MacArthur SES Scale

## RECODE
data$ses_earnings<-na_if(data$ses_earnings,10)
data$ses_earnings<-na_if(data$ses_earnings,11)
data$ses_earnings.factor<-na_if(data$ses_earnings.factor, "No response")
data$ses_earnings.factor<-na_if(data$ses_earnings.factor, "Dont Know")


data$EDU_HS<-if_else(str_detect(data$ses_year_education_2.factor, "High School")==TRUE, 1,0)
data$EDU_College<-if_else(str_detect(data$ses_year_education_2.factor, "College")==TRUE, 1,0)
data$EDU_Grad<-if_else(str_detect(data$ses_year_education_2.factor, "Graduate School")==TRUE, 1,0)

data$EDU.factor<-if_else(str_detect(data$ses_year_education_2.factor, "High School")==TRUE, "High School",as.character(data$ses_year_education_2.factor))
data$EDU.factor<-if_else(str_detect(data$EDU.factor, "College")==TRUE, "College",as.character(data$EDU.factor))
data$EDU.factor<-if_else(str_detect(data$ses_year_education_2.factor, "Graduate School")==TRUE, "Graduate",as.character(data$EDU.factor))
data$EDU.factor<-as.factor(data$EDU.factor)

Discrepancies- Education Variable

DT::datatable(data %>% select(educ,ses_year_education_2, ses_year_education_2.factor), rownames = FALSE, options = list(pageLength = 5))

2. VO2 Test Data

Absolute VO2 = Liters per min (L/min)
Relative VO2 = milliliters per minute per kilogram (a unit of mass)

class(data$ses_year_education_2)<-"integer"
class(data$vo2_age)<-"integer"
class(data$vo2sum_peak_ml)<-"integer"
class(data$vo2_sex)<-"integer"

####data$vo2sum_peak_ml<-data$vo2sum_ma
data$BMI<-(data$vo2_data_weight/data$vo2_data_height/data$vo2_data_height)*10000
data$BMI<-labelled::remove_var_label(data$BMI)
data$BMI<-as.numeric(data$BMI)
class(data$BMI)="numeric" 

data$BMI_group<-if_else(data$BMI<18.5, 1, as.numeric(data$BMI)) #under
data$BMI_group<-if_else( (data$BMI_group>1 & data$BMI_group<24.9)  ,2 , as.numeric(data$BMI_group)) #healthy
data$BMI_group<-if_else(data$BMI_group<29.9  & data$BMI_group>2 ,3, as.numeric(data$BMI_group)) #over
data$BMI_group<-if_else(data$BMI_group<34.9 & data$BMI_group>3 ,4, as.numeric(data$BMI_group)) #obese I
data$BMI_group<-if_else(data$BMI_group<39.9 & data$BMI_group>4 ,5, as.numeric(data$BMI_group)) #obese II
data$BMI_group<-if_else(data$BMI_group<49.9 & data$BMI_group>5 ,6, as.numeric(data$BMI_group)) #obese III
data$BMI_group<-if_else(data$BMI_group>50 ,7, as.numeric(data$BMI_group)) #OUTLIER?
data$BMI_group<-as.character(data$BMI_group)

data$BMI_GROUP<-if_else(data$BMI_group=="1", "Underweight", data$BMI_group)
data$BMI_GROUP<-if_else(data$BMI_group=="2", "Healthy", data$BMI_GROUP)
data$BMI_GROUP<-if_else(data$BMI_group=="3", "Overweight", data$BMI_GROUP)
data$BMI_GROUP<-if_else(data$BMI_group=="4", "Obese I", data$BMI_GROUP)
data$BMI_GROUP<-if_else(data$BMI_group=="5", "Obese II", data$BMI_GROUP)
data$BMI_GROUP<-if_else(data$BMI_group=="6", "Obese III", data$BMI_GROUP)
data$BMI_GROUP<-if_else(data$BMI_group=="7", "ERROR", data$BMI_GROUP)
#data %>% filter(data$BMI_GROUP=="ERROR")
#data<-data[!(data$BMI_GROUP=="ERROR"),] 
#data %>% filter(is.na(data$BMI_GROUP)==TRUE)
data$BMI_GROUP<-factor(data$BMI_GROUP, levels=c("Underweight", "Healthy", "Overweight","Obese I", "Obese II", "Obese III"))


data$BMI_GROUP<-as.character(data$BMI_GROUP)
data$BMI_Underweight<-if_else(data$BMI_GROUP=="Underweight", 1,0)
data$BMI_Healthy<-if_else(data$BMI_GROUP== "Healthy", 1,0)
data$BMI_Obese<-if_else(data$BMI_GROUP== "Overweight", "1",as.character(data$BMI_GROUP))
data$BMI_Obese<-if_else(data$BMI_Obese=="Obese I", "1",as.character(data$BMI_Obese))
data$BMI_Obese<-if_else(data$BMI_Obese=="Obese II", "1",as.character(data$BMI_Obese))
data$BMI_Obese<-if_else(data$BMI_Obese=="Obese III", "1",as.character(data$BMI_Obese))
data$BMI_Obese<-if_else(data$BMI_Obese=="Healthy", "0",as.character(data$BMI_Obese))
data$BMI_Obese<-if_else(data$BMI_Obese=="Underweight", "0",as.character(data$BMI_Obese))

Discrepancies- beta blocker variables

# V02_data_beta != VA Classifiers != custom gather oral beta blocking ingredients.      
data$usp_data_beta.factor<-as.factor(data$usp_data_beta.factor)
data %>%
  filter(data$vo2_data_beta != data$usp_beta.factor | 
           data$vo2_data_beta !=data$usp_data_beta_oral.factor  |
           data$vo2_data_beta !=data$usp_data_beta.factor
         ) %>%
  select(record_id,vo2_data_beta,usp_data_beta_oral.factor,usp_data_beta.factor ) %>%
  DT::datatable(rownames=FALSE, options = list(pageLength = 5))

3. CIRS Data

#TBA

4. Hru Data

#TBA

Mean Tables

Site

	KU (N=179)	NEU (N=138)	PITT (N=183)	Overall (N=500)
usp_data_classified_rxs
Mean (SD)	3.70 (3.05)	3.69 (3.01)	3.91 (3.03)	3.77 (3.03)
Median [Min, Max]	3.00 [0, 19.0]	3.00 [0, 15.0]	3.00 [0, 21.0]	3.00 [0, 21.0]
usp_data_classified_otcs
Mean (SD)	2.64 (2.15)	2.79 (2.00)	3.31 (2.31)	2.93 (2.19)
Median [Min, Max]	2.00 [0, 10.0]	3.00 [0, 10.0]	3.00 [0, 10.0]	2.00 [0, 10.0]
Weekly_alc.score
Mean (SD)	3.17 (5.17)	3.32 (4.69)	2.84 (4.76)	3.09 (4.89)
Median [Min, Max]	1.00 [0, 28.0]	1.00 [0, 21.0]	1.00 [0, 30.0]	1.00 [0, 30.0]
daily_caffeine
Mean (SD)	2.52 (1.93)	2.52 (1.74)	2.87 (2.83)	2.65 (2.27)
Median [Min, Max]	2.00 [0, 10.0]	2.00 [0, 10.0]	2.50 [0, 30.0]	2.00 [0, 30.0]
Missing	1 (0.6%)	3 (2.2%)	0 (0%)	4 (0.8%)
yrs_smoke
Mean (SD)	1.03 (6.50)	1.12 (6.91)	0.656 (5.20)	0.918 (6.17)
Median [Min, Max]	0 [0, 50.0]	0 [0, 54.0]	0 [0, 50.0]	0 [0, 54.0]
prior_smoke_yrs
Mean (SD)	5.49 (10.6)	5.95 (9.26)	9.99 (13.4)	7.26 (11.6)
Median [Min, Max]	0 [0, 59.0]	0 [0, 40.0]	0 [0, 50.0]	0 [0, 59.0]
hvlt_total_recall_tscore
Mean (SD)	54.1 (8.64)	53.1 (9.10)	53.8 (8.86)	53.7 (8.84)
Median [Min, Max]	54.0 [28.0, 73.0]	54.0 [31.0, 71.0]	54.0 [29.0, 72.0]	54.0 [28.0, 73.0]
BVMT Delayed Recall (Norm) T Score
Mean (SD)	55.9 (8.70)	52.3 (11.5)	53.6 (10.9)	54.1 (10.4)
Median [Min, Max]	58.0 [27.0, 68.0]	55.0 [24.0, 68.0]	55.0 [20.0, 68.0]	55.0 [20.0, 68.0]
CIRS Total Score
Mean (SD)	2.91 (2.23)	3.28 (2.32)	3.72 (2.56)	3.31 (2.40)
Median [Min, Max]	3.00 [0, 12.0]	3.00 [0, 11.0]	3.00 [0, 12.0]	3.00 [0, 12.0]
Peak VO2 (ml/kg/min):
Mean (SD)	22.6 (4.97)	22.3 (5.18)	20.4 (4.63)	21.7 (5.00)
Median [Min, Max]	22.0 [12.0, 39.0]	22.0 [11.0, 34.0]	20.0 [11.0, 34.0]	21.0 [11.0, 39.0]
Max RER
Mean (SD)	1.09 (0.0652)	1.10 (0.0817)	1.09 (0.0783)	1.09 (0.0748)
Median [Min, Max]	1.09 [0.930, 1.28]	1.10 [0.870, 1.36]	1.10 [0.880, 1.30]	1.09 [0.870, 1.36]

Sex differences-

	Male (N=142)	Female (N=358)	P-value
usp_data_classified_rxs
Mean (SD)	4.02 (2.79)	3.68 (3.11)	0.229
Median [Min, Max]	4.00 [0, 12.0]	3.00 [0, 21.0]
usp_data_classified_otcs
Mean (SD)	2.27 (1.86)	3.18 (2.26)	<0.001
Median [Min, Max]	2.00 [0, 9.00]	3.00 [0, 10.0]
Weekly_alc.score
Mean (SD)	4.71 (6.59)	2.44 (3.85)	<0.001
Median [Min, Max]	2.00 [0, 30.0]	1.00 [0, 21.0]
pack_day
Mean (SD)	1.20 (0.759)	1.01 (0.720)	0.067
Median [Min, Max]	1.00 [0.0330, 3.00]	1.00 [0.00500, 3.00]
Missing	67 (47.2%)	208 (58.1%)
smoke_yrs
Mean (SD)	17.7 (13.4)	18.2 (13.0)	0.822
Median [Min, Max]	15.0 [0.167, 59.0]	15.0 [1.00, 54.0]
Missing	66 (46.5%)	207 (57.8%)
daily_caffeine
Mean (SD)	3.08 (3.05)	2.48 (1.84)	0.03
Median [Min, Max]	3.00 [0, 30.0]	2.00 [0, 12.0]
Missing	0 (0%)	4 (1.1%)
hvlt_total_recall_tscore
Mean (SD)	50.5 (8.64)	55.0 (8.60)	<0.001
Median [Min, Max]	50.0 [28.0, 71.0]	55.0 [31.0, 73.0]
BVMT Delayed Recall (Norm) T Score
Mean (SD)	53.0 (10.5)	54.5 (10.4)	0.156
Median [Min, Max]	55.0 [24.0, 68.0]	58.0 [20.0, 68.0]
CIRS Total Score
Mean (SD)	3.33 (2.48)	3.30 (2.38)	0.895
Median [Min, Max]	3.00 [0, 11.0]	3.00 [0, 12.0]
Peak VO2 (ml/kg/min):
Mean (SD)	24.2 (5.52)	20.7 (4.42)	<0.001
Median [Min, Max]	24.5 [11.0, 39.0]	20.0 [11.0, 34.0]
Max RER
Mean (SD)	1.10 (0.0754)	1.09 (0.0744)	0.087
Median [Min, Max]	1.09 [0.890, 1.33]	1.10 [0.870, 1.36]

Smoking Demos -

Cigarettes only

	0 (N=488)	1 (N=12)	P-value
usp_data_classified_rxs
Mean (SD)	3.76 (2.95)	4.25 (5.55)	0.767
Median [Min, Max]	3.00 [0, 19.0]	3.00 [0, 21.0]
Weekly_alc.score
Mean (SD)	3.08 (4.86)	3.54 (6.01)	0.796
Median [Min, Max]	1.00 [0, 30.0]	1.00 [0, 20.5]
pack_day
Mean (SD)	1.11 (0.736)	0.365 (0.212)	<0.001
Median [Min, Max]	1.00 [0.0140, 3.00]	0.330 [0.00500, 0.750]
Missing	274 (56.1%)	1 (8.3%)
smoke_yrs
Mean (SD)	16.9 (12.2)	38.3 (13.0)	<0.001
Median [Min, Max]	15.0 [0.167, 59.0]	40.5 [15.0, 54.0]
Missing	273 (55.9%)	0 (0%)
hvlt_total_recall_tscore
Mean (SD)	53.7 (8.88)	54.0 (7.22)	0.884
Median [Min, Max]	54.0 [28.0, 73.0]	51.5 [42.0, 70.0]
BVMT Delayed Recall (Norm) T Score
Mean (SD)	54.2 (10.4)	48.7 (12.0)	0.141
Median [Min, Max]	55.0 [20.0, 68.0]	51.5 [27.0, 67.0]
CIRS Total Score
Mean (SD)	3.31 (2.42)	3.17 (1.80)	0.79
Median [Min, Max]	3.00 [0, 12.0]	2.50 [0, 6.00]
Peak VO2 (ml/kg/min):
Mean (SD)	21.8 (5.01)	18.0 (3.22)	0.002
Median [Min, Max]	22.0 [11.0, 39.0]	18.0 [12.0, 25.0]
Max RER
Mean (SD)	1.09 (0.0751)	1.07 (0.0599)	0.125
Median [Min, Max]	1.10 [0.870, 1.36]	1.07 [0.950, 1.17]

All Inhaled Alternatives

	0 (N=476)	1 (N=24)	P-value
usp_data_classified_rxs
Mean (SD)	3.74 (2.96)	4.46 (4.21)	0.417
Median [Min, Max]	3.00 [0, 19.0]	3.50 [0, 21.0]
Weekly_alc.score
Mean (SD)	2.94 (4.68)	6.04 (7.50)	0.056
Median [Min, Max]	1.00 [0, 30.0]	3.75 [0, 24.0]
pack_day
Mean (SD)	1.10 (0.747)	0.751 (0.545)	0.013
Median [Min, Max]	1.00 [0.0140, 3.00]	0.625 [0.00500, 2.00]
Missing	271 (56.9%)	4 (16.7%)
smoke_yrs
Mean (SD)	16.8 (12.3)	30.1 (14.9)	<0.001
Median [Min, Max]	15.0 [0.167, 59.0]	30.0 [3.00, 54.0]
Missing	270 (56.7%)	3 (12.5%)
hvlt_total_recall_tscore
Mean (SD)	53.8 (8.87)	52.0 (8.23)	0.325
Median [Min, Max]	54.0 [28.0, 73.0]	51.5 [34.0, 70.0]
BVMT Delayed Recall (Norm) T Score
Mean (SD)	54.3 (10.3)	49.1 (11.2)	0.034
Median [Min, Max]	55.0 [20.0, 68.0]	49.0 [27.0, 67.0]
CIRS Total Score
Mean (SD)	3.30 (2.41)	3.46 (2.32)	0.748
Median [Min, Max]	3.00 [0, 12.0]	3.50 [0, 9.00]
Peak VO2 (ml/kg/min):
Mean (SD)	21.8 (4.98)	20.8 (5.48)	0.43
Median [Min, Max]	21.5 [11.0, 39.0]	20.0 [12.0, 32.0]
Max RER
Mean (SD)	1.09 (0.0754)	1.08 (0.0628)	0.415
Median [Min, Max]	1.10 [0.870, 1.36]	1.08 [0.950, 1.22]

Summated Smoking factors

	0 (N=269)	1 (N=221)	2 (N=10)	Overall (N=500)
usp_data_classified_rxs
Mean (SD)	3.56 (2.83)	4.01 (3.27)	4.40 (2.37)	3.77 (3.03)
Median [Min, Max]	3.00 [0, 13.0]	3.00 [0, 21.0]	4.50 [1.00, 8.00]	3.00 [0, 21.0]
Weekly_alc.score
Mean (SD)	2.42 (3.78)	3.66 (5.61)	8.35 (8.84)	3.09 (4.89)
Median [Min, Max]	1.00 [0, 21.0]	1.00 [0, 30.0]	5.50 [0, 24.0]	1.00 [0, 30.0]
pack_day
Mean (SD)	NA (NA)	1.07 (0.747)	1.13 (0.502)	1.07 (0.737)
Median [Min, Max]	NA [NA, NA]	1.00 [0.00500, 3.00]	1.00 [0.330, 2.00]	1.00 [0.00500, 3.00]
Missing	269 (100%)	6 (2.7%)	0 (0%)	275 (55.0%)
smoke_yrs
Mean (SD)	NA (NA)	17.8 (13.1)	22.3 (13.3)	18.0 (13.1)
Median [Min, Max]	NA [NA, NA]	15.0 [0.167, 59.0]	20.3 [3.00, 50.0]	15.0 [0.167, 59.0]
Missing	269 (100%)	4 (1.8%)	0 (0%)	273 (54.6%)
hvlt_total_recall_tscore
Mean (SD)	53.5 (9.21)	54.0 (8.47)	53.1 (6.87)	53.7 (8.84)
Median [Min, Max]	54.0 [29.0, 72.0]	54.0 [28.0, 73.0]	54.0 [42.0, 64.0]	54.0 [28.0, 73.0]
BVMT Delayed Recall (Norm) T Score
Mean (SD)	54.9 (9.98)	53.2 (11.0)	52.1 (8.24)	54.1 (10.4)
Median [Min, Max]	58.0 [27.0, 68.0]	55.0 [20.0, 68.0]	51.5 [41.0, 63.0]	55.0 [20.0, 68.0]
CIRS Total Score
Mean (SD)	3.16 (2.29)	3.47 (2.51)	3.90 (2.73)	3.31 (2.40)
Median [Min, Max]	3.00 [0, 11.0]	3.00 [0, 12.0]	4.00 [1.00, 9.00]	3.00 [0, 12.0]
Peak VO2 (ml/kg/min):
Mean (SD)	21.9 (4.88)	21.4 (5.08)	22.9 (6.45)	21.7 (5.00)
Median [Min, Max]	22.0 [11.0, 36.0]	21.0 [11.0, 39.0]	21.5 [14.0, 32.0]	21.0 [11.0, 39.0]
Max RER
Mean (SD)	1.10 (0.0755)	1.09 (0.0743)	1.11 (0.0652)	1.09 (0.0748)
Median [Min, Max]	1.10 [0.880, 1.30]	1.08 [0.870, 1.36]	1.12 [1.02, 1.22]	1.09 [0.870, 1.36]

Smoking Demos Stratified

	Never Smoked (N=269)	Former Cigarette User (N=207)	Primary Alternative User (N=3)	Secondary Alternative User (N=10)	Primary Cigarette User (N=11)	Overall (N=500)
usp_data_classified_rxs
Mean (SD)	3.56 (2.83)	3.98 (3.11)	6.00 (2.65)	4.40 (2.37)	4.09 (5.79)	3.77 (3.03)
Median [Min, Max]	3.00 [0, 13.0]	3.00 [0, 19.0]	5.00 [4.00, 9.00]	4.50 [1.00, 8.00]	3.00 [0, 21.0]	3.00 [0, 21.0]
Weekly_alc.score
Mean (SD)	2.42 (3.78)	3.62 (5.58)	6.33 (7.09)	8.35 (8.84)	3.86 (6.20)	3.09 (4.89)
Median [Min, Max]	1.00 [0, 21.0]	1.00 [0, 30.0]	5.00 [0, 14.0]	5.50 [0, 24.0]	1.00 [0, 20.5]	1.00 [0, 30.0]
pack_day
Mean (SD)	NA (NA)	1.10 (0.747)	NA (NA)	1.13 (0.502)	0.369 (0.223)	1.07 (0.737)
Median [Min, Max]	NA [NA, NA]	1.00 [0.0140, 3.00]	NA [NA, NA]	1.00 [0.330, 2.00]	0.415 [0.00500, 0.750]	1.00 [0.00500, 3.00]
Missing	269 (100%)	2 (1.0%)	3 (100%)	0 (0%)	1 (9.1%)	275 (55.0%)
smoke_yrs
Mean (SD)	NA (NA)	16.8 (12.3)	NA (NA)	22.3 (13.3)	37.2 (13.1)	18.0 (13.1)
Median [Min, Max]	NA [NA, NA]	15.0 [0.167, 59.0]	NA [NA, NA]	20.3 [3.00, 50.0]	40.0 [15.0, 54.0]	15.0 [0.167, 59.0]
Missing	269 (100%)	1 (0.5%)	3 (100%)	0 (0%)	0 (0%)	273 (54.6%)
hvlt_total_recall_tscore
Mean (SD)	53.5 (9.21)	54.2 (8.40)	42.7 (11.7)	53.1 (6.87)	53.6 (7.46)	53.7 (8.84)
Median [Min, Max]	54.0 [29.0, 72.0]	54.0 [28.0, 73.0]	38.0 [34.0, 56.0]	54.0 [42.0, 64.0]	51.0 [42.0, 70.0]	54.0 [28.0, 73.0]
BVMT Delayed Recall (Norm) T Score
Mean (SD)	54.9 (9.98)	53.6 (10.7)	44.0 (17.3)	52.1 (8.24)	47.7 (12.1)	54.1 (10.4)
Median [Min, Max]	58.0 [27.0, 68.0]	55.0 [20.0, 68.0]	35.0 [33.0, 64.0]	51.5 [41.0, 63.0]	49.0 [27.0, 67.0]	55.0 [20.0, 68.0]
CIRS Total Score
Mean (SD)	3.16 (2.29)	3.49 (2.54)	3.67 (3.21)	3.90 (2.73)	3.00 (1.79)	3.31 (2.40)
Median [Min, Max]	3.00 [0, 11.0]	3.00 [0, 12.0]	5.00 [0, 6.00]	4.00 [1.00, 9.00]	2.00 [0, 6.00]	3.00 [0, 12.0]
Peak VO2 (ml/kg/min):
Mean (SD)	21.9 (4.88)	21.6 (5.11)	24.3 (4.16)	22.9 (6.45)	18.0 (3.38)	21.7 (5.00)
Median [Min, Max]	22.0 [11.0, 36.0]	21.0 [11.0, 39.0]	23.0 [21.0, 29.0]	21.5 [14.0, 32.0]	18.0 [12.0, 25.0]	21.0 [11.0, 39.0]
Max RER
Mean (SD)	1.10 (0.0755)	1.09 (0.0752)	1.08 (0.0379)	1.11 (0.0652)	1.06 (0.0602)	1.09 (0.0748)
Median [Min, Max]	1.10 [0.880, 1.30]	1.09 [0.870, 1.36]	1.06 [1.05, 1.12]	1.12 [1.02, 1.22]	1.06 [0.950, 1.17]	1.09 [0.870, 1.36]

Corrected Smoking Status (all inhaled alternatives)...

	Current (N=24)	Former (N=207)	Never (N=268)	Overall (N=500)
usp_data_classified_rxs
Mean (SD)	4.46 (4.21)	3.98 (3.11)	3.55 (2.83)	3.77 (3.03)
Median [Min, Max]	3.50 [0, 21.0]	3.00 [0, 19.0]	3.00 [0, 13.0]	3.00 [0, 21.0]
Weekly_alc.score
Mean (SD)	6.04 (7.50)	3.62 (5.58)	2.43 (3.79)	3.09 (4.89)
Median [Min, Max]	3.75 [0, 24.0]	1.00 [0, 30.0]	1.00 [0, 21.0]	1.00 [0, 30.0]
pack_day
Mean (SD)	0.751 (0.545)	1.10 (0.747)	NA (NA)	1.07 (0.737)
Median [Min, Max]	0.625 [0.00500, 2.00]	1.00 [0.0140, 3.00]	NA [NA, NA]	1.00 [0.00500, 3.00]
Missing	4 (16.7%)	2 (1.0%)	268 (100%)	275 (55.0%)
smoke_yrs
Mean (SD)	30.1 (14.9)	16.8 (12.3)	NA (NA)	18.0 (13.1)
Median [Min, Max]	30.0 [3.00, 54.0]	15.0 [0.167, 59.0]	NA [NA, NA]	15.0 [0.167, 59.0]
Missing	3 (12.5%)	1 (0.5%)	268 (100%)	273 (54.6%)
hvlt_total_recall_tscore
Mean (SD)	52.0 (8.23)	54.2 (8.40)	53.4 (9.19)	53.7 (8.84)
Median [Min, Max]	51.5 [34.0, 70.0]	54.0 [28.0, 73.0]	54.0 [29.0, 72.0]	54.0 [28.0, 73.0]
BVMT Delayed Recall (Norm) T Score
Mean (SD)	49.1 (11.2)	53.6 (10.7)	54.8 (9.97)	54.1 (10.4)
Median [Min, Max]	49.0 [27.0, 67.0]	55.0 [20.0, 68.0]	58.0 [27.0, 68.0]	55.0 [20.0, 68.0]
CIRS Total Score
Mean (SD)	3.46 (2.32)	3.49 (2.54)	3.15 (2.30)	3.31 (2.40)
Median [Min, Max]	3.50 [0, 9.00]	3.00 [0, 12.0]	3.00 [0, 11.0]	3.00 [0, 12.0]
Peak VO2 (ml/kg/min):
Mean (SD)	20.8 (5.48)	21.6 (5.11)	21.9 (4.89)	21.7 (5.00)
Median [Min, Max]	20.0 [12.0, 32.0]	21.0 [11.0, 39.0]	22.0 [11.0, 36.0]	21.0 [11.0, 39.0]

Adjudication Outcomes

	Performing poorly on Memory tests (N=67)	Performing poorly on Non-Memory tests (N=77)	Performing within normal ranges on tests (N=356)	Overall (N=500)
usp_data_classified_rxs
Mean (SD)	4.49 (3.32)	3.91 (3.19)	3.61 (2.92)	3.77 (3.03)
Median [Min, Max]	4.00 [0, 15.0]	3.00 [0, 19.0]	3.00 [0, 21.0]	3.00 [0, 21.0]
usp_data_classified_otcs
Mean (SD)	2.43 (1.94)	3.06 (2.15)	2.99 (2.23)	2.93 (2.19)
Median [Min, Max]	2.00 [0, 8.00]	3.00 [0, 9.00]	3.00 [0, 10.0]	2.00 [0, 10.0]
Weekly_alc.score
Mean (SD)	3.87 (4.39)	2.56 (3.89)	3.06 (5.16)	3.09 (4.89)
Median [Min, Max]	2.00 [0, 17.5]	1.00 [0, 20.0]	1.00 [0, 30.0]	1.00 [0, 30.0]
daily_caffeine
Mean (SD)	2.76 (1.95)	2.49 (1.92)	2.66 (2.39)	2.65 (2.27)
Median [Min, Max]	2.50 [0, 8.50]	2.00 [0, 10.0]	2.00 [0, 30.0]	2.00 [0, 30.0]
Missing	0 (0%)	1 (1.3%)	3 (0.8%)	4 (0.8%)
pack_day
Mean (SD)	0.975 (0.554)	1.03 (0.579)	1.10 (0.797)	1.07 (0.737)
Median [Min, Max]	1.00 [0.140, 2.00]	1.00 [0.250, 3.00]	1.00 [0.00500, 3.00]	1.00 [0.00500, 3.00]
Missing	40 (59.7%)	39 (50.6%)	196 (55.1%)	275 (55.0%)
smoke_yrs
Mean (SD)	19.8 (15.2)	14.8 (10.3)	18.5 (13.3)	18.0 (13.1)
Median [Min, Max]	15.0 [0.167, 50.0]	12.0 [0.230, 40.0]	15.0 [1.00, 59.0]	15.0 [0.167, 59.0]
Missing	40 (59.7%)	38 (49.4%)	195 (54.8%)	273 (54.6%)
hvlt_total_recall_tscore
Mean (SD)	44.5 (8.10)	52.3 (7.40)	55.7 (8.07)	53.7 (8.84)
Median [Min, Max]	44.0 [28.0, 60.0]	52.0 [34.0, 69.0]	56.0 [35.0, 73.0]	54.0 [28.0, 73.0]
BVMT Delayed Recall (Norm) T Score
Mean (SD)	39.0 (10.3)	53.2 (7.78)	57.1 (8.21)	54.1 (10.4)
Median [Min, Max]	37.0 [20.0, 67.0]	53.0 [36.0, 68.0]	59.0 [33.0, 68.0]	55.0 [20.0, 68.0]
CIRS Total Score
Mean (SD)	3.33 (2.09)	3.34 (2.56)	3.30 (2.43)	3.31 (2.40)
Median [Min, Max]	3.00 [0, 9.00]	4.00 [0, 12.0]	3.00 [0, 12.0]	3.00 [0, 12.0]
Peak VO2 (ml/kg/min):
Mean (SD)	21.2 (5.31)	21.9 (4.55)	21.8 (5.05)	21.7 (5.00)
Median [Min, Max]	21.0 [11.0, 33.0]	21.0 [13.0, 35.0]	21.0 [11.0, 39.0]	21.0 [11.0, 39.0]
Max RER
Mean (SD)	1.08 (0.0806)	1.10 (0.0625)	1.10 (0.0762)	1.09 (0.0748)
Median [Min, Max]	1.08 [0.870, 1.28]	1.10 [0.930, 1.24]	1.10 [0.880, 1.36]	1.09 [0.870, 1.36]

ITT / Withdrew (?)

	No Treatment (N=6)	Treatment (N=494)	Overall (N=500)
usp_data_classified_rxs
Mean (SD)	2.17 (2.14)	3.79 (3.03)	3.77 (3.03)
Median [Min, Max]	1.50 [0, 6.00]	3.00 [0, 21.0]	3.00 [0, 21.0]
usp_data_classified_otcs
Mean (SD)	2.50 (1.38)	2.93 (2.20)	2.93 (2.19)
Median [Min, Max]	2.00 [1.00, 5.00]	2.00 [0, 10.0]	2.00 [0, 10.0]
Weekly_alc.score
Mean (SD)	6.17 (8.01)	3.05 (4.84)	3.09 (4.89)
Median [Min, Max]	3.50 [0, 20.0]	1.00 [0, 30.0]	1.00 [0, 30.0]
daily_caffeine
Mean (SD)	1.90 (2.25)	2.66 (2.27)	2.65 (2.27)
Median [Min, Max]	1.00 [0, 5.00]	2.00 [0, 30.0]	2.00 [0, 30.0]
Missing	1 (16.7%)	3 (0.6%)	4 (0.8%)
pack_day
Mean (SD)	2.00 (1.41)	1.06 (0.729)	1.07 (0.737)
Median [Min, Max]	2.00 [1.00, 3.00]	1.00 [0.00500, 3.00]	1.00 [0.00500, 3.00]
Missing	4 (66.7%)	271 (54.9%)	275 (55.0%)
smoke_yrs
Mean (SD)	12.3 (0.354)	18.1 (13.2)	18.0 (13.1)
Median [Min, Max]	12.3 [12.0, 12.5]	15.0 [0.167, 59.0]	15.0 [0.167, 59.0]
Missing	4 (66.7%)	269 (54.5%)	273 (54.6%)
hvlt_total_recall_tscore
Mean (SD)	53.7 (7.55)	53.7 (8.86)	53.7 (8.84)
Median [Min, Max]	55.5 [42.0, 63.0]	54.0 [28.0, 73.0]	54.0 [28.0, 73.0]
BVMT Delayed Recall (Norm) T Score
Mean (SD)	50.5 (8.62)	54.1 (10.4)	54.1 (10.4)
Median [Min, Max]	51.5 [39.0, 63.0]	55.0 [20.0, 68.0]	55.0 [20.0, 68.0]
CIRS Total Score
Mean (SD)	1.33 (0.816)	3.33 (2.41)	3.31 (2.40)
Median [Min, Max]	1.50 [0, 2.00]	3.00 [0, 12.0]	3.00 [0, 12.0]
Peak VO2 (ml/kg/min):
Mean (SD)	23.7 (6.83)	21.7 (4.98)	21.7 (5.00)
Median [Min, Max]	21.5 [17.0, 34.0]	21.0 [11.0, 39.0]	21.0 [11.0, 39.0]
BMI
Mean (SD)	28.4 (4.12)	30.1 (10.9)	30.0 (10.9)
Median [Min, Max]	27.1 [22.8, 33.7]	28.9 [18.0, 238]	28.9 [18.0, 238]
Missing	0 (0%)	1 (0.2%)	1 (0.2%)

COVID Timelines

	COVID (N=6)	Pre-Covid (N=494)	P-value
usp_data_classified_rxs
Mean (SD)	3.17 (2.64)	3.78 (3.03)	0.595
Median [Min, Max]	2.00 [1.00, 7.00]	3.00 [0, 21.0]
usp_data_classified_otcs
Mean (SD)	2.33 (1.37)	2.93 (2.19)	0.335
Median [Min, Max]	2.50 [0, 4.00]	2.00 [0, 10.0]
Weekly_alc.score
Mean (SD)	7.58 (7.07)	3.03 (4.84)	0.176
Median [Min, Max]	5.25 [1.00, 17.0]	1.00 [0, 30.0]
daily_caffeine
Mean (SD)	2.42 (1.56)	2.65 (2.27)	0.73
Median [Min, Max]	3.00 [0, 4.00]	2.00 [0, 30.0]
Missing	0 (0%)	4 (0.8%)
pack_day
Mean (SD)	1.13 (0.629)	1.07 (0.740)	0.876
Median [Min, Max]	1.00 [0.500, 2.00]	1.00 [0.00500, 3.00]
Missing	2 (33.3%)	273 (55.3%)
smoke_yrs
Mean (SD)	6.75 (5.56)	18.2 (13.1)	0.021
Median [Min, Max]	4.50 [3.00, 15.0]	15.0 [0.167, 59.0]
Missing	2 (33.3%)	271 (54.9%)
hvlt_total_recall_tscore
Mean (SD)	50.2 (9.83)	53.7 (8.83)	0.416
Median [Min, Max]	51.5 [37.0, 65.0]	54.0 [28.0, 73.0]
BVMT Delayed Recall (Norm) T Score
Mean (SD)	49.5 (14.0)	54.1 (10.4)	0.455
Median [Min, Max]	55.0 [32.0, 67.0]	55.0 [20.0, 68.0]
CIRS Total Score
Mean (SD)	3.83 (1.60)	3.30 (2.41)	0.457
Median [Min, Max]	4.00 [1.00, 6.00]	3.00 [0, 12.0]
Peak VO2 (ml/kg/min):
Mean (SD)	17.5 (3.94)	21.8 (5.00)	0.045
Median [Min, Max]	17.5 [11.0, 22.0]	21.0 [11.0, 39.0]

Correlograms

Dummy Coded Groups (dispersion)

V02/Physical Function/Fitness

Medications

Bar Charts (sig & trending chi bolded)

Site Dispersion

Gender x Site

Race1 x Site

	vo2_site.factor
screen_race.factor	Pitt				Kansas				Northeastern				Total
African American / Black	28	(	44.4%	)	7	(	11.1%	)	28	(	44.4%	)	63	(	100.0%	)
Asian	1	(	14.3%	)	2	(	28.6%	)	4	(	57.1%	)	7	(	100.0%	)
Caucasian / White	152	(	36.5%	)	164	(	39.4%	)	100	(	24.0%	)	416	(	100.0%	)
American Indian or Alaska Native	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)
Native Hawaiian or other Pacific Islander	0	(	0.0%	)	1	(	100.0%	)	0	(	0.0%	)	1	(	100.0%	)
Subject Refused to Answer	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)	0	(	0.0%	)
Bi-racial	0	(	0.0%	)	4	(	66.7%	)	2	(	33.3%	)	6	(	100.0%	)
Other	2	(	28.6%	)	1	(	14.3%	)	4	(	57.1%	)	7	(	100.0%	)
Total	183	(	36.6%	)	179	(	35.8%	)	138	(	27.6%	)	500	(	100.0%	)
Χ² = NaN df = 14 p = NaN

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Race2 x Site

Latina/Hispanic x Race

## 
## Other White 
##    84   416

	race.factor_white.factor
screen_race_la_his.factor	Other				White				Total
Yes	7	(	38.9%	)	11	(	61.1%	)	18	(	100.0%	)
No	77	(	16.0%	)	405	(	84.0%	)	482	(	100.0%	)
Total	84	(	16.8%	)	416	(	83.2%	)	500	(	100.0%	)
Χ² = 4.9817 df = 1 p = .0256 O.R. (95% C.I.) = 3.35 (1.26 - 8.90) R.R. (95% C.I.) = 2.43 (1.32 - 4.50)

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Race3 x Site

	vo2_site.factor
race.factor	Pitt				Kansas				Northeastern				Total
Black	28	(	44.4%	)	7	(	11.1%	)	28	(	44.4%	)	63	(	100.0%	)
Other	3	(	14.3%	)	8	(	38.1%	)	10	(	47.6%	)	21	(	100.0%	)
White	152	(	36.5%	)	164	(	39.4%	)	100	(	24.0%	)	416	(	100.0%	)
Total	183	(	36.6%	)	179	(	35.8%	)	138	(	27.6%	)	500	(	100.0%	)
Χ² = 27.6371 df = 4 p = .0000

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Latina/Hispanic x Site

	vo2_site.factor
screen_race_la_his.factor	Pitt				Kansas				Northeastern				Total
Yes	2	(	11.1%	)	8	(	44.4%	)	8	(	44.4%	)	18	(	100.0%	)
No	181	(	37.6%	)	171	(	35.5%	)	130	(	27.0%	)	482	(	100.0%	)
Total	183	(	36.6%	)	179	(	35.8%	)	138	(	27.6%	)	500	(	100.0%	)
Χ² = 5.6238 df = 2 p = .0601

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Smoking.Status x Site

	vo2_site.factor
Smoking.Status	Pitt				Kansas				Northeastern				Total
Current	10	(	41.7%	)	9	(	37.5%	)	5	(	20.8%	)	24	(	100.0%	)
Former	86	(	41.5%	)	59	(	28.5%	)	62	(	30.0%	)	207	(	100.0%	)
Never	86	(	32.1%	)	111	(	41.4%	)	71	(	26.5%	)	268	(	100.0%	)
<NA>	1	(	100.0%	)	0	(	0.0%	)	0	(	0.0%	)	1	(	100.0%	)
Total	183	(	36.6%	)	179	(	35.8%	)	138	(	27.6%	)	500	(	100.0%	)
Χ² = 9.4292 df = 4 p = .0512

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Heavy Drinkers x Site

	vo2_site.factor
Heavy_Drinker	Pitt				Kansas				Northeastern				Total
0	165	(	37.8%	)	153	(	35.0%	)	119	(	27.2%	)	437	(	100.0%	)
1	18	(	28.6%	)	26	(	41.3%	)	19	(	30.2%	)	63	(	100.0%	)
Total	183	(	36.6%	)	179	(	35.8%	)	138	(	27.6%	)	500	(	100.0%	)
Χ² = 2.0429 df = 2 p = .3601

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Education x Site

	vo2_site.factor
educ	Pitt				Kansas				Northeastern				Total
10	2	(	100.0%	)	0	(	0.0%	)	0	(	0.0%	)	2	(	100.0%	)
11	0	(	0.0%	)	2	(	100.0%	)	0	(	0.0%	)	2	(	100.0%	)
12	14	(	43.8%	)	7	(	21.9%	)	11	(	34.4%	)	32	(	100.0%	)
13	10	(	38.5%	)	9	(	34.6%	)	7	(	26.9%	)	26	(	100.0%	)
14	17	(	34.0%	)	20	(	40.0%	)	13	(	26.0%	)	50	(	100.0%	)
15	4	(	30.8%	)	5	(	38.5%	)	4	(	30.8%	)	13	(	100.0%	)
16	62	(	41.6%	)	50	(	33.6%	)	37	(	24.8%	)	149	(	100.0%	)
17	4	(	44.4%	)	3	(	33.3%	)	2	(	22.2%	)	9	(	100.0%	)
18	56	(	35.7%	)	61	(	38.9%	)	40	(	25.5%	)	157	(	100.0%	)
19	6	(	26.1%	)	5	(	21.7%	)	12	(	52.2%	)	23	(	100.0%	)
20	8	(	22.9%	)	15	(	42.9%	)	12	(	34.3%	)	35	(	100.0%	)
21	0	(	0.0%	)	1	(	100.0%	)	0	(	0.0%	)	1	(	100.0%	)
23	0	(	0.0%	)	1	(	100.0%	)	0	(	0.0%	)	1	(	100.0%	)
Total	183	(	36.6%	)	179	(	35.8%	)	138	(	27.6%	)	500	(	100.0%	)
Χ² = 26.4333 df = 24 p = .3316

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Handedness x Site

	vo2_site.factor
hand_nih_reg.factor	Pitt				Kansas				Northeastern				Total
Right	162	(	36.4%	)	168	(	37.8%	)	115	(	25.8%	)	445	(	100.0%	)
Left	21	(	38.2%	)	11	(	20.0%	)	23	(	41.8%	)	55	(	100.0%	)
Total	183	(	36.6%	)	179	(	35.8%	)	138	(	27.6%	)	500	(	100.0%	)
Χ² = 8.8779 df = 2 p = .0118

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

OTC Anticholinergic Status x Site

	vo2_site.factor
usp_data_anticholinergic_otc.factor	Pitt				Kansas				Northeastern				Total
0	174	(	36.5%	)	173	(	36.3%	)	130	(	27.3%	)	477	(	100.0%	)
1	9	(	39.1%	)	6	(	26.1%	)	8	(	34.8%	)	23	(	100.0%	)
Total	183	(	36.6%	)	179	(	35.8%	)	138	(	27.6%	)	500	(	100.0%	)
Χ² = 1.1282 df = 2 p = .5689

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Adjudication Groups x Site

	vo2_site.factor
adj_cat	Pitt				Kansas				Northeastern				Total
Performing poorly on Memory tests	21	(	31.3%	)	18	(	26.9%	)	28	(	41.8%	)	67	(	100.0%	)
Performing poorly on Non-Memory tests	23	(	29.9%	)	35	(	45.5%	)	19	(	24.7%	)	77	(	100.0%	)
Performing within normal ranges on tests	139	(	39.0%	)	126	(	35.4%	)	91	(	25.6%	)	356	(	100.0%	)
Total	183	(	36.6%	)	179	(	35.8%	)	138	(	27.6%	)	500	(	100.0%	)
Χ² = 11.2185 df = 4 p = .0242

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

term	estimate	std.error	statistic	p.value
vo2_age	0.3	0.1	3.61	0.00
vo2_sex	4.6	0.8	6.07	0.00
educ	0.8	0.2	5.51	0.00
vo2_site.factorPitt	-0.7	7.2	-0.10	0.92
vo2_site.factorKansas	0.8	7.2	0.11	0.91
vo2_site.factorNortheastern	-0.8	7.2	-0.11	0.91
vo2_site.factorPitt:adj_catPerforming poorly on Non-Memory tests	8.2	2.3	3.61	0.00
vo2_site.factorKansas:adj_catPerforming poorly on Non-Memory tests	6.5	2.2	3.00	0.00
vo2_site.factorNortheastern:adj_catPerforming poorly on Non-Memory tests	4.8	2.2	2.16	0.03
vo2_site.factorPitt:adj_catPerforming within normal ranges on tests	11.1	1.8	6.32	0.00
vo2_site.factorKansas:adj_catPerforming within normal ranges on tests	9.3	1.9	4.92	0.00
vo2_site.factorNortheastern:adj_catPerforming within normal ranges on tests	11.3	1.6	6.99	0.00

Adjudication Outcomes

Education x Adjudication Groups

	educ
adj_cat	10				11				12				13				14				15				16				17				18				19				20				21				23				Total
Performing poorly on Memory tests	0	(	0.0%	)	0	(	0.0%	)	7	(	10.4%	)	4	(	6.0%	)	9	(	13.4%	)	1	(	1.5%	)	16	(	23.9%	)	2	(	3.0%	)	21	(	31.3%	)	2	(	3.0%	)	4	(	6.0%	)	0	(	0.0%	)	1	(	1.5%	)	67	(	100.0%	)
Performing poorly on Non-Memory tests	0	(	0.0%	)	1	(	1.3%	)	6	(	7.8%	)	1	(	1.3%	)	4	(	5.2%	)	1	(	1.3%	)	27	(	35.1%	)	1	(	1.3%	)	27	(	35.1%	)	4	(	5.2%	)	4	(	5.2%	)	1	(	1.3%	)	0	(	0.0%	)	77	(	100.0%	)
Performing within normal ranges on tests	2	(	0.6%	)	1	(	0.3%	)	19	(	5.3%	)	21	(	5.9%	)	37	(	10.4%	)	11	(	3.1%	)	106	(	29.8%	)	6	(	1.7%	)	109	(	30.6%	)	17	(	4.8%	)	27	(	7.6%	)	0	(	0.0%	)	0	(	0.0%	)	356	(	100.0%	)
Total	2	(	0.4%	)	2	(	0.4%	)	32	(	6.4%	)	26	(	5.2%	)	50	(	10.0%	)	13	(	2.6%	)	149	(	29.8%	)	9	(	1.8%	)	157	(	31.4%	)	23	(	4.6%	)	35	(	7.0%	)	1	(	0.2%	)	1	(	0.2%	)	500	(	100.0%	)
Χ² = 27.3799 df = 24 p = .2871

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Smoking Status x Adjudication Groups

	Smoking.Status
adj_cat	Current				Former				Never				<NA>				Total
Performing poorly on Memory tests	5	(	7.5%	)	24	(	35.8%	)	38	(	56.7%	)	0	(	0.0%	)	67	(	100.0%	)
Performing poorly on Non-Memory tests	4	(	5.2%	)	36	(	46.8%	)	37	(	48.1%	)	0	(	0.0%	)	77	(	100.0%	)
Performing within normal ranges on tests	15	(	4.2%	)	147	(	41.3%	)	193	(	54.2%	)	1	(	0.3%	)	356	(	100.0%	)
Total	24	(	4.8%	)	207	(	41.4%	)	268	(	53.6%	)	1	(	0.2%	)	500	(	100.0%	)
Χ² = 2.8903 df = 4 p = .5764

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Weekly Alcohol Groups x Adjudication Groups

	Heavy_Drinker
adj_cat	0				1				Total
Performing poorly on Memory tests	54	(	80.6%	)	13	(	19.4%	)	67	(	100.0%	)
Performing poorly on Non-Memory tests	69	(	89.6%	)	8	(	10.4%	)	77	(	100.0%	)
Performing within normal ranges on tests	314	(	88.2%	)	42	(	11.8%	)	356	(	100.0%	)
Total	437	(	87.4%	)	63	(	12.6%	)	500	(	100.0%	)
Χ² = 3.3654 df = 2 p = .1859

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Gender x Adjudication Groups

	vo2_sex.factor
adj_cat	Male				Female				Total
Performing poorly on Memory tests	26	(	38.8%	)	41	(	61.2%	)	67	(	100.0%	)
Performing poorly on Non-Memory tests	22	(	28.6%	)	55	(	71.4%	)	77	(	100.0%	)
Performing within normal ranges on tests	94	(	26.4%	)	262	(	73.6%	)	356	(	100.0%	)
Total	142	(	28.4%	)	358	(	71.6%	)	500	(	100.0%	)
Χ² = 4.2661 df = 2 p = .1185

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Demographics

Race x Hispanic/Latina

	screen_race_la_his.factor
race.factor_white.factor	Yes				No				Total
Other	7	(	8.3%	)	77	(	91.7%	)	84	(	100.0%	)
White	11	(	2.6%	)	405	(	97.4%	)	416	(	100.0%	)
Total	18	(	3.6%	)	482	(	96.4%	)	500	(	100.0%	)
Χ² = 4.9817 df = 1 p = .0256 O.R. (95% C.I.) = 3.35 (1.26 - 8.90) R.R. (95% C.I.) = 3.15 (1.26 - 7.89)

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Education x Race, by Site

	EDU.factor
race.factor	College				Graduate				High School				Total
Black	27	(	42.9%	)	20	(	31.7%	)	16	(	25.4%	)	63	(	100.0%	)
Other	8	(	38.1%	)	12	(	57.1%	)	1	(	4.8%	)	21	(	100.0%	)
White	171	(	41.1%	)	218	(	52.4%	)	27	(	6.5%	)	416	(	100.0%	)
Total	206	(	41.2%	)	250	(	50.0%	)	44	(	8.8%	)	500	(	100.0%	)
Χ² = 27.6163 df = 4 p = .0000

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Gender x Education

	educ
vo2_sex	10				11				12				13				14				15				16				17				18				19				20				21				23				Total
1	0	(	0.0%	)	0	(	0.0%	)	8	(	5.6%	)	3	(	2.1%	)	13	(	9.2%	)	5	(	3.5%	)	44	(	31.0%	)	1	(	0.7%	)	40	(	28.2%	)	11	(	7.7%	)	16	(	11.3%	)	0	(	0.0%	)	1	(	0.7%	)	142	(	100.0%	)
2	2	(	0.6%	)	2	(	0.6%	)	24	(	6.7%	)	23	(	6.4%	)	37	(	10.3%	)	8	(	2.2%	)	105	(	29.3%	)	8	(	2.2%	)	117	(	32.7%	)	12	(	3.4%	)	19	(	5.3%	)	1	(	0.3%	)	0	(	0.0%	)	358	(	100.0%	)
Total	2	(	0.4%	)	2	(	0.4%	)	32	(	6.4%	)	26	(	5.2%	)	50	(	10.0%	)	13	(	2.6%	)	149	(	29.8%	)	9	(	1.8%	)	157	(	31.4%	)	23	(	4.6%	)	35	(	7.0%	)	1	(	0.2%	)	1	(	0.2%	)	500	(	100.0%	)
Χ² = 20.6147 df = 12 p = .0563

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Gender x Earning

	ses_earnings.factor
vo2_sex	less than $5 ,000				$5,000 through $11, 999				$12,000 through $15, 999				$16,000 through $24, 999				$25,000 through $34, 999				$35,000 through $49, 999				$50,000 through $74, 999				$75,000 through $99, 999				$100,000 and greater				<NA>				Total
1	5	(	3.5%	)	7	(	4.9%	)	4	(	2.8%	)	9	(	6.3%	)	15	(	10.6%	)	22	(	15.5%	)	21	(	14.8%	)	13	(	9.2%	)	34	(	23.9%	)	12	(	8.5%	)	142	(	100.0%	)
2	27	(	7.5%	)	21	(	5.9%	)	23	(	6.4%	)	31	(	8.7%	)	46	(	12.8%	)	52	(	14.5%	)	49	(	13.7%	)	27	(	7.5%	)	26	(	7.3%	)	56	(	15.6%	)	358	(	100.0%	)
Total	32	(	6.4%	)	28	(	5.6%	)	27	(	5.4%	)	40	(	8.0%	)	61	(	12.2%	)	74	(	14.8%	)	70	(	14.0%	)	40	(	8.0%	)	60	(	12.0%	)	68	(	13.6%	)	500	(	100.0%	)
Χ² = 28.7551 df = 8 p = .0004

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Gender x Anticholinergic

	usp_data_anticholinergic_otc.factor
vo2_sex.factor	0				1				Total
Male	139	(	97.9%	)	3	(	2.1%	)	142	(	100.0%	)
Female	338	(	94.4%	)	20	(	5.6%	)	358	(	100.0%	)
Total	477	(	95.4%	)	23	(	4.6%	)	500	(	100.0%	)
Χ² = 2.0604 df = 1 p = .1512 O.R. (95% C.I.) = 2.74 (0.80 - 9.37) R.R. (95% C.I.) = 1.04 (1.00 - 1.07)

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Smoking.Status x Weekly Alcohol Consumption Groups

	Weekly_alc.group
Smoking.Status	none				1-3				3+				Total
Current	7	(	29.2%	)	4	(	16.7%	)	13	(	54.2%	)	24	(	100.0%	)
Former	80	(	38.6%	)	62	(	30.0%	)	65	(	31.4%	)	207	(	100.0%	)
Never	121	(	45.1%	)	83	(	31.0%	)	64	(	23.9%	)	268	(	100.0%	)
<NA>	0	(	0.0%	)	1	(	100.0%	)	0	(	0.0%	)	1	(	100.0%	)
Total	208	(	41.6%	)	150	(	30.0%	)	142	(	28.4%	)	500	(	100.0%	)
Χ² = 11.8198 df = 4 p = .0187

Generated by summarytools 0.9.9 (R version 3.6.1)
2021-09-14

Stratified Box Plots:

For information on outlier threshold see: https://waterdata.usgs.gov/blog/boxplots/

 ##Job complexity edu variable
ggplot(data, aes(x=educ, y=hvlt_total_recall_tscore, color=Site))+geom_point()

 ggplot(data, aes(x=vo2_age, y=hvlt_total_recall_tscore, color=Smoking.Status))+geom_smooth()

   ggplot(data, aes(x=vo2_age, y=bvmt_total_recall_raw, color=Smoking.Status))+geom_smooth()

data$Current.Smoke.factor<-if_else(is.na(data$Current.Smoke.factor), 0, as.double(data$Current.Smoke.factor))
ggplot(data, aes(x=educ, y=hvlt_total_recall_tscore, color=Site))+geom_smooth()+theme(legend.position = "none")

 data$bvmt_total_recall_raw

## LABEL: BVMT Total Recall Raw Score 
## VALUES:
## 23, 25, 12, 27, 21, 17, 18, 13, 17, 18, 22, 26, 23, 9, 20, 28, 20, 31, 24, 25, 30, 32, 30, 29, 29, 28, 31, 26, 34, 23, 18, 25, 23, 17, 26, 26, 34, 23, 19, 25, 19, 30, 22, 16, 13, 16, 16, 11, 22, 20... 50 items printed out of 500

mod<-(lm(bvmt_total_recall_raw~ 
           vo2_gender_summary.factor+
           Weekly_alc.score+
           Current.Smoke.factor+
            poly(vo2_age,2)+
            educ:
            race.factor_white+
           Site,
          data))
summary(mod)

## 
## Call:
## lm(formula = bvmt_total_recall_raw ~ vo2_gender_summary.factor + 
##     Weekly_alc.score + Current.Smoke.factor + poly(vo2_age, 2) + 
##     educ:race.factor_white + Site, data = data)
## 
## Residuals:
## LABEL: BVMT Total Recall Raw Score 
## VALUES:
## -13.4727, -4.0059, 0.3072, 4.0268, 15.0795
## 
## Coefficients:
##                                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                      19.12008    0.96815  19.749  < 2e-16 ***
## vo2_gender_summary.factorFemale   1.17303    0.61177   1.917  0.05576 .  
## Weekly_alc.score                 -0.01588    0.05680  -0.280  0.77997    
## Current.Smoke.factor             -1.74157    1.27231  -1.369  0.17168    
## poly(vo2_age, 2)1               -25.26185    6.05473  -4.172 3.57e-05 ***
## poly(vo2_age, 2)2                10.97179    5.98091   1.834  0.06719 .  
## SiteNEU                          -1.83197    0.69266  -2.645  0.00843 ** 
## SitePITT                         -1.02687    0.63128  -1.627  0.10445    
## educ:race.factor_white            0.20377    0.04252   4.792 2.19e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.954 on 491 degrees of freedom
## Multiple R-squared:  0.1197, Adjusted R-squared:  0.1053 
## F-statistic: 8.344 on 8 and 491 DF,  p-value: 1.218e-10

visreg::visreg(mod,"educ" , by="race.factor_white", overlay=TRUE)


mod<-(lm(bvmt_total_recall_raw~ 
           vo2_gender_summary.factor+
           Weekly_alc.score+
           Current.Smoke.factor+
            poly(vo2_age,2)+
            educ:
            race.factor_white+
           Site,
          data))
summary(mod)

## 
## Call:
## lm(formula = bvmt_total_recall_raw ~ vo2_gender_summary.factor + 
##     Weekly_alc.score + Current.Smoke.factor + poly(vo2_age, 2) + 
##     educ:race.factor_white + Site, data = data)
## 
## Residuals:
## LABEL: BVMT Total Recall Raw Score 
## VALUES:
## -13.4727, -4.0059, 0.3072, 4.0268, 15.0795
## 
## Coefficients:
##                                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                      19.12008    0.96815  19.749  < 2e-16 ***
## vo2_gender_summary.factorFemale   1.17303    0.61177   1.917  0.05576 .  
## Weekly_alc.score                 -0.01588    0.05680  -0.280  0.77997    
## Current.Smoke.factor             -1.74157    1.27231  -1.369  0.17168    
## poly(vo2_age, 2)1               -25.26185    6.05473  -4.172 3.57e-05 ***
## poly(vo2_age, 2)2                10.97179    5.98091   1.834  0.06719 .  
## SiteNEU                          -1.83197    0.69266  -2.645  0.00843 ** 
## SitePITT                         -1.02687    0.63128  -1.627  0.10445    
## educ:race.factor_white            0.20377    0.04252   4.792 2.19e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.954 on 491 degrees of freedom
## Multiple R-squared:  0.1197, Adjusted R-squared:  0.1053 
## F-statistic: 8.344 on 8 and 491 DF,  p-value: 1.218e-10

visreg::visreg(mod,"educ" , by="race.factor_white", overlay=TRUE)

hvlt_total_recall_tscore - Model non-linear main effects

data<-data[complete.cases(data$educ),]
mod1<- lm(hvlt_total_recall_tscore ~
           vo2_sex.factor+
           poly(vo2_age, 2)+ 
           poly(educ,2)+
           vo2_site.factor,
         data=data)

mod2<- lm(hvlt_total_recall_tscore ~
            vo2_sex.factor+
            poly(vo2_age, 2)+ 
            poly(educ, 2)+   
            vo2_site.factor+
            race.factor,
         data=data)

mod3<- lm(hvlt_total_recall_tscore ~
            vo2_sex.factor+
            poly(vo2_age, 2)+ 
            educ+
            vo2_site.factor/
            race.factor,
         data=data)


mod4<- lm(hvlt_total_recall_tscore ~
            vo2_sex.factor+
            poly(vo2_age, 2)+ 
            poly(educ,2)/
            vo2_site.factor/
            race.factor,
         data=data)


mod5<- lm(hvlt_total_recall_tscore ~
            vo2_sex.factor+
            poly(vo2_age, 2)+ 
            educ/
            vo2_site.factor/
            race.factor+
            Current.Smoke.factor,
         data=data)


#data$educ
#data$educ


	Dependent variable:

	hvlt_total_recall_tscore
	(1)	(2)	(3)	(4)

vo2_sex.factorFemale	5.16^*** (3.54, 6.78)	5.53^*** (3.92, 7.13)	5.59^*** (3.98, 7.20)	5.20^*** (3.56, 6.84)
poly(vo2_age, 2)1	27.18^*** (10.72, 43.63)	26.48^*** (10.25, 42.71)	26.18^*** (9.84, 42.51)	27.12^*** (10.57, 43.66)
poly(vo2_age, 2)2	-18.46^** (-34.61, -2.31)	-18.86^** (-34.79, -2.93)	-19.32^** (-35.37, -3.27)	-17.77^** (-34.26, -1.28)
poly(educ, 2)1	44.61^*** (28.25, 60.97)	36.91^*** (20.34, 53.47)		28.96^* (-3.44, 61.36)
poly(educ, 2)2	-19.17^** (-35.35, -2.98)	-15.20^* (-31.28, 0.89)		-11.30 (-47.28, 24.68)
educ			0.71^*** (0.38, 1.05)
vo2_site.factorKansas	0.06 (-1.65, 1.76)	-0.40 (-2.10, 1.29)	-0.29 (-2.09, 1.51)
vo2_site.factorNortheastern	-1.32 (-3.15, 0.52)	-0.98 (-2.81, 0.84)	-0.41 (-2.48, 1.67)
race.factorBlack		-4.70^*** (-6.99, -2.40)
race.factorOther		-1.17 (-4.75, 2.41)
vo2_site.factorPitt:race.factorBlack			-3.43^** (-6.74, -0.13)
vo2_site.factorKansas:race.factorBlack			-1.54 (-7.73, 4.65)
vo2_site.factorNortheastern:race.factorBlack			-7.59^*** (-11.08, -4.10)
vo2_site.factorPitt:race.factorOther			-1.51 (-10.88, 7.86)
vo2_site.factorKansas:race.factorOther			-3.34 (-9.10, 2.42)
vo2_site.factorNortheastern:race.factorOther			0.48 (-4.80, 5.77)
poly(educ, 2)1:vo2_site.factorKansas				-1.03 (-45.00, 42.94)
poly(educ, 2)2:vo2_site.factorKansas				-4.36 (-49.30, 40.58)
poly(educ, 2)1:vo2_site.factorNortheastern				15.34 (-35.71, 66.40)
poly(educ, 2)2:vo2_site.factorNortheastern				-14.29 (-69.30, 40.72)
poly(educ, 2)1:vo2_site.factorPitt:race.factorBlack				55.82 (-21.25, 132.88)
poly(educ, 2)2:vo2_site.factorPitt:race.factorBlack				15.67 (-52.67, 84.02)
poly(educ, 2)1:vo2_site.factorKansas:race.factorBlack				-23.51 (-180.23, 133.21)
poly(educ, 2)2:vo2_site.factorKansas:race.factorBlack				34.34 (-114.99, 183.66)
poly(educ, 2)1:vo2_site.factorNortheastern:race.factorBlack				125.78^*** (42.44, 209.13)
poly(educ, 2)2:vo2_site.factorNortheastern:race.factorBlack				47.74 (-54.45, 149.93)
poly(educ, 2)1:vo2_site.factorPitt:race.factorOther				-47.37 (-424.66, 329.92)
poly(educ, 2)2:vo2_site.factorPitt:race.factorOther				18.53 (-306.68, 343.73)
poly(educ, 2)1:vo2_site.factorKansas:race.factorOther				47.94 (-112.55, 208.42)
poly(educ, 2)2:vo2_site.factorKansas:race.factorOther				34.09 (-242.89, 311.06)
poly(educ, 2)1:vo2_site.factorNortheastern:race.factorOther				-85.27 (-248.47, 77.94)
poly(educ, 2)2:vo2_site.factorNortheastern:race.factorOther				111.43 (-74.17, 297.04)
Constant	50.34^*** (48.70, 51.99)	50.79^*** (49.16, 52.43)	38.88^*** (33.09, 44.68)	50.26^*** (48.87, 51.65)

Observations	500	500	500	500
R²	0.15	0.17	0.18	0.18
Adjusted R²	0.13	0.16	0.16	0.14
Residual Std. Error	8.22 (df = 492)	8.11 (df = 490)	8.12 (df = 487)	8.18 (df = 478)
F Statistic	12.02^*** (df = 7; 492)	11.41^*** (df = 9; 490)	8.73^*** (df = 12; 487)	4.95^*** (df = 21; 478)

Note:	p<0.1; p<0.05; p<0.01

hvlt_total_recall_tscore - Test mod 4 - Overfit

Might edu gap explain variations in hvlt scores at NEU?

## 
## Call:
## lm(formula = hvlt_total_recall_tscore ~ vo2_sex.factor + poly(vo2_age, 
##     2) + poly(educ, 2)/vo2_site.factor/race.factor, data = data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -23.3677  -5.2901   0.2773   5.6479  18.0574 
## 
## Coefficients:
##                                                             Estimate Std. Error
## (Intercept)                                                  50.2611     0.7105
## vo2_sex.factorFemale                                          5.2009     0.8350
## poly(vo2_age, 2)1                                            27.1164     8.4420
## poly(vo2_age, 2)2                                           -17.7691     8.4114
## poly(educ, 2)1                                               28.9633    16.5313
## poly(educ, 2)2                                              -11.3021    18.3586
## poly(educ, 2)1:vo2_site.factorKansas                         -1.0334    22.4342
## poly(educ, 2)2:vo2_site.factorKansas                         -4.3577    22.9286
## poly(educ, 2)1:vo2_site.factorNortheastern                   15.3438    26.0484
## poly(educ, 2)2:vo2_site.factorNortheastern                  -14.2897    28.0685
## poly(educ, 2)1:vo2_site.factorPitt:race.factorBlack          55.8177    39.3184
## poly(educ, 2)2:vo2_site.factorPitt:race.factorBlack          15.6748    34.8717
## poly(educ, 2)1:vo2_site.factorKansas:race.factorBlack       -23.5102    79.9612
## poly(educ, 2)2:vo2_site.factorKansas:race.factorBlack        34.3359    76.1878
## poly(educ, 2)1:vo2_site.factorNortheastern:race.factorBlack 125.7844    42.5230
## poly(educ, 2)2:vo2_site.factorNortheastern:race.factorBlack  47.7353    52.1387
## poly(educ, 2)1:vo2_site.factorPitt:race.factorOther         -47.3709   192.4975
## poly(educ, 2)2:vo2_site.factorPitt:race.factorOther          18.5271   165.9243
## poly(educ, 2)1:vo2_site.factorKansas:race.factorOther        47.9351    81.8825
## poly(educ, 2)2:vo2_site.factorKansas:race.factorOther        34.0851   141.3142
## poly(educ, 2)1:vo2_site.factorNortheastern:race.factorOther -85.2682    83.2686
## poly(educ, 2)2:vo2_site.factorNortheastern:race.factorOther 111.4342    94.6980
##                                                             t value Pr(>|t|)
## (Intercept)                                                  70.743  < 2e-16
## vo2_sex.factorFemale                                          6.229 1.03e-09
## poly(vo2_age, 2)1                                             3.212  0.00141
## poly(vo2_age, 2)2                                            -2.112  0.03516
## poly(educ, 2)1                                                1.752  0.08041
## poly(educ, 2)2                                               -0.616  0.53843
## poly(educ, 2)1:vo2_site.factorKansas                         -0.046  0.96328
## poly(educ, 2)2:vo2_site.factorKansas                         -0.190  0.84935
## poly(educ, 2)1:vo2_site.factorNortheastern                    0.589  0.55610
## poly(educ, 2)2:vo2_site.factorNortheastern                   -0.509  0.61092
## poly(educ, 2)1:vo2_site.factorPitt:race.factorBlack           1.420  0.15637
## poly(educ, 2)2:vo2_site.factorPitt:race.factorBlack           0.449  0.65327
## poly(educ, 2)1:vo2_site.factorKansas:race.factorBlack        -0.294  0.76887
## poly(educ, 2)2:vo2_site.factorKansas:race.factorBlack         0.451  0.65243
## poly(educ, 2)1:vo2_site.factorNortheastern:race.factorBlack   2.958  0.00325
## poly(educ, 2)2:vo2_site.factorNortheastern:race.factorBlack   0.916  0.36037
## poly(educ, 2)1:vo2_site.factorPitt:race.factorOther          -0.246  0.80572
## poly(educ, 2)2:vo2_site.factorPitt:race.factorOther           0.112  0.91114
## poly(educ, 2)1:vo2_site.factorKansas:race.factorOther         0.585  0.55855
## poly(educ, 2)2:vo2_site.factorKansas:race.factorOther         0.241  0.80950
## poly(educ, 2)1:vo2_site.factorNortheastern:race.factorOther  -1.024  0.30635
## poly(educ, 2)2:vo2_site.factorNortheastern:race.factorOther   1.177  0.23989
##                                                                
## (Intercept)                                                 ***
## vo2_sex.factorFemale                                        ***
## poly(vo2_age, 2)1                                           ** 
## poly(vo2_age, 2)2                                           *  
## poly(educ, 2)1                                              .  
## poly(educ, 2)2                                                 
## poly(educ, 2)1:vo2_site.factorKansas                           
## poly(educ, 2)2:vo2_site.factorKansas                           
## poly(educ, 2)1:vo2_site.factorNortheastern                     
## poly(educ, 2)2:vo2_site.factorNortheastern                     
## poly(educ, 2)1:vo2_site.factorPitt:race.factorBlack            
## poly(educ, 2)2:vo2_site.factorPitt:race.factorBlack            
## poly(educ, 2)1:vo2_site.factorKansas:race.factorBlack          
## poly(educ, 2)2:vo2_site.factorKansas:race.factorBlack          
## poly(educ, 2)1:vo2_site.factorNortheastern:race.factorBlack ** 
## poly(educ, 2)2:vo2_site.factorNortheastern:race.factorBlack    
## poly(educ, 2)1:vo2_site.factorPitt:race.factorOther            
## poly(educ, 2)2:vo2_site.factorPitt:race.factorOther            
## poly(educ, 2)1:vo2_site.factorKansas:race.factorOther          
## poly(educ, 2)2:vo2_site.factorKansas:race.factorOther          
## poly(educ, 2)1:vo2_site.factorNortheastern:race.factorOther    
## poly(educ, 2)2:vo2_site.factorNortheastern:race.factorOther    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.184 on 478 degrees of freedom
## Multiple R-squared:  0.1786, Adjusted R-squared:  0.1425 
## F-statistic: 4.948 on 21 and 478 DF,  p-value: 1.549e-11


	Dependent variable:

	hvlt_total_recall_tscore

vo2_sex.factorFemale	5.0^*** (3.4, 6.6)
poly(educ, 2)1	43.5^*** (27.2, 59.8)
poly(educ, 2)2	-20.1^** (-36.3, -4.0)
poly(vo2_age, 2)1	25.7^*** (9.4, 42.0)
poly(vo2_age, 2)2	-19.0^** (-35.1, -2.9)
usp_data_anticholinergic_otc.factor	3.3^* (-0.2, 6.8)
Constant	50.0^*** (48.6, 51.3)

Observations	500
R²	0.1
Adjusted R²	0.1
Residual Std. Error	8.2 (df = 493)
F Statistic	14.2^*** (df = 6; 493)

Note:	p<0.1; p<0.05; p<0.01

LOGIT REGRESSION

https://stats.idre.ucla.edu/r/dae/logit-regression/

MULTINOMIAL LOGISTIC REGRESSION

https://stats.idre.ucla.edu/r/dae/multinomial-logistic-regression/

Adjudication ~ vo2_sex.factor/Weekly_alc.score+vo2_site.factor

## # weights:  21 (12 variable)
## initial  value 549.306144 
## iter  10 value 392.959092
## iter  20 value 388.145238
## iter  20 value 388.145238
## iter  20 value 388.145238
## final  value 388.145238 
## converged

## # weights:  18 (10 variable)
## initial  value 549.306144 
## iter  10 value 397.154656
## final  value 391.285192 
## converged

Adjudication ~ hvlt_total_recall_tscore+race.factor+vo2_site.factor

## # weights:  21 (12 variable)
## initial  value 549.306144 
## iter  10 value 346.156713
## iter  20 value 343.193241
## final  value 343.193220 
## converged

Adjudication ~ hvlt_total_recall_tscore/Smoking.Status

## # weights:  21 (12 variable)
## initial  value 548.207532 
## iter  10 value 359.830937
## final  value 342.616306 
## converged

IGNITE HHQ / Demographics

Task(s):

Health History Questionnaire (HHQ)

Creating Summary Variables...

HHQ Caffeine Consumption -

Entered as Missing Data...

Recode entries reported as...

1. Missing/Performed test incorrectly -

2. Translated as dates (not included in codebook...)

3. Subjective estimates/ranges...

4. Mathmatical expressions...

Creating Variables...

Outliers- 10898, 30

HHQ Alcohol Consumption

Entered as Missing Data...

Recode entries reported as...

1. Subjective estimates/ranges...

2. Translated as dates (not included in codebook...)

3. Mathmatical expressions...

Creating Variables...

Outliers-

HHQ Smoking Demographics

Entered as Missing Data...

Missing secondary smoking demographics...

Recode entries...

1. Missing/Performed Test Incorrectly -

2. Subjective estimates/ranges...

3. Translated as dates (not included in codebook...)

4. Mathmatical expressions...

Creating Variables...

HHQ Education

HHQ Mother Education

Entered as Missing Values...

Recode entries...

1. Missing Values/Performed Test Incorrectly -

2. Automated Recoding

3. Manually Recode..

HHQ Recent Health Events...

Entered as Missing Data...

Chronic Illnesses -

Recent Illness -

Recently hospitalized -

Recent Surgery -

Creating Variables...

HHQ language...

### HHQ Summary Variables...

HHQ_Health_Status.Factor

USP Classified Medications:

Creating Variables...

Manually Count USP Classified BETA BLOCKERS/RELATED Agents

Merge HVLT Data

Write out processed/merged variables...

Import/Process Full RedCap Database as.is:

1. Education Variables

2. VO2 Test Data

3. CIRS Data

4. Hru Data

Mean Tables

Site

Sex differences-

Smoking Demos -

Cigarettes only

All Inhaled Alternatives

Summated Smoking factors

Smoking Demos Stratified

Corrected Smoking Status (all inhaled alternatives)...

Adjudication Outcomes

ITT / Withdrew (?)

COVID Timelines

Correlograms

Bar Charts (sig & trending chi bolded)

Site Dispersion

Adjudication Outcomes

Demographics

Stratified Box Plots:

hvlt_total_recall_tscore - Model non-linear main effects

hvlt_total_recall_tscore - Test mod 4 - Overfit

LOGIT REGRESSION

MULTINOMIAL LOGISTIC REGRESSION

Adjudication ~ vo2_sex.factor/Weekly_alc.score+vo2_site.factor

Manually Count USP Classified `BETA BLOCKERS/RELATED` Agents