I re-organized data as four .csv files.
We have total 30 observations (Omitted sample ID of 6, 15, 23) and 10 variables, we will focus on chemical measurements (BPS BPF BPE BPA BPB BPAF 4NP) . The observations are saperated as three groups:
Subject mothers: maternal urine
BPA Analog maternal urine-1st-group-10-26-2015 (Sample ID: 1-11)
BPA Analog maternal urine-2nd-group-11-05-2015 (Sample ID: 12-24)
BPA Analog maternal urine-3rd-group-12-14-2015 (Sample ID: 25-35)
Except Sample ID, the rest of 9 variables have the unit measures as
Specific Gravity(mg/ml), volumn(ml), BPS(ng/ml), BPF(ng/ml), BPE(ng/ml), BPA(ng/ml) BPB(ng/ml), BPAF(ng/ml), 4NP(ng/ml).
We will compare the measurements of the chemicals with corresponding LOQ(ng/ml).
Read the data
# only include the chemical varaibles (quantity)
mother <- read.csv(file="Hackensack_BPA_alternatives_Maternal_Urine.csv", header = T)[,4:10]
#corresponding chemical LOQ
mother_LOQ <- read.csv(file="Hackensack_BPA_alternatives_Maternal_Urine.csv", header = T)[,12:18]
# number of mothers at least one chemical were detected
sum(rowSums(mother>mother_LOQ)>=1)
## [1] 10
# logical value of whether the chemical is detected
mdec <- mother>mother_LOQ
mother_presence <- rowSums(mother>mother_LOQ)>=1
#correlation among mother urine blood presence and other chemical detected
library(GGally)
ggpairs(mdec)
# ggpairs(cbind(mdec,mother_presence))
ggpairs(mother)
# ggpairs(cbind(mother,mother_presence))
For mother urine blood presence, BPB measurement is the most related to presence, since the correlation coefficent between BPA and mother urine blood presence is the highest 0.707 (0.49 for original data).
We have total 32 observations (Omitted sample ID of 6, 15, 23) and 9 variables, we will focus on chemical measurements (BPS BPF BPE BPA BPB BPAF 4NP) . The observations are saperated by two groups:
Subject mothers: cord blood
BPA Analog cord blood-1st-group-12-18-2015 (Sample ID: 1-18)
BPA Analog cord blood-2nd-group-1-11-2016 (Sample ID: 19-35)
Except Sample ID, the rest of 8 variables have the unit measures as
** volumn(ml), BPS(ng/ml), BPF(ng/ml), BPE(ng/ml), BPA(ng/ml) BPB(ng/ml), BPAF(ng/ml), 4NP(ng/ml).**
We will compare the measurements of the chemicals with corresponding LOQ(ng/ml).
Read the data
baby <- read.csv(file="Hackensack_BPA_alternatives_cord_blood_serum.csv", header = T)[,3:9]
baby_LOQ <- read.csv(file="Hackensack_BPA_alternatives_cord_blood_serum.csv", header = T)[,11:17]
sum(rowSums(baby>baby_LOQ)>=1)
## [1] 10
bdec <- baby>baby_LOQ
baby_presence <- rowSums(baby>baby_LOQ)>=1
#correlation among baby blood presence and other chemical detected
library(scales)
library(GGally)
ggpairs(cbind(bdec,baby_presence))
ggpairs(cbind(baby,baby_presence))
# correlation between urine blood presence of mother and baby
cor(mother_presence,baby_presence)
## [1] 0.4
For baby blood presence, BPA measurement is the most related to presence, since the correlation coefficent between BPA and baby presence is the highest 0.78 (0.77 for original data).