The purpose of this script is to combine the data sets from Van Meter et al. (2014, 2015, 2016, 2018), Glinski et al. (2018a, b, c, 2019), and Henson-Ramsey (2008) to create a collated database of amphibian dermal exposure data.
| Manuscript | Data Set (Original Source Link) | Data Set (Repo Link) | Additional Data Sets |
|---|---|---|---|
| Van Meter et al. 2014 | good_data.csv | vm2014_data.csv | |
| Van Meter et al. 2015 | good_data.csv | vm2014_data.csv | |
| Van Meter et al. 2016 | RDATA.csv | vm2016_data.csv | |
| Van Meter et al. 2018 | vm2017_merge.csv | vm2017_merge.csv | |
| Glinski et al. 2018a (dehydration) | dehydration3.csv | dag2016_data_dehydration.csv | |
| Glinski et al. 2018b (metabolites) | exposure_experiment.csv | dag2016_data_metabolites_4merge.csv | |
| Glinski et al. 2018c (biomarkers) | exposure_mixtures3.csv | dag2018_data_biomarkers.csv | biomarker.csv (dag_biomarker2.csv) |
| Glinski et al. 2020 (dermal routes) | Water_soil.csv | dag2019_dermal_routes.csv | Dermal_routes_weights.csv (weights) |
| Henson-Ramsey 2008 | HensonRamseyetal2008_data.pdf | hr2008_data.csv |
This repository can be found at: https://github.com/puruckertom/amphib_dermal_collation
If you are on a Mac and get xquartz complaints (knitr), install from: https://www.xquartz.org/
Van Meter et al. 2014 performed exposures for 5 pesticide active ingredients (imidacloprid, pendimethalin, atrazine, fipronil, tridimefon) and 7 species (Southern leopard frog (Lithobates sphenocephala), Fowler’s toad (Anaxyrus fowleri), gray treefrog (Hyla versicolor), Northern cricket frog (Acris crepitans), Eastern narrowmouth toad (Gastrophryne carolinensis), barking treefrog (Hyla gratiosa) and green treefrog (Hyla cinerea)). Whole body tissue concentrations were measured after an 8 hour exposure period to contaminated soil. Pesticides were applied at the maximum legally allowable application rates scaled down to the area of a 10-gallon aquarium.
Van Meter et al. 2015 contrasted two pesticide exposure scenarios: direct exposure through aerial overspray and indirect exposure through soil. These scenarios tested the same 5 pesticide active ingredients and two of the species (barking treefrog (Hyla gratiosa) and green treefrog (Hyla cinerea)). Pesticides were applied at the maxium legally allowable application rates scaled down to the size of a 10-gallon aquarium, with the exception of pedimethalin which was applied at 30% of the permitted application rate. This was due to pedimethalin’s insolubility in the limited solvent and the water volumes used in this study.
For our purposes, the Van Meter et al. 2015 essentially adds the aerial overspray exposures to the Van Meter et al. 2014 data set.
Note: this file does include metabolites into the total for the parents
Data Set Dimensions, Column Names, and Summary:
## [1] 474 23
## [1] "Species" "Sample" "Chemical" "Instrument"
## [5] "good" "Application" "app_rate_g_cm2" "TissueConc"
## [9] "SoilConc" "logKow" "BCF" "bodyweight"
## [13] "initialweight" "Solat20C_mgL" "Solat20C_gL" "molmass_gmol"
## [17] "Density_gcm3" "AppFactor" "SA_cm2" "VapPrs_mPa"
## [21] "Koc_gmL" "HalfLife_day" "HabFac"
## Species Sample Chemical Instrument
## Barking treefrog:120 FTA1 : 4 Total Triadimefon: 49 GCMS: 25
## Green treefrog :115 FTA2 : 4 Triadimefon : 49 LCMS:449
## Gray treefrog : 60 FTA3 : 4 Triadimenol : 49
## Fowlers toad : 55 FTA4 : 4 Atrazine : 39
## Leopard frog : 44 FTA5 : 4 Fipronil : 39
## Mole salamander : 40 HCA1 : 4 Pendimethalin : 39
## (Other) : 40 (Other):450 (Other) :210
## good Application app_rate_g_cm2 TissueConc
## Min. :1 Overspray:115 Min. :0e+00 Min. : 0.007484
## 1st Qu.:1 Soil :359 1st Qu.:0e+00 1st Qu.: 0.246753
## Median :1 Median :0e+00 Median : 0.575811
## Mean :1 Mean :1e-05 Mean : 1.908242
## 3rd Qu.:1 3rd Qu.:2e-05 3rd Qu.: 1.743142
## Max. :1 Max. :2e-05 Max. :23.441298
## NA's :151
## SoilConc logKow BCF bodyweight
## Min. : 0.00625 Min. :0.570 Min. : 0.0018 Min. :0.5004
## 1st Qu.: 0.20866 1st Qu.:2.500 1st Qu.: 0.0755 1st Qu.:1.3162
## Median : 3.49248 Median :3.110 Median : 0.2069 Median :1.8550
## Mean : 7.22468 Mean :3.142 Mean : 11.3804 Mean :1.8658
## 3rd Qu.:10.06719 3rd Qu.:4.000 3rd Qu.: 1.0828 3rd Qu.:2.3489
## Max. :81.71115 Max. :5.180 Max. :396.8461 Max. :3.9931
##
## initialweight Solat20C_mgL Solat20C_gL molmass_gmol
## Min. :0.5004 Min. : 0.30 Min. :0.00030 Min. :215.7
## 1st Qu.:1.6614 1st Qu.: 3.78 1st Qu.:0.00378 1st Qu.:215.7
## Median :2.1766 Median : 30.00 Median :0.03000 Median :291.7
## Mean :2.2307 Mean :123.20 Mean :0.12320 Mean :299.5
## 3rd Qu.:2.7601 3rd Qu.:260.00 3rd Qu.:0.26000 3rd Qu.:291.7
## Max. :5.5480 Max. :510.00 Max. :0.51000 Max. :437.1
##
## Density_gcm3 AppFactor SA_cm2 VapPrs_mPa
## Min. :1.170 Min. : 850 Min. : 0.7915 Min. :0.00020
## 1st Qu.:1.187 1st Qu.: 47011 1st Qu.: 1.5393 1st Qu.:0.00037
## Median :1.220 Median : 143055 Median : 1.7866 Median :0.02000
## Mean :1.288 Mean : 291904 Mean : 3.0232 Mean :0.34774
## 3rd Qu.:1.480 3rd Qu.: 348598 3rd Qu.: 2.0882 3rd Qu.:0.04000
## Max. :1.543 Max. :4490329 Max. :23.3326 Max. :4.00000
## NA's :151
## Koc_gmL HalfLife_day HabFac
## Min. : 122 Min. : 26.00 Aquatic : 59
## 1st Qu.: 122 1st Qu.: 26.00 Arboreal :300
## Median : 520 Median : 80.00 Terrestrial:115
## Mean : 20406 Mean : 70.85
## 3rd Qu.: 825 3rd Qu.: 84.00
## Max. :243000 Max. :125.00
##
Van Meter et al. 2016 considered bioconcentration of 5 current-use pesticides (imidacloprid, atrazine, triadimefon, fipronil, and pedimethalin) in American toads (Bufo americanus) across soil types. Toads were exposed to one of two soil types with significantly different organic matter content (14.1% = high organic matter, 3.1% = low organic matter). Whole body tissue concentrations were measured after an 8 hour exposure period to contaminated soil. Pesticides were applied at the maximum legally allowable application rates scaled down to the area of six 0.94 L Pyrex glass bowls each with a 15 cm diameter.
Note: this file does include metabolites into the total for the parents
Data Set Dimensions, Column Names, and Summary:
## [1] 264 11
## [1] "Day" "Row" "Column" "Pesticide" "SoilType"
## [6] "BodyBurden" "Soil" "Weight" "Total" "Formulation"
## [11] "Parent"
## Day Row Column Pesticide SoilType
## Min. :0.000 Min. :1.000 I :37 ATZ : 24 OLS:132
## 1st Qu.:2.000 1st Qu.:2.000 B :35 ATZTOT : 24 PLE:132
## Median :2.000 Median :4.000 A :34 Pendi : 24
## Mean :2.326 Mean :4.023 C :30 TDN : 24
## 3rd Qu.:3.000 3rd Qu.:6.000 G :30 TNDTOT : 24
## Max. :3.000 Max. :7.000 D :29 ATZDEA : 12
## (Other):69 (Other):132
## BodyBurden Soil Weight Total
## Min. :-0.0378 Min. :-0.10518 Min. : 6.964 Min. :0.0000
## 1st Qu.: 0.0486 1st Qu.: 0.02086 1st Qu.:10.524 1st Qu.:0.0000
## Median : 0.1099 Median : 1.49572 Median :11.740 Median :0.0000
## Mean : 0.4955 Mean : 6.02720 Mean :12.044 Mean :0.3636
## 3rd Qu.: 0.3650 3rd Qu.: 8.64289 3rd Qu.:13.440 3rd Qu.:1.0000
## Max. : 6.8744 Max. :39.57404 Max. :23.340 Max. :1.0000
##
## Formulation Parent
## Min. :0.0000 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.:0.0000
## Median :0.0000 Median :1.0000
## Mean :0.4091 Mean :0.5909
## 3rd Qu.:1.0000 3rd Qu.:1.0000
## Max. :1.0000 Max. :1.0000
##
Glinski et al. 2018a studied how amphibian hydration status influences uptake of pesticides through dermal exposure. Amphibians (Southern leopard frogs (Lithobates sphenocephala) and Fowler’s toads (Anaxyrus fowleri)) were dehyrated for periods of 0, 2, 4, 6, 8, or 10 hours prior to exposure to pesticide-contaminated soils. Pesticides studied included atrazine, triadimefon, metolachlor, chlorothalonil, and imidacloprid. Soil and whole-body homogenates were measured after an 8 hour exposure period. Pesticides were applied at the maximum legally allowable application rates scaled down to the area of six 0.94 L Pyrex glass bowls each with a 15 cm diameter.
Note: this file does not combine daughters with parents
Note: this file has body burdens and soil concentrations as separate rows
Data Set Dimensions, column Names, and Summary:
## [1] 1494 8
## [1] "time" "parent" "analyte" "matrix" "species" "conc" "ID"
## [8] "weight"
## time parent analyte matrix species
## Min. : 0 atrazine:396 atrazine:132 amphib:732 BA:630
## 1st Qu.: 2 chloro :168 dea :132 soil :762 LF:864
## Median : 5 chloro+d: 66 dia :132
## Mean : 5 imid : 72 mesa :132
## 3rd Qu.: 8 metol :396 metol :132
## Max. :10 tdn :396 moxa :132
## (Other) :702
## conc ID weight
## Min. : 0.00000 BAA0-1 : 6 Min. :0.6821
## 1st Qu.: 0.02215 BAA0-2 : 6 1st Qu.:1.6108
## Median : 0.08482 BAA0-3 : 6 Median :3.0890
## Mean : 6.17646 BAA0-4 : 6 Mean :3.0810
## 3rd Qu.: 2.60007 BAA0-5 : 6 3rd Qu.:4.3124
## Max. :238.15019 BAA10-1: 6 Max. :7.2481
## (Other):1458
Henson-Ramsey 2008 tested the biological impact of exposure to malathion for tiger salamanders (Ambystoma tigrinum). Tiger salamanders were exposed to contaminated soils with 50 ug/cm2 or 100 ug/cm2 malathion and through ingestion of an earthworm exposed to contaminated soils with 200 ug/cm2 malathion. For each exposure, the malathion application rate was sprayed onto the approximately 1200g of soil in the 1060cm2 polyethylene cages. Tissue concentrations were assessed for five treatment groups: unexposed, exposed to 50 ug/cm2 contaminated soil for 1 day, exposed to 50 ug/cm2 for 2 days, exposed to 50 ug/cm2 contaminated soil for 2 days and fed a contaminated worm on the first exposure day, and exposed to 100 ug/cm2 contaminated soil for 2 days and fed a contaminated worm on the first exposure day.
Data Set Dimensions, Column Names, and Summary:
## [1] 9 12
## [1] "chemical" "species" "tissue_conc_ugg"
## [4] "sample_id" "body_weight_g" "formulation"
## [7] "soil_type" "application" "app_rate_g_cm2"
## [10] "exp_duration" "soil_conc_ugg" "source"
## chemical species tissue_conc_ugg sample_id
## Malathion:9 Ambystoma_tigrinum:9 Min. :0.050 sal1 :1
## 1st Qu.:0.350 sal2 :1
## Median :1.420 sal3 :1
## Mean :1.186 sal4 :1
## 3rd Qu.:1.470 sal5 :1
## Max. :3.730 sal6 :1
## (Other):3
## body_weight_g formulation soil_type application app_rate_g_cm2
## Min. :20.89 Mode:logical Mode:logical soil:9 Min. :5e-05
## 1st Qu.:44.15 NA's:9 NA's:9 1st Qu.:5e-05
## Median :46.26 Median :5e-05
## Mean :43.73 Mean :5e-05
## 3rd Qu.:48.93 3rd Qu.:5e-05
## Max. :50.92 Max. :5e-05
##
## exp_duration soil_conc_ugg source
## Min. :24 Mode:logical hr2008:9
## 1st Qu.:24 NA's:9
## Median :48
## Mean :40
## 3rd Qu.:48
## Max. :48
##
Glinski et al. 2018b assessed the potential metabolic activation of pesticides (atrazine, triadimefon, fopronil) in amphibians. This data set (1) contains in vitro and in vivo metabolic rate constants derived from toad (Anaxyrus terrestris) livers during experiments measuring the depletion of pesticides and the formation of their metabolites. Pesticides were applied at the maximum legally allowable application rates scaled down to the area of a 10-gallon aquarium.
Data Set Dimensions, Column Names, and Summary:
## [1] 352 6
## [1] "time" "parent" "analyte" "matrix" "conc" "replicate"
## time parent analyte matrix
## Min. : 0.00 atrazine :132 ATZ :44 amphib:160
## 1st Qu.: 2.00 fipronil : 88 DEA :44 soil :192
## Median :12.00 triadimefon:132 DIA :44
## Mean :16.41 F. sulfone:44
## 3rd Qu.:24.00 FIP :44
## Max. :48.00 TDL A :44
## (Other) :88
## conc replicate
## Min. :-0.01244 Min. :1.00
## 1st Qu.: 0.01292 1st Qu.:1.75
## Median : 0.08373 Median :2.50
## Mean : 2.12963 Mean :2.50
## 3rd Qu.: 0.97824 3rd Qu.:3.25
## Max. :32.47385 Max. :4.00
##
The in vitro derived constants were assessed for their precitability by exposing Fowler’s toads (Anaxyrus fowleri) to contaminated soils at maximum application rate for 2, 4, 12, and 48 hours. This data set (merged) contains the data from the Fowler’s toad experiment along with the tissue concentrations from data set 1; this data set (merged) is used in subsequent steps.
Data Set Dimensions, Column Names, and Summary:
## [1] 60 12
## [1] "exp_duration" "chemical" "tissue_conc_ugg"
## [4] "sample_id" "soil_type" "app_rate_g_cm2"
## [7] "soil_conc_ugg" "body_weight_g" "formulation"
## [10] "species" "application" "source"
## exp_duration chemical tissue_conc_ugg sample_id
## Min. : 2 atrazine :20 Min. :0.08328 atrazine_12hr_1: 1
## 1st Qu.: 4 fipronil :20 1st Qu.:0.33733 atrazine_12hr_2: 1
## Median :12 triadimefon:20 Median :0.86010 atrazine_12hr_3: 1
## Mean :18 Mean :1.42634 atrazine_12hr_4: 1
## 3rd Qu.:24 3rd Qu.:1.88383 atrazine_24hr_1: 1
## Max. :48 Max. :7.62649 atrazine_24hr_2: 1
## (Other) :54
## soil_type app_rate_g_cm2 soil_conc_ugg body_weight_g
## Mode:logical Min. :1.100e-06 Mode:logical Min. :0.1879
## NA's:60 1st Qu.:1.100e-06 NA's:60 1st Qu.:0.5925
## Median :2.700e-06 Median :0.7144
## Mean :9.237e-06 Mean :0.7350
## 3rd Qu.:2.290e-05 3rd Qu.:0.8782
## Max. :2.290e-05 Max. :1.4909
##
## formulation species application source
## Min. :0 Anaxyrus_fowleri:60 soil:60 dag_metabolites:60
## 1st Qu.:0
## Median :0
## Mean :0
## 3rd Qu.:0
## Max. :0
##
Glinski et al. 2018c exposed Southern leopard frogs (Lithobates sphenocephala) to either the maximum or 1/10th maximum pesticide application rate to single, double, or triple pesticide mixtures of bifenthrin, metolachlor, and triadimefon to consider the typical co-application of pesticides during agricultural growing seasons. Tissue concentrations and metabolomic profiling of amphibian livers were studied after an 8 hour exposure period to pesticide-contaminated soil. Pesticides application rates were scaled down to the area of eight 0.94 L Pyrex glass bowls each with a 15 cm diameter.
Data Set Dimensions, Column Names, and Summary:
## [1] 192 9
## [1] "group" "met" "tdt" "bif" "frog.weight"
## [6] "sample_id" "pesticide" "rate" "conc"
## group met tdt bif
## bif :16 Min. :-1.0000 Min. :-1.0000 Min. :-1.0000
## bifmet :32 1st Qu.:-1.0000 1st Qu.:-1.0000 1st Qu.:-1.0000
## bifmettdt:48 Median : 1.0000 Median : 1.0000 Median : 1.0000
## biftdt :32 Mean : 0.3333 Mean : 0.3333 Mean : 0.3333
## met :16 3rd Qu.: 1.0000 3rd Qu.: 1.0000 3rd Qu.: 1.0000
## mettdt :32 Max. : 1.0000 Max. : 1.0000 Max. : 1.0000
## tdt :16
## frog.weight sample_id pesticide rate
## Min. :1.012 TMB 10 1: 3 bif:64 1/10th Max:96
## 1st Qu.:2.745 TMB 10 2: 3 met:64 Maximum :96
## Median :3.142 TMB 10 3: 3 tdt:64
## Mean :3.299 TMB 10 4: 3
## 3rd Qu.:3.789 TMB 10 5: 3
## Max. :6.739 TMB 10 6: 3
## (Other) :174
## conc
## Min. : 0.001061
## 1st Qu.: 0.069055
## Median : 0.212920
## Mean : 0.801643
## 3rd Qu.: 0.521471
## Max. :19.879783
##
Van Meter et al. 2018 evaluated risks to amphibians after exposure to a single pesticide and pesticide mixtures. The five pesticides studied were three herbicides (atrazine, metolachlor, and 2,4-D), one insecticide (malathion), and one fungicide (propiconazole). Juvenile green frogs (Lithobates clamitans) were exposed to contaminated soils for 8 hours and metabolic analysis of amphibian livers was conducted to measure the effects. Pesticides were applied at the maximum legally allowable application rates individually and in mixtures of two or three pesticides within an herbicide or mixed pesticide group, scaled down to the area of six 0.94 L Pyrex glass bowls each with a 15 cm diameter.
Two data sets were generated from this study, one containing data for exposure to herbicides (single and mixed) and the other containing data for exposure to mixed pesticide treatments (herbicides, insecticide, fungicide).
Data Set Dimensions, Column Names, and Summary:
## [1] 378 10
## [1] "Group" "ATZ" "D" "ME" "AppRate"
## [6] "Weight" "SA" "Media" "Pesticide" "Conc"
## Group ATZ D ME
## ATZ :54 Min. :-1.0000 Min. :-1.0000 Min. :-1.0000
## ATZD :54 1st Qu.:-1.0000 1st Qu.:-1.0000 1st Qu.:-1.0000
## ATZME :54 Median : 1.0000 Median : 1.0000 Median : 1.0000
## ATZMED:54 Mean : 0.1429 Mean : 0.1429 Mean : 0.1429
## D :54 3rd Qu.: 1.0000 3rd Qu.: 1.0000 3rd Qu.: 1.0000
## ME :54 Max. : 1.0000 Max. : 1.0000 Max. : 1.0000
## MED :54
## AppRate Weight SA Media
## Min. :14.30 Min. :0.9634 Min. :1.107 Amphib:126
## 1st Qu.:23.60 1st Qu.:1.6929 1st Qu.:1.534 BCF :126
## Median :37.90 Median :2.0637 Median :1.720 Soil :126
## Mean :39.31 Mean :2.0892 Mean :1.715
## 3rd Qu.:54.50 3rd Qu.:2.4927 3rd Qu.:1.919
## Max. :68.80 Max. :3.6843 Max. :2.406
##
## Pesticide Conc
## ATZBCF : 42 Min. : 0.00000
## ATZS : 42 1st Qu.: 0.00000
## ATZT : 42 Median : 0.06358
## DBCF : 42 Mean : 5.64721
## DS : 42 3rd Qu.: 1.46036
## DT : 42 Max. :76.03573
## (Other):126
Data Set Dimensions, Column Names, and Summary:
## [1] 216 9
## [1] "Group" "ATZ" "MA" "PROP" "Pesticide" "Media"
## [7] "Conc" "Weight" "SA"
## Group ATZ MA PROP
## ATZ :18 Min. :-1.0000 Min. :-1.0000 Min. :-1.0000
## ATZMA :36 1st Qu.:-1.0000 1st Qu.:-1.0000 1st Qu.:-1.0000
## ATZMAPZ:54 Median : 1.0000 Median : 1.0000 Median : 1.0000
## ATZPZ :36 Mean : 0.3333 Mean : 0.3333 Mean : 0.3333
## MA :18 3rd Qu.: 1.0000 3rd Qu.: 1.0000 3rd Qu.: 1.0000
## MAPZ :36 Max. : 1.0000 Max. : 1.0000 Max. : 1.0000
## PZ :18
## Pesticide Media Conc Weight
## ATZBCF :24 Amphib:72 Min. : 0.00024 Min. :1.188
## ATZS :24 BCF :72 1st Qu.: 0.32682 1st Qu.:1.786
## ATZT :24 Soil :72 Median : 1.61181 Median :2.014
## MABCF :24 Mean : 5.46049 Mean :2.203
## MAS :24 3rd Qu.: 9.99874 3rd Qu.:2.455
## MAT :24 Max. :71.52122 Max. :4.014
## (Other):72
## SA
## Min. :1.447
## 1st Qu.:1.833
## Median :1.965
## Mean :2.047
## 3rd Qu.:2.203
## Max. :2.929
##
The herbicide and mixed pesticide data sets were cleaned and joined into a merged data set (referred to as Van Meter et al. 2018 Multiple Pesticides Study in subsequent steps). The single and mixed-pesticide treatments that were retained in the merged data set include atrazine, propiconazole, 2,4-D, malathion, and metolachlor. Original columns from the herbicide and mixed pesticide data sets were altered for standardization. These standardized columns will be used in future data cleaning steps in order to merge all data sets.
Data Set Dimensions, Column Names, and Summary:
## [1] 137 12
## [1] "app_rate_g_cm2" "body_weight_g" "chemical"
## [4] "tissue_conc_ugg" "sample_id" "source"
## [7] "application" "exp_duration" "formulation"
## [10] "soil_conc_ugg" "soil_type" "species"
## app_rate_g_cm2 body_weight_g chemical tissue_conc_ugg
## Min. :2.600e-06 Min. :0.9634 ATZT :42 Min. : 0.00054
## 1st Qu.:1.430e-05 1st Qu.:1.7623 DT :23 1st Qu.: 0.27576
## Median :2.360e-05 Median :2.0136 MAT :24 Median : 1.41009
## Mean :2.004e-05 Mean :2.1086 MET :24 Mean : 7.36154
## 3rd Qu.:2.590e-05 3rd Qu.:2.3395 PROPT:24 3rd Qu.: 9.95084
## Max. :3.090e-05 Max. :4.0141 Max. :72.62672
##
## sample_id source application exp_duration formulation
## ATZ_ATZT : 6 rvm2017:137 soil:137 Min. :8 Min. :0
## ATZD_ATZT : 6 1st Qu.:8 1st Qu.:0
## ATZMA_ATZT : 6 Median :8 Median :0
## ATZMA_MAT : 6 Mean :8 Mean :0
## ATZMAPZ_ATZT: 6 3rd Qu.:8 3rd Qu.:0
## ATZMAPZ_MAT : 6 Max. :8 Max. :0
## (Other) :101
## soil_conc_ugg soil_type species
## Mode:logical Mode:logical Rana_clamitans:137
## NA's:137 NA's:137
##
##
##
##
##
~~~~~ Talk about Glinski et al. 2020 dermal routes….
Data Set Dimensions, Column Names, and Summary:
## [1] 192 5
## [1] "Sample.ID" "Analyte" "Media" "Matrix"
## [5] "Concentration"
## Sample.ID Analyte Media Matrix Concentration
## Bif LF S 1 F: 2 4-OH:32 Soil :96 Amphibian:96 Min. :0.00000
## Bif LF S 1 S: 2 BIF :32 Water:96 Soil :48 1st Qu.:0.01036
## Bif LF S 2 F: 2 CPF :32 Water :48 Median :0.15326
## Bif LF S 2 S: 2 CPO :32 Mean :0.33962
## Bif LF S 3 F: 2 TFS :32 3rd Qu.:0.44162
## Bif LF S 3 S: 2 TFSa:32 Max. :3.40759
## (Other) :180
The table below concisely displays the pesticide applications rates (ug/cm2) used in each relevant study as well as the variables used to compute the application rates.
| pesticide | app_rate_ug_cm2 | applied_mL | container | area_cm2 | total_area_cm2 | density_g_cm3 | pesticide_ug | pesticide_mL |
|---|---|---|---|---|---|---|---|---|
| Van Meter et al. 2014/2015 | ||||||||
| atrazine | 22.9 | 75 MeOH | 10-gal aquarium | 1225 | 1225 | 1.1900 | ?? | ?? |
| fipronil | 1.1 | 75 MeOH | 10-gal aquarium | 1225 | 1225 | 1.5515 | ?? | ?? |
| imidacloprid | 5.7 | 75 MeOH | 10-gal aquarium | 1225 | 1225 | 1.6000 | ?? | ?? |
| pendimethalin | 19.8 | 75 MeOH | 10-gal aquarium | 1225 | 1225 | 1.1700 | ?? | ?? |
| triadimefon | 2.7 | 75 MeOH | 10-gal aquarium | 1225 | 1225 | 1.2200 | ?? | ?? |
| Van Meter et al. 2016 | ||||||||
| atrazine | 22.9 | 75 MeOH | .94 L bowl | 225*6 | 1350 | 1.1900 | ?? | ?? |
| fipronil | 1.1 | 75 MeOH | .94 L bowl | 225*6 | 1350 | 1.5515 | ?? | ?? |
| imidacloprid | 5.7 | 75 MeOH | .94 L bowl | 225*6 | 1350 | 1.6000 | ?? | ?? |
| pendimethalin | 69.8 | 75 MeOH | .94 L bowl | 225*6 | 1350 | 1.1700 | ?? | ?? |
| triadimefon | 2.7 | 75 MeOH | .94 L bowl | 225*6 | 1350 | 1.2200 | ?? | ?? |
| Van Meter et al. 2018 | ||||||||
| atrazine | 23.6 | 50 MeOH | .94 L bowl | 225*6 | 1350 | 1.1900 | ?? | ?? |
| 2,4-D | 14.3 | 50 MeOH | .94 L bowl | 225*6 | 1350 | 1.5000 | ?? | ?? |
| metolachlor | 30.9 | 50 MeOH | .94 L bowl | 225*6 | 1350 | 1.1000 | ?? | ?? |
| malathion | 25.9 | 50 MeOH | .94 L bowl | 225*6 | 1350 | 1.2300 | ?? | ?? |
| propiconazole | 2.6 | 50 MeOH | .94 L bowl | 225*6 | 1350 | 1.3000 | ?? | ?? |
| Henson-Ramsey et al. 2008 | ||||||||
| malathion | 50 | NA | cage | 1060 | NA | 1.2300 | ?? | ?? |
| Glinski et al. 2018a | ||||||||
| atrazine | 23.95 | ?? | .94 L bowl | 225*6 | 1350 | 1.1900 | ?? | ?? |
| chlorothalonil | 44.3 | ?? | .94 L bowl | 225*6 | 1350 | 1.8000 | ?? | ?? |
| imidacloprid | 5.39 | ?? | .94 L bowl | 225*6 | 1350 | 1.6000 | ?? | ?? |
| metolachlor | 31.01 | ?? | .94 L bowl | 225*6 | 1350 | 1.1000 | ?? | ?? |
| triadimefon | 2.91 | ?? | .94 L bowl | 225*6 | 1350 | 1.2200 | ?? | ?? |
| Glinski et al. 2018b | ||||||||
| atrazine | 22.9 | ?? | 10-gal aquarium | 1225 | 1225 | 1.1900 | ?? | ?? |
| fipronil | 1.1 | ?? | 10-gal aquarium | 1225 | 1225 | 1.5515 | ?? | ?? |
| triadimefon | 2.7 | ?? | 10-gal aquarium | 1225 | 1225 | 1.2200 | ?? | ?? |
| Glinski et al. 2018c | ||||||||
| bifenthrin (max) | 3.45 | 75 MeOH | .94 L bowl | 225*8 | 1800 | 1.3000 | ?? | ?? |
| metolachlor (max) | 30.62 | 75 MeOH | .94 L bowl | 225*8 | 1800 | 1.1000 | ?? | ?? |
| triadimefon (max) | 2.87 | 75 MeOH | .94 L bowl | 225*8 | 1800 | 1.2200 | ?? | ?? |
| bifenthrin (1/10 max) | .345 | 75 MeOH | .94 L bowl | 225*8 | 1800 | 1.3000 | ?? | ?? |
| metolachlor (1/10 max) | 3.062 | 75 MeOH | .94 L bowl | 225*8 | 1800 | 1.1000 | ?? | ?? |
| triadimefon (1/10 max) | .287 | 75 MeOH | .94 L bowl | 225*8 | 1800 | 1.2200 | ?? | ?? |
| Glinski et al. 2020 | ||||||||
| bifenthrin | ?? | ?? | .94 L bowl | 225 | ?? | 1.3000 | ?? | ?? |
| chlorpyrifos | ?? | ?? | .94 L bowl | 225 | ?? | 1.4000 | ?? | ?? |
| trifloxystrobin | ?? | ?? | .94 L bowl | 225 | ?? | 1.3600 | ?? | |
Each data set was cleaned for merging. This consisted of dropping unneeded columns and standardizing column names of retained columns. Four columns were added to all data sets (soil type, formulation, exposure duration, and research study source).
Once each data set was cleaned, a local copy was saved and the data set was merged with the previously cleaned data sets.
The process of cleaning and merging each data set is briefly described below.
Metabolites and parents that do not include metabolites were dropped from the data set. This includes atrazine, deisopropyl atrazine, desethyl atrazine, fipronil, fipronil-sulfone, triadimefon, triadimenol.
# drop metabolites and parents that do not include metabolites
vm2015_chem_drop <- c("Atrazine","Deisopropyl Atrazine","Desethyl Atrazine","Fipronil","Fipronil-Sulfone","Triadimefon","Triadimenol")
chem_vector_drop <- which(vm2015$Chemical %in% vm2015_chem_drop)
vm2015_subset1 <- vm2015[-chem_vector_drop,]
vm2015_subset2 <- droplevels(vm2015_subset1)
There were 278 observations with these chemicals. After dropping the 278 observations from the initial 474, the updated dimensions are:
## [1] 196 23
There were 15 unneeded columns dropped and 4 added for standarization.
# drop unneeded columns for merging
all_cols <- colnames(vm2015_subset2)
drop_cols <- c("Instrument", "good", "logKow", "BCF", "initialweight",
"Solat20C_mgL", "Solat20C_gL", "molmass_gmol", "Density_gcm3","AppFactor", "SA_cm2", "VapPrs_mPa",
"Koc_gmL", "HalfLife_day", "HabFac")
vm2015_subset3 <- vm2015_subset2[,!(names(vm2015_subset2) %in% drop_cols)]
colnames(vm2015_subset3)
## [1] "Species" "Sample" "Chemical" "Application"
## [5] "app_rate_g_cm2" "TissueConc" "SoilConc" "bodyweight"
# add columns
soil_type <- c(rep("PLE",nrow(vm2015_subset3)))
formulation <- (rep(0,nrow(vm2015_subset3)))
exp_duration<- (rep(8,nrow(vm2015_subset3)))
source <- c(rep("rvm2015",nrow(vm2015_subset3)))
vm2015_subset4 <- cbind(vm2015_subset3, formulation, soil_type, exp_duration, source)
# standardize column names
colnames(vm2015_subset4)
## [1] "Species" "Sample" "Chemical" "Application"
## [5] "app_rate_g_cm2" "TissueConc" "SoilConc" "bodyweight"
## [9] "formulation" "soil_type" "exp_duration" "source"
colnames(vm2015_subset4)[which(colnames(vm2015_subset4)=="Sample")]<-"sample_id"
colnames(vm2015_subset4)[which(colnames(vm2015_subset4)=="Species")]<-"species"
colnames(vm2015_subset4)[which(colnames(vm2015_subset4)=="Chemical")]<-"chemical"
colnames(vm2015_subset4)[which(colnames(vm2015_subset4)=="Application")]<-"application"
colnames(vm2015_subset4)[which(colnames(vm2015_subset4)=="TissueConc")]<-"tissue_conc_ugg"
colnames(vm2015_subset4)[which(colnames(vm2015_subset4)=="SoilConc")]<-"soil_conc_ugg"
colnames(vm2015_subset4)[which(colnames(vm2015_subset4)=="bodyweight")]<-"body_weight_g"
colnames(vm2015_subset4)
## [1] "species" "sample_id" "chemical"
## [4] "application" "app_rate_g_cm2" "tissue_conc_ugg"
## [7] "soil_conc_ugg" "body_weight_g" "formulation"
## [10] "soil_type" "exp_duration" "source"
# reorder vm2015 alphabetically
vm2015_merge <- vm2015_subset4[,order(names(vm2015_subset4))]
# write a local copy
vm2015_merge_filename <- paste(amphibdir_data_out,"vm2015_merge.csv", sep="")
write.csv(vm2015_merge, file=vm2015_merge_filename)
The data set’s dimensions are:
## [1] 196 12
From the initial 11 columns, 4 columns were dropped and consolidated into 1, and 4 columns were added.
# add sample_id
vm2016$sample_id <- paste(vm2016$Day, vm2016$Row, vm2016$Column, sep="_")
vm2016_subset2 <- subset(vm2016, select=c(-Day,-Row, -Column, -Total))
# add additional columns
species <- c(rep("American toad",nrow(vm2016_subset2)))
application <- c(rep("Indirect",nrow(vm2016_subset2)))
exp_duration<- (rep(8,nrow(vm2016_subset2)))
source <- c(rep("rvm2016",nrow(vm2016_subset2)))
vm2016_subset3 <- cbind(vm2016_subset2, species, application, exp_duration, source)
Application rates for several pesticides were inserted. There were 108 observations with decay products that were not sprayed; these observations were dropped so as to only include the parents in the cleaned data set. There were 60 observations with atrazine, fipronil, or triadimefon that were dropped because they do not include metabolites in total.
# assign values to application rate
#unique(vm2016_subset3$Pesticide)
vm2016_subset3$app_rate_g_cm2[vm2016_subset3$Pesticide=="ATZTOT"] <- 22.9e-6
vm2016_subset3$app_rate_g_cm2[vm2016_subset3$Pesticide=="Imid"] <- 5.7e-6
vm2016_subset3$app_rate_g_cm2[vm2016_subset3$Pesticide=="FipTOT"] <- 1.1e-6
vm2016_subset3$app_rate_g_cm2[vm2016_subset3$Pesticide=="TNDTOT"] <- 2.7e-6
vm2016_subset3$app_rate_g_cm2[vm2016_subset3$Pesticide=="Pendi"] <- 69.8e-6
# drop decay products that were not sprayed, keeping only parents
rows_to_drop <- which(vm2016_subset3$Parent == 0)
vm2016_subset4 <- vm2016_subset3[-rows_to_drop,]
# drop ATZ, Fip, TDN since do not include metabolites in total
chems_to_drop <- c("ATZ","Fip","TDN")
vm2016_subset5 <- vm2016_subset4[!(vm2016_subset4$Pesticide %in% chems_to_drop),]
# now drop parent field
drop_cols <- c("Parent")
vm2016_subset6 <- vm2016_subset5[,!(names(vm2016_subset5) %in% drop_cols)]
Several column names were standardized and all columns were ordered for ease of merging with the combined data set.
# standardize column names
colnames(vm2016_subset6)
## [1] "Pesticide" "SoilType" "BodyBurden" "Soil"
## [5] "Weight" "Formulation" "sample_id" "species"
## [9] "application" "exp_duration" "source" "app_rate_g_cm2"
colnames(vm2016_subset6)[which(colnames(vm2016_subset6)=="Pesticide")]<-"chemical"
colnames(vm2016_subset6)[which(colnames(vm2016_subset6)=="SoilType")]<-"soil_type"
colnames(vm2016_subset6)[which(colnames(vm2016_subset6)=="BodyBurden")]<-"tissue_conc_ugg"
colnames(vm2016_subset6)[which(colnames(vm2016_subset6)=="Soil")]<-"soil_conc_ugg"
colnames(vm2016_subset6)[which(colnames(vm2016_subset6)=="Weight")]<-"body_weight_g"
colnames(vm2016_subset6)[which(colnames(vm2016_subset6)=="Formulation")]<-"formulation"
# reorder columns alphabetically to help with merge
colnames(vm2016_subset6)
## [1] "chemical" "soil_type" "tissue_conc_ugg"
## [4] "soil_conc_ugg" "body_weight_g" "formulation"
## [7] "sample_id" "species" "application"
## [10] "exp_duration" "source" "app_rate_g_cm2"
vm2016_merge <- vm2016_subset6[,order(names(vm2016_subset6))]
colnames(vm2016_merge)
## [1] "app_rate_g_cm2" "application" "body_weight_g"
## [4] "chemical" "exp_duration" "formulation"
## [7] "sample_id" "soil_conc_ugg" "soil_type"
## [10] "source" "species" "tissue_conc_ugg"
# write a local copy
vm2016_merge_filename <- paste(amphibdir_data_out,"vm2016_merge.csv", sep="")
write.csv(vm2016_merge, file=vm2016_merge_filename)
The updated dimensions are:
## [1] 96 12
The Van Meter et al. 2014/2015 and Van Meter et al. 2016 data sets were combined.
The combined data set’s updated dimensions are:
## [1] 292 12
The metabolite products were dropped from the data set; 600 rows from the initial 1494 rows were retained.
# drop metabolite products
parent_keepers <- which(as.vector(dag2016_dehy0$parent) == as.vector(dag2016_dehy0$analyte))
dag2016_dehy1 <- dag2016_dehy0[parent_keepers,]
Several column names were altered for standarization across the data set, and 7 columns were added for standarization.
## time is length of dehydration
#colnames(dag2016_dehy1)[which(colnames(dag2016_dehy1)=="time")]<-"exp_duration"
# standardize column names
colnames(dag2016_dehy1)[which(colnames(dag2016_dehy1)=="analyte")]<-"chemical"
colnames(dag2016_dehy1)[which(colnames(dag2016_dehy1)=="conc")]<-"tissue_conc_ugg"
colnames(dag2016_dehy1)[which(colnames(dag2016_dehy1)=="ID")]<-"sample_id"
colnames(dag2016_dehy1)[which(colnames(dag2016_dehy1)=="weight")]<-"body_weight_g"
# add additional columns
exp_duration <- c(rep(8,nrow(dag2016_dehy1)))
soil_type <- c(rep("PLE",nrow(dag2016_dehy1)))
application <- c(rep("Indirect",nrow(dag2016_dehy1)))
formulation <- (rep(0,nrow(dag2016_dehy1)))
app_rate_g_cm2 <- (rep(0,nrow(dag2016_dehy1)))
soil_conc_ugg <- (rep(0,nrow(dag2016_dehy1)))
source <- c(rep("dag_dehydration",nrow(dag2016_dehy1)))
dag2016_dehy2 <- cbind(dag2016_dehy1, formulation, soil_type, application,
app_rate_g_cm2, exp_duration, soil_conc_ugg, source)
The updated dimensions are:
## [1] 600 15
Multiple soil concentration observations were given the same ID. Until a many-to-one merge of soil concentrations could be executed, 300 rows were temporarily dropped. There were also 3 columns dropped.
# drop the soil until we can do a many-to-one merge of soil concentrations
# drop decay products that were not sprayed, keeping only parents
rows_to_drop <- which(dag2016_dehy2$matrix == 'soil')
dag2016_dehy3 <- dag2016_dehy2[-rows_to_drop,]
# parent, time and matrix columns delete
drop_cols <- c("parent","time","matrix")
dag2016_dehy4 <- dag2016_dehy3[,!(names(dag2016_dehy3) %in% drop_cols)]
The updated dimensions are:
## [1] 300 12
The application rate values were inserted, the temporarily dropped soil concentrations were updated to the current data set, and the species names were standardized.
# fill in application rates
#unique(dag2016_dehy4$chemical)
update_atrazine <- which(dag2016_dehy4$chemical == 'atrazine')
dag2016_dehy4$app_rate_g_cm2[update_atrazine] <- 0.00002395 # atrazine g/cm2
update_chloro <- which(dag2016_dehy4$chemical == 'chloro+d')
dag2016_dehy4$app_rate_g_cm2[update_chloro] <- 0.0000443 # chloro g/cm2
update_metol <- which(dag2016_dehy4$chemical == 'metol')
dag2016_dehy4$app_rate_g_cm2[update_metol] <- 0.00003101 # metol g/cm2
update_tdn <- which(dag2016_dehy4$chemical == 'tdn')
dag2016_dehy4$app_rate_g_cm2[update_tdn] <- 0.00000291 # tdn g/cm2
update_imid <- which(dag2016_dehy4$chemical == 'imid')
dag2016_dehy4$app_rate_g_cm2[update_imid] <- 0.00000539 # imid g/cm2
# add back in soil concentrations (in already-made soil_conc_ugg column)
dag2016_soil <- dag2016_dehy2[rows_to_drop,]
dag2016_dehy4$soil_conc_ugg <- dag2016_soil$tissue_conc_ugg
# rename species names, according to standardized names
dag2016_dehy4$species <- as.character(dag2016_dehy4$species)
dag2016_dehy4$species[dag2016_dehy4$species == "LF"] <- "Leopard frog"
dag2016_dehy4$species[dag2016_dehy4$species == "BA"] <- "Fowlers toad"
dag2016_dehy4$species <- as.factor(dag2016_dehy4$species)
The dimensions are:
## [1] 300 12
The Glinkski et al. 2018a (Dehydration) was combined with the previously merged data sets.
The combined data set’s updated dimensions are:
## [1] 592 12
The Henson-Ramsey 2008 data set did not require any additional data cleaning. It was combined with the previously merged data sets.
The combined data set’s updated dimensions are:
## [1] 601 12
Apart from standardizing the species name and inserting the application rates, the Glinski et al. 2018b (Metabolites) data set did not require any additional data cleaning. It was combined with the previously merged data sets.
# rename species names, according to standardized names
dag2016_metabolite_merge$species <- as.character(dag2016_metabolite_merge$species)
dag2016_metabolite_merge$species[dag2016_metabolite_merge$species == "Anaxyrus_fowleri"] <- "Fowlers toad"
dag2016_metabolite_merge$species <- as.factor(dag2016_metabolite_merge$species)
# assign values to application rate
unique(dag2016_metabolite_merge$chemical)
## [1] atrazine triadimefon fipronil
## Levels: atrazine fipronil triadimefon
#dag2016_metabolite_merge$chemical[dag2016_metabolite_merge$chemical =="atrazine"] <-
#dag2016_metabolite_merge$chemical[dag2016_metabolite_merge$chemical =="triadimefon"] <-
#dag2016_metabolite_merge$chemical[dag2016_metabolite_merge$chemical =="fipronil"] <-
The combined data set’s updated dimensions are:
## [1] 661 12
Five columns were dropped from the original biomarkers data set and the names of two columns were standardized.
# drop columns
drop_cols <- c("met", "tdt", "bif", "rate", "group")
dag_biomarker_subset <- dag_biomarker[, !(names(dag_biomarker) %in% drop_cols)]
# standardize column names
colnames(dag_biomarker_subset)[which(colnames(dag_biomarker_subset)=="conc")]<-"tissue_conc_ugg"
colnames(dag_biomarker_subset)[which(colnames(dag_biomarker_subset)=="frog.weight")]<-"body_weight_g"
The updated column names and dimensions are:
## [1] "body_weight_g" "sample_id" "pesticide" "tissue_conc_ugg"
## [1] 192 4
The application rates and soil concentrations were not included in the original biomarkers data set. Both are included in the following data set:
Data Set Dimensions, Column Names, and Summary:
## [1] 136 15
## [1] "frog.weight" "SAMPLE" "Met" "TDN"
## [5] "TDL" "BIF" "soil.weight" "Met.soil"
## [9] "TDN.soil" "TDL.soil" "BIF.soil" "Rate"
## [13] "app.rate.met" "app.rate.tdn" "app.rate.bif"
## frog.weight SAMPLE Met TDN
## Min. :1.012 BIF 10 1: 1 Min. : 0.0000 Min. :0.00000
## 1st Qu.:2.749 BIF 10 2: 1 1st Qu.: 0.0000 1st Qu.:0.00000
## Median :3.164 BIF 10 3: 1 Median : 0.0000 Median :0.00000
## Mean :3.302 BIF 10 4: 1 Mean : 0.9123 Mean :0.06927
## 3rd Qu.:3.762 BIF 10 5: 1 3rd Qu.: 0.4298 3rd Qu.:0.07447
## Max. :6.784 BIF 10 6: 1 Max. :19.8798 Max. :0.55921
## (Other) :130
## TDL BIF soil.weight Met.soil
## Min. :0.00000 Min. :0.0000 Min. : 4.476 Min. :0.000
## 1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.: 6.731 1st Qu.:0.000
## Median :0.00000 Median :0.0000 Median : 7.772 Median :0.000
## Mean :0.02259 Mean :0.1276 Mean : 8.043 Mean :1.605
## 3rd Qu.:0.01770 3rd Qu.:0.1299 3rd Qu.: 9.050 3rd Qu.:2.265
## Max. :0.30815 Max. :1.0271 Max. :13.571 Max. :6.758
##
## TDN.soil TDL.soil BIF.soil Rate
## Min. :0.0000 Min. :0.000000 Min. :0.0000 0 :24
## 1st Qu.:0.0000 1st Qu.:0.000000 1st Qu.:0.0000 High:56
## Median :0.0000 Median :0.000000 Median :0.0000 Low :56
## Mean :0.7168 Mean :0.010160 Mean :0.7417
## 3rd Qu.:0.5312 3rd Qu.:0.007463 3rd Qu.:1.1472
## Max. :3.6300 Max. :0.061563 Max. :5.2658
##
## app.rate.met app.rate.tdn app.rate.bif
## Min. : 0.000 Min. :0.0000 Min. :0.0000
## 1st Qu.: 0.000 1st Qu.:0.0000 1st Qu.:0.0000
## Median : 0.000 Median :0.0000 Median :0.0000
## Mean :14.263 Mean :1.3389 Mean :1.6070
## 3rd Qu.: 5.511 3rd Qu.:0.5173 3rd Qu.:0.6209
## Max. :55.106 Max. :5.1730 Max. :6.2090
##
The application rates were converted from ug/cm2 to g/cm2.
dag_biomarker2_update <- replace.value(dag_biomarker2, "app.rate.met", from = 55.106, to= 5.5106e-5, verbose = TRUE)
dag_biomarker2_update <- replace.value(dag_biomarker2_update, "app.rate.met", from = 5.5106, to= 5.5106e-6, verbose = FALSE)
dag_biomarker2_update <- replace.value(dag_biomarker2_update, "app.rate.tdn", from = 5.173, to= 5.173e-6, verbose = FALSE)
dag_biomarker2_update <- replace.value(dag_biomarker2_update, "app.rate.tdn", from = .5173, to= 5.173e-7, verbose = FALSE)
dag_biomarker2_update <- replace.value(dag_biomarker2_update, "app.rate.bif", from = 6.209, to= 6.209e-6, verbose = FALSE)
dag_biomarker2_update <- replace.value(dag_biomarker2_update, "app.rate.bif", from = .6209, to= 6.209e-7, verbose = FALSE)
A one-to-one merge was conducted based on the unique sample id for each measured pesticide (either bifenthrin, metolachlor, or triadimefon) to join the original biomarkers data set and the data set containing the application rates and soil concentrations. Vectors containing the application rates and soil concentrations were joined to the original data set.
# bif extraction
dag_biomarker_subset_bif <- dag_biomarker_subset[dag_biomarker_subset$pesticide == "bif", ]
dag_biomarker2_subset_bif <- dag_biomarker2_update[dag_biomarker2_update$BIF != 0, ]
dag_biomarker_bif_merge <- merge(x = dag_biomarker_subset_bif, y = dag_biomarker2_subset_bif,
by.x = "sample_id", by.y = "SAMPLE", all.x = TRUE)
# met extraction
dag_biomarker_subset_met <- dag_biomarker_subset[dag_biomarker_subset$pesticide == "met", ]
dag_biomarker2_subset_met <- dag_biomarker2_update[dag_biomarker2_update$Met != 0, ]
dag_biomarker_met_merge <- merge(x = dag_biomarker_subset_met, y = dag_biomarker2_subset_met,
by.x = "sample_id", by.y = "SAMPLE", all.x = TRUE)
# tdt extraction
dag_biomarker_subset_tdt <- dag_biomarker_subset[dag_biomarker_subset$pesticide == "tdt", ]
dag_biomarker2_subset_tdt <- dag_biomarker2_update[dag_biomarker2_update$TDN != 0, ]
dag_biomarker_tdt_merge <- merge(x = dag_biomarker_subset_tdt, y = dag_biomarker2_subset_tdt,
by.x = "sample_id", by.y = "SAMPLE", all.x = TRUE)
# combine bif, met, and tdt
app_bind_bmt <- c(dag_biomarker_bif_merge[,"app.rate.bif"],
dag_biomarker_met_merge[,"app.rate.met"], dag_biomarker_tdt_merge[,"app.rate.tdn"])
soil_bind_bmt <- c(dag_biomarker_bif_merge[,"BIF.soil"],
dag_biomarker_met_merge[,"Met.soil"], dag_biomarker_tdt_merge[,"TDN.soil"])
# join app and soil vectors to data set
dag_biomarker_subset2 <- dag_biomarker_subset[order(dag_biomarker_subset[, 3]),]
rownames(dag_biomarker_subset2) <- seq(length=nrow(dag_biomarker_subset2))
dag_biomarker_subset3 <- cbind(dag_biomarker_subset2, app_bind_bmt, soil_bind_bmt)
# standardize column names
colnames(dag_biomarker_subset3)[which(colnames(dag_biomarker_subset3)=="app_bind_bmt")]<-"app_rate_g_cm2"
colnames(dag_biomarker_subset3)[which(colnames(dag_biomarker_subset3)=="soil_bind_bmt")]<-"soil_conc_ugg"
The updated column names and dimensions are:
## [1] "body_weight_g" "sample_id" "pesticide" "tissue_conc_ugg"
## [5] "app_rate_g_cm2" "soil_conc_ugg"
## [1] 192 6
New columns were created for standarization, the columns were ordered alphabetically, and a local copy was stored.
# create new columns
application <- c(rep("soil", nrow(dag_biomarker_subset3)))
exp_duration <- c(rep(8, nrow(dag_biomarker_subset3)))
formulation <- c(rep(0, nrow(dag_biomarker_subset3)))
soil_type <- c(rep(NA, nrow(dag_biomarker_subset3)))
source <- c(rep("dag_biomarker", nrow(dag_biomarker_subset3)))
species <- c(rep("Leopard frog", nrow(dag_biomarker_subset3)))
# combine columns
dag_biomarker_subset4 <- cbind(dag_biomarker_subset3, application, exp_duration,
formulation, soil_type, source, species)
# standardize pesticide column
dag_biomarker_subset4$pesticide <- as.character(dag_biomarker_subset4$pesticide)
dag_biomarker_subset4$pesticide[dag_biomarker_subset4$pesticide == "bif"] <- "Bifenthrin"
dag_biomarker_subset4$pesticide[dag_biomarker_subset4$pesticide == "met"] <- "Metolachlor"
dag_biomarker_subset4$pesticide[dag_biomarker_subset4$pesticide == "tdt"] <- "Triadimefon"
colnames(dag_biomarker_subset4)[which(colnames(dag_biomarker_subset4)=="pesticide")]<-"chemical"
# unite function for sample id and chemical
dag_biomarker_subset5 <- unite(data = dag_biomarker_subset4, col = "sample_id", "sample_id", "chemical", sep = " ", remove = FALSE)
# order columns in abc for merge
dag_biomarker_merge <- dag_biomarker_subset5[ ,order(names(dag_biomarker_subset5))]
The updated column names and dimensions are:
## [1] "app_rate_g_cm2" "application" "body_weight_g"
## [4] "chemical" "exp_duration" "formulation"
## [7] "sample_id" "soil_conc_ugg" "soil_type"
## [10] "source" "species" "tissue_conc_ugg"
## [1] 192 12
The Glinski et al. 2018c (Biomarkers) was combined with the previously merged data sets.
The combined data set’s updated dimensions are:
## [1] 853 12
The Van Meter et al. 2018 (Multiple Pesticides Study) data set did not require any additional data cleaning. It was combined with the previously merged data sets.
The combined data set’s updated dimensions are:
## [1] 990 12
~~~ Emma: still need to add in app rates
The dermal routes data set did not include the body weights for the measured amphibians. These weights were included in a separate data set:
Data Set Dimensions, Column Names, and Summary:
## [1] 48 2
## [1] "Weight_g" "Sample"
## Weight_g Sample
## Min. :0.9555 Bif LF S 1 F: 1
## 1st Qu.:1.4204 Bif LF S 2 F: 1
## Median :1.7817 Bif LF S 3 F: 1
## Mean :1.7784 Bif LF S 4 F: 1
## 3rd Qu.:2.1319 Bif LF S 5 F: 1
## Max. :2.8197 Bif LF S 6 F: 1
## (Other) :42
A one-to-many merge was employed to merge the dermal routes data set and the weights data set based on the Sample ID. Only rows where the Matrix is “Amphibian” have a body weight; all other rows are NA.
# merge (one-to-many) dermal routes data with weights data, based on Sample ID
dermal_routes_subset2 <- dermal_routes[order(dermal_routes$Sample.ID), ]
weights_2 <- weights[order(weights$Sample),]
dermal_routes_subset3 <- merge(dermal_routes_subset2, weights_2,
by.x = "Sample.ID", by.y = "Sample", all.x = TRUE, all.y = TRUE)
The updated dimensions are:
## [1] 192 6
The soil concentrations, where the Media and Matrix are both “Soil,” was subset from the data set to be used later in the data cleaning process. These soil concentrations (currently listed in the “Concentration” column) will be used for the soil_conc_ugg column in the cleaned data set.
# subset soil to be used later for soil concentration column (will use "Concentration" column)
soil_subset <- dermal_routes_subset2[dermal_routes_subset2$Media == "Soil", ]
soil_subset2 <- soil_subset[soil_subset$Matrix == "Soil",]
The dimensions of this soil subset are:
## [1] 48 5
Referring back to the main dermal routes data set: we are only interested in the pesticide exposures on amphibians while in soil. These rows were subset.
# want Media == soil because interested in dermal exposure in soil
dermal_routes_subset4 <- dermal_routes_subset3[dermal_routes_subset3$Media == "Soil",]
#sum(dermal_routes_subset3$Media == "Soil") # == 96
#dim(dermal_routes_subset4) # == 96 x 6
# want Matrix == Amphibian because interested in amphib exposure
dermal_routes_subset5 <- dermal_routes_subset4[dermal_routes_subset4$Matrix == "Amphibian", ]
#sum(dermal_routes_subset4$Matrix == "Amphibian") # == 48
#dim(dermal_routes_subset5) # == 48 x 6
The updated dimensions are:
## [1] 48 6
The soil concentrations were appended to the main dermal routes data set.
# add in soil concentration column, previously subset
# order by Sample.ID, then by Analyte name to match up rows for the two data sets
dermal_routes_subset6 <- dermal_routes_subset5[order(dermal_routes_subset5[,1],
dermal_routes_subset5[,2]),]
soil_subset3 <- soil_subset2[order(soil_subset2[,1], soil_subset2[,2]),]
#dim(dermal_routes_subset6) # == 48 x 6
#dim(soil_subset3) # == 48 x 5
dermal_routes_subset7 <- cbind(dermal_routes_subset6, soil_subset3$Concentration)
The updated dimensions are:
## [1] 48 7
The metabolites were dropped from the data set. Additionally, several new columns were created for standarization, existing columns were standardized according to the naming conventions of the collated data set, and unneeded columns were dropped. Columns were ordered alphabetically for ease of merging.
# drop metabolites
rows_to_drop <- c("4-OH", "CPO", "TFSa")
dermal_routes_subset8 <- dermal_routes_subset7[!(dermal_routes_subset7$Analyte %in% rows_to_drop),]
# create new columns
app_rate_g_cm2 <- c(rep(NA, nrow(dermal_routes_subset8)))
application <- c(rep("soil", nrow(dermal_routes_subset8)))
exp_duration <- c(rep(8, nrow(dermal_routes_subset8)))
formulation <- c(rep(0, nrow(dermal_routes_subset8)))
soil_type <- c(rep("OLS", nrow(dermal_routes_subset8)))
source <- c(rep("dag_dermal_routes", nrow(dermal_routes_subset8)))
species <- c(rep("Leopard frog", nrow(dermal_routes_subset8)))
# insert application rates
#unique(dermal_routes_subset8$Analyte)
#dermal_routes_subset8$Analyte[dermal_routes_subset8$Analyte =="BIF"] <-
#dermal_routes_subset8$Analyte[dermal_routes_subset8$Analyte =="CPF"] <-
#dermal_routes_subset8$Analyte[dermal_routes_subset8$Analyte =="TFS"] <-
# alter existing column names
colnames(dermal_routes_subset8)
## [1] "Sample.ID" "Analyte"
## [3] "Media" "Matrix"
## [5] "Concentration" "Weight_g"
## [7] "soil_subset3$Concentration"
colnames(dermal_routes_subset8)[which(colnames(dermal_routes_subset8)=="Analyte")]<-"chemical"
colnames(dermal_routes_subset8)[which(colnames(dermal_routes_subset8)=="Sample.ID")]<-"sample_id"
colnames(dermal_routes_subset8)[which(colnames(dermal_routes_subset8)=="Concentration")]<-"tissue_conc_ugg"
colnames(dermal_routes_subset8)[which(colnames(dermal_routes_subset8)=="soil_subset3$Concentration")]<-"soil_conc_ugg"
colnames(dermal_routes_subset8)[which(colnames(dermal_routes_subset8)=="Weight_g")]<-"body_weight_g"
# combine columns
dermal_routes_subset9 <- cbind(dermal_routes_subset8, app_rate_g_cm2, application, exp_duration,
formulation, soil_type, source, species)
names(dermal_routes_subset9)
## [1] "sample_id" "chemical" "Media"
## [4] "Matrix" "tissue_conc_ugg" "body_weight_g"
## [7] "soil_conc_ugg" "app_rate_g_cm2" "application"
## [10] "exp_duration" "formulation" "soil_type"
## [13] "source" "species"
# drop columns
cols_to_drop <- c("Matrix", "Media")
dermal_routes_subset10 <- dermal_routes_subset9[, !(names(dermal_routes_subset9) %in% cols_to_drop)]
# order columns in abc for merge
dermal_routes_merge <- dermal_routes_subset10[ ,order(names(dermal_routes_subset10))]
The updated column names and dimensions are:
## [1] 24 12
## [1] "app_rate_g_cm2" "application" "body_weight_g"
## [4] "chemical" "exp_duration" "formulation"
## [7] "sample_id" "soil_conc_ugg" "soil_type"
## [10] "source" "species" "tissue_conc_ugg"
A local copy was saved, and the data set was combined with the collated data set.
The combined data set’s updated dimensions are:
## [1] 1014 12
Minor alterations were made to the final collated data set to standardize names of the application types and chemicals.
amphib_dermal_collated <- combined_data6
colnames(amphib_dermal_collated)
## [1] "app_rate_g_cm2" "application" "body_weight_g"
## [4] "chemical" "exp_duration" "formulation"
## [7] "sample_id" "soil_conc_ugg" "soil_type"
## [10] "source" "species" "tissue_conc_ugg"
# check to see if everything ok
summary(amphib_dermal_collated$app_rate_g_cm2) # units issues
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 1.0e-06 3.0e-06 1.4e-05 1.8e-05 2.4e-05 7.0e-05 24
summary(amphib_dermal_collated$body_weight_g) # 60 NAs
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.1879 1.3043 2.1247 3.3800 3.0412 50.9200
summary(amphib_dermal_collated$exp_duration)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.000 8.000 8.000 8.876 8.000 48.000
summary(amphib_dermal_collated$soil_conc_ugg) # 206 NAs
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.1125 2.0709 5.2459 14.3042 15.3781 238.1502 206
summary(amphib_dermal_collated$tissue_conc_ugg)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00054 0.16908 0.52573 2.51415 2.06812 72.62672
# standardize application levels
amphib_dermal_collated$application <- tolower(amphib_dermal_collated$application)
amphib_dermal_collated$application <- as.factor(amphib_dermal_collated$application)
# standardize chemical levels
amphib_dermal_collated$chemical <- as.character(amphib_dermal_collated$chemical)
amphib_dermal_collated$chemical[amphib_dermal_collated$chemical == "fip"] <- "fipronil"
amphib_dermal_collated$chemical[amphib_dermal_collated$chemical == "BIF"] <- "bifenthrin"
amphib_dermal_collated$chemical[amphib_dermal_collated$chemical == "MET"] <- "metolachlor"
amphib_dermal_collated$chemical[amphib_dermal_collated$chemical == "MAT"] <- "malathion"
amphib_dermal_collated$chemical[amphib_dermal_collated$chemical == "ATZT"] <- "atrazine"
amphib_dermal_collated$chemical[amphib_dermal_collated$chemical == "PROPT"] <- "propiconazole"
amphib_dermal_collated$chemical[amphib_dermal_collated$chemical == "metol"] <- "metolachlor"
amphib_dermal_collated$chemical[amphib_dermal_collated$chemical == "tdn"] <- "triadimefon"
amphib_dermal_collated$chemical[amphib_dermal_collated$chemical == "imid"] <- "imidacloprid"
amphib_dermal_collated$chemical[amphib_dermal_collated$chemical == "chloro+d"] <- "chlorothalonil"
amphib_dermal_collated$chemical[amphib_dermal_collated$chemical == "CPF"] <- "chlorpyrifos"
amphib_dermal_collated$chemical[amphib_dermal_collated$chemical == "TFS"] <- "trifloxystrobin"
amphib_dermal_collated$chemical[amphib_dermal_collated$chemical == "FipTOT"] <- "fipronil"
amphib_dermal_collated$chemical[amphib_dermal_collated$chemical == "ATZTOT"] <- "atrazine"
amphib_dermal_collated$chemical[amphib_dermal_collated$chemical == "TNDTOT"] <- "triadimefon"
amphib_dermal_collated$chemical[amphib_dermal_collated$chemical == "Pendi"] <- "pendimethalin"
amphib_dermal_collated$chemical[amphib_dermal_collated$chemical == "Total Atrazine"] <- "atrazine"
amphib_dermal_collated$chemical[amphib_dermal_collated$chemical == "Total Fipronil"] <- "fipronil"
amphib_dermal_collated$chemical[amphib_dermal_collated$chemical == "Total Triadimefon"] <- "triadimefon"
amphib_dermal_collated$chemical <- tolower(amphib_dermal_collated$chemical)
amphib_dermal_collated$chemical <- as.factor(amphib_dermal_collated$chemical)
# write out file
amphib_dermal_collated_filename <- paste(amphibdir_data_out,"amphib_dermal_collated.csv", sep="")
write.csv(amphib_dermal_collated, file=amphib_dermal_collated_filename)
## [1] "app_rate_g_cm2" "application" "body_weight_g"
## [4] "chemical" "exp_duration" "formulation"
## [7] "sample_id" "soil_conc_ugg" "soil_type"
## [10] "source" "species" "tissue_conc_ugg"
## [1] 1014 12
## app_rate_g_cm2 application body_weight_g chemical
## Min. :1.0e-06 indirect :396 Min. : 0.1879 triadimefon :223
## 1st Qu.:3.0e-06 overspray: 45 1st Qu.: 1.3043 atrazine :191
## Median :1.4e-05 soil :573 Median : 2.1247 metolachlor :154
## Mean :1.8e-05 Mean : 3.3800 bifenthrin : 72
## 3rd Qu.:2.4e-05 3rd Qu.: 3.0412 fipronil : 71
## Max. :7.0e-05 Max. :50.9200 chlorothalonil: 66
## NA's :24 (Other) :237
## exp_duration formulation sample_id soil_conc_ugg
## Min. : 2.000 Min. :0.00000 ATZ_ATZT : 6 Min. : 0.1125
## 1st Qu.: 8.000 1st Qu.:0.00000 ATZD_ATZT : 6 1st Qu.: 2.0709
## Median : 8.000 Median :0.00000 ATZMA_ATZT : 6 Median : 5.2459
## Mean : 8.876 Mean :0.03582 ATZMA_MAT : 6 Mean : 14.3042
## 3rd Qu.: 8.000 3rd Qu.:0.00000 ATZMAPZ_ATZT: 6 3rd Qu.: 15.3781
## Max. :48.000 Max. :1.00000 ATZMAPZ_MAT : 6 Max. :238.1502
## NA's :9 (Other) :978 NA's :206
## soil_type source species
## PLE :544 dag_dehydration:300 Leopard frog :412
## OLS : 72 rvm2015 :196 Fowlers toad :200
## NA's:398 dag_biomarker :192 Rana_clamitans :137
## rvm2017 :137 American toad : 96
## rvm2016 : 96 Barking treefrog: 50
## dag_metabolites: 60 Green treefrog : 45
## (Other) : 33 (Other) : 74
## tissue_conc_ugg
## Min. : 0.00054
## 1st Qu.: 0.16908
## Median : 0.52573
## Mean : 2.51415
## 3rd Qu.: 2.06812
## Max. :72.62672
##
Session Information
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 16299)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=English_United States.1252
## [2] LC_CTYPE=English_United States.1252
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.1252
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] anchors_3.0-8 MASS_7.3-51.4 rgenoud_5.8-3.0 stringr_1.4.0
## [5] tidyr_1.0.0 dplyr_0.8.3 kableExtra_1.1.0 reshape2_1.4.3
## [9] gridExtra_2.3 ggplot2_3.2.1
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.2 highr_0.8 pillar_1.4.2
## [4] compiler_3.6.1 plyr_1.8.4 tools_3.6.1
## [7] zeallot_0.1.0 digest_0.6.21 lifecycle_0.1.0
## [10] viridisLite_0.3.0 evaluate_0.14 tibble_2.1.3
## [13] gtable_0.3.0 pkgconfig_2.0.3 rlang_0.4.0
## [16] rstudioapi_0.10 yaml_2.2.0 xfun_0.10
## [19] xml2_1.2.2 httr_1.4.1 withr_2.1.2
## [22] knitr_1.25 vctrs_0.2.0 hms_0.5.1
## [25] webshot_0.5.1 grid_3.6.1 tidyselect_0.2.5
## [28] glue_1.3.1 R6_2.4.0 rmarkdown_1.16
## [31] purrr_0.3.3 readr_1.3.1 magrittr_1.5
## [34] ellipsis_0.3.0 backports_1.1.5 scales_1.0.0
## [37] htmltools_0.4.0 rvest_0.3.5 assertthat_0.2.1
## [40] colorspace_1.4-1 stringi_1.4.3 lazyeval_0.2.2
## [43] munsell_0.5.0 crayon_1.3.4