README.Rmd

HollyArnold 2022-11-08

Scientific Objectives

Are host microbe interactions shaped by population structure?
Are there groups of microbes which correspond to low parasite burdens in bighorn sheep

Generation of microbial features of the cross sectional analysis

The Bighorn grant produced roughly three phases of sample events:

“Cross sectional study”: Sheep were brought into basecamp and samples were collected including a single 16S and 18S sample. Diverse metadata was also collected on these sheep (see below for a list of compilation of collected metadata parameters). Following sampling of microbiome, the sheep at basecamp were collared, and their location was followed over time. Field samples were also collected and genetic markers were used to identify how samples mapped to individuals. The samples collected from field samples of bighorn sheep may or may not be collared or have been caputred at basecamp. In order to link these two studies together, a “sample integrity” experiment was carried out where fresh fecal samples were sampled over time if left exposed to the elements.
“Longitudinal study”: Sheep were sampled for a target of 9 times and at each timepoint, microbiome data was collected.
“Parasitic intervention study”: Sheep were captured and the microbiome was measured before and after an antiparasitic medication was given.

This document will describe analysis of the first microbiome samples from the cross sectional study above.

Amplicon Sequencing Varient (ASV) processing of all cross sectional samples allows for unified comparison across samples.

All samples for the sample integrity project, field samples, as well as basecamp samples were processed together to determine ASV content. By processing ASVs together, we will be able to compare any ASVs that of interest between studies. ASVs for the sample integrity project will be split off after determination of ASVs and cladal analysis will not be performed on these now. The remaining two projects (field samples and basecamp samples) were are split into two projects and they will be processed separately to (1) determine if there are samples that should be removed due to low numbers of sequences and (2) determine appropriate rarefaction methods.

Sequencing file description

Some statistics of the files that we have for the cross sectional analysis. In the event that two extraction steps were performed per sample (i.e. a technical duplicate) the sample with the largest number of sequences was kept for all further analysis. The sample with the lower amount of sequences was excluded from all further analysis.

16S Sequences

16S raw fastQ files 446 samples = 892 files Number of duplicated 16S sequences 4 samples = 8 files Number of unique animal IDs 442 samples = 884 files Number of samples post technical filter 442 samples = 884 files

18S Sequences

18S raw fastQ files 445 samples = 890 files Number of duplicated 16S sequences 3 samples = 6 files Number of unique animal IDs 442 samples = 884 samples

Determination of parameters for Amplicon Sequencing Variants

In order to determine how dada2 parameters impacted unqiue ASVs of each project, I did a parameter sweep here to determine how trim length inpacts the number of unique ASVs retained for each project and overall. Numbers in parenthasis for Basecamp ASVs, Field ASVs and Integrity ASVs columns represent the number of ASVs retained after excluding ASVs that occur <= 5 times.

FWD_TRIM	REV_TRIM	EE	N_ASVs	Mean (min - max) ASVs Retained	Basecamp ASVs	Field ASVs	Integrity ASVs	screen
250	200	2,2	9943	62% (52 - 71)	9476 (8230)	9618 (8744)	8152 (6349)	case1darwin
200	150	2,2	40741	71% (60 - 83%)	24900 (9872)	26992 (10929)	20318 (7329)	case2
250	200	0,0	0	0% (0 - 0)	0	0	0	case3
200	150	0,0	0	0 (0 - 0)	0	0	0	case4
225	175	2,2	14007	67% (56 - 78)	11563 (8452)	11962 (8935)	9701 (6412)	case5

Determination of Amplicon Sequencing Variants (ASVs)

“Universal” 16S sequencing primers were utilized from the Earth Microbiome Project 16S Illumina Amplicon Protocol. Forward (5’-GTGYCAGCMGCCGCGGTAA-3’) and reverse primers (5’-GGACTACNVGGGTWTCTAAT-3’) amplified the V4 hypervariable region.

ASVs were determined using the Divisive Amplicon Denoising ALgorith (DADA2; version 1.22.0) pipeline in R (Bird Hippie; version 4.1.2). A forward primer and reverse reverse-complement primer were trimmed off the forward reads and the the reverse and forward-reverse complement were trimmed from the reverse reads using cutadapt (version 4.1).

Forward reads were trimmed at 250 basepairs and reverse reads were trimmed at 200 basepairs after examining quality scores and removing reads with any ambiguous bases. Convergent error model parameterization and sample inference was estimated for forward and reverse reads. Paired ends were merged, and chimeras were excluded.

The following quality plots show the overall statistics for all samples of the cross sectional analysis considered.

A note for those reading this document on the GitLab page: Please see README.pdf for these following plots as they don’t render correctly on GitLab, but they do on the PDF file. These are ugly plots that are automatically generated. Note that you can make your own plots from data that look prettier if desired from data/dada2_processing/bighorn_2023-02-07_output_dada2Run.RData.

rsync -avz /nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep/dada2.out/16S/bighorn_2023-02-07_output/ /nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep/population_structure_of_the_immunomicrobiome/data/dada2_processing/

cp params16S.txt /nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep/population_structure_of_the_immunomicrobiome/data/dada2_processing/

The forward error rates:

The reverse error rates:

Collector’s curves:

Heatmap of numbers of sequences kept at each step. Most are removed during quality filtering after cutting off primers.

The read depth distribution of all samples

The read length distribution of all samples. As expected most ASVs are of similar length. There are some longer than expected ASVs that might have merged incorrectly or are likely eukaryotic contamination that should be filtered out by the user before proceeding with ASVs.

Look for the presence of primers present in the first 10 files pre cut. As can be seen there is signifcaint read through of the reverse complement of the forward primer in the reverse reads and the reverse compliment of the reverse primer in the forward reads. This is common since CQLS doesn’t look for the read through of any primers or adapters at the end of reads.

pwd
head -n 70 data/dada2_processing/initialPrimers.txt

## /nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep/population_structure_of_the_immunomicrobiome
## [1] 1
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//filtN/lane1-s003-index-TGACTAATGGCC-TS032-16S-1-C1_S3_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0    7535
## REV.ForwardReads       0          0       0   15976
## REV.ReverseReads       0          0       0       0
## [1] 2
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//filtN/lane1-s004-index-GTGGAGTCTCAT-TS032-16S-1-D1_S4_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0    7438
## REV.ForwardReads       0          0       0   17446
## REV.ReverseReads       0          0       0       0
## [1] 3
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//filtN/lane1-s005-index-TGATGTGCTAAG-TS032-16S-1-E1_S5_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0   12007
## REV.ForwardReads       0          0       0   27592
## REV.ReverseReads       0          0       0       0
## [1] 4
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//filtN/lane1-s006-index-TGTGCACGCCAT-TS032-16S-1-F1_S6_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0   11470
## REV.ForwardReads       0          0       0   25932
## REV.ReverseReads       0          0       0       0
## [1] 5
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//filtN/lane1-s007-index-GGTGAGCAAGCA-TS032-16S-1-G1_S7_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0   12999
## REV.ForwardReads       0          0       0   30671
## REV.ReverseReads       0          0       0       0
## [1] 6
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//filtN/lane1-s008-index-CTATGTATTAGT-TS032-16S-1-H1_S8_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0   22972
## REV.ForwardReads       0          0       0   50955
## REV.ReverseReads       0          0       0       0
## [1] 7
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//filtN/lane1-s011-index-CGGGACACCCGA-TS032-16S-1-C2_S11_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0    6929
## REV.ForwardReads       0          0       0   15887
## REV.ReverseReads       0          0       0       0
## [1] 8
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//filtN/lane1-s012-index-ACCTTACACCTT-TS032-16S-1-D2_S12_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0   10104
## REV.ForwardReads       0          0       0   22056
## REV.ReverseReads       0          0       0       0
## [1] 9
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//filtN/lane1-s013-index-GTAGTAGACCAT-TS032-16S-1-E2_S13_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0   10273
## REV.ForwardReads       0          0       0   23259
## REV.ReverseReads       0          0       0       0
## [1] 10
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//filtN/lane1-s014-index-CCGGACAAGAAG-TS032-16S-1-F2_S14_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0    7589
## REV.ForwardReads       0          0       0   17116
## REV.ReverseReads       0          0       0       0

Look for the presence of primers present in the first 10 files post cut. Our cut adapt worked to trim these off!

pwd
head -n 70 data/dada2_processing/cutPrimers.txt

## /nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep/population_structure_of_the_immunomicrobiome
## [1] 1
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//cut/lane1-s003-index-TGACTAATGGCC-TS032-16S-1-C1_S3_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0       0
## REV.ForwardReads       0          0       0       0
## REV.ReverseReads       0          0       0       0
## [1] 2
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//cut/lane1-s004-index-GTGGAGTCTCAT-TS032-16S-1-D1_S4_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0       0
## REV.ForwardReads       0          0       0       0
## REV.ReverseReads       0          0       0       0
## [1] 3
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//cut/lane1-s005-index-TGATGTGCTAAG-TS032-16S-1-E1_S5_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0       0
## REV.ForwardReads       0          0       0       0
## REV.ReverseReads       0          0       0       0
## [1] 4
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//cut/lane1-s006-index-TGTGCACGCCAT-TS032-16S-1-F1_S6_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0       0
## REV.ForwardReads       0          0       0       0
## REV.ReverseReads       0          0       0       0
## [1] 5
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//cut/lane1-s007-index-GGTGAGCAAGCA-TS032-16S-1-G1_S7_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0       0
## REV.ForwardReads       0          0       0       0
## REV.ReverseReads       0          0       0       0
## [1] 6
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//cut/lane1-s008-index-CTATGTATTAGT-TS032-16S-1-H1_S8_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0       0
## REV.ForwardReads       0          0       0       0
## REV.ReverseReads       0          0       0       0
## [1] 7
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//cut/lane1-s011-index-CGGGACACCCGA-TS032-16S-1-C2_S11_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0       0
## REV.ForwardReads       0          0       0       0
## REV.ReverseReads       0          0       0       0
## [1] 8
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//cut/lane1-s012-index-ACCTTACACCTT-TS032-16S-1-D2_S12_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0       0
## REV.ForwardReads       0          0       0       0
## REV.ReverseReads       0          0       0       0
## [1] 9
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//cut/lane1-s013-index-GTAGTAGACCAT-TS032-16S-1-E2_S13_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0       0
## REV.ForwardReads       0          0       0       0
## REV.ReverseReads       0          0       0       0
## [1] 10
## [1] "/nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//cut/lane1-s014-index-CCGGACAAGAAG-TS032-16S-1-F2_S14_R1_001.fastq.gz"
##                  Forward Complement Reverse RevComp
## FWD.ForwardReads       0          0       0       0
## FWD.ReverseReads       0          0       0       0
## REV.ForwardReads       0          0       0       0
## REV.ReverseReads       0          0       0       0

Look at the first bit of cut adapt for the first file. As can be seen, most of the sequences trimmed off were around 48 base pairs. All reads passed cutting of primers.

pwd
head -n 270 data/dada2_processing/cutAdaptOutput.txt

## /nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep/population_structure_of_the_immunomicrobiome
## 
## ### This is cutadapt 2.6 with Python 3.7.12
## 
## 
## ### Command line parameters: -g GTGYCAGCMGCCGCGGTAA -a ATTAGAWACCCBNGTAGTCC -G GGACTACNVGGGTWTCTAAT -A TTACCGCGGCKGCTGRCAC -n 2 -m 1 -j 100 -e 0.1 -o /nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//cut/lane1-s003-index-TGACTAATGGCC-TS032-16S-1-C1_S3_R1_001.fastq.gz -p /nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//cut/lane1-s003-index-TGACTAATGGCC-TS032-16S-1-C1_S3_R2_001.fastq.gz /nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//filtN/lane1-s003-index-TGACTAATGGCC-TS032-16S-1-C1_S3_R1_001.fastq.gz /nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//filtN/lane1-s003-index-TGACTAATGGCC-TS032-16S-1-C1_S3_R2_001.fastq.gz
## 
## 
## ### Processing reads on 100 cores in paired-end mode ...
## 
## 
## ### Finished in 7.16 s (366 us/read; 0.16 M reads/minute).
## 
## 
## ### 
## 
## 
## ### === Summary ===
## 
## 
## ### 
## 
## 
## ### Total read pairs processed:             19,550
## 
## 
## ###   Read 1 with adapter:                  18,213 (93.2%)
## 
## 
## ###   Read 2 with adapter:                  12,903 (66.0%)
## 
## 
## ### Pairs that were too short:                   0 (0.0%)
## 
## 
## ### Pairs written (passing filters):        19,550 (100.0%)
## 
## 
## ### 
## 
## 
## ### Total basepairs processed:    11,768,821 bp
## 
## 
## ###   Read 1:     5,884,550 bp
## 
## 
## ###   Read 2:     5,884,271 bp
## 
## 
## ### Total written (filtered):     10,273,440 bp (87.3%)
## 
## 
## ###   Read 1:     5,009,542 bp
## 
## 
## ###   Read 2:     5,263,898 bp
## 
## 
## ### 
## 
## 
## ### === First read: Adapter 1 ===
## 
## 
## ### 
## 
## 
## ### Sequence: GTGYCAGCMGCCGCGGTAA; Type: regular 5'; Length: 19; Trimmed: 0 times.
## 
## 
## ### 
## 
## 
## ### === First read: Adapter 2 ===
## 
## 
## ### 
## 
## 
## ### Sequence: ATTAGAWACCCBNGTAGTCC; Type: regular 3'; Length: 20; Trimmed: 18213 times.
## 
## 
## ### 
## 
## 
## ### No. of allowed errors:
## 
## 
## ### 0-9 bp: 0; 10-19 bp: 1
## 
## 
## ### 
## 
## 
## ### Bases preceding removed adapters:
## 
## 
## ###   A: 0.1%
## 
## 
## ###   C: 0.1%
## 
## 
## ###   G: 99.5%
## 
## 
## ###   T: 0.3%
## 
## 
## ###   none/other: 0.0%
## 
## 
## ### WARNING:
## 
## 
## ###     The adapter is preceded by "G" extremely often.
## 
## 
## ###     The provided adapter sequence could be incomplete at its 3' end.
## 
## 
## ### 
## 
## 
## ### Overview of removed sequences
## 
## 
## ### length   count   expect  max.err error counts
## 
## 
## ### 3    7   305.5   0   7
## 
## 
## ### 28   1   0.0 2   1
## 
## 
## ### 46   2   0.0 2   0 2
## 
## 
## ### 47   290 0.0 2   226 64
## 
## 
## ### 48   16503   0.0 2   14604 1899
## 
## 
## ### 49   1407    0.0 2   1143 264
## 
## 
## ### 50   3   0.0 2   2 1
## 
## 
## ### 
## 
## 
## ### 
## 
## 
## ### === Second read: Adapter 3 ===
## 
## 
## ### 
## 
## 
## ### Sequence: GGACTACNVGGGTWTCTAAT; Type: regular 5'; Length: 20; Trimmed: 0 times.
## 
## 
## ### 
## 
## 
## ### === Second read: Adapter 4 ===
## 
## 
## ### 
## 
## 
## ### Sequence: TTACCGCGGCKGCTGRCAC; Type: regular 3'; Length: 19; Trimmed: 13029 times.
## 
## 
## ### 
## 
## 
## ### No. of allowed errors:
## 
## 
## ### 0-9 bp: 0; 10-19 bp: 1
## 
## 
## ### 
## 
## 
## ### Bases preceding removed adapters:
## 
## 
## ###   A: 91.6%
## 
## 
## ###   C: 5.0%
## 
## 
## ###   G: 2.3%
## 
## 
## ###   T: 1.2%
## 
## 
## ###   none/other: 0.0%
## 
## 
## ### WARNING:
## 
## 
## ###     The adapter is preceded by "A" extremely often.
## 
## 
## ###     The provided adapter sequence could be incomplete at its 3' end.
## 
## 
## ### 
## 
## 
## ### Overview of removed sequences
## 
## 
## ### length   count   expect  max.err error counts
## 
## 
## ### 3    125 305.5   0   125
## 
## 
## ### 4    1   76.4    0   1
## 
## 
## ### 9    1   0.1 0   0 1
## 
## 
## ### 47   163 0.0 1   72 91
## 
## 
## ### 48   11890   0.0 1   6988 4902
## 
## 
## ### 49   846 0.0 1   474 372
## 
## 
## ### 50   3   0.0 1   1 2
## 
## 
## ### 
## 
## 
## ### 
## 
## 
## ### WARNING:
## 
## 
## ###     One or more of your adapter sequences may be incomplete.
## 
## 
## ###     Please see the detailed output above.
## 
## 
## ### This is cutadapt 2.6 with Python 3.7.12
## 
## 
## ### Command line parameters: -g GTGYCAGCMGCCGCGGTAA -a ATTAGAWACCCBNGTAGTCC -G GGACTACNVGGGTWTCTAAT -A TTACCGCGGCKGCTGRCAC -n 2 -m 1 -j 100 -e 0.1 -o /nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//cut/lane1-s004-index-GTGGAGTCTCAT-TS032-16S-1-D1_S4_R1_001.fastq.gz -p /nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//cut/lane1-s004-index-GTGGAGTCTCAT-TS032-16S-1-D1_S4_R2_001.fastq.gz /nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//filtN/lane1-s004-index-GTGGAGTCTCAT-TS032-16S-1-D1_S4_R1_001.fastq.gz /nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep//raw.tmp.links/raw.tmp.links.16S//filtN/lane1-s004-index-GTGGAGTCTCAT-TS032-16S-1-D1_S4_R2_001.fastq.gz
## 
## 
## ### Processing reads on 100 cores in paired-end mode ...

Metadata field description

CDFW_Animal_ID: The California Department of Fish and Wildlife Animal ID number. This ID number has the following properties:
- Field 1: Contains the metapopulation from which the animal was caught (e.g. PEBS = peninsular region; DEBS = Mojave region) followed by the year in which it was caught (e.g. 2020)
- Field 2: Encodes the capture “event” in which the animal was sampled. Because of how captures are structured (i.e. different locations), the same sheep will not be caught between capture events.
- Field 3: Encodes the animal ID number. Animals are sequentially numbered by how they are caught in time.
- Each CDFW_Animal_ID is unique to a particular individual, however, if this individual is re-caputred the following year, it will be assigned a differing CDFW_Animal_ID as described above. A permanent ID is therefore used to link the same animal across different capture years (see below).
Permanent_ID: The permanent ID number for an individual. Unlike the DEBS/ PEBS number which can change between seasons, the permanent ID is always the same for each individual regardless of capture season.
Metapopulation_Name: Describes if the individual was captured from a population from within the Mojave / Death Valley (i.e. DEBS) region or the Peninsular (i.e. PEBS) region.
Population: An English tag that is used to describe populations (as is the case for Mojave populations) or ewe groups (as is the case for Peninsular groups). The english tag can be thought of a rough proxy for population structure, though genetic data will be better.
Capture_Date: Date animal was captured.
Season: Season animal was captured.
Age: Estimated age of animal.
Sex: Animal sex.
Lactating: Lactation presence (1) or absence (0).
Pregnant: Pregnancy presence confirmed via ultrasound (1) or absence (0).
UTM_Region: UTM coordinate region.
UTM_Easting: UTM Easting.
UTM_Northing: UTM Northing
World_Geodetic_System: From Wikipedia (https://en.wikipedia.org/wiki/World_Geodetic_System): “The World Geodetic System (WGS) is a standard used in cartography, geodesy, and satellite navigation including GPS. The current version, WGS 84, defines an Earth-centered, Earth-fixed coordinate system and a geodetic datum, and also describes the associated Earth Gravitational Model (EGM) and World Magnetic Model (WMM). The standard is published and maintained by the United States National Geospatial-Intelligence Agency.” All coordinates were taken with WGS84 for Peninsular and for Mojave populations.
Time_Capture: Capture time start.
Time_Release: Capture time release.
Catch_Duration: Time animal was in processing.
Chase_Duration_Seconds: How long the animal was pursued with the helicopter.
Subjective_Body_Condition: Team subjective body condition score.
Field_Contagious_Ecthyma: Presence (1) or absence (0) of Contagious Ecthyma noted on capture.
Field_Sinusitis: Presence (1) or absence (0) of sinusitis noted during capture.
Field_Coughing: Presence (1) or absence (0) of coughing noted during capture.
Field_Nasal_Discharge: Presence (1) or absence (0) of nasal discharge noted during capture.
Field_Notes: Notes from the field team.
Ear_Tag_Right: Right ear tag color and number.
Ear_Tag_Left: Left ear tag color and number.
Mojave_Weight_kg: Sheep weight (kg). Available for Mojave populations only.
Mojave_Relative_Body_Condition_Score: Subjective body condition scoring in the Mojave sheep on a scale of 5.
Mojave_Teeth_Present: Teeth present notated by the Modified Triadan System.
Mojave_Tooth_Notes: Field team notes about teeth.
Mojave_Lifestage: Animal lifestage.
Mojave_Body_Length: Body length (cm) measured from nose to rump. Available for Mojave only.
Mojave_Chest_Girth_cm: Chest girth (cm). Available for Mojave only.
Mojave_Metatarsal_Length_cm: Metatarsal length (cm). Available for Mojave only.
Mojave_Neck_Circumference_Cranial_cm: Neck circumference (cm) measured just caudal to the mandibular ramus at approximately C2. Available for Mojave only.
Mojave_Neck_Circumference_Mid_cm: Neck circumference (cm) measured within the mid cervical region at approximately C4. Available for Mojave only.
Mojave_Right_Horn_Length_cm: Right horn length (cm). Available for Mojave animals only.
Mojave_Right_Horn_Diameter_cm: Right horn diameter (cm). Available for Mojave animals only.
Mojave_Left_Horn_Length_cm: Left horn length (cm). Available for Mojave animals only.
Mojave_Left_Horn_Diameter_cm: Left horn diameter (cm). Available for Mojave animals only.
US_Max_Fat_cm: Subcutaneous fat thickness was measured at its thickest point cranial to the cranial process of the tuber ischium (pin bone) and 3 cm
from the spine (method is detailed in Stephenson et al 2020). Available for Mojave animals only.
US_Beta_Femorus_cm: Ultrasound measurement of beta femorus muscle. Method is detailed in Stephenson et al 2020.
US_Latissimus_Dorsi_cm: Ultrasound measurement of the latissimus dorsi muscle. Method is detailed in Stephenson et al 2020.
Ingesta_Free_Body_Fat_Percent: Ingesta free body fat (IFBFat) percent is calculated from ultrasound measurements taken at caputre. Stephenson 2020 found that IFBFat exhibit strong linear relationships with in vivo and postmortem indices in bighorn sheep:
- “Estimation of IFBFat vivo across the full range was best accommodated using two separate equations. When rum fat was measurable (7.75% IFBFat), we used ultrasound to predict body fat IFBFat (R^2 = 0.91); otherwise, we used our body condition score (R^2 = 0.77).” Note these measurements will probably be updated when using desert bighorn as these equations were calculated with Sierra sheep, but it is a best estimate given previous studies. When the max fat measurement was greater than 0, then IFBFat was calculated with the equation 13.28*x_1 + 7.78, where x_1 is the max fat pad measured as described in US_Max_Fat_cm. When the max fat measurement was equal to zero, then it was calculated with the equation 3.92*x_2 - 1.48, where x_2 is equal to Mojave_Relative_Body_Condition_Score.
Ingestia_Free_Body_Fat_kg: Calculated from line of best fit for Stephenson et al. 2020. See note in Ingesta_Free_Body_Fat_Percent. When the max fat measurement was greater than 0, then IFBFat was calculated with the equation6.85*x_1 + 3.28, where x_1 is the max fat pad measured as described in US_Max_Fat_cm. When the max fat measurement was equal to zero, then it was calculated with the equation 2.11*x_2 - 1.46, where x_2 is equal to Mojave_Relative_Body_Condition_Score. Reported R^2 values respectively were 0.80 and 0.73.
Mojave_Udder_Size: Subjective udder size noted by the field team.
Mojave_Milk_Expressed: Presence (1) or absence (0) of animal lactating.
Mojave_Injuries_Noted: Field team notes about injuries that were noted in the field.
Mojave_Vital_Time_1 - Mojave_Vital_Time_7: Capture times where vitals are measured.
Mojave_Temp_1 - Mojave_Temp_7: Temperatures that were measured at corresponding capture times 1 - 7 above.
Mojave_Resp_1 - Mojave_Resp_7: Temperatures that were measured at corresponding capture times 1 - 7 above.
Mojave_HR_1 - Mojave_HR_7: Heart rates that were measured at corresponding capture times 1 - 7 above.
GPS_Collar_Frequency: GPS collar frequency.
GPS_Collar_Serial_Number: GPS collar serial number.
VHF_Collar_Frequency: VHF collar frequency. VHF collars transmit a very high frequency compared to GPS collar. They are used to track the animals in the field to gather data on long term survival because they can collect data long after the GPS collar typically dies.
VHF_Collar_Serial_Number: VHF collar serial number.
GPS_Color: GPS collar color.
VHF_Color: VHF collar color.
GPS_Measurement: GPS collar measurement around the neck.
VHF_Measurement: VHF collar measurement around the neck.
GPS_Collar_Notes: Notes from the tech team about how the collar is fitting.
GPS_Manufacturer: Company manufacturer of collars used in field.
Mojave_RIT_Bolus_ID: Rumen Implant Transmitters (RIT) ID number.
Mojave_RIT_Swallowed: RIT swallowed (1) or not (0).
Mojave_Sample_Blood: Blood sample collected (1) or missing (0).
Mojave_Pharyngeal_Swab: Pharyngeal swab collected (1) or missing (0).
Mojave_Sample_Nasal_Swab: Nasal swab collected (1) or missing (0).
Mojave_Sample_Parasite: Fecal sample collected (1) or missing for parasite composition analysis.
Mojave_Sample_Ear_Swab: Ear swab collected (1) or missing (0).
Mojave_Sample_Feces: Fecal sample collected (1) or missing (0).
Mojave_Sample_Hair: Hair sample collected (1) or missing (0).
Mojave_Sample_Photo: Photo taken (1) or missing (0).
Mojave_Medications_Administered: Medications administered (1) or not (0) during field collection.
Mojave_Prophylaxis_VIT_MuSE: Sheep were administered (1) or not administered MU-SE on capture. MU-SE (selenium, vitamin E) is an emulsion of selenium-tocopherol for the prevention and treatment of (Selenium-Tocopherol Deficiency) syndrome in weanling calves and breeding beef cattle.
Mojave_Treatments_Administered: Noted if additional treatments were administered to Mojave sheep during capture.
Mojave_Sample_Collection_Location: Location that samples were processed. Field means that the helicopter team collected the samples from the sheep netted down. This often happened if weather conditions worsened and it was not safe to transport the sheep to basecamp via helicopter (e.g. wind speed). If sheep were processed in basecamp, then sheep were transported via helicopter to basecamp and samples were collected by the team listed. Often times, more samples will be collected from those sheep that were transported into basecamp than those sheep that were left in the field.
Mojave_Range_Code: Mojave mountain range code from the location of capture (see Mojave_Capture_Location).
Mojave_Range_County: County that the sheep were captured in.
Mojave_Capture_Location: The caputre location.
Mojave_Capture_Location_Long: The relative location that the sheep were captured in that is often reported as a relitive location that is listed in Mojave_Capture_Location column.
Mojave_Recorder_Name: The person that recorded data on each sheep that was captured and processed in basecamp.
Mojave_Capture_Team: Team that was involved in the capture of the sheep (helicopter team).
Mojave_Sample_Collection_Team: Names of the team that was involved in collecting the samples that were gathered from the basecamp sheep.
Mojave_Capture_Method: The method that was employed by the capture helicopter team. All sheep in the Mojave were captured via helicopter net gut.
Mojave_Slow_To_Move_Post_Capture: If sheep were returned to the field and then the helicopter team noticed that they were slow to move post release, this is indicated as a 1. Most sheep recorded during this year got up quickly on release.
Pen_Ewe_Group: The ewe group of the Peninsular region. The Ewe group is a proxy for the population structure of the populations within the Peninsular region. Note that ewe group in this region is probably a much rouger proxy for population structure than that of the Mojave region. The reason for this is that population ranges tend to be liner and parallel to one another and so it is thought that there is much mroe crossover between populations in this region. Ewe group assignments also have much more debate to the assignment of an individual sheep to the group comparatively to the sheep that are assigned a population within the Mojave region.
Pen_Recovery_Region: The recovery region short hand of where the sheep were capture within the Peninsular ranges.
Pen_Recovery_Region_Full: The full name of the recovery region.
Pen_Recovery_Region_Notes: Notes regarding the region in which the sheep was captured.
Pen_Lattitude: Lattitude of the GPS location of the Peninsular sheep.
Pen_Longitude: Longitude of the GPS location of the Peninsular sheep on capture.
Pen_Temperature: Temperature on capture.
Pen_Field_Behavior: Notes on behavior during capture in Peninsular region.
Plate_Location: Location of sample during extraction and sequencing. Contains both Plate_ID and Well_ID.
Plate_ID: Plate ID number for sequencing.
Well_ID: Well ID number for sequencing.
Sampling_Location: Location that samples were taken from. “Basecamp” samples were either taken by the helicopter team in the field, or they were taken by scientists in the basecamp after the helicopter team transported the sheep. “Field” samples were those samples that were collected by sheep bedsides. “Longitudinal integrity samples” were an experiment that was conducted to see how long after fecal sample production the microbial communities showed changes.
Barcode_18S: The 18S sample barcode used for sequencing.
Sharpton_Internal_Project_ID_18S: The internal Sharpton Lab project ID that is used to map sequencing runs to projects within the lab.
Raw_FastQ_Server_Location_18S: The location on the server where raw 18S sequences are stored.
Raw_FastQ_Forward_Name_18S: The fastq file name of the raw forward 18S reads.
Sequencing_Lane_18S: The sequncing lane that samples were run in.
Sequencing_ID_18S: The sample ID for the 18S sequences that was assigned during sequencing.
Raw_FastQ_Reverse_Name_18S: The fastq file name of the raw reverse 18S reads.
path_tmp_data_18S: A temporary back up copy of the 18S sequences used for analysis.
path_tmp_links_18S: A temporary symbolic link to the copy of the 18S used for analysis.
Lines_18S: The number of lines in the raw 18S forward sequencing file.
Barcode_16S: The 16S sample barcode used for sequencing.
Sharpton_Internal_Project_ID_16S: The internal Sharpton Lab project ID that is used to map sequencing runs to projects within the lab.
Raw_FastQ_Server_Location_16S: The location on the server where raw 16S sequences are stored.
Raw_FastQ_Forward_Name_16S: The fastq file name of the raw forward 16S reads.
Sequencing_Lane_16S: The sequncing lane that samples were run in.
Sequencing_ID_16S: The sample ID for the 18S sequences that was assigned during sequencing.
Raw_FastQ_Reverse_Name_16S: The fastq file name of the raw reverse 16S reads.
path_tmp_data_16S: A temporary back up copy of the 16S sequences used for analysis.
path_tmp_links_16S: A temporary symbolic link to the copy of the 16S used for analysis.
Lines_16S: The number of lines in the raw 16S forward sequencing file.
PCV: Pack cell volume (%). In a paper by Borjesson et al. 2000, reported reference intervals for PCV in free ranging desert bighorn sheep for HEMATOCRIT was 44.3% - 56.2% for females and 33.2% - 56.3% for males.
TP: Total protein taken from capillary tube along with PCV.
Mojave_Death_Date: Death date of an animal. Dates in the death column are animals that were retrieved upon death.
Mojave_Mortality_Date: Death date of an animal. Dates in the mortality column are animals that were not retrieved upon death.
GPS_Collar_End_Date: Date that the GPS signal cut out in the event the sheep did not die but the collar stopped working. RIT was used to confirm the date the GPS end date is either when the GPS collar dropped off, The GPS collar failed (stopped recording). It is also possible that it is a day or two after the mortality date so that the temperature / HR data was not truncated too soon.
RIT_End_Date: Date that RIT Collar stopped recording.
Mortality_Notes: Notes regarding Mojave_Death_Date, Mojave_Mortality_Date, GPS_Collar_End_Date, or RIT_End_Date.
VDL_Chemistry_Accession_Number: VDL report accession number for chemistry values (BUN, Cr, Glucose, Cholesterol, Triglycerides, TP, Albumin, Globulin, Bilirubin, Calcium, Chloride, CK, GGT, Mg, NEFA, P, K, SDH, Na, tCO2, TP).
Chemistry_VDL_Reference_Range_Species: The species that the VDL reports their reference ranges from. Our VDL reference ranges were set to domesticated sheep reference ranges.
VDL_Chemistry_Sample_Type: Sample type that was submitted for chemistry values.
VDL_Chemistry_Accession_Number_Specimen: The VDL’s accession number followed by a sample number linking each individual’s report to a unique number.
BUN: Blood Urea nitrogen (mg/dL). VDL reference range 10 - 35mg/dL. In a study by Borjesson et al. 2000, reference intervals for free-ranging bighorn sheep for 5 - 28 mg/dL (n = 200). Reference values for Borjesson et al. 2000 were calculated with nonparametric analysis using the central 90th percentile. Southeast Oregon bighorn sheep reference values (nonparametric) are 6.4 - 28.6 mg/dL.
Creatinine: Creatinine (mg/dL). VDL reference range 0.9 - 2.0mg/dL. In a study by Borjesson et al. 2000, reference intervals for free-ranging bighorn sheep for 1.6 - 2.6 mg/dL (n = 200). Reference values for Borjesson et al. 2000 were calculated with nonparametric analysis using the central 90th percentile. Southeast Oregon bighorn sheep reference values (nonparametric) are 1.1 - 2.5 mg/dL.
Glucose: Glucose (mg/dL). VDL reference ranges 50 - 85 mg/dL. In a study by Borjesson et al. 2000, reference intervals for free-ranging bighorn sheep for 95 - 185 mg/dL (n = 200). Reference values for Borjesson et al. 2000 were calculated with nonparametric analysis using the central 90th percentile. Southeast Oregon bighorn sheep reference values (nonparametric) are 78.8 - 222.4 mg/dL.
Cholesterol: Cholesterol (mg/dL). VDL reference ranges 40 - 76 mg/dL. Reference values for Borjesson et al. 2000 were calculated with nonparametric analysis using the central 90th percentile. Southeast Oregon bighorn sheep reference values (nonparametric) are 28.6 - 76.0 mg/dL.
Triglycerides: Triglycerides (mg/dL). No standard reported reference range by VDL. Southeast Oregon bighorn sheep reference values (nonparametric) are 71.2 - 536.4 mg/dL.
Total_Protein: Total Protein (g/dL). VDL reference ranges are 5.5 - 7.5 g/dL. In a study by Borjesson et al. 2000, reference intervals for free-ranging bighorn sheep for total protein was 6.0 - 9.4 (n = 184) for adult sheep. Reference values for Borjesson et al. 2000 were calculated with nonparametric analysis using the central 90th percentile. Southeast Oregon bighorn sheep reference values (nonparametric) are 4.94 - 8.72 g/dL.
Albumin: Albumin (g/dL). VDL reference ranges are 2.5 - 3.9 g/dL. In a study by Borjesson et al. 2000, reference intervals for free-ranging bighorn sheep for albumin was 2.8 - 3.7 g/dL (n = 200). Reference values for Borjesson et al. 2000 were calculated with nonparametric analysis using the central 90th percentile. Southeast Oregon bighorn sheep reference values (nonparametric) are 2.54 - 4.42 g/dL.
Globulins: GLobulins (g/dL). Globulins were not calculated by the VDL, nor were reference ranges reported. In a study by Borjesson et al. 2000, reference intervals for free-ranging bighorn sheep for globulins were 2.8 - 6.1 g/dL (n = 184) for adults and was 0.8 - 1.3 g/dL for young animals (n = 16). Reference values for Borjesson et al. 2000 were calculated with nonparametric analysis using the central 90th percentile.
Albumin_Globulin_Ratio: The albumin to globulin ratio. In a study by Borjesson et al. 2000, reference intervals for free-ranging bighorn sheep for albumin:globulin ratios was 0.5 - 1.2 (n = 184) for adults and was 0.8 - 1.3 for young animals (n = 16). Reference values for Borjesson et al. 2000 were calculated with nonparametric analysis using the central 90th percentile.
Bilirubin: Total bilirubin (mg/dL). VDL reference ranges are .0 - 0.5 mg/dL. In a study by Borjesson et al. 2000, reference intervals for free-ranging bighorn sheep for total bilirubin was 0.0 - 0.1 mg/dL (n = 200). Reference values for Borjesson et al. 2000 were calculated with nonparametric analysis using the central 90th percentile. Southeast Oregon bighorn sheep reference values (nonparametric) are 0.10 - 0.40 mg/dL.
CK: Creatine Kinase (U/L). VDL reference ranges are 50 - 150 U/L. In a study by Borjesson et al. 2000, reference intervals for free-ranging bighorn sheep for CK were 175 - 2300 U/L (n = 200). Reference values for Borjesson et al. 2000 were calculated with nonparametric analysis using the central 90th percentile. Southeast Oregon bighorn sheep reference values (nonparametric) are 16.8 - 1868.4 U/L.
GGT: Gamma-glutamyl Transferase (U/L). VDL reference ranges are 30 - 94 U/L. In a study by Borjesson et al. 2000, reference intervals for free-ranging bighorn sheep for GGT are 20 - 130 U/L (n = 200). Reference values for Borjesson et al. 2000 were calculated with nonparametric analysis using the central 90th percentile. Southeast Oregon bighorn sheep reference values (nonparametric) are 14.0 - 78.6 U/L.
AST: Aspartate transaminase (U/L). VDL reference ranges are 60 - 110 U/L. In a study by Borjesson et al. 2000, reference intervals for free-ranging bighorn sheep for ALT were 78 - 312 U/L (n = 200). Reference values for Borjesson et al. 2000 were calculated with nonparametric analysis using the central 90th percentile. Southeast Oregon bighorn sheep reference values (nonparametric) are 84 - 310.6 U/L.
Sodium: Sodium (mEq/L). VDL reference ranges are 145 - 155 mEq/L. In a study by Borjesson et al. 2000, reference intervals for free-ranging bighorn sheep for sheep were 145 - 160 mmol/L (n = 200). Reference values for Borjesson et al. 2000 were calculated with nonparametric analysis using the central 90th percentile. Southeast Oregon bighorn sheep reference values (nonparametric) are 117.6 - 178.8 mEq/L.
Potassium: Potassium (mEq/L). VDL reference ranges are 4.5 - 6.0 mEq/L. In a study by Borjesson et al. 2000, reference intervals for free-ranging bighorn sheep for potassium were 3.8 - 6.3 mmol/L (n = 200). Reference values for Borjesson et al. 2000 were calculated with nonparametric analysis using the central 90th percentile. Southeast Oregon bighorn sheep reference values (nonparametric) are 3.60 - 7.28 mEq/L.
Chloride: Chloride (mEq/L). VDL reference ranges are 95 - 112 mEq/L. In a study by Borjesson et al. 2000, reference intervals for free-ranging bighorn sheep for Chloride were 89 - 107 mmol/L (n = 200). Reference values for Borjesson et al. 2000 were calculated with nonparametric analysis using the central 90th percentile. Southeast Oregon bighorn sheep reference values (nonparametric) are 78.2 - 115.6 mEq/L.
Calcium: Calcium (mg/dL). VDL reference ranges are 8.5 - 12.0 mg/dL. In a study by Borjesson et al. 2000, reference intervals for free-ranging bighorn sheep for calcium were 9.3 - 11.5 mg/dL (n = 200). Reference values for Borjesson et al. 2000 were calculated with nonparametric analysis using the central 90th percentile. Southeast Oregon bighorn sheep reference values (nonparametric) are 7.34 - 12.60 mg/dL.
Phosphorus: Phosphours (mg/dL). VDL reference ranges are 5.0 - 7.5 mg/dL. In a study by Borjesson et al. 2000, reference intervals for free-ranging bighorn sheep for phosphorus were 4.0 - 9.3 mg/dL (n = 200). Reference values for Borjesson et al. 2000 were calculated with nonparametric analysis using the central 90th percentile. Southeast Oregon bighorn sheep reference values (nonparametric) are 2.96 - 8.62 mg/dL.
Magnesium: Magnesium (mg/dL). VDL reference ranges are 2.2 - 2.8 mg/dL. No reference ranges were reported for Mg in Borjesson et al. 2000. Southeast Oregon bighorn sheep reference values (nonparametric) are 2 - 3.66 mg/dL.
tCO2: Total Carbon Dioxide (mEq/L). VDL reference ranges are 21.0 - 28.0. No reference ranges were reported for tCO2 in Borjesson et al. 2000. Southeast Oregon bighorn sheep reference values (nonparametric) are 0.44 - 15.0 mEq/L.
BHbA: Beta-hydroxybutyrate (mg/dL). No reported reference ranges at VDL. No reference ranges were reported for BHbA in Borjesson et al. 2000. Southeast Oregon bighorn sheep reference values (nonparametric) are 1.29 - 3.63 mg/dL.
Non_Esterfied_Fatty_Acids: Non_Esterfied_Fatty_Acids (mEq/L). No reported VDL reference ranges. No reported ranges for California Bighorn sheep. No reference ranges were reported for non esterfied fatty acids in Borjesson et al. 2000.
Anion_Gap: Anion Gap reported in mEq/L. No reported reference range by VDL. Normal anion gap reported for bovines is 19 - 26. Based off of the normal reference ranges for electrolytes in Borjessen et al. 2000, the normal normal anion gap would be the following:
- Anion Gap = (Na+ + K+) - (Cl + HC03-)
- Min = (Min+) - (Max-) = (145 + 4.5) - (112 + 28) = 9.5
- Max = (Max+) - (Min-) = (155 + 6.0) - (95 + 21) = 45
- Southeast Oregon bighorn sheepreference values (nonparametric) are 34.4 to 64.0.
SDH: Sorbitol_dehydrogenas (U/L). VDL reference ranges are 6.0 - 27.0 U/L. Southeast Oregon bighorn sheep reference values (nonparametric) are 11.16 - 114.79 U/L.
Chemistry_Laboratory: Laboratory that ran blood chemistry values. VDL is the Carlson College of Veterinary Medicine Veterinary Diagnostic Lab.
CBC_Personnel: Initials of person performing CBC differential count.
Neutrophils: The percent neutrophils on WBC differential count.
Lymphocytes: The percent lymphocytes on WBC differential count.
Monocytes: The percent Monocytes on WBC differential count.
Eosinophils: The percent Eosinophils on WBC differential count.
Basophils: The percent Basophils on WBC differential count.
Bands: The percent bamded neutrophils on WBC differential count.
Metamyelocytes: The percent Metamyelocytes on WBC differential count.
Neutrophil_Toxicity: A subjective rating of degree of neutrophil toxicity as determined by clinical pathologist. Possible values are 0 (no toxic changes noted), 1, 2, and 3 (highest degree of toxic changes noted).
Lymphocyte Reactivity: A subjective rating of degree of lymphocyte reactivity as determined by clinical pathologist. Possible values are 0 (no reactive changes noted), 1, 2, and 3 (highest degree of reactive changes noted).
Anaplasma: A rating of degree of anaplasma infection present in blood smear where 0 means no anaplasma infection was noted and 3 being the highest numbers of Anaplasma were noted in peripheral blood sample.
CBC_Differential_Total: Sum of percentages obtained for differential WBC count. Should sum to 100%
Fecal_Egg_Count_Sample_Weight_g: Grams of fecal sample used for fecal egg counts.
Fecal_Egg_Count_Ascaris: Fecal egg count ascaris.
Fecal_Egg_Count_Strongyle: Fecal egg count strongyle.
Fecal_Egg_Count_Coccidia: Fecal egg count coccidia.
Fecal_Egg_Count_Trichuris: Fecal egg count trichuris.
Fecal_Egg_Count_Pinworm: Fecal egg count pinworm.
Fecal_Egg_Count_Cestode: Fecal egg count cestode.
Fecal_Egg_Count_Process_Date: Date fecal egg counts were performed.
Fecal_Egg_Count_Notes: Notes from field technition about Fecal egg counts.
Serology_AM_ELISA:
Se: Selenium ppm in whole blood EDTA from UC Davis CAHFS Report D2014031. Reporting limit is the lowest routinely quantified concentration of an analyte in a sample. The analyte may be detected, but not quantified at concentrations that are below the reporting limit. Sample volumes less than requested might result in reporting limits that are higher than those listed. Reporting limit 0.010 ppm. Reference range reported is 0.08 to 0.5.
Se_Specimen: The sample that was used for by the laboratory for selenium.
Se_Lab: Lab that reported Se column.
Se_Lab_Accession: Lab accession number from Se_Lab.
Se_Units: Units that Se is reported in.
BRSV_Status: Bovine Respiratory Syncytial Virus Antibody Immunofluorescence Assay (IFA) Results. Based on the clinical interpretation suggested is that 1:40 is evidence for exposure / infection in domesticated livestock. They classify into semi-quantitative categories:
- “Low” := 4 - 32
- “Medium” := 40 - 160
- “High” := 320 - 5120 Results are reported from UC Davis CAHFS Report D2014031. See BRSV_Titer result.
BRSV_Titer: Titer reported along with BRSV_Status.
BRSV_Specimen: Sample used for BRSV_Status.
BRSV_Lab: Laboratory that ran BRSV_Status results are from UC Davis CAHFS Report D2014031.
BRSV_Lab_Accession: Lab accession number from BRSV_Status.
BRSV_Test: Test type for results presented in BRSV_Stats and BRSV_Titer column.
CE_Status: Contagious Ecthyma Antibody Results reported via compliment fixation (CF). Based on clinical interpretation, the CDFW may report these as:
- “1” := CF result is negative at 1:5
- “2” := CF result is anticomplimentary (AC) or has non-specific agglutination.
- “3” := CF result is positive
Note that for the current dataset, I encoded CE_Status as reported by UC Davis CAHFS Report D2014031 and not by CDFW semi-quantitative values discussed above for clinical interpretation. See paired column CE_CF.
CE_CF: Complement fixation number paired with CE_Status above.
CE_Specimen: Sample type used for CE_Status.
CE_Lab: Lab that ran the test for CE_Status above.
CE_Test: Test type used for CE_Status.
BT_EHD_Status: Test result from Bluetongue Virus / Epizootic Hemorrhage Disease antibody agar gel immunodiffusion (AGID) assay as reported by UC Davis CAHFS Report D2014031. Results are reported as positive or negative only.
BT_EHD_Specimen: Specimen type for BT_EHD_Status.
BT_EHD_Lab: Lab reporting BT_EHD_Status.
BT_EHD_Lab_Accession: Accession number of report for BT_EHD_Status.
BT_EHD_Test: Test type of BT_EHD_Status.
BT_Status: Test result from Bluetongue virus antibody enzyme linked immunosorbent assay (cELISA) as reported by UC Davis CAHFS report D2014031. Results are reported as positive or negative only.
BT_Specimen: Specimen type for BT_Status.
BT_Lab: Lab reporting BT_Status.
BT_Lab_Accession: Accession number of report for BT_Status.
BT_Test: Test type of BT_Status.
PI3_Status: Parainfluenza 3 (PI-3) antibody hemaglutination inhibition (HI) as reported by UC Davis CAHFS report D2014031. Clincial interpreation suggested doesn’t differ based on the results shown here.
PI3_Titer: Titer result for PI3_Status.
PI3_Specimen: Specimen type for PI3_Status.
PI3_Lab: Laboratory that reports PI3_Status.
PI3_Lab_Accession: Lab accession number for PI3_Status.
PI3_Test: Test type for PI3_Status.
Am_Status:
Am_Specimen:
Am_Lab:
Am_Lab_Accession:
Am_Test:
``:

[233] “”
[234] “”
[235] “”
[236] “”
[237] “”
[238] “”
[239] “IBR_Status”
[240] “IBR_Titer”
[241] “IBR_Specimen”
[242] “IBR_Lab”
[243] “IBR_Lab_Accession”
[244] “IBR_Test”
[245] “BVDV1_Status”
[246] “BVDV1_Titer”
[247] “BVDV1_Specimen”
[248] “BVDV1_Lab”
[249] “BVDV1_Lab_Accession”
[250] “BVDV1_Test”
[251] “BVDV2_Status”
[252] “BVDV2_Titer”
[253] “BVDV2_Specimen”
[254] “BVDV2_Lab”
[255] “BVDV2_Lab_Accession”
[256] “BVDV2_Test”
[257] “Ch_Status”
[258] “Ch_Titer”
[259] “Ch_Specimen”
[260] “Ch_Lab”
[261] “Ch_Lab_Accession”
[262] “Ch_Test”

infectious_Bovine_rhinotracheatiis_titer bovine_viral_diarrhea_titer Serum neutralization parainfluenza titer parainflenza ISOlation Parainflenza hi bluetoung elisa agid bt ioslation epizoatic hemorrhagic disease isolation brucellossis lepto each of lepto titers

bovine resp synctial virus ovine progressive pneumonia anaplasma card anaplasma elisa toxoplasmosis titer chlymidia orf Mycoplasma ovipneumonia WSUT diagnostic and WSUE diagnostic

Mycoplasma ovipneumoniae https://tests.waddl.vetmed.wsu.edu/Tests/Details/8147

Metadata file list

TODO add in a description for all metadata files here.

California Bighorn Sheep Reference Intervals: Confidence intervals for chemistry data parameters for California bighorn sheep. The particular population of California bighorn sheep is geographically located in our Oregon study system.
Bighorn Sheep Chemistry Reference Ranges: A file of calculated chemistry reference intervals calculated from various bighorn sheep populations.
Desert Bighorn Sheep Fecal Egg Counts 2020: A file of fecal egg counts performed in the field for desert bighorn sheep.
Toxicology_Serology_Disease_Panel_CAHFS_UCDavis_D2014031_HKA: Serology results from UCDavis report number D2014031. Results have been crosschecked by HKA and MFW.
Toxicology_Serology_Disease_Panel_Clinical_Interpretation: Clinical interpretation of what titer values mean from serology results. Note that the status reported by UCDavis (e.g. “positive”) might not match the values that they suggest here for use as positive or negative animals. Annotated this readme file from the UCDavis disease panel results with this info.
Serology Mojave 1980-2020: Latest serology data taken from Sara Carpenter email sent on 3/10/2023 with attachment “Serology Mojave 1980-2017_SC 2.csv”. I did not integrate this into my analysis because it is encoded in a manner specific for integration with the other disease panel results that are encoded by the CDFW.

Codes for positive and negative results include:

0 Not done or sample was contaminated 1 Negative 2 Suspect 3 Positive

Codes for titer results include:

0 Not done 1 Negative

Titers are the number of the titer’s reciprocal (i.e. 64 for 1:64)

Age

0 Not known 1 0 to 2 years 2 >2 years

Sex

0 not known 1 male 2 female