README.Rmd

HollyArnold 2022-10-30

What are you here for?

Raw Data Mapping

You’re here because you are looking for raw data from a sequencing project produced by the Sharpton lab core. See the Raw Data Files Section.

Data Analysis Results

You’re here because you are looking for analysis outputs or descriptions of methods for a manuscript from Arnold. See the Data Analysis Files section.

Computational Infrastructure

You are here because you need to set up a bash profile, learn some basic UNIX commands, set up conda, or learn where you can access R studio cloud. See the Helpful Computational Reseources section.

Project Download Record

You’re here because you are looking for details on how data files were transferred from one place to another. See Project Download Record. You probably aren’t looking for this.

Raw Data Files

This section serves to be a comprehensive list of raw data files that I have transferred after sequencing. Below, each project is described with a key value. Use that key to look for corresponding sub-directories within raw_data_mapping/. Read through this gitlab page to find where you are mentioned.

Raw data is deposited into raw sequence run IDs in sequencing project folders that are identified by internal Sharpton Lab run identifiers. Currently, projects are located here: /nfs3/Sharpton_Lab/prod/prod_restructure/projects/.

Project Descriptions

Projects live in this folder /nfs3/Sharpton_Lab/prod/prod_restructure/projects/ within a project folder identified by three digit “TS” number unless otherwise specified.

Cross-sectional Microbial-Parasite-Immune Interactions of the Mojave and Peninsular Bighorn Sheep populations (2022).

Project Lead: Holly Arnold
Project Support: Thomas Sharpton (Thomas.Sharpton@oregonstate.edu); Brianna Beechler (breebeechler@gmail.com); Anna Jolles (anna.jolles@oregonstate.edu); Sara Carpenter (carpesar@oregonstate.edu): Sample DNA extraction; Luke Weinstein (weinstel@oregonstate.edu): Generation of immune data; Leigh Combrink (leighcombrink@arizona.edu): field sample collection; Kristin Kasschau (Kristin.Kasschau@oregonstate.edu): 18S and 16S sample prep.
Project Description: Determination how population structure of Mojave and Penisular Bighorn Sheep associates with microbe-immune and microbe-parasite interactions.
Sample Description: Mojave and Peninsular bighorn sheep microbiome (16S), parasite (18S), and immunology data were collected in 2022 at basecamp.
Raw Sample Map Location: raw_data_mapping/bighorn_project_2022_run_1/bighorn_sheep_2022_raw_fastq_mapping_file.numbers
Sample_Group: basecamp
Sample_ID: PEBS2020-<###>-<###> and DEBS2020-<###>-<###>
FastQ_ID: Numbers represent the plate ID from where the sample was extracted and submitted to CQLS.
Sample_Duplicated: Some samples have duplicated extractions. If so, they are labelled yes.
Raw Data Fastq Location: TS032A or TS032B (Raw data initially stored in TSOA33 and TSOB1)
Notes: Files located at raw_data_mapping/bighorn_project_2022_run_1/: (1) bighorn_sheep_2022_raw_fastq_mapping_file: Map fastQ raw data file well identifier to sheep ID and to project ID, (2) raw_illumina_links_email_confirmation.txt: Email with initial CQLS raw data location, (3) TS032A-18S_barcodes.xlsx: Barcode mapping file wells 4, 1, 3 and (4) TS032A-18S_barcodes.xlsx: Barcode mapping file wells 5, 2, 3.

Cross-sectional description of Penninuslar Bihorn Sheep populations

Project Lead: Sakshi Nulkar (nulkars@oregonstate.edu)
Project Support: Thomas Sharpton (Thomas.Sharpton@oregonstate.edu); Brianna Beechler (breebeechler@gmail.com); Anna Jolles (anna.jolles@oregonstate.edu); Holly Arnold (Holly.K.Arnold@gmail.com): Informatics support; Sara Carpenter (carpesar@oregonstate.edu): Sample DNA extraction; Leigh Combrink (leighcombrink@arizona.edu): field sample collection; Kristin Kasschau (Kristin.Kasschau@oregonstate.edu): 18S and 16S sample prep.
Project Description: Description study of the Peninsular bighorn sheep from samples collected in the field
Sample Description: Peninsular bighorn sheep fecal samples were collected form samples in the field in 2022.
Raw Sample Map Location: raw_data_mapping/bighorn_project_2022_run_1/bighorn_sheep_2022_raw_fastq_mapping_file.numbers
Sample_Group: field
Sample_ID: PEBS_<####>
FastQ_ID: Numbers represent the plate ID from where the sample was extracted and submitted to CQLS.
Sample_Duplicated: Some samples have duplicated extractions. If so, they are labelled yes.
Raw Data Fastq Location: TS032A or TS032B (Raw data initially stored in TSOA33 and TSOB1). TS032 contains the raw 16S data.

Determination of Longitudinal sample integrity of bighorn sheep fecal samples collected from the field

Project Lead: Leigh Combrink (leighcombrink@arizona.edu)
Project Support: Thomas Sharpton (Thomas.Sharpton@oregonstate.edu); Brianna Beechler (breebeechler@gmail.com); Anna Jolles (anna.jolles@oregonstate.edu);
Project Description: Description study of the Peninsular bighorn sheep from samples collected in the field
Sample Description: Peninsular bighorn sheep fecal samples were collected out of the field in 2022.
Raw Sample Map Location: raw_data_mapping/bighorn_project_2022_run_1/bighorn_sheep_2022_raw_fastq_mapping_file.numbers
Sample_Group: sample_integrity
Sample_ID: PEBS_<####>
FastQ_ID: Numbers represent the plate ID from where the sample was extracted and submitted to CQLS.
Sample_Duplicated: Some samples have duplicated extractions. If so, they are labelled yes.
Raw Data Fastq Location: TS032A or TS032B (Raw data initially sotred in TSOA33 and TSOB1)

Scalebrain Data

Project Lead: Ed Davis (ed@cqls.oregonstate.edu)
Project Support: Thomas Sharpton (Thomas.Sharpton@oregonstate.edu)
Project Description: Scalebrain data
Sample Description: Unknown
Raw Data Fastq Location: TS027.

EMC2 Data

Project Lead: Thomas Sharpton (Thomas.Sharpton@oregonstate.edu)
Project Support:
Raw Data Fastq Location: TS031

EMC2 Data

Project Lead: Sara Wolf (wolfs2@oregonstate.edu)
Project Support:
Raw Data Fastq Location: TS035
Notes: The PCR blank from TS031 was PCRd with the TS035 data. TS031_EMC2_PCR_blank is the PCR blank for both sets of samples.

BEE21 Data

Project Lead: Seb Singleton (singlese@oregonstate.edu)
Project Support:
Raw Data Fastq Location: TS036
Notes: TS035, TS036, TS037 and TS038 were all run together and so directories TS036-8 are linked to TS035.

Microgreens data

Project Lead: Carmen Wong (Carmen.Wong@oregonstate.edu)
Project Support:
Raw Data Fastq Location: TS037
Notes: TS035, TS036, TS037 and TS038 were all run together and so directories TS036-8 are linked to TS035.

PIB2 Data

Project Lead: Seb Singleton (singlese@oregonstate.edu)
Project Support:
Raw Data Fastq Location: TS038
Notes: TS035, TS036, TS037 and TS038 were all run together and so directories TS036-8 are linked to TS035.

Bighorn sheep project 2021 and 2022 16S

Project Lead: Holly Arnold (holly.k.arnold@gmail.com)
Project Support: Thomas Sharpton (Thomas.Sharpton@oregonstate.edu)
Project Description: Bighorn sheep data for longitudinal (2021) and for parasite intervention experiment (2022)
Sample Description:
Raw Data Fastq Location: TS039 - MiSeq Run 1046, 1047, 1050

Bighorn sheep project 18S

Project Lead: Holly Arnold (holly.k.arnold@gmail.com)
Project Support: Thomas Sharpton (Thomas.Sharpton@oregonstate.edu)
Project Description: Bighorn sheep data for longitudinal (2021) and for parasite intervention experiment (2022) 18S
Sample Description:
Raw Data Fastq Location: TS039 - 1063, 1064, 1065

PFAS (TS40 and TS040Rename)

Project Lead: Ebony Stretch (stretche@oregonstate.edu)
Project Support: Thomas Sharpton (Thomas.Sharpton@oregonstate.edu); Robert Tanguay (tanguayr@oregonstate.edu); Kristin Kasschau (Kristin.Kasschau@oregonstate.edu): 16S sample prep.
Project Description: PFAS
Server Location: /nfs3/Sharpton_Lab/prod/prod_restructure/projects/TS040Rename/ and /nfs3/Sharpton_Lab/prod/prod_restructure/projects/TS040/

Data Analysis Results

Sample Integrity project bighorn sheep 2020

Project Lead: Leigh Combrink (leighcombrink@arizona.edu)
Methods Description: Extractions (Sara Carpenter): All samples used for microbiome analyses were stored at -80C until processing for extraction. For DNA extraction, samples were thawed at 37 ̊C from the freezer. After thawing, a section of fecal pellet weighing between 0.08-0.1g was separated and added to the Qiagen Inc. DNeasy PowerSoil Kit 96 well plate (Hilden, Germany). The filled plates were stored again at -80C until extraction. #TODO Add methods on ASV generation
Objects: phyloseq_sheep_2020_16S_sample_integrity.rds **
References:

Bighorn sheep field samples 2020

Project Lead: Sakshi Nulkar (nulkars@oregonstate.edu)
Methods Description: Extractions (Sara Carpenter): All samples used for microbiome analyses were stored at -80C until processing for extraction. For DNA extraction, samples were thawed at 37 ̊C from the freezer. After thawing, a section of fecal pellet weighing between 0.08-0.1g was separated and added to the Qiagen Inc. DNeasy PowerSoil Kit 96 well plate (Hilden, Germany). The filled plates were stored again at -80C until extraction. #TODO Add methods on ASV generation
Objects: phyloseq_sheep_2020_16S_field_samples.rds **
References:

Helpful computational resources

This section compiles some helpful resources and tips to help you interact most successfully with computing infrastructure during your time in the Sharpton Lab. It shares several links that may be generally helpful for you to learn the command line (if you need to), describes the Sharpton Lab infrastructure set up, and finally describes how you can set up your .bashrc and .bash_profile in your home directory on bash so that you best access all the features of the CQLS infrastructure. CQLS finds it easier to provide support if we follow some guidelines that are then similar between users. I have made every effort to point readers to different how to files that often are not easy or intuitive to find. Let me know if I should include something else here.

Getting Started

There are a variety of guides out there that can get you started with the command line in general.

How to get started with the CQLS infrastructure: https://tips.cgrb.oregonstate.edu/posts/the-cgrb-infrastructure-and-you/ .
Getting started with the command line: https://open.oregonstate.education/computationalbiology/chapter/the-command-line-and-filesystem/.
Working with some basic unix commands: https://astrobiomike.github.io/unix/
Command line best practices: https://github.com/jlevy/the-art-of-command-line
CQLS user portal for gitlab: https://gitlab.cgrb.oregonstate.edu/users/sign_in

Send more suggestions on helpful links for the next new lab memeber that joins and I will add them here.

Sharpton Lab file infrastructure

Here is a list of documents created by Keaton describing the Sharpton Lab product (prod) file structure. Everything that lives under prod/ is backed up regularly. Thinking about what kind of data and code should be kept here includes anything that cannot be reproduced easily.

data that we have generated
code that you write
Any results with randomization that cannot be reproduced (use set seed instead)
public data that requires a large amount of work to re-assemble

Things that should not be stored in the prod folder include

publicly available data
software - software should be installed in your own directory at /nfs3/Sharpton_Lab/tmp/src/

Sharpton Lab File Structure

Sharpton Lab Data Life cycle

Setting up how you access software on Darwin

As a lab, and as a user, you have access to many different computational resources.

In general, as a lab, we have space on /nfs3/Sharpton_Lab/. This will be where you store most of the things you produce during your time.
As a user, you also have your own folder that is allocated to you at /home/micro/user. Your home directory is limited in space, so its best to not store things here, including software installs (we will talk about that shortly). The /home/micro is set to your home directory, so here is where we can modify files to interact with the other programs on the compute infrastructure.
You can interact with Darwin by creating a virtual machine Rstudio-cloud: https://rstudio-darwin.cqls.oregonstate.edu/auth-sign-in?appUri=%2F
You should learn to version control your software that you develop using the OSU version of GitHub: https://gitlab.cgrb.oregonstate.edu/users/sign_in

The bash profile

Every time you log into darwin, you start out in your home directory. You can return to your home directory at any time by simply typing cd. Your home directory is located at home/micro/username/, or ~/. Within your home directory, you have a file that sets your bash profile. The bash profile is a great tool to make terminal use easier and quicker to use. It can also make the terminal look more pretty than the default mode.

The bash_profile is a configuration file for the bash shell, which you access via the terminal. Now, before you make any changes to your bash profile, you should probably make a back up file first, maybe bash_profile.bak. Note that the bash profile is a hidden file, which is why the file name begins with a “.”. To see this file, you will have to type ls -a to show hidden files. The .bash_profile should be one of them.

# Look at the current .bash_profile
cat .bash_profile

# Save a copy
cp .bash_profile .bash_profile_back

# Look at the base .bash_profile

cat .bash_profile
# Jan 20 2022
# .bash_profile

Here is what is stored in a default .bash_profile created for a user on darwin.

# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
    . ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/bin

export PATH
unset USERNAME

The bashrc file

As you can see, the .bash_profile sources another file called the .bashrc file. For most things, you should make changes within the .bashrc file, rather than to the .bash_profile. This includes things like setting alias names, prompt modification, and setting your path variable. The .bashrc file lives in the home directory as well. On the CQLS infrastructure, here is a standard .bashrc profile that they begin with for each user.

# .basrc base file
#       Source the standard .bashrc file
source /local/cluster/etc/std.bashrc
#       Add your own personal changes following this line:

This script sources the standard bash file for any user. After this, we should now have access to any programs that are stored in /local/cluster/bin/. These programs are kept updated and installed by the CQLS. The benefit is that they can install one instance of the software and multiple users can access. The downside is that the software might not be as customizable because its installed for everyone.

We get it, things get messy, you just want that program to run! Pretty soon, you have modifications to your $PATH variable in both the .bash_profile and .bashrc file with multiple alias calling different programs (maybe I am the only guilty party here!). Well, that just makes sense why the default behavior of the compute structure isn’t working for you! If you ever need to “reset” your home directory to the default settings, then you now have the default .bash_profile and .bashrc file listed here. Don’t forget to back up your current profile first before making changes to the .bash_profile and .bashrc file.

Change prompt

In general, you should try and store things into discrete groups within your .bashrc file. First, let’s change what our prompt looks like. For example, right now, my current profile looks like this on default.

-bash-4.2$

That’s kind of ugly. Why don’t we change it to reflect our user name, what machine (host) we are on, and then what our current directory we are in. To do that, we add two lines of code under the default code in the .bashrc file:

## Change bash prompt 
export PS1="Arnold@\h [\W]$ "

This change results in the prompt looking like this. We have our user name, what machine we are on, and then within the brackets is the current directory that we are in. Because we are in our home directory, its just a tilda sign.

Arnold@darwin [~]$

There are a billion modifications that you can do with your .bashrc file. And you probably don’t want to learn all the short hand for things. Here is a .bashrc generator that allows you to figure out the prompt you would like and then generates the bash code to be added to the .bashrc file https://bashrcgenerator.com/ .

Alias

The next lines in your .bashrc file should contain a list of your short cuts, called “alias”. In an effort to keep things tidy, I’ve decided to create another file called .alias, and then just have the .bashrc source this file. So I’ve added a line of code at the end of the .bashrc file to source the .alias file.

## Get shortcuts
source ~/.alias

Now, we have to make the .alias file! In general, you should use short cuts to make your life easier. For example, you could say create a keyword that means any command that you could run on bash. Good examples of this include running a command to change directories to a specified location, or running commands of a program that you use all the time. For example, if we work on a particular project all the time, in the .alias file we could add this lines:

alias mouse='cd /nfs3/Sharpton_Lab/prod/projects/mouse_behavior_metaanalysis/projects/'

Then in the command line, we can just type “mouse”, and walla! we are suddenly in the directory we want to be. This way we don’t have to remember to navigate to the project folder each time for each terminal. That is exhausting! Another thing we might add are commands for software we use all the time. For example, if you hate remembering the git commands, then why not just make alias functions for them here?

Bad examples of alias commands include accessing special programs installed within conda environments. This results in starting a software that might not be configured correctly for the current environment you are running in. To use this software, you should source activate the conda enviroment.

R

We use R all the time, so its important to know how to start R, how to manage changing to a new R version, and how to install packages. Default install will have software going to your /home/micro/ directory, but in time, you will likely reach space capacity for this folder, so you should install software somewhere else. In the Sharpton Lab, everything under prod/ is backed up, but because software can be downloaded again, there is no reason to back it up on prod, and so should be installed in a directory under your name within the /nfs3/Sharpton_Lab/tmp/src/ folder.

The CQLS installs many softwares including R, but they do not maintain R packages. So, in our .bashrc profile, we need to put the desired R version in our path, then determine a destination for any libraries we download:

## Set up R path in .bashrc
export PATH=/local/cluster/R-4.1.0/bin:${PATH} # Tell which is the default R to use
export R_LIBS=/nfs3/Sharpton_Lab/tmp/src/arnoldhk/R/library/4.1.0 # where are user libraries stored
unset R_LIBS_USER # unset this - its a relic for when everyone in the lab was using same R version

Then, Wwithin the src/ folder, you should create an R/library/4.1.0 folder where we will install all packages for R version 4.1.0. This means the next version of R that comes out, you can just create a new version to install our next set of R libraries to R/library/4.1.1, and then update our path to source the R version 4.1.1. Ed has made a great resource describing this here: https://software.cqls.oregonstate.edu/tips/posts/using-system-r-with-user-installed-packages/.

iGraph

For many R libraries that run phylogenetic analyses, they use iGraph. Unfortunately iGraph had an issue compiling on Darwin because there were some missing C+ files. This problem has been solved by the following.

# Using /local/cluster/R-4.1.0/bin
install.packages("remotes")
remotes::install_cran("iGraph")

# Remotes is also handy because you don't have to remember BcLite commands and git commands
remotes::install_bioc("")
remotes::install_git("")

R cloud

The R cloud is available here: https://rstudio-darwin.cqls.oregonstate.edu

Conda

The next software that you will likely use is conda. Within your source folder, you should also create a /conda folder within src/ to install conda files. The conda\ folder should contain a envs and pkgs folder to store conda enviroments and packages. Within your .bashrc file, you should add these lines. These should be located last in the .bashrc file.

## Conda - keep last in .bashrc
# >>> conda initialize >>> 
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/local/cluster/miniconda3_base/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/local/cluster/miniconda3_base/etc/profile.d/conda.sh" ]; then
        . "/local/cluster/miniconda3_base/etc/profile.d/conda.sh"
    else
        export PATH="/local/cluster/miniconda3_base/bin:$PATH"
    fi
fi
unset __conda_setup
# <<< conda initialize <<<

Last, you should add the path of the src file to the .condarc file that is located in the home directory:

auto_activate_base: false
channels:
  - conda-forge
  - bioconda
  - defaults
envs_dirs:
  - /nfs3/Sharpton_Lab/tmp/src/arnoldhk/conda/envs
pkgs_dirs:
  - /nfs3/Sharpton_Lab/tmp/src/arnoldhk/conda/pkgs

Within the /home/micro/.conda/ folder, there is a file called environments.txt. This should have a list of environments and a link to where they are stored. If you move where conda environments or conda packages are stored, you will have to update this file as well as the .condarc file. Here is a tutorial on how to get started with conda: https://software.cqls.oregonstate.edu/tips/posts/conda-tutorial/.

If you feel that the conda package would be used by more than just yourself, you can have conda programs installed by CQLS. For example, humann3 was installed for everyone. This is then activated by writing:

source /local/cluster/humann3/activate.sh

If you want to install your own conda environment, you can just call conda, and it will now install to the appropriate location on src/

conda create -n test_env plotly=4.4.1 notebook=6.0.1 ipywidgets=7.5.1
conda activate test_env

Mamba has been installed on the conda base and can be used to solve environments more quickly. For example, you could install the test_env above more quickly by running

mamba create -n test_env plotly=4.4.1 notebook=6.0.1 ipywidgets=7.5.1

Conda and running a specific version of R

If you want to run a specific version of R from within conda, then you can do the following.

# See what R we are calling in base
which R

# See what R_LIBS we are pointing to
echo $R_LIBS

# Activate the conda environemnt that we want to use R in
source activate my_program

# Navigate to the base conda directory environment.
cd /nfs3/Sharpton_Lab/tmp/src/arnoldhk/conda/envs/my_program
/local/cluster/conda/conda_R_setup.sh

# Now, the version of R should be what we want
which R

# And R Libs is now pointint to shomewhere else
echo $R_LIBS

Project Download Record

This serves as a record of how Arnold transferred files for each of the projects listed above.

####################
# Sheep 2020 18S Data
# 
####################
rsync -avz --dry-run /nfs2/hts/external/illumina/miseq/221007_M04034_0030_000000000-KDV5D/L1/ /nfs3/Sharpton_Lab/prod/prod_restructure/projects/TSOA33/FastQ/

rsync -avz --dry-run /nfs2/hts/external/illumina/miseq/221007_M70296_0001_000000000-KD3GT/L1/ /nfs3/Sharpton_Lab/prod/prod_restructure/projects/TSOB1/FastQ/

rsync -avz /nfs2/hts/external/illumina/miseq/221007_M04034_0030_000000000-KDV5D/L1/ /nfs3/Sharpton_Lab/prod/prod_restructure/projects/TSOA33/FastQ/

rsync -avz /nfs2/hts/external/illumina/miseq/221007_M70296_0001_000000000-KD3GT/L1/ /nfs3/Sharpton_Lab/prod/prod_restructure/projects/TSOB1/FastQ/

# Update the internal CQLS identifiers to Internal Sharpton Lab identifiers for the project.
mv TSOA33/ TS032A/
mv TSOB1/ TSO32B/

####################
# Scalebrain Data
# Jan 5th, 2022
####################
rsync -avz --dry-run /nfs2/hts/miseq/2022/221224_M01498_1005_000000000-KM76M/L1/ /nfs3/Sharpton_Lab/prod/prod_restructure/projects/TS027/

rsync -avz /nfs2/hts/miseq/2022/221224_M01498_1005_000000000-KM76M/L1/ /nfs3/Sharpton_Lab/prod/prod_restructure/projects/TS027/
#sent 4,700,564,569 bytes  received 45,385 bytes  4,823,612.06 bytes/sec
#total size is 4,843,480,736  speedup is 1.03

rsync -avz /nfs2/hts/miseq/2022/221224_M01498_1005_000000000-KM76M/L1/ /nfs3/Sharpton_Lab/prod/prod_restructure/projects/TS027/
#sending incremental file list

#sent 171,335 bytes  received 21 bytes  31,155.64 bytes/sec
#total size is 4,843,480,736  speedup is 28,265.60

####################
# Sheep 16S data (TS032)
# Jan 5th, 2022
####################
#CWD: /nfs3/Sharpton_Lab/prod/prod_restructure/projects
mkdir TS032

rsync -avz --dry-run /nfs2/hts/miseq/2022/221209_M01498_0998_000000000-KL5LW/L1/ /nfs3/Sharpton_Lab/prod/prod_restructure/projects/TS032/

rsync -avz /nfs2/hts/miseq/2022/221209_M01498_0998_000000000-KL5LW/L1/ /nfs3/Sharpton_Lab/prod/prod_restructure/projects/TS032/

rsync -avz /nfs2/hts/miseq/2022/221209_M01498_0998_000000000-KL5LW/L1/ /nfs3/Sharpton_Lab/prod/prod_restructure/projects/TS032/
sending incremental file list

#sent 218,464 bytes  received 21 bytes  48,552.22 bytes/sec
#total size is 7,991,109,808  speedup is 36,575.10

####################
# EMC2, HBH, BEE21, Microgreens, PIB2 data
# Feb 7th, 2023
####################

rsync -avz /nfs2/hts/miseq/230203_M01498_1021_000000000-KPN3Y/L1/ /nfs3/Sharpton_Lab/prod/prod_restructure/projects/TS035/

#sending incremental file list

#sent 83036 bytes  received 17 bytes  166106.00 bytes/sec
#total size is 4923200099  speedup is 59277.81

####################
# Sample Integrity Project for Leigh
# May 6th, 2023
####################

rsync -avz /nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep/dada2.out/16S/bighorn_sheep_2020_2023-05-05_output/phyloseq_sheep_2020_16S_sample_integrity.rds /nfs3/Sharpton_Lab/WEB_DOWNLOADS/arnold/

####################
# Field Samples for Sakshi and Arnold
# May 6th, 2023
####################

rsync -avz /nfs3/Sharpton_Lab/prod/prod_restructure/projects/arnoldhk/2022_Bighorn_Sheep/dada2.out/16S/bighorn_sheep_2020_2023-05-05_output/phyloseq_sheep_2020_16S_field_samples.rds /nfs3/Sharpton_Lab/WEB_DOWNLOADS/arnold/


####################
# Sheep Data 2021 and 2022 (TS039) 16S
# June 4th, 2023
####################
rsync -avz /nfs2/hts/miseq/230508_M01498_1050_000000000-KWWNR/L1/ /nfs3/Sharpton_Lab/prod/prod_restructure/projects/TS039/

#sent 181,383 bytes  received 21 bytes  120,936.00 bytes/sec
#total size is 9,647,518,063  speedup is 53,182.50 

####################
# Sheep Data 2021 and 2022 (TS039) 18
# July 10th, 2023
####################
# Run 2
rsync -avz /nfs2/hts/miseq/230706_M01498_1064_000000000-L5NCH/L1/ /nfs3/Sharpton_Lab/prod/prod_restructure/projects/TS039/MiSeq_Run_1064/
sent 190,710 bytes  received 21 bytes  127,154.00 bytes/sec
total size is 3,302,511,100  speedup is 17,315.02

# Run 1
rsync -avz /nfs2/hts/miseq/230705_M01498_1063_000000000-L5N6B/L1/ /nfs3/Sharpton_Lab/prod/prod_restructure/projects/TS039/MiSeq_Run_1063/
sent 188,606 bytes  received 21 bytes  377,254.00 bytes/sec
total size is 2,706,575,210  speedup is 14,348.82



# Run 3
rsync -avz /nfs2/hts/miseq/230707_M01498_1065_000000000-L523W/L1/ /nfs3/Sharpton_Lab/prod/prod_restructure/projects/TS039/MiSeq_Run_1065/
sent 190,912 bytes  received 21 bytes  381,866.00 bytes/sec
total size is 3,185,530,667  speedup is 16,684.02

#############
## PFAS
## October 5th, 2023
#############
rsync -avz /nfs2/hts/nextseq/230708_VH00571_285_AACJLKMM5/L1-0mm-rename/ TS040Rename/

```