Load Packages

Lets start by loading in information from necessary packages:
Notice that the here package is indicating that I am in my folder under Joy_worm_images.

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.3     ✓ purrr   0.3.4
## ✓ tibble  3.1.1     ✓ dplyr   1.0.6
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(here)
## here() starts at /Users/grad/Box/Joy_worm_images/Joy

Using the here package

Lets see how the here package works.
We can use the function list.files to list files in a given directory…
This will give us information about what is in your folder.

Notice - I am using the notation here::here()… this is called specifying the name space. This lets other people who read your code know which package your function belongs to. So in this case we are using the here() function that is present in the here package

list.files(here::here())
## [1] "20200519_jordan_H01.csv" "20200521_jordan_H02.csv"
## [3] "20200526_jordan_H03.csv" "Joy.Rproj"              
## [5] "Presentations"           "R scripts"

Cool, so what if we want to read a file in the folder? This is where tidyverse comes in!
Remember, tidyverse is essentially a universe of R packages…
We will use the read_csv() function present in the readr package (housed in tidyverse).

Here’s what that looks like:

readr::read_csv(here::here("20200519_jordan_H01.csv"))
## Warning: Missing column names filled in: 'X1' [1]
## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   X1 = col_double(),
##   Label = col_character(),
##   Area = col_double(),
##   Angle = col_double(),
##   Length = col_double()
## )
First 6 rows of dataframe
X1 Label Area Angle Length
1 p01-growth-H01-2X_B01.TIF 81 0.000 80.435
2 p01-growth-H01-2X_B01.TIF 5 0.000 3.875
3 p01-growth-H01-2X_B01.TIF 77 0.000 76.811
4 p01-growth-H01-2X_B01.TIF 6 36.027 5.101
5 p01-growth-H01-2X_B01.TIF 84 0.000 84.011
6 p01-growth-H01-2X_B01.TIF 6 48.991 5.080

Using the here package with tidyverse

But what if there’s more than one file in your directory? (which in your case there are)
We would not want to write out the file name each time… instead lets assign all the files to a variable and then read.

# I don't want to include the folders "Presenations" and "R scripts" so I am specifying th number of objects unlike above
# I am also adding in "full.names = TRUE" to get the full pathname for each file (ensures we won't have issues later on)
files <- list.files(here::here(), full.names = TRUE)[1:3]
files
## [1] "/Users/grad/Box/Joy_worm_images/Joy/20200519_jordan_H01.csv"
## [2] "/Users/grad/Box/Joy_worm_images/Joy/20200521_jordan_H02.csv"
## [3] "/Users/grad/Box/Joy_worm_images/Joy/20200526_jordan_H03.csv"

Now we are going to use a function in the purrr package (also part of tidyverse) called map_dfr() to go through all the elements in the variable files and read each. Don’t worry too much about the notation used for purrr… if you are curious to know more about purrr feel free to reach out to me.

worms <- purrr::map_dfr(files, ~readr::read_csv(.x))

Great! Now lets check out the dataframe we just loaded in…

First 10 rows of the worms dataframe
X1 Label Area Angle Length
1 p01-growth-H01-2X_B01.TIF 81 0.000 80.435
2 p01-growth-H01-2X_B01.TIF 5 0.000 3.875
3 p01-growth-H01-2X_B01.TIF 77 0.000 76.811
4 p01-growth-H01-2X_B01.TIF 6 36.027 5.101
5 p01-growth-H01-2X_B01.TIF 84 0.000 84.011
6 p01-growth-H01-2X_B01.TIF 6 48.991 5.080
7 p01-growth-H01-2X_B01.TIF 71 0.000 70.411
8 p01-growth-H01-2X_B01.TIF 5 -51.072 4.178
9 p01-growth-H01-2X_B01.TIF 86 0.000 85.893
10 p01-growth-H01-2X_B01.TIF 6 56.310 4.807

And there you have it! We have identified files in our directory that we are interested in, and we have used purrr and readr to read in each file.

Give it a try yourself…

We can use the above basics to look at data in your own folder.
Begin by listing files. For example, let me pretend I’m working in Izzy’s folder. Let’s list the files found here:

# you will not write a path here. I must do this because I'm entering a folder that is not my own
# your code should look like: list.files(here::here())
list.files("/Users/grad/Box/Joy_worm_images/Izzy")
##  [1] "20200604_Izzy_H01.csv"                       
##  [2] "20200604_Izzy_H02.csv"                       
##  [3] "20200607_Izzy_H03.csv"                       
##  [4] "20200607_Izzy_H05.csv"                       
##  [5] "20200610_Izzy_H04.csv"                       
##  [6] "20200621_Izzy_H06.csv"                       
##  [7] "20200621_Izzy_H07.csv"                       
##  [8] "20200621_Izzy_H08.csv"                       
##  [9] "20200621_Izzy_H09.csv"                       
## [10] "20200622_Izzy_H10.csv"                       
## [11] "20200622_Izzy_H11.csv"                       
## [12] "20200623_Izzy_H12.csv"                       
## [13] "20200623_Izzy_H13.csv"                       
## [14] "20200629_Izzy_H14.csv"                       
## [15] "20200629_Izzy_H15.csv"                       
## [16] "20200630_Izzy_H16.csv"                       
## [17] "20200630_Izzy_H17.csv"                       
## [18] "20200630_Izzy_H18.csv"                       
## [19] "20200630_Izzy_H19.csv"                       
## [20] "20200630_Izzy_H20.csv"                       
## [21] "20200705_Izzy_H21.csv"                       
## [22] "20200705_Izzy_H22.csv"                       
## [23] "20200705_Izzy_H23.csv"                       
## [24] "20200706_Izzy_H24.csv"                       
## [25] "20200706_Izzy_H25.csv"                       
## [26] "20200706_Izzy_H26.csv"                       
## [27] "20200706_Izzy_H27.csv"                       
## [28] "20200706_Izzy_H28.csv"                       
## [29] "20200706_Izzy_H29.csv"                       
## [30] "20200706_Izzy_H30.csv"                       
## [31] "20200707_Izzy_H31.csv"                       
## [32] "20200707_Izzy_H32.csv"                       
## [33] "20200707_Izzy_H33.csv"                       
## [34] "20200709_Izzy_H34.csv"                       
## [35] "20200709_Izzy_H35.csv"                       
## [36] "20200709_Izzy_H36.csv"                       
## [37] "20200709_Izzy_H37.csv"                       
## [38] "20200709_Izzy_H38.csv"                       
## [39] "20200709_Izzy_H39.csv"                       
## [40] "20200709_Izzy_H40.csv"                       
## [41] "20200713_Izzy_H41.csv"                       
## [42] "20200713_Izzy_H42.csv"                       
## [43] "20200715_Izzy_H43.csv"                       
## [44] "20200715_Izzy_H44.csv"                       
## [45] "20200715_Izzy_H45.csv"                       
## [46] "20200715_Izzy_H46.csv"                       
## [47] "20200715_Izzy_H47.csv"                       
## [48] "20200715_Izzy_H48.csv"                       
## [49] "20200715_Izzy_H49.csv"                       
## [50] "20200715_Izzy_H50.csv"                       
## [51] "20200715_Izzy_H51.csv"                       
## [52] "20200717_Izzy_H52.csv"                       
## [53] "20200718_Izzy_H53.csv"                       
## [54] "20200718_Izzy_H54.csv"                       
## [55] "20200719_Izzy_H55.csv"                       
## [56] "20200720_Izzy_H56.csv"                       
## [57] "20200720_Izzy_H57.csv"                       
## [58] "20200721_Izzy_H58.csv"                       
## [59] "20200721_Izzy_H59.csv"                       
## [60] "20200721_Izzy_H60.csv"                       
## [61] "20200721_Izzy_H61.csv"                       
## [62] "20200722_Izzy_H62.csv"                       
## [63] "20200722_Izzy_H63.csv"                       
## [64] "20200722_Izzy_H64.csv"                       
## [65] "20200722_Izzy_H65.csv"                       
## [66] "20200722_Izzy_H66.csv"                       
## [67] "20200722_Izzy_H67.csv"                       
## [68] "20200723_Izzy_H68.csv"                       
## [69] "20200723_Izzy_H69.csv"                       
## [70] "20200723_Izzy_H70.csv"                       
## [71] "20200723_Izzy_H71.csv"                       
## [72] "20200723_Izzy_H72.csv"                       
## [73] "Data 1 - Big FIve Personality Traits.numbers"
## [74] "Izzy.Rproj"                                  
## [75] "TidyData.R"

Notice that at the end there is a file that is NOT a .csv file. You do not want to read this so we will need to tell R which files we want to read, similar to what I did before…

list.files("/Users/grad/Box/Joy_worm_images/Izzy")[1:60]
##  [1] "20200604_Izzy_H01.csv" "20200604_Izzy_H02.csv" "20200607_Izzy_H03.csv"
##  [4] "20200607_Izzy_H05.csv" "20200610_Izzy_H04.csv" "20200621_Izzy_H06.csv"
##  [7] "20200621_Izzy_H07.csv" "20200621_Izzy_H08.csv" "20200621_Izzy_H09.csv"
## [10] "20200622_Izzy_H10.csv" "20200622_Izzy_H11.csv" "20200623_Izzy_H12.csv"
## [13] "20200623_Izzy_H13.csv" "20200629_Izzy_H14.csv" "20200629_Izzy_H15.csv"
## [16] "20200630_Izzy_H16.csv" "20200630_Izzy_H17.csv" "20200630_Izzy_H18.csv"
## [19] "20200630_Izzy_H19.csv" "20200630_Izzy_H20.csv" "20200705_Izzy_H21.csv"
## [22] "20200705_Izzy_H22.csv" "20200705_Izzy_H23.csv" "20200706_Izzy_H24.csv"
## [25] "20200706_Izzy_H25.csv" "20200706_Izzy_H26.csv" "20200706_Izzy_H27.csv"
## [28] "20200706_Izzy_H28.csv" "20200706_Izzy_H29.csv" "20200706_Izzy_H30.csv"
## [31] "20200707_Izzy_H31.csv" "20200707_Izzy_H32.csv" "20200707_Izzy_H33.csv"
## [34] "20200709_Izzy_H34.csv" "20200709_Izzy_H35.csv" "20200709_Izzy_H36.csv"
## [37] "20200709_Izzy_H37.csv" "20200709_Izzy_H38.csv" "20200709_Izzy_H39.csv"
## [40] "20200709_Izzy_H40.csv" "20200713_Izzy_H41.csv" "20200713_Izzy_H42.csv"
## [43] "20200715_Izzy_H43.csv" "20200715_Izzy_H44.csv" "20200715_Izzy_H45.csv"
## [46] "20200715_Izzy_H46.csv" "20200715_Izzy_H47.csv" "20200715_Izzy_H48.csv"
## [49] "20200715_Izzy_H49.csv" "20200715_Izzy_H50.csv" "20200715_Izzy_H51.csv"
## [52] "20200717_Izzy_H52.csv" "20200718_Izzy_H53.csv" "20200718_Izzy_H54.csv"
## [55] "20200719_Izzy_H55.csv" "20200720_Izzy_H56.csv" "20200720_Izzy_H57.csv"
## [58] "20200721_Izzy_H58.csv" "20200721_Izzy_H59.csv" "20200721_Izzy_H60.csv"

Awesome okay, now lets assign these to a variable so we can call them in the next step:

# remember we want to use the full names to avoid downstream problems
files <- list.files("/Users/grad/Box/Joy_worm_images/Izzy", full.names = TRUE)[1:60]
## your code will look like this: files <- list.files(here::here(), full.names = TRUE)[1:__]

Now lets read in all files:

worms <- purrr::map_dfr(files, ~readr::read_csv(.x))

Aaand here we are:

First 10 rows of Izzy’s data
X1 Label Area Angle Length
1 p01-growth-H01-2X_F01.TIF 79 0.000 78.675
2 p01-growth-H01-2X_F01.TIF 7 -69.444 5.696
3 p01-growth-H01-2X_F01.TIF 83 0.000 82.100
4 p01-growth-H01-2X_F01.TIF 6 29.982 5.003
5 p01-growth-H01-2X_F01.TIF 65 0.000 64.593
6 p01-growth-H01-2X_F01.TIF 5 0.000 4.868
7 p01-growth-H01-2X_F01.TIF 86 0.000 85.628
8 p01-growth-H01-2X_F01.TIF 4 36.870 2.500
9 p01-growth-H01-2X_F01.TIF 81 0.000 80.141
10 p01-growth-H01-2X_F01.TIF 6 59.349 5.231

Explore your data

Try some of these functions:
1. colnames(worms) - gives the names of columns
2. dim(worms) - outputs the number of rows then columns
3. summary(worms) - outputs the summary statistics
4. str(worms) - gives structure information of the df

Next week we will work to tidy and process our data into a suitable format for plotting.