Install R and Rstudio. Get Rstudio running on your computer and get familiar with the layout.
I installed R and RStudio - otherwise I wouldn’t have been able to make this Quarto document! I used the instructions provided for Lab 0 on our class website.
2.
Make a folder (directory) on your computer for this course and then make an Rstudio project for this course that runs from that folder.
I created a folder!
Here it is :)
3.
Make a new quarto document. This is where you will do the lab assignment that you will turn in. Make a title and subject headers for each question. Copy the instructions and then add your work below.
You’re reading the Quarto document right now!
4.
Install packages and load packages
I installed the Packages under the Packages window in RStudio.
library(palmerpenguins)library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
dat <- vroom(...)
problems(dat)
Rows: 5223 Columns: 23
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (4): method, type, life.history, species
dbl (19): site, lat, transect, diver, percent.of.cover, l, w, area, pale, bl...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
This pulls the .csv data file off the site online and names it “those_data”!
str(those_data) # Tells us the "attributes" of each column and the type of data that is in it. e.g. "num" is numbers!
# A tibble: 6 × 23
method site type lat transect diver life.history species percent.of.cover
<chr> <dbl> <chr> <dbl> <dbl> <dbl> <chr> <chr> <dbl>
1 Video 1 Back … 3 1 4 Stress Tole… SSID 0.3
2 Video 1 Back … 3 1 4 <NA> MCOM 0.9
3 Video 1 Back … 3 1 4 Stress Tole… SSID 0.8
4 Video 1 Back … 3 1 4 Generalist OFAV 0.9
5 Video 1 Back … 3 1 4 Weedy UTEN 0.3
6 Video 1 Back … 3 1 4 Stress Tole… SSID 0.85
# ℹ 14 more variables: l <dbl>, w <dbl>, area <dbl>, pale <dbl>,
# bleached <dbl>, total.pb <dbl>, percent.pb <dbl>, new <dbl>, trans <dbl>,
# old <dbl>, total.mort <dbl>, percent.mort <dbl>, disease <dbl>,
# percentdisease <dbl>
tail(those_data) # Shows us just the bottom 6 rows
# A tibble: 6 × 23
method site type lat transect diver life.history species percent.of.cover
<chr> <dbl> <chr> <dbl> <dbl> <dbl> <chr> <chr> <dbl>
1 AGRRA 14 Patch… 1 6 3 Weedy PAST NA
2 AGRRA 14 Patch… 1 6 3 <NA> PDIV NA
3 AGRRA 14 Patch… 1 6 3 Stress Tole… SSID NA
4 AGRRA 14 Patch… 1 6 3 <NA> MCOM NA
5 AGRRA 14 Patch… 1 6 3 Weedy PAST NA
6 AGRRA 14 Patch… 1 6 3 Weedy PAST NA
# ℹ 14 more variables: l <dbl>, w <dbl>, area <dbl>, pale <dbl>,
# bleached <dbl>, total.pb <dbl>, percent.pb <dbl>, new <dbl>, trans <dbl>,
# old <dbl>, total.mort <dbl>, percent.mort <dbl>, disease <dbl>,
# percentdisease <dbl>
summary(those_data) # Gives statistics like mean and median for each column
method site type lat
Length:5223 Min. : 1.000 Length:5223 Min. :1.000
Class :character 1st Qu.: 4.000 Class :character 1st Qu.:2.000
Mode :character Median : 7.000 Mode :character Median :3.000
Mean : 7.366 Mean :3.196
3rd Qu.:11.000 3rd Qu.:4.000
Max. :14.000 Max. :5.000
transect diver life.history species
Min. :1.000 Min. :1.000 Length:5223 Length:5223
1st Qu.:2.000 1st Qu.:2.000 Class :character Class :character
Median :4.000 Median :4.000 Mode :character Mode :character
Mean :3.545 Mean :3.433
3rd Qu.:5.000 3rd Qu.:5.000
Max. :7.000 Max. :5.000
percent.of.cover l w area
Min. :0.1000 Min. : 2.57 Min. : 1.500 Min. : 4.50
1st Qu.:0.6500 1st Qu.: 8.75 1st Qu.: 6.924 1st Qu.: 54.93
Median :0.8000 Median : 14.22 Median : 10.723 Median : 150.00
Mean :0.7351 Mean : 22.19 Mean : 17.707 Mean : 750.69
3rd Qu.:0.9000 3rd Qu.: 26.92 3rd Qu.: 20.000 3rd Qu.: 506.42
Max. :7.0000 Max. :600.00 Max. :4250.000 Max. :216750.00
NA's :2286 NA's :1 NA's :1
pale bleached total.pb percent.pb
Min. :0.00000 Min. :0.000000 Min. :0.00000 Min. : 0.000
1st Qu.:0.00000 1st Qu.:0.000000 1st Qu.:0.00000 1st Qu.: 0.000
Median :0.00000 Median :0.000000 Median :0.00000 Median : 0.000
Mean :0.03855 Mean :0.009736 Mean :0.04829 Mean : 4.829
3rd Qu.:0.00000 3rd Qu.:0.000000 3rd Qu.:0.00000 3rd Qu.: 0.000
Max. :1.00000 Max. :1.000000 Max. :1.15000 Max. :115.000
new trans old total.mort
Min. :0.00000 Min. :0.00000 Min. :0.00000 Min. : 1.000
1st Qu.:0.00000 1st Qu.:0.00000 1st Qu.:0.00000 1st Qu.: 2.000
Median :0.00000 Median :0.00000 Median :0.00000 Median : 2.000
Mean :0.01295 Mean :0.01374 Mean :0.02827 Mean : 3.099
3rd Qu.:0.00000 3rd Qu.:0.00000 3rd Qu.:0.00000 3rd Qu.: 2.000
Max. :1.00000 Max. :0.95000 Max. :0.95000 Max. :22.000
NA's :1 NA's :1 NA's :4
percent.mort disease percentdisease
Min. : 0.000 Min. :0.00000 Min. : 0.000
1st Qu.: 0.000 1st Qu.:0.00000 1st Qu.: 0.000
Median : 0.000 Median :0.00000 Median : 0.000
Mean : 5.495 Mean :0.02834 Mean : 2.834
3rd Qu.: 0.000 3rd Qu.:0.00000 3rd Qu.: 0.000
Max. :100.000 Max. :1.00000 Max. :100.000
NA's :1
nrow(those_data) # Number of rows in dataset
[1] 5223
ncol(those_data) # Number of columns in dataset
[1] 23
6.
Change a numeric column to a factor. (And add a new column!)
#change data to a factorthose_data$transect <-as.factor(those_data$transect) # This tells R to view the data in this column as a "factor" (i.e., as qualitative)those_data$site <-as.factor(those_data$site)those_data$site2 <-as.factor(those_data$site) # This makes a new column based off the site column that is considered a "factor"str(those_data)
# to make a new column instead, do dataframe$columnnew <- as.factor(dataframe$column)# or dataframe$columnnew <- "whatever i want in there"
Like we did in class, I chose to do this with site. To make sure I had the hang of it, I also did it with transect.
I also wanted to be sure I could switch them back…
those_data$site <-as.numeric(as.character(those_data$site))those_data$transect <-as.numeric(as.character(those_data$transect)) # These commands set the data to be considered as quantitative againstr(those_data)
Save data on your computer and read the file back into R!
write_csv(those_data, file='those_data.csv') # This saves those_data as a new file locally on my computer
the_same_data <-read_csv("data/those_data.csv") # Then I read the file back into R under a new name
Rows: 5223 Columns: 24
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (4): method, type, life.history, species
dbl (20): site, lat, transect, diver, percent.of.cover, l, w, area, pale, bl...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.