Working Directories, Variable Types, & Workflow Tips

library(tidyverse)
library(palmerpenguins)
library(here)
library(janitor)

Working Directory

  • what is a working directory? getwd()
  • how do you set it? setwd() or session → set working directory OR use an R Project
  • what is an R Project?
  • I always use R projects to keep everything organized
  • in order to read in a file, it must be located within your current working directory or you will have to explicitly tell R how to navigate to the correct location
  • Jenny’s blog on the here package
# read in forensic data - note the data file is in my wd
forensic <- read_csv("data_forensic.csv") # tibble "tidy dataframe"
# forensic <- read.csv("data_forensic.csv") # this also works, but reads in your data as a dataframe instead of a tibble

Tidyverse

  • what is tidyverse?
  • tibbles vs dataframes
# iris vs penguins data

# iris = dataframe
# penguins = tibble 

mydata <- iris %>%
  as_tibble() %>% # turns the dataframe into a tibble
  clean_names() %>% # cleans the variable names (gets rid of white space, capital letters, etc)
  rename(penguin_species = species, # rename specific columns
         s_length = sepal_length) # new name = old name

glimpse(mydata) # useful function to get a "glimpse" of your data and variable types
## Rows: 150
## Columns: 5
## $ s_length        <dbl> 5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9, 5.4,~
## $ sepal_width     <dbl> 3.5, 3.0, 3.2, 3.1, 3.6, 3.9, 3.4, 3.4, 2.9, 3.1, 3.7,~
## $ petal_length    <dbl> 1.4, 1.4, 1.3, 1.5, 1.4, 1.7, 1.4, 1.5, 1.4, 1.5, 1.5,~
## $ petal_width     <dbl> 0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.3, 0.2, 0.2, 0.1, 0.2,~
## $ penguin_species <fct> setosa, setosa, setosa, setosa, setosa, setosa, setosa~

Variable types

  • different types of variables (e.g dbl, int, fct, chr, lgl, date)
  • converting variable types
  • factor variable trap: be careful converting a factor to numeric
# levels() will only work with factors
levels(penguins$species)
## [1] "Adelie"    "Chinstrap" "Gentoo"
levels(penguins$sex)
## [1] "female" "male"
penguins$species <- as.character(penguins$species) # convert numeric to character
penguins$bill_length_mm <- as.factor(penguins$bill_length_mm) # convert numeric to factor

# convert factor to numeric... this is the trap!!
penguins$bill_length_mm2 <- as.numeric(penguins$bill_length_mm) 
penguins$bill_length_mm3 <- as.numeric(as.character(penguins$bill_length_mm)) # make character first!

Workflow Tips

  • when you’re starting a new project, create an R project where you will keep all of your files
  • don’t use white spaces when names files or variables (programming languages don’t like white spaces)
  • every time you want to work on that project, you simply open up the .Rproj file and your working directory is automatically set
  • load libraries at the start of your document
  • read in your data
  • clean data names
  • glimpse() to see all variables and variable types
  • are there any variables you need to convert to a different type?