Working Directories, Variable Types, & Workflow Tips
library(tidyverse)
library(palmerpenguins)
library(here)
library(janitor)- cheatsheet google doc (feel free to add to this!)
Working Directory
- what is a working directory?
getwd() - how do you set it?
setwd()or session → set working directory OR use an R Project - what is an R Project?
- I always use R projects to keep everything organized
- in order to read in a file, it must be located within your current working directory or you will have to explicitly tell R how to navigate to the correct location
- Jenny’s blog on the here package
# read in forensic data - note the data file is in my wd
forensic <- read_csv("data_forensic.csv") # tibble "tidy dataframe"
# forensic <- read.csv("data_forensic.csv") # this also works, but reads in your data as a dataframe instead of a tibbleTidyverse
- what is tidyverse?
- tibbles vs dataframes
# iris vs penguins data
# iris = dataframe
# penguins = tibble
mydata <- iris %>%
as_tibble() %>% # turns the dataframe into a tibble
clean_names() %>% # cleans the variable names (gets rid of white space, capital letters, etc)
rename(penguin_species = species, # rename specific columns
s_length = sepal_length) # new name = old name
glimpse(mydata) # useful function to get a "glimpse" of your data and variable types## Rows: 150
## Columns: 5
## $ s_length <dbl> 5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9, 5.4,~
## $ sepal_width <dbl> 3.5, 3.0, 3.2, 3.1, 3.6, 3.9, 3.4, 3.4, 2.9, 3.1, 3.7,~
## $ petal_length <dbl> 1.4, 1.4, 1.3, 1.5, 1.4, 1.7, 1.4, 1.5, 1.4, 1.5, 1.5,~
## $ petal_width <dbl> 0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.3, 0.2, 0.2, 0.1, 0.2,~
## $ penguin_species <fct> setosa, setosa, setosa, setosa, setosa, setosa, setosa~
Variable types
- different types of variables (e.g dbl, int, fct, chr, lgl, date)
- converting variable types
- factor variable trap: be careful converting a factor to numeric
# levels() will only work with factors
levels(penguins$species)## [1] "Adelie" "Chinstrap" "Gentoo"
levels(penguins$sex)## [1] "female" "male"
penguins$species <- as.character(penguins$species) # convert numeric to character
penguins$bill_length_mm <- as.factor(penguins$bill_length_mm) # convert numeric to factor
# convert factor to numeric... this is the trap!!
penguins$bill_length_mm2 <- as.numeric(penguins$bill_length_mm)
penguins$bill_length_mm3 <- as.numeric(as.character(penguins$bill_length_mm)) # make character first!Workflow Tips
- when you’re starting a new project, create an R project where you will keep all of your files
- don’t use white spaces when names files or variables (programming languages don’t like white spaces)
- every time you want to work on that project, you simply open up the .Rproj file and your working directory is automatically set
- load libraries at the start of your document
- read in your data
- clean data names
- glimpse() to see all variables and variable types
- are there any variables you need to convert to a different type?