── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)library(dbplyr)
Attaching package: 'dbplyr'
The following objects are masked from 'package:dplyr':
ident, sql
Week 1
Terrifying R Pumkin
Introduction to R
Advantages to R:
Flexible
Allows analysis of almost any type of data
Extremely efficient packages readily permits sharing of code
Freely distributed
Basic Points:
R is command driven
It requires you to type a command after a command prompt >. If your command is not complete R issues a continuation prompt (‘+’)
You write in the script window
R is case sensitive
Commands in R are called functions
The ^ arrow can be used to bring up previous commands
The $ symbol is used to select a particular column within your data called a dataframe (eg df$var1)
Using # allows you to write text that is not executed
Objects
Statistical analyses are based around creating and manipulating objects.
a <-1a +1
[1] 2
b <- a+1
We have created an object ‘a’ with the value 1. <- is an assignment operator that assigns whatever is on the right to whatever is on the left.
We have added 1 to a giving 2. This exciting result has now been lost unless we recreate it so we can make another object!
Objects can contain anything - vectors, matrices, dataframes, lists, script. Data frames and vectors are the are most common. Consider the frequency of sightings of five avian raptor species over the last month. Species is categorical frequency is numeric
freq <-c(18,5,11,3,0)#we have created a vector object 'freq' the c() is referred to as the 'concetenate' function, combining all elelmets into a vector.species <-c("buzzard", "hobby", "kestrel", "merlin", "redkite")#another type of object is a dataframe which you could consider a 'table' of informationspec_freq <-data.frame(species,freq)spec_freq
In this exercise we will use the package psych to report basic summary statistics. The Install packages installs it however you need to also load it using library(). If you close R you need to load packages you need again. Note in the text it is cyan but i have imported as cyanistes.
Week 3 Data analyses
Using R for Graduate Students by Y. Wendy Huynh
Data set is called diamonds
We can check our current working directory by typing getwd() in the console. Section 3 shows creating a new project
getwd()
[1] "/Users/wendycolgan/Documents/Msc research Methods"
Installing Packages
For this guide we need:
tidyverse - broad and useful and includes graphing (ggplot2) and user friendly formatting (dplr)
afex statistics esp ANOVAs
emmeans another stats package for calculating the estimated marginal means for post hoc analyses
writexl allows excel files from R data
readxl
ggthemes additional tools for managing graphing aesthetics
She recommends inactivating install.packages() after it has been executed once by highlighting and using ctrl+shift+c - all the above info is in R packages and uses a separate script I have created.
Note for install packages there are (““) but not when library. An R convetion.
library(tidyverse)library(afex)
Loading required package: lme4
Loading required package: Matrix
Attaching package: 'Matrix'
The following objects are masked from 'package:tidyr':
expand, pack, unpack
************
Welcome to afex. For support visit: http://afex.singmann.science/
- Functions for ANOVAs: aov_car(), aov_ez(), and aov_4()
- Methods for calculating p-values with mixed(): 'S', 'KR', 'LRT', and 'PB'
- 'afex_aov' and 'mixed' objects can be passed to emmeans() for follow-up tests
- Get and set global package options with: afex_options()
- Set sum-to-zero contrasts globally: set_sum_contrasts()
- For example analyses see: browseVignettes("afex")
************
Attaching package: 'afex'
The following object is masked from 'package:lme4':
lmer
library(emmeans)
Welcome to emmeans.
Caution: You lose important information if you filter this package's results.
See '? untidy'
library(writexl)library(readxl)library(ggthemes)
Updating the R version
Do this every 3 months. To get what you need execute getRversion() see 3.2.5 of Wendy Book in week 3
Types of R file
.Rproj this is your project file in R studio -automatically sets your working directory
.R Script file - your previously saved code - like a word doc
.RHistory keeps tracks of all your commends in a session
Hot Keys
Commands
Function
Ctrl + Enter
Executes/runs code
Ctrl + Shift + C
Turns the current line of code into a comment. Inactivated lines will not be read and executed as normal code.
Ctrl + Left/Right Arrow
Moves your cursor all the way to the right/left
Ctrl + Shift + Left/Right Arrow
Highlights entire line to the left/right of your cursor
Alt + Left/Right Arrow
Moves your cursor one chunk of letters at a time
Alt + Shift + Left/Right Arrow
Highlights one chunk of letters at a time
Ctrl + A
Select all text
Ctrl + S
Saves the file. Do this every few minutes to save your progress.
Ctrl + Shift + R
Creates headers for your R script. You can easily navigate between labeled sections of your R script.
Ctrl + Shift + M
Shortcut for %>%, also known as a “pipe”. The pipe symbol is frequently used in the tidyverse-style syntax. More on pipes later.
Alt + Shift + K
Displays all programmed keyboard shortcuts for R.
Some R conventions
Naming
Choose descriptive and concise names
Do not start with a number or a symbol
Avoid punctuation other than .-_ other punctuation are understood as special commands
No spaces
Consistency e.g. all objects in capitals or all objects lower case
Built in Symbols
Symbol
Definition
Example
=
A single equal means as defined as, NOT equals. This can be used to define variable names within a dataset (most common use) or objects/datasets (not recommended use).
variable = 2 translates to: the object labeled variable is defined as the number two
<-
Defines objects/datasets. The single equal sign (=) should only be used to define an object within a larger dataset. The arrow (<-) is used to define objects that are not part of larger datasets.
dataset <- 2 translates to: the object labeled dataset is defined as the number two
==
Equal(s)
variable == 2 translates to: retrieve values from the object labeled variable that equal two
!=
Does not equal
variable != 2 translates to: retrieve values from the object labeled variable that do not equal two
>
Greater than
variable > 2 translates to: retrieve values that are greater than two from the object labeled variable
>=
Greater than or equal to
variable >= 2 translates to: retrieve values that are greater than or equal to two from the object labeled variable
<
Less than
variable < 2 translates to: retrieve values that are less than two from the object labeled variable
<=
Less than or equal to
variable <= 2 translates to: retrieve values that are less than or equal to two from the object labeled variable
%in%
Determines whether values match (TRUE/FALSE). The format for use is x %in% table where x and table are vectors.
2 %in% c(2,3,4) translates to: Does the value on the left match anything in the vector on the right? This would return TRUE the left value has a match on the right value(s).c(2,3,4) %in% 2 would return TRUE FALSE FALSE as each value on the right would return TRUE/FALSE for whether it matches with the value on the right of %in%.
|
Translates to the word “or”. | can be used to retrieve matching values that meet at least one condition.
variableA > 5 | variableB == TRUE translates to: retrieve values where variableA is greater than five or where the values of variableB are TRUE.
&
Translates to the word “and”. & can be used to retrieve matching values that meet all conditions listed.
variableA > 5 & variableC == “male” translates to: retrieve values where variableA is greater than five and where the values of variableC equal “male”
Built in Data sets
You can execute these with data() mtcars, diamonds and midwest
Common Errors
Capitalization
Mis-spelling
Closing punctuation
Continuing Punctuation
Conflicting code (try emptying the environment)
Libraries not loaded - load every time come back into R
The unsaved object
Tidyverse and Diamonds
10 total variables (three ordered, one interger and 6 nuneric)
Make an appropriate question e.g. is there an association between habitat and Ladybird morphotype. May help eg to see if habitat associated with habitat or not
Titles
A global meta-analysis of the impacts of tree plantations on biodiversity
This is a good title page. The words informed clearly what they did. Consider a title as an advertisement!
Bad title e.g.
Assessment of interspecific interactions in plant communities: an illustration from the cold desert saltbrush grasslands of North America
Word salad.
Good: Short informative, jargon less, wide keywords
Bad: Long, over informative, too specailsied, narrow non keywords
Don’t put a spoiler in the title e.g. Large grazers suppress a fondational plant and reduces soil carcon concentration in eastern US saltmarshes
Interrogstive titles can be good - a hook e.g. How does curciate ligament rupture treatment affect range of motion in digs
The ‘hook’ On the hope of biodiversity-friednly tropical landscapes
Burining biodiversity:fuelwood harvesting causes forest degradtation in human dminated tropicalmlanscapes - was one of Felipes - he would now take out the bits in italics