Researchmethodsdocwk6on

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)
library(dbplyr)

Attaching package: 'dbplyr'

The following objects are masked from 'package:dplyr':

    ident, sql

Week 1

R scary pumkin in green

Terrifying R Pumkin

Introduction to R

Advantages to R:

  • Flexible

  • Allows analysis of almost any type of data

  • Extremely efficient packages readily permits sharing of code

  • Freely distributed

Basic Points:

R is command driven

  • It requires you to type a command after a command prompt >. If your command is not complete R issues a continuation prompt (‘+’)

  • You write in the script window

  • R is case sensitive

  • Commands in R are called functions

  • The ^ arrow can be used to bring up previous commands

  • The $ symbol is used to select a particular column within your data called a dataframe (eg df$var1)

  • Using # allows you to write text that is not executed

    Objects

Statistical analyses are based around creating and manipulating objects.

a <- 1
a + 1
[1] 2
b <- a+1

We have created an object ‘a’ with the value 1. <- is an assignment operator that assigns whatever is on the right to whatever is on the left.

We have added 1 to a giving 2. This exciting result has now been lost unless we recreate it so we can make another object!

Objects can contain anything - vectors, matrices, dataframes, lists, script. Data frames and vectors are the are most common. Consider the frequency of sightings of five avian raptor species over the last month. Species is categorical frequency is numeric

freq <- c(18,5,11,3,0)
#we have created a vector object 'freq' the c() is referred to as the 'concetenate' function, combining all elelmets into a vector.
species <- c("buzzard", "hobby", "kestrel", "merlin", "redkite")
#another type of object is a dataframe which you could consider a 'table' of information
spec_freq <- data.frame(species,freq)
spec_freq
  species freq
1 buzzard   18
2   hobby    5
3 kestrel   11
4  merlin    3
5 redkite    0

Setting the working directory

Cyanistes - a data set on eurasian blue tits

In this exercise we will use the package psych to report basic summary statistics. The Install packages installs it however you need to also load it using library(). If you close R you need to load packages you need again. Note in the text it is cyan but i have imported as cyanistes.

Week 3 Data analyses

Using R for Graduate Students by Y. Wendy Huynh

Data set is called diamonds

We can check our current working directory by typing getwd() in the console. Section 3 shows creating a new project

getwd()
[1] "/Users/wendycolgan/Documents/Msc research Methods"

Installing Packages

For this guide we need:

  • tidyverse - broad and useful and includes graphing (ggplot2) and user friendly formatting (dplr)

  • afex statistics esp ANOVAs

  • emmeans another stats package for calculating the estimated marginal means for post hoc analyses

  • writexl allows excel files from R data

  • readxl

  • ggthemes additional tools for managing graphing aesthetics

She recommends inactivating install.packages() after it has been executed once by highlighting and using ctrl+shift+c - all the above info is in R packages and uses a separate script I have created.

Note for install packages there are (““) but not when library. An R convetion.

library(tidyverse)
library(afex)
Loading required package: lme4
Loading required package: Matrix

Attaching package: 'Matrix'
The following objects are masked from 'package:tidyr':

    expand, pack, unpack
************
Welcome to afex. For support visit: http://afex.singmann.science/
- Functions for ANOVAs: aov_car(), aov_ez(), and aov_4()
- Methods for calculating p-values with mixed(): 'S', 'KR', 'LRT', and 'PB'
- 'afex_aov' and 'mixed' objects can be passed to emmeans() for follow-up tests
- Get and set global package options with: afex_options()
- Set sum-to-zero contrasts globally: set_sum_contrasts()
- For example analyses see: browseVignettes("afex")
************

Attaching package: 'afex'
The following object is masked from 'package:lme4':

    lmer
library(emmeans)
Welcome to emmeans.
Caution: You lose important information if you filter this package's results.
See '? untidy'
library(writexl)
library(readxl)
library(ggthemes)

Updating the R version

Do this every 3 months. To get what you need execute getRversion() see 3.2.5 of Wendy Book in week 3

Types of R file

.Rproj this is your project file in R studio -automatically sets your working directory

.R Script file - your previously saved code - like a word doc

.RHistory keeps tracks of all your commends in a session

Hot Keys

Commands Function
Ctrl + Enter Executes/runs code
Ctrl + Shift + C Turns the current line of code into a comment. Inactivated lines will not be read and executed as normal code.
Ctrl + Left/Right Arrow Moves your cursor all the way to the right/left
Ctrl + Shift + Left/Right Arrow Highlights entire line to the left/right of your cursor
Alt + Left/Right Arrow Moves your cursor one chunk of letters at a time
Alt + Shift + Left/Right Arrow Highlights one chunk of letters at a time
Ctrl + A Select all text
Ctrl + S Saves the file. Do this every few minutes to save your progress.
Ctrl + Shift + R Creates headers for your R script. You can easily navigate between labeled sections of your R script.
Ctrl + Shift + M Shortcut for %>%, also known as a “pipe”. The pipe symbol is frequently used in the tidyverse-style syntax. More on pipes later.
Alt + Shift + K Displays all programmed keyboard shortcuts for R.

Some R conventions

Naming

  • Choose descriptive and concise names

  • Do not start with a number or a symbol

  • Avoid punctuation other than .-_ other punctuation are understood as special commands

  • No spaces

  • Consistency e.g. all objects in capitals or all objects lower case

Built in Symbols

Symbol Definition Example
= A single equal means as defined as, NOT equals. This can be used to define variable names within a dataset (most common use) or objects/datasets (not recommended use). variable = 2 translates to: the object labeled variable is defined as the number two
<- Defines objects/datasets. The single equal sign (=) should only be used to define an object within a larger dataset. The arrow (<-) is used to define objects that are not part of larger datasets. dataset <- 2 translates to: the object labeled dataset is defined as the number two
== Equal(s) variable == 2 translates to: retrieve values from the object labeled variable that equal two
!= Does not equal variable != 2 translates to: retrieve values from the object labeled variable that do not equal two
> Greater than variable > 2 translates to: retrieve values that are greater than two from the object labeled variable
>= Greater than or equal to variable >= 2 translates to: retrieve values that are greater than or equal to two from the object labeled variable
< Less than variable < 2 translates to: retrieve values that are less than two from the object labeled variable
<= Less than or equal to variable <= 2 translates to: retrieve values that are less than or equal to two from the object labeled variable
%in% Determines whether values match (TRUE/FALSE). The format for use is x %in% table where x and table are vectors. 2 %in% c(2,3,4) translates to: Does the value on the left match anything in the vector on the right? This would return TRUE the left value has a match on the right value(s).c(2,3,4) %in% 2 would return TRUE FALSE FALSE as each value on the right would return TRUE/FALSE for whether it matches with the value on the right of %in%.
| Translates to the word “or”. | can be used to retrieve matching values that meet at least one condition. variableA > 5 | variableB == TRUE translates to: retrieve values where variableA is greater than five or where the values of variableB are TRUE.
& Translates to the word “and”. &amp; can be used to retrieve matching values that meet all conditions listed. variableA > 5 & variableC == “male” translates to: retrieve values where variableA is greater than five and where the values of variableC equal “male”

Built in Data sets

You can execute these with data() mtcars, diamonds and midwest

Common Errors

  • Capitalization

  • Mis-spelling

  • Closing punctuation

  • Continuing Punctuation

  • Conflicting code (try emptying the environment)

  • Libraries not loaded - load every time come back into R

  • The unsaved object

Tidyverse and Diamonds

10 total variables (three ordered, one interger and 6 nuneric)

View lets us see the data

Str lets us see the structure

?diamonds lets us see further descrptions

library(tidyverse)
view(diamonds)
str(diamonds)
tibble [53,940 × 10] (S3: tbl_df/tbl/data.frame)
 $ carat  : num [1:53940] 0.23 0.21 0.23 0.29 0.31 0.24 0.24 0.26 0.22 0.23 ...
 $ cut    : Ord.factor w/ 5 levels "Fair"<"Good"<..: 5 4 2 4 2 3 3 3 1 3 ...
 $ color  : Ord.factor w/ 7 levels "D"<"E"<"F"<"G"<..: 2 2 2 6 7 7 6 5 2 5 ...
 $ clarity: Ord.factor w/ 8 levels "I1"<"SI2"<"SI1"<..: 2 3 5 4 2 6 7 3 4 5 ...
 $ depth  : num [1:53940] 61.5 59.8 56.9 62.4 63.3 62.8 62.3 61.9 65.1 59.4 ...
 $ table  : num [1:53940] 55 61 65 58 58 57 57 55 61 61 ...
 $ price  : int [1:53940] 326 326 327 334 335 336 336 337 337 338 ...
 $ x      : num [1:53940] 3.95 3.89 4.05 4.2 4.34 3.94 3.95 4.07 3.87 4 ...
 $ y      : num [1:53940] 3.98 3.84 4.07 4.23 4.35 3.96 3.98 4.11 3.78 4.05 ...
 $ z      : num [1:53940] 2.43 2.31 2.31 2.63 2.75 2.48 2.47 2.53 2.49 2.39 ...
names(diamonds)
 [1] "carat"   "cut"     "color"   "clarity" "depth"   "table"   "price"  
 [8] "x"       "y"       "z"      

Week 4

Week 5

Week 6

Elephant under tree in sun

Elephant in Lower Zambezi

Covers

  • When to use frequency tests

  • Working with tables and graphs

  • Goodness of fit tests

  • Independency test

  • Homogeneity tests

Make an appropriate question e.g. is there an association between habitat and Ladybird morphotype. May help eg to see if habitat associated with habitat or not

Titles

A global meta-analysis of the impacts of tree plantations on biodiversity

This is a good title page. The words informed clearly what they did. Consider a title as an advertisement!

Bad title e.g.

Assessment of interspecific interactions in plant communities: an illustration from the cold desert saltbrush grasslands of North America

Word salad.

Good: Short informative, jargon less, wide keywords

Bad: Long, over informative, too specailsied, narrow non keywords

Don’t put a spoiler in the title e.g. Large grazers suppress a fondational plant and reduces soil carcon concentration in eastern US saltmarshes

Interrogstive titles can be good - a hook e.g. How does curciate ligament rupture treatment affect range of motion in digs

The ‘hook’ On the hope of biodiversity-friednly tropical landscapes

Burining biodiversity:fuelwood harvesting causes forest degradtation in human dminated tropicalmlanscapes - was one of Felipes - he would now take out the bits in italics