One of the first maps I made for my thesis
2016-06-13
One of the first maps I made for my thesis

See pct.bike
scholar-searches1
Source: r4stats.com

scholar-searches2
Source: r4stats.com
Source: Data Camp
jobs
Source: revolution analytics
What is RStudio?
Key shortcuts in RStudio:
| Command | Action |
|---|---|
| Alt + Shift + K | Show shortcuts |
| Ctrl + Enter | Run current line of code |
| Ctrl + R | Run all lines of code in the script |
| Tab | Autocomplete* |
RStudio is not just about R - it's a productivity suite!
shortcuts <- data.frame(Command = c(
"Alt + Shift + K",
"Ctrl + Enter",
"Ctrl + R",
"Tab"),
Action = c("Show shortcuts",
"Run current line of code",
"Run all lines of code in the script",
"Autocomplete*"))
kable(shortcuts)
There are 7,000+ 'add-on' packages to 'supercharge' R.
Easiest way to install them, from RStudio:
Tools -> Install Packages
or using keyboard shortcuts:
Alt + T ... then k
Can be installed and loaded in 6 lines of code:
pkgs <- c("devtools", "shiny", "rgdal", "rgeos", "ggmap", "raster")
install.packages(pkgs) # install the official packages!
library(devtools) # enables installation of leaflet
gh_pkgs <- c("rstudio/leaflet", "robinlovelace/stplanr")
install_github(gh_pkgs) # install packages on github
lapply(c(pkgs, "leaflet", "stplanr"), library, character.only = T)
RStudio has four main window 'panes':
The Source pane, for editing, saving, and dispatching R code to the console (top left).
The Console pane. Any code entered here is processed by R, line by line (bottom left).
The Environment pane (top right) contains information about the current objects loaded in the workspace including their class, dimension (if they are a data frame) and name.
The Files pane (bottom right) contains a simple file browser, a Plots tab, Help and Package tabs and a Viewer.
You are developing a project to visualise data. Test out the multi-panel RStudio workflow by following the steps below:
Create a new folder for the input data using the Files pane.
Type in read.c in the Source pane and hit Enter to make the function read.csv() autocomplete. Then type ", which will autocomplete to "".
Execute the full command with Ctrl-Enter:
url = "https://www.census.gov/2010census/csv/pop_change.csv" pop_change = read.csv(url, skip = 2)
Use the Environment pane to click on the data object pop_change.
Use the Console to test different plot commands to visualise the data, saving the code you want to keep back into the Source pane, as pop_change.R.
In the far top-right of RStudio there is a diminutive drop-down menu illustrated with R inside a transparent box.
Projects
Set the working directory automatically. setwd(), a common source of error for R users, is rarely if ever needed.
The last previously open file is loaded into the Source pane.
The File tab displays the associated files and folders in the project, allowing you to quickly find your previous work.
Any settings associated with the project, such as Git settings, are loaded. This assists with collaboration and project-specific set-up.
But good code style has a number of advantages > - Ease of reading > - Consistency > - Collaboration
Example: <- vs = for assignment (mostly interchangeable)
Anything that exists in R is an object. Let's create some with the <- symbol (= does the same job, before you ask!)
vector_logical <- c(TRUE, TRUE, FALSE)
vector_character <- c("yes", "yes", "Hello!")
vector_numeric <- c(1, 3, 9.9)
class(vector_logical) # what are the other object classes?
## [1] "logical"
Use the "Environment tab" (top right in RStudio) to see these
R has a hierarchy of data classes, tending to the lowest:
a <- TRUE b <- 1:5 c <- pi d <- "Hello Leeds"
class(a) class(b) class(c) class(d)
ab <- c(a, b) ab
## [1] 1 1 2 3 4 5
class(ab)
## [1] "integer"
x = 1:5 class(x)
## [1] "integer"
x = c(x, 6.1) class(x)
## [1] "numeric"
x = c(x, "hello")?Test: what is the dimension of objects we created in the last slide?
Python is not vectorised by default, hence:
a = [1,2,3] b = [9,8,6] print(a + b)
## [1, 2, 3, 9, 8, 6]
R is vectorised, meaning that it adds each element automatically
a = c(1,2,3) b = c(9,8,6) a + b
## [1] 10 10 9
?matmult for more on matrix multiplicationx <- c(1, 2, 5)
for(i in x){
print(i^2)
}
## [1] 1 ## [1] 4 ## [1] 25
Creating a new vector based on x
for(i in 1:length(x)){
if(i == 1) x2 <- x[i]^2
else x2 <- c(x2, x[i]^2)
}
x2
## [1] 1 4 25
class(c(a, b))
## [1] "numeric"
class(c(a, c))
## [1] "numeric"
class(c(b, d))
## [1] "character"
x <- 1:5 y <- 2:6 plot(x, y)
x <- seq(1,2, by = 0.2) length(x)
## [1] 6
x <- seq(1, 2, length.out = 5) length(x)
## [1] 5
The fundamental data object in R.
Create them with data.frame()
data.frame(vector_logical, vector_character, n = vector_numeric)
## vector_logical vector_character n ## 1 TRUE yes 1.0 ## 2 TRUE yes 3.0 ## 3 FALSE Hello! 9.9
Oops - we forgot to assign that. Tap UP or Ctl-UP in the console, then enter:
df <- data.frame(vector_logical, vector_character, n = vector_numeric)
| Homogeneous | Heterogeneous | |
|---|---|---|
| 1d | Atomic vector | List |
| 2d | Matrix | Data frame |
| nd | Array |
Source: Wickham (2014)
To ask R what objects it has, we can use ls().
(Anything that happens is a function)
ls()
## [1] "a" "ab" "b" ## [4] "c" "d" "df" ## [7] "i" "pkgs" "pop_change" ## [10] "shortcuts" "url" "vector_character" ## [13] "vector_logical" "vector_numeric" "x" ## [16] "x2" "y"
Now we can automate the question: what class?
obs <- ls()[grep("ve", ls())]
sapply(X = mget(obs), FUN = class)
## vector_character vector_logical vector_numeric ## "character" "logical" "numeric"
To find out what just happened, we can use R's internal help
The most commonly used help functions are:
help(apply) # get help on apply ?apply ?sapply ??apply
The *apply family of functions are R's internal for loops. What about get()
?get
The [] brackets, appending the object name, subset data.
A comma separates each dimension; nothing means everything:
df[1,] # all of the the 1st line of df
## vector_logical vector_character n ## 1 TRUE yes 1
In a 2d dataset, the following selects the 3rd line in the 3rd column:
df[3,3]
## [1] 9.9
New columns can be created as follows:
df$new_col = NA
Or as a function of old ones:
df$new_col = df$vector_logical + df$n
plot() is polymorphic. Try plot(df) and ?plot:
## Help on topic 'plot' was found in the following packages: ## ## Package Library ## raster /home/robin/R/x86_64-pc-linux-gnu-library/3.3 ## graphics /usr/lib/R/library ## ## ## Using the first match ...
u = "https://www.census.gov/2010census/csv/pop_change.csv"
pop_change = read.csv(u, skip = 2)
plot(pop_change$X1910_POPULATION, pop_change$X1960_POPULATION,
xlim = c(0, 10e6), ylim = c(0, 10e6))
library(ggplot2) ggplot(pop_change) + geom_point(aes(X1910_POPULATION, X1960_POPULATION, color = pop_change$STATE_OR_REGION))
To access the materials, please see here: https://www.dropbox.com/sh/akwx5cw611c3rz2/AAC8zgQARcdO80sJqHihnbvta?dl=0
Please complete this to help make these courses better:
Wickham, Hadley. 2014. Advanced R. CRC Press. http://www.crcpress.com/product/isbn/9781466586963 http://adv-r.had.co.nz http://www.crcpress.com/product/isbn/9781466586963 http://adv-r.had.co.nz.