Session 1

# Session 1
## S
### Julian Flowers
### 03-01-2022 (updated: 2022-01-05)

---

# Getting Started

* Install R from https://www.r-project.org/   
* Install latest version of RStudio IDE<sup>1</sup>    from https://www.rstudio.com/products/rstudio/download/

## Optional

* Set up a Github account e.g. https://github.com/julianflowers12
* Set up an RPubs account https://rpubs.com/users/new
* Open a browser with Google
* Open a browser with Stack Overflow

---

### Power of R

---

## Some basics

- Usually need to add packages
    + `install.packages("package name")`
    
- First lines of code
    + `install.packages("pacman")`        ## download and install a universal package manager
    + `library(pacman)`                   ## load into R
    + `p_load(tidyverse)`                 ## install and load `tidyverse` - more later

```r
install.packages("pacman", repos = "https://cran.rstudio.com" )
```

```
## 
## The downloaded binary packages are in
##  /var/folders/bk/jrqs03tx5mq9s28mhml5xzhm0000gn/T//RtmpC6NC0l/downloaded_packages
```

```r
library(pacman)
p_load(tidyverse, viridis, gganimate, tweenr)
```

---

### Key ideas

* Tidy data and daat wrangling
* End-to-end
* Automation
* Reproducibility
* Open
  + Data
  + Source
  + Code
* Sharing

---

### R difficulties

* Multiple ways of achieving same result
* Dependencies
* Learning curve
*

---
### Examples

- In the code chunk below:
    + We are reading in a data from the Coronavirus Dashboard API as a csv file via `read_csv()`
    + (Dataset is daily test positivity by lower tier LA)
    + We are using the `head()` function to show the first 6 data rows of data `df1`
    + We are using the *pipe* function `%>%` 
    + Data is a *data frame*  - in this case a `tibble`

```r
df1 <- read_csv("https://api.coronavirus.data.gov.uk/v2/data?areaType=ltla&metric=uniqueCasePositivityBySpecimenDateRollingSum&format=csv", show_col_types = FALSE)
df1 %>%
  head()
```

```
## # A tibble: 6 × 5
##   areaCode  areaName             areaType date       uniqueCasePositivityBySpec…
##   <chr>     <chr>                <chr>    <date>                           <dbl>
## 1 E06000003 Redcar and Cleveland ltla     2021-12-26                        20.3
## 2 E07000040 East Devon           ltla     2021-12-26                        13.8
## 3 E07000090 Havant               ltla     2021-12-26                        21.7
## 4 E07000214 Surrey Heath         ltla     2021-12-26                        22.2
## 5 E07000229 Worthing             ltla     2021-12-26                        19  
## 6 E08000001 Bolton               ltla     2021-12-26                        30
```

---
### Lets plot some of the data

```r
df1 %>% filter(str_detect(areaName, "Leeds")) %>% ## filter row-wise; `str_detect` is a good strategy for filtering among large numbers of text categories
  ggplot(aes(date, uniqueCasePositivityBySpecimenDateRollingSum)) +
  geom_line(colour = "darkblue") +
  geom_smooth(method = "loess", span = .3) +
  labs(title = "Test positivity") + theme(plot.title.position = "plot")
```

```
## `geom_smooth()` using formula 'y ~ x'
```

![](session1-xaringan_files/figure-html/unnamed-chunk-3-1.png)
---

### Further plots
![](session1-xaringan_files/figure-html/unnamed-chunk-4-1.png)

---

### Map code

```
library(tmap); library(sf)

s2020 <- "https://opendata.arcgis.com/datasets/69d8b52032024edf87561fb60fe07c85_0.geojson"

shp2020 <- st_read(s2020, quiet = T)  ## read shape file

shp2020 <- filter(shp2020, str_detect(LAD20CD, "^E"))

shp2020 <- shp2020 %>% left_join(df1, by = c("LAD20CD" = "areaCode"))

shp2020_nov <- filter(shp2020, date >= "2021-12-01")

g <- ggplot(shp2020_nov) +
  geom_sf(aes(fill = uniqueCasePositivityBySpecimenDateRollingSum, 
  colour = uniqueCasePositivityBySpecimenDateRollingSum) )+
  coord_sf() +
  scale_fill_viridis(direction = -1, name = "Test positivity (%)", 
  option = "inferno") +
  scale_colour_viridis(direction = -1, name = "Test positivity (%)", 
  option = "inferno") +
  theme_void() +
  facet_wrap(~date, ncol = 8)

```

---

### Map
![](session1-xaringan_files/figure-html/unnamed-chunk-6-1.png)

---
### Small multiples

```r
p + facet_wrap(~areaName, ncol = 8)
```

![](session1-xaringan_files/figure-html/unnamed-chunk-8-1.png)

---