R-workshop-series-plan

P.Adames
July 13, 2020

The vision

Contribute a mini-series of R-centric workshops to the CalgaryR community with the following main goals:

Explain the R programming language in a concise manner to beginners
Introduce R as a tool for computation and data manipulation
Highlight, not hide, what makes native R different and useful
Introduce the abstractions that R makes possible by design
1. Environments and scopes
2. Closures
3. Non-standard evaluation
Show how modern R libraries allow data manipulation

The audience

Any body with an interest in learning an expressive computational language for statistical and data analysis, data science, and data visualization.

Any one curious about what modern R tools offer to solve old and new problems alike.

Anybody willing to put in some time to answer the question what makes R look so similar but sometimes behave so unexpectedly differently from most C-like languages.

The workshop's philosophy

Use the right tool for the job.

Not all [computer] languages are created equal.

Some are meant to just look prettier and sound almost like natural (English) language, others extremenly easy to use to the uninitiated but progressively more complex when used to solve specialized tasks, yet others are just expressive and focused and while docile in the hands of the expert, they can be harsh in the hands of the unprepared.

R-Workshop Series Plan

The basic data types in R (You need to know these four)
Vectorization (Did you know that R is like Matlab, a vectorized language, for data)
Environments (what they are and how they work)
Lexical scoping and what can do for you (use cases)
Libraries (all you ever wanted to know and never asked)
Reproducible R data analysis: Knitr-RStudio and Jupyter (via Anaconda)
Did you know R is a functional language? Here is why that can be good news.
Object-oriented R (S3 and S4 object models, can you code without ever knowing what they really are?)

R-Workshop Series Plan (Cont.)

The Tidyverse implementation of data analysis workflows, Part 1 (tidy data)
The Tidyverse implementation of data analysis workflows, Part 2 (table transformations, column-based ops)
The Tidyverse implementation of data analysis workflows, Part 3 (Mutate, Summarize, Group, Nest)
Part I: The Caret implementation of Machine Learning workflows
Part II: The Caret implementation of ML pipelines and recipes
R and DSLs (What's a DSL anyway?)
Part I: R Vs. Python. R native Vs. Python native
Part II: R Vs. Python. Tidyverse/Caret/ggplot Versus Numpy/Pandas/Sklearn/Matplotlib/SeaBorn
Part III: R Vs. Python. Use cases. Research/prototyping/production/packages/communities/documentation
Part IV: R Vs. Python. Notebooks/Jupyter/Spark/Kaggle/AWS

Here is a list of books used as reference fro these workshop series:

R in Action, Data Analysis and Graphics with R, 2nd Ed. Robert I. Kabacoff. Manning 2015
Advanced R. Hadley Wickham. CRC Press, 2015 ggplot, Elegant graphics for data analysis. Hadley Wickham. Springer, 2009
Text Mining in practice with R, Ted Kwartler. Wiley 2017
Probability, Decisions, and Games, a gentle introduction using R. Abel Rodriguez and Bruno Mendes. Wiley 2018
Statistical Data Cleaning with Applications in R. Mark van der Loo and Edwin de Jonge. Wiley 2018
Deep Learning with R. Francois Chollet with J.J. Allaire. Manning 2018