The R universe
Faculty of Humanities, Education and Social Sciences (FHSE), University of Luxembourg
Getting to know each-other and the seminar.
Assistant professor in lifespan developmental psychology (FHSE).
Research interests in ageing, culture, mental health, and digital technology.
What do you study?
Your experience with r?
What are your expectations?
In presence in Belval.
13:00 - 17:00.
Topics:
A combination of input and practical sessions.
Use the short book associated with the seminar https://adrian-stanciu.quarto.pub/r-beyond-data-analysis/.
Reason 1: Integrated workflow through programming IDE
Reason 2: Open source
Reason 3: Universe of possibilities
2 ECTs.
ECTs conditional on evaluation and attendance.
Evaluation is through homework to be delivered after each topic (see next slide).
Feedback given in person or via email no later than 5 days thereafter.
Homework is to be submitted as r script files, .rmd or .qmd documents unless otherwise specified.
If you need the ECTs, please approach me at the end of the seminar.
This is half the job already!
We need:
a laptop preferably
software: r, RStudio, quarto
git et co.: git
online repositories: GitHub, shinyapps.io, quartopub.com
r and the r-console:RStudio:quarto:Open accounts on:
Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency https://git-scm.com/.
Line command in Terminal.
GitHub Desktop with a user interface: https://desktop.github.com/
Tip
We will use line command throughout the seminar.
To work with git and the online repository GitHub we need to open the communication channel between local machine and online repository.
Please follow the instructions on the official GitHub website:
For Mac, press here
For Windows, press here
For Linux, press here
Tip
The SSH key set up may take some time. That is okay.
The action of uploading the written code or files from the local machine onto the online repository.
Code uploading and fixing the push in the history of the project.
Opposite of git push.
Copy/ paste and transfers the project history from the online repository to local machine.
Commits a project update while the update gets fixed in the project history.
Download, install, and set up the work environment.
If you’ve already done it yourself, please help others.
r and the r-console:RStudio:quarto:Throughout the seminar, we use r pre-installed datasets like mtcars and a sub-sample from Stanciu et al. (2017).
We will progressively build an example code from automatized reports to shiny applications.
Likewise, meta-data contained in an Microsoft Excel file movies.xlsx.
Stanciu et al. (2017) studied how people stereotyped varying social groups in terms of warmth and competence across several regions in Romania. For this seminar, we will use data from a sample of n = 100 participants selected at random from the reported data set.
ppn participant number,gen self-reported gender of participant as female (1) or male (2),age chronological age as it was self-reported in years,res region or residence of the participant,res_other open ended question regarding region or residence of the participant,men_warm participant’s stereotypeical evaluation of men in terms of warmth,men_comp participant’s stereotypeical evaluation of men in terms of competence,wom_warm participant’s stereotypeical evaluation of women in terms of warmth,wom_comp participant’s stereotypeical evaluation of women in terms of competence. :::Stereotypical evaluations were assessed on Likert scales with these answer options:
1 = strongly disagree,
2 = disagree,
3 = undecided,
4 = agree,
5 = strongly agree.
Download all the Stanciu et al. (2017) sub-sample and .xlsx file from the seminar book R beyond data analysis
r scriptsWorking in the console can get messy and cannot be saved or reproduced.
r scripts are the solution!
Organize your work routine
When you are sure you wrote the right code in the console, copy-paste it in an r script to have it saved.
The basics, the absolute very basics.
r is a programming language derived from S, a commercial package.
Developed in 1991 at University of Auckland (NZ) by Ross Ihaka and Robert Gentleman.
In 1995, it became an open source code thanks to contributions by Martin Mächler.
Integrated in RStudio.
Where RMarkdown and quarto documents, and shiny apps are programmed.
History provides access to ran code in the console or inside chunks ran in the Source panel.
Tutorial contains tutorials. Install first the package learnr as indicated.
Environment displays the available packages in the work environment.
R console and Terminal integrared.
Here – Terminal tab – is where we use git command line.
Ideally, we always use projects - Rprojects or quarto projects.
Most important tabs are:
Files works like Windows Explorer or Mac OS Finder.
Plots previews graphs generated through code.
Packages provides manual access to installing and updating packages.
Open the r console and RStudio. Navigate between the two and familiarize yourself with them.
r codingIntroduces the logic of working with objects in r.
We will use either the console panel or r code chunks integrated in enhanced documents (source panel) in RStudio.
Stores information, code, and results of code into an object.
Simplifies the workflow because we can perform operations on the object as a whole or on specific elements of the object.
Vectors are scalable objects meaning that they can hold multiple elements.
Numeric
Character string
[1] "A" "Ab" "This or that" "C" "d"
Logical
Combine multiple vectors and have varying internal structures.
Can be saved and imported as SPSS .sav, Microsoft Excel .xlsx, R .RData files. Also, .dat, .csv, .asci and so on.
'data.frame': 5 obs. of 2 variables:
$ col1: num 1 3 66 9 121
$ col2: chr "A" "Ab" "This or that" "C" ...
Labeled as such and have a unique code structure: function(){}.
() defines the function arguments.
{} contains the function itself.
Tip
The word function is a must!
Let us have a first look at functions.
We use the pre-installed data iris and mtcars.
iris df# ten rows of the pre-installed dataset iris
# numerical columns are then columns 1 through 4
head(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
# we apply the function and add 3 to all numeric columns of iris
# we only ask to see the first ten rows of the outcome using head()
head(func1(iris,3)) Sepal.Length Sepal.Width Petal.Length Petal.Width
1 8.1 6.5 4.4 3.2
2 7.9 6.0 4.4 3.2
3 7.7 6.2 4.3 3.2
4 7.6 6.1 4.5 3.2
5 8.0 6.6 4.4 3.2
6 8.4 6.9 4.7 3.4
# we apply the function and add 99 to all numeric columns of another pre-installed dataset 'mtcars'
# we only ask to see the first ten rows of the outcome using head()
head(func1(mtcars,99)) mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 120.0 105 259 209 102.90 101.620 115.46 99 100 103 103
Mazda RX4 Wag 120.0 105 259 209 102.90 101.875 116.02 99 100 103 103
Datsun 710 121.8 103 207 192 102.85 101.320 117.61 100 100 103 100
Hornet 4 Drive 120.4 105 357 209 102.08 102.215 118.44 100 99 102 100
Hornet Sportabout 117.7 107 459 274 102.15 102.440 116.02 99 99 102 101
Valiant 117.1 105 324 204 101.76 102.460 119.22 100 99 102 100
An R package contains code, documentation, and sometimes even data.
Typically, do not come pre-installed so they need to be installed before use.
# we might have to set up a mirror first!
# mirror is the website from which r will install packages
r <- getOption("repos")
r["CRAN"] <-"https://cloud.r-project.org/"
options(repos=r)
# installs `tidyverse`
install.packages("tidyverse")
# makes it available for R on your local machine
# this step is crucial if you want to have access to all the containing function
library(tidyverse)pacman packageWe can install multiple packages at once.
But the package pacman must be first installed using the simple method above.
(Almost) Every package has a designated website. Visit the package website for examples on how to use and also to identify the functions contained. For example https://www.tidyverse.org/.
Call the package documentation by typing in a question mark followed by the name of the package or function contained in a package. For example:
%>% operatorThe pipe operator %>% compresses into one code otherwise a long chain of steps that involve creating objects which are then subjected to new operations.
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
# we filter the column cyl such
# that only cars with a cyl < 5 are displayed
mtcars %>% filter(cyl < 5) mpg cyl disp hp drat wt qsec vs am gear carb
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
Without %>%.
mpg cyl disp hp drat wt qsec vs am gear carb
Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.9 1 1 5 2
Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.6 1 1 4 2
With %>%.
Use something complex but stable or something simple but unstable?!
Base R is complex but stable.
Packages are simple(r) to use but depend on the community for their maintenance.
Use groundhog
To circumvent this danger, one could use the package groundhog. Read more on https://groundhogr.com/.
Write a function using base r or packages.
Think of something simple and repetitive that can be applied to either the iris or mtcars. Or both.
Tip
This exercise paves the way for Day 4 when we work on shiny apps.
With Dr. Ranjit SINGH from GESIS - Leibniz Institute for the Social Sciences, I prepared a workshop on r for beginners. All the material is open access via GitHub.
clone the repository on your local machine and do the exercises.
Navigate first to the page of the repository and then clone it to your local machine: https://github.com/adrianvstanciu/rworkshop_open.
If anyone needs the 2 ECTs, please let me know.
Of course, if you want an assignment without needing the ECTs, also approach me.
r
Made r open source
Figure 3: The R console
An Integrated Development Environment (IDE).