About

Getting to know each-other and the seminar.

Myself

Assistant professor in lifespan developmental psychology (FHSE).

Research interests in ageing, culture, mental health, and digital technology.

Your turn…

  • What do you study?

  • Your experience with r?

  • What are your expectations?

This seminar

In presence in Belval.

13:00 - 17:00.

Topics:

  • March 12 & March 19: Intro to R universe and infrastructure preparation
  • March 26 & April 14: Automatization
  • April 21: Online content publication
  • April 30: Web applications (Shiny apps)

This seminar

A combination of input and practical sessions.

Use the short book associated with the seminar https://adrian-stanciu.quarto.pub/r-beyond-data-analysis/.

Why this seminar

  • Reason 1: Integrated workflow through programming IDE

  • Reason 2: Open source

  • Reason 3: Universe of possibilities

Seminar admin & ECTs

2 ECTs.

ECTs conditional on evaluation and attendance.

Evaluation is through homework to be delivered after each topic (see next slide).

Feedback given in person or via email no later than 5 days thereafter.

Homework

Homework is to be submitted as r script files, .rmd or .qmd documents unless otherwise specified.

If you need the ECTs, please approach me at the end of the seminar.

Set up

This is half the job already!

Tools & Equipment

We need:

  • a laptop preferably

  • software: r, RStudio, quarto

  • git et co.: git

  • online repositories: GitHub, shinyapps.io, quartopub.com

Set up – Software

  1. Install r and the r-console:

https://cran.r-project.org/

  1. Install RStudio:

https://posit.co/downloads/

  1. Install quarto:

https://quarto.org/docs/get-started/

Set up – git

  1. Install for Windows:

https://gitforwindows.org/

  1. Install for Mac or Linux using Homebrew:

https://brew.sh/

Set up – online repositories

Open accounts on:

  1. GitHub

https://github.com/

  1. shinyapps.io

https://www.shinyapps.io/

  1. quartopub.com

https://quartopub.com/

git basics

How Microsoft Copilot sees git

How Microsoft Copilot sees git

Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency https://git-scm.com/.

Line command in Terminal.

GitHub Desktop with a user interface: https://desktop.github.com/

Tip

We will use line command throughout the seminar.

git basics

To work with git and the online repository GitHub we need to open the communication channel between local machine and online repository.

the SSH Key

Please follow the instructions on the official GitHub website:

For Mac, press here

For Windows, press here

For Linux, press here

Tip

The SSH key set up may take some time. That is okay.

git basics – push

git add . # adds all files to the push action
git push

The action of uploading the written code or files from the local machine onto the online repository.

Code uploading and fixing the push in the history of the project.

git basics – pull

git pull

Opposite of git push.

git basics – clone

git clone {repository here}

Copy/ paste and transfers the project history from the online repository to local machine.

git basics – commit

git commit -m "COMMIT HERE"

Commits a project update while the update gets fixed in the project history.

DIY – Set up

Download, install, and set up the work environment.

If you’ve already done it yourself, please help others.

The illustrative example

Throughout the seminar, we use r pre-installed datasets like mtcars and a sub-sample from Stanciu et al. (2017).

We will progressively build an example code from automatized reports to shiny applications.

Likewise, meta-data contained in an Microsoft Excel file movies.xlsx.

The paper

Stanciu et al. (2017) studied how people stereotyped varying social groups in terms of warmth and competence across several regions in Romania. For this seminar, we will use data from a sample of n = 100 participants selected at random from the reported data set.

The variables

  • ppn participant number,
  • gen self-reported gender of participant as female (1) or male (2),
  • age chronological age as it was self-reported in years,
  • res region or residence of the participant,
  • res_other open ended question regarding region or residence of the participant,
  • men_warm participant’s stereotypeical evaluation of men in terms of warmth,
  • men_comp participant’s stereotypeical evaluation of men in terms of competence,
  • wom_warm participant’s stereotypeical evaluation of women in terms of warmth,
  • wom_comp participant’s stereotypeical evaluation of women in terms of competence. :::

The measurement scale

Stereotypical evaluations were assessed on Likert scales with these answer options:

1 = strongly disagree,

2 = disagree,

3 = undecided,

4 = agree,

5 = strongly agree.

Download seminar resources

Download all the Stanciu et al. (2017) sub-sample and .xlsx file from the seminar book R beyond data analysis

https://adrian-stanciu.quarto.pub/r-beyond-data-analysis/

r scripts

Working in the console can get messy and cannot be saved or reproduced.

r scripts are the solution!

Figure 1: Anatomy of a r script file

Organize your work routine

When you are sure you wrote the right code in the console, copy-paste it in an r script to have it saved.

R

r is a programming language derived from S, a commercial package.

Developed in 1991 at University of Auckland (NZ) by Ross Ihaka and Robert Gentleman.

In 1995, it became an open source code thanks to contributions by Martin Mächler.

Creators of r

Made r open source
Figure 2: Creators of r open source as we know it today

R console

Integrated in RStudio.

RStudio

An Integrated Development Environment (IDE).

An Integrated Development Environment (IDE).

Source panel

Where RMarkdown and quarto documents, and shiny apps are programmed.

Environment panel

History provides access to ran code in the console or inside chunks ran in the Source panel.

Tutorial contains tutorials. Install first the package learnr as indicated.

Environment displays the available packages in the work environment.

Console/Terminal panel

R console and Terminal integrared.

Here – Terminal tab – is where we use git command line.

Ideally, we always use projects - Rprojects or quarto projects.

Files/Plots panel

Most important tabs are:

Files works like Windows Explorer or Mac OS Finder.

Plots previews graphs generated through code.

Packages provides manual access to installing and updating packages.

Familiarize yourself

Open the r console and RStudio. Navigate between the two and familiarize yourself with them.

Basics of r coding

Introduces the logic of working with objects in r.

We will use either the console panel or r code chunks integrated in enhanced documents (source panel) in RStudio.

Objects

Stores information, code, and results of code into an object.

Simplifies the workflow because we can perform operations on the object as a whole or on specific elements of the object.

# no object created
2+2
[1] 4
# object is first created and then run
sum<-2+2
sum
[1] 4
# creates a second object called mean
mean<-mean(c(1,2,5,7,8,9))
mean
[1] 5.333333
# and then adds the two objects 'sum' and 'mean' together
result<-sum+mean
result
[1] 9.333333

Vectors

Vectors are scalable objects meaning that they can hold multiple elements.

Numeric

# example of numeric vectors
vec1<-c(1,3,66,9,121)
vec1
[1]   1   3  66   9 121

Character string

# example of character string vector
vec2<-c("A","Ab","This or that","C","d")
vec2
[1] "A"            "Ab"           "This or that" "C"            "d"           

Logical

# example of logical vector
vec3<-c(TRUE,TRUE, FALSE, TRUE)
vec3
[1]  TRUE  TRUE FALSE  TRUE

Data tables

Combine multiple vectors and have varying internal structures.

Can be saved and imported as SPSS .sav, Microsoft Excel .xlsx, R .RData files. Also, .dat, .csv, .asci and so on.

Simple example

# create a simple data table
df<-data.frame(col1=vec1,
                  col2=vec2)
df
  col1         col2
1    1            A
2    3           Ab
3   66 This or that
4    9            C
5  121            d

Accessing df elements

# access col1
df[,1]
[1]   1   3  66   9 121
# access first row
df[1,]
  col1 col2
1    1    A
# access entry at first row and col1
df[1,1]
[1] 1

Actions on df

# checks the elements of the data table
str(df)
'data.frame':   5 obs. of  2 variables:
 $ col1: num  1 3 66 9 121
 $ col2: chr  "A" "Ab" "This or that" "C" ...
# provides a summary of the data table
summary(df)
      col1         col2          
 Min.   :  1   Length:5          
 1st Qu.:  3   Class :character  
 Median :  9   Mode  :character  
 Mean   : 40                     
 3rd Qu.: 66                     
 Max.   :121                     

Actions on specific elements

# an addition on the numeric vector of the data table
df[,1]+100
[1] 101 103 166 109 221
# an addition on the numeric elements at the intersection row 1 and column 1
df[5,1]+123
[1] 244

Functions

Labeled as such and have a unique code structure: function(){}.

() defines the function arguments.

{} contains the function itself.

Tip

The word function is a must!

Functions

Let us have a first look at functions.

We use the pre-installed data iris and mtcars.

Inspect iris df

# ten rows of the pre-installed dataset iris
# numerical columns are then columns 1 through 4
head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa
# adds 3 to all numerical columns
head(iris[,1:4] + 3)

# add 77 to all numerical columns 
head(iris[,1:4] + 77)

Write simple function

# this function takes two arguments: a dataset 'df' and a constant 'n'
func1<-function(df,n){
  
  tmp <- Filter(is.numeric, df) # we first filter the dataframe for numeric columns
  
  tmp + n # we then add the constant to all the numeric columns
}

Apply function

# we apply the function and add 3 to all numeric columns of iris
# we only ask to see the first ten rows of the outcome using head()
head(func1(iris,3))
  Sepal.Length Sepal.Width Petal.Length Petal.Width
1          8.1         6.5          4.4         3.2
2          7.9         6.0          4.4         3.2
3          7.7         6.2          4.3         3.2
4          7.6         6.1          4.5         3.2
5          8.0         6.6          4.4         3.2
6          8.4         6.9          4.7         3.4
# we apply the function and add 99 to all numeric columns of another pre-installed dataset 'mtcars'
# we only ask to see the first ten rows of the outcome using head()
head(func1(mtcars,99))
                    mpg cyl disp  hp   drat      wt   qsec  vs  am gear carb
Mazda RX4         120.0 105  259 209 102.90 101.620 115.46  99 100  103  103
Mazda RX4 Wag     120.0 105  259 209 102.90 101.875 116.02  99 100  103  103
Datsun 710        121.8 103  207 192 102.85 101.320 117.61 100 100  103  100
Hornet 4 Drive    120.4 105  357 209 102.08 102.215 118.44 100  99  102  100
Hornet Sportabout 117.7 107  459 274 102.15 102.440 116.02  99  99  102  101
Valiant           117.1 105  324 204 101.76 102.460 119.22 100  99  102  100

Packages

An R package contains code, documentation, and sometimes even data.

Typically, do not come pre-installed so they need to be installed before use.

Simple installation

# we might have to set up a mirror first!
# mirror is the website from which r will install packages
r <- getOption("repos")
r["CRAN"] <-"https://cloud.r-project.org/"
options(repos=r)

# installs `tidyverse`
 install.packages("tidyverse") 

# makes it available for R on your local machine
# this step is crucial if you want to have access to all the containing function
library(tidyverse)

Use pacman package

We can install multiple packages at once.

But the package pacman must be first installed using the simple method above.

# first, we install the `pacman` package
install.packages("pacman")

# then, we use the function `p_load` from the `pacman` package to install `tidyverse`, `rmarkdown`, `shiny` packages
pacman::p_load(tidyverse,rmarkdown,bookdown,quarto,shiny)

Tips

(Almost) Every package has a designated website. Visit the package website for examples on how to use and also to identify the functions contained. For example https://www.tidyverse.org/.

Call the package documentation by typing in a question mark followed by the name of the package or function contained in a package. For example:

# Make sure the package is installed and activated 
library(tidyverse)
# Call the help which will open on the Files/Pck panel on the lower right side of the screen
?tidyverse()

pipe %>% operator

The pipe operator %>% compresses into one code otherwise a long chain of steps that involve creating objects which are then subjected to new operations.

# apply the function filter to the dataset mtcars

head(mtcars)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Single filter

# we filter the column cyl such 
# that only cars with a cyl < 5 are displayed
mtcars %>% filter(cyl < 5)
                mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Datsun 710     22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
Merc 240D      24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
Merc 230       22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
Fiat 128       32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
Honda Civic    30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
Toyota Corolla 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
Toyota Corona  21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
Fiat X1-9      27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
Porsche 914-2  26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
Lotus Europa   30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
Volvo 142E     21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2

Multiple filters

Without %>%.

a<-mtcars %>% filter(cyl < 5)
b<-a %>% filter(hp > 100)
b
              mpg cyl  disp  hp drat    wt qsec vs am gear carb
Lotus Europa 30.4   4  95.1 113 3.77 1.513 16.9  1  1    5    2
Volvo 142E   21.4   4 121.0 109 4.11 2.780 18.6  1  1    4    2

With %>%.

mtcars %>% filter(cyl < 5) %>% filter(hp > 100)
              mpg cyl  disp  hp drat    wt qsec vs am gear carb
Lotus Europa 30.4   4  95.1 113 3.77 1.513 16.9  1  1    5    2
Volvo 142E   21.4   4 121.0 109 4.11 2.780 18.6  1  1    4    2

Base R vs. Packages

Use something complex but stable or something simple but unstable?!

Base R vs. Packages

Base R is complex but stable.

Packages are simple(r) to use but depend on the community for their maintenance.

Use groundhog

To circumvent this danger, one could use the package groundhog. Read more on https://groundhogr.com/.

DIY – Write a function

Write a function using base r or packages.

Think of something simple and repetitive that can be applied to either the iris or mtcars. Or both.

Tip

This exercise paves the way for Day 4 when we work on shiny apps.

DIY – Clone a repository

With Dr. Ranjit SINGH from GESIS - Leibniz Institute for the Social Sciences, I prepared a workshop on r for beginners. All the material is open access via GitHub.

clone the repository on your local machine and do the exercises.

Navigate first to the page of the repository and then clone it to your local machine: https://github.com/adrianvstanciu/rworkshop_open.

Assignment

If anyone needs the 2 ECTs, please let me know.

Of course, if you want an assignment without needing the ECTs, also approach me.

References

Stanciu, A., Cohrs, C. J., Hanke, K., & Gavreliuc, A. (2017). Within-culture variation in the content of stereotypes: Application and development of the stereotype content model in an eastern european culture. The Journal of Social Psychology, 157(5), 611–628. https://doi.org/10.1080/00224545.2016.1262812