About myself

Assistant professor in lifespan developmental psychology (University of Luxembourg).

Research interests in ageing, culture, mental health, technology.

I use R since 2021.

More on my website: https://adrianstanciu.eu.

About this workshop

A combination of input and practical sessions.

Use the available online book https://adrian-stanciu.quarto.pub/r-beyond-data-analysis/.

Why do I think the approach covered in the workshop has value:

  1. Integrated workflow through programming IDE

  2. Open source

  3. Universe of possibilities

Required material:

Do it yourself – DIY

Please follow the instructions to set-up your work environment.

  • Install git first
  • Then link GitHub with your local machine via an encrypted channel
  • You should now have access to git command line directly in the Terminal in RStudio (see next slides)

Origins

r is a programming language derived from S, a commercial package.

Developed in 1991 at University of Auckland (NZ) by Ross Ihaka and Robert Gentleman.

In 1995, it became an open source code thanks to contributions by Martin Mächler.

Creators of r

Made r open source
Figure 1: Creators of r open source as we know it today

R and RStudio

An Integrated Development Environment (IDE).

An Integrated Development Environment (IDE).
  • Source: Text, code, images are rendered into a final output document
  • Environment/History: An overview of what is available in the work environment (packages and code)
  • Console/Terminal: Direct access to R (Console) and git command line (Terminal)
  • Files/Plots/Pkgs/Help: Direct access to input and output material

Lines of code

Working with objects in R.

# creates a second object called mean
mean<-mean(c(1,2,5,7,8,9))
mean
[1] 5.333333
# and then adds one constant to the object
result<-mean+3
result
[1] 8.333333

Programming languages including R use vectors.

Vectors are scalable objects, and can be numeric, character, logical, or combinations of these.

# example of numeric vectors
vec1<-c(1,3,66,9,121)
vec1
[1]   1   3  66   9 121
# example of character string vector
vec2<-c("A","Ab","This or that","C","d")
vec2
[1] "A"            "Ab"           "This or that" "C"            "d"           
# example of logical vector
vec3<-c(TRUE,TRUE, FALSE, TRUE)
vec3
[1]  TRUE  TRUE FALSE  TRUE

Observations for measured variables are stored as vectors in datasets.

Datasets have different formats: .sav (SPSS native), .xlsx (Microsoft Excel native), .RData (R native), as well as .dat, .csv, .asci (cross platform).

Datasets created or modified in R must be saved as R objects or exported in specific file formats if needed at a later time.

# create a simple data table
df<-data.frame(col1=vec1,
                  col2=vec2)
df
  col1         col2
1    1            A
2    3           Ab
3   66 This or that
4    9            C
5  121            d

One can then access specific observations in the dataset.

# access col1
df[,1]
[1]   1   3  66   9 121
# access first row
df[1,]
  col1 col2
1    1    A
# access entry at first row and col1
df[1,1]
[1] 1

One can furthermore ask for dataset summaries.

# checks the elements of the data table
str(df)
'data.frame':   5 obs. of  2 variables:
 $ col1: num  1 3 66 9 121
 $ col2: chr  "A" "Ab" "This or that" "C" ...
# provides a summary of the data table
summary(df)
      col1         col2          
 Min.   :  1   Length:5          
 1st Qu.:  3   Class :character  
 Median :  9   Mode  :character  
 Mean   : 40                     
 3rd Qu.: 66                     
 Max.   :121                     

We can perform operations on specific elements of the dataset.

# an addition on the numeric vector of the data table
df[,1]+100
[1] 101 103 166 109 221
# an addition on the numeric elements at the intersection row 1 and column 1
df[5,1]+123
[1] 244

Functions

Functions are powerful tools in R. With functions we can automatize specific redundat tasks.

Identifying which tasks are redundant is one step towards understanding programming.

We can create our own functions follwing this simple structure.

function(){}.

() defines the function arguments.

{} contains the function itself.

We create a simple function that adds a constant to numeric columns in datasets.

We use the pre-installed datasets in R: iris and mtcars.

What we’d like to do.

  1. Identify numeric vectors.
  2. Add a specific constant to numeric vectors.
# adds 3 to all numerical columns
head(iris[,1:4] + 3)

# add 77 to all numerical columns 
head(iris[,1:4] + 77)

If we wrote this into a function, it would look like this.

# this function takes two arguments: a dataset 'df' and a constant 'n'
func1<-function(df,n){
  
  tmp <- Filter(is.numeric, df) # we first filter the dataframe for numeric columns
  
  tmp + n # we then add the constant to all the numeric columns
}

Let’s see if it works.

# we apply the function and add 3 to all numeric columns of iris
# we only ask to see the first ten rows of the outcome using head()
# note that here we've combined two functions
## func1() which is the one we've created is first computed
## then head() is applied to the results of the previous function
head(func1(iris,3))
  Sepal.Length Sepal.Width Petal.Length Petal.Width
1          8.1         6.5          4.4         3.2
2          7.9         6.0          4.4         3.2
3          7.7         6.2          4.3         3.2
4          7.6         6.1          4.5         3.2
5          8.0         6.6          4.4         3.2
6          8.4         6.9          4.7         3.4
# we apply the function and add 99 to all numeric columns of another pre-installed dataset 'mtcars'
# we only ask to see the first ten rows of the outcome using head()
head(func1(mtcars,99))
                    mpg cyl disp  hp   drat      wt   qsec  vs  am gear carb
Mazda RX4         120.0 105  259 209 102.90 101.620 115.46  99 100  103  103
Mazda RX4 Wag     120.0 105  259 209 102.90 101.875 116.02  99 100  103  103
Datsun 710        121.8 103  207 192 102.85 101.320 117.61 100 100  103  100
Hornet 4 Drive    120.4 105  357 209 102.08 102.215 118.44 100  99  102  100
Hornet Sportabout 117.7 107  459 274 102.15 102.440 116.02  99  99  102  101
Valiant           117.1 105  324 204 101.76 102.460 119.22 100  99  102  100

Packages

An R package contains code, documentation, and sometimes even data.

Typically do not come pre-installed so they need to be installed before use.

We should first know which package to install and then install them and activate them in our work environment.

# we might have to set up a mirror first!
# mirror is the website from which r will install packages
r <- getOption("repos")
r["CRAN"] <-"https://cloud.r-project.org/"
options(repos=r)

# installs `tidyverse`
 install.packages("tidyverse") 

# makes it available for R on your local machine
# this step is crucial if you want to have access to all the containing functions
library(tidyverse)

R Packages with websites

(Almost) Every package has a designated website. Visit the package website for examples on how to use and also to identify the functions contained. For example https://www.tidyverse.org/.

pipe %>% operator

The pipe operator %>% compresses into one code otherwise a long chain of steps that involve creating objects which are then subjected to new operations.

It is contained in the package tidyverse.

It simplifies a lot the workflow.

Let’s apply multiple filters to a dataset. We are only interested in the final output.

mtcars %>% # we use the pre-installed data mtcars
  filter(cyl < 5) %>% # we apply the 1st filter on column cyl
  filter(hp > 100) # we apply filter 2 on the results of the filtered data
              mpg cyl  disp  hp drat    wt qsec vs am gear carb
Lotus Europa 30.4   4  95.1 113 3.77 1.513 16.9  1  1    5    2
Volvo 142E   21.4   4 121.0 109 4.11 2.780 18.6  1  1    4    2

Introduction to website creation

Read this for further details.

https://adrian-stanciu.quarto.pub/r-beyond-data-analysis/publish.html#website

Self-publishing online using R is one way to integrate varying work-routines towards a greater goal – that of communicating own research, consultancy job involving data science, to name just a few examples.

We can only create static websites!

My website, for instance: https://adrianstanciu.eu/

We need

  • git and a GitHub account (see DIY)
  • quarto (see DIY)
  • a clear-to-us final product
  • imagination

We also need to install these packages first.

# set up CRAN mirror to download packages
r <- getOption("repos")
r["CRAN"] <-"https://cloud.r-project.org/"
options(repos=r)

# install packages
install.packages("tidyverse") # suite of functions
install.packages("rmarkdown") # helpful tools for qmd editing
install.packages("shiny") # needed for web applications

# activate packages
library(tidyverse)
library(shiny)
library(rmarkdown)

Assuming we have all the DIY in place, we create a working, albeit rudimentary, website structure using quarto website projects.

Note that this website structure looks like a website indeed but needs deployment on an online server (GitHub) to become truly a website that is accessible by everyone, everywhere.

We will not cover deployment in this workshop. Details are given here, the R beyond online book (see previous slides).

To create a quarto website project, open RStudio, navigate to File/ New Project and then select from the list New Directory.

Scroll down to Quarto Website, click on this option, follow the instructions, and a rudimentary website structure will be automatically generated in the chosen directory.

More details are given here, the R beyond online book (see previous slides).

The default quarto website structure contains the following: .qmd, .yml, .css as well as a folder “_site”.

Let’s explore them one by one.

.qmd stands for quarto markdown which is a specific type of markdown language document.

Markdown documents are documents that can render text, code, images, and so on, into a single desired final document format: html, pdf, docx, etc.

This is the source file that we will edit. See for basics of qmd editing this website: https://quarto.org/docs/authoring/markdown-basics.html.

.yml stands for YAML Ain’t Markup Language, and is used for final document formatting purposes.

Note the .yml file formats the entire website.

Meanwhile, the yaml header (see first lines of the .qmd) formats each individual page.

Note at line 4 of the .yml file the description of the project: website. Here we specific attributes of the website, such as which pages (individual .qmd files) to render and in which order.

Note at line 12 of the .ymlfile the format section. Here we specific the final output file type: html.

We also specify formatting attributes to this final files. Especially the css (line 15) is important to note because this file allows us to specify custom formatting to the website.

.css stands for Cascading Style Sheet which is a programming language used to specific formatting options to HTML documents.

One can create custom css styles for their website or use pre-defined templates from the Internet.

Warning

Always be careful what you download from the Internet!

Quick deployment

Assuming you have an account on https://quartopub.com/, you can deploy your website so that everyone, everywhere has access to it.

We can deploy directly from Terminal (lower-left panel in RStudio) to quartopub.com using

quarto render

then

quarto publish quarto-pub

Note that this approach will deploy your website without a custom domain!

To use a custom domain you’d need to deploy via GitHub (see DIY).

Introduction to web application creation

Read this for further details.

https://adrian-stanciu.quarto.pub/r-beyond-data-analysis/apps.html

Not this kind of applications:

Applications in the common sense - Spotify, Instagram, etc.

But rather:

Applications serving a supportive role for your research, your skills or expertise.

Applications wrapped around repetitive code.

Web application examples

Predicted as observed by Dr. Julian Kohnke. Read paper by Witte et al. (2022).

Predicted as observed by Dr. Julian Kohnke. Read paper by Witte et al. (2022).

Web application examples

Values in Europe by Dr. Maksim Rudnev.

Values in Europe by Dr. Maksim Rudnev.

Web application examples

An example of, yes, web application, but no, not a very user friendly one (from my work, sic!).

Social perception during COVID-19 by Jessica A. Herzig. Read paper by Friehs et al. (2022).

Social perception during COVID-19 by Jessica A. Herzig. Read paper by Friehs et al. (2022).

To create a working, yet rudimentary, web application, navigate to File/ New Project and then select from the New Directory and finally Shiny Application.

Follow the rest of the given instructions.

Read here more details on setting-up the work environment and further tips.

Make sure you’ve installed the package shiny (see previous slides).

Shiny App structure

Shiny apps have a user interface (UI) that is wrapped around code that runs in the background on a server.

When programming a shiny app therefore we need to program both the design (UI) and the code that runs on the server (server).

# copy and paste these lines of code in the Console
# a shiny app will be automatically shown

library(shiny)
runExample("01_hello")

UI

library(shiny)
library(bslib)

# Define UI for app that draws a histogram ----
ui <- page_sidebar(
  
  # App title ----
  title = "Hello Shiny!",
  
  # Sidebar panel for inputs ----
  sidebar = sidebar(
   
     # Input: Slider for the number of bins ----
    sliderInput(
      inputId = "bins",
      label = "Number of bins:",
      min = 1,
      max = 50,
      value = 30
    )
  ),
  # Output: Histogram ----
  plotOutput(outputId = "distPlot")
)

Let’s discuss the main elements one-by-one.

Code snippet from the official shiny app website https://shiny.posit.co/r/getstarted/shiny-basics/lesson1/

The UI part makes a shiny app attractive to the audience and, if programmed right, can engage the audience in an interactive and dynamic manner.

Programming the UI part requires a bit of orientation toward the audience for which the app is designed.

Pre-work

Before coding the app itself, think of these and similar questions:

What are the minimum skills required by your audience to operate the app?

What theoretical and practical expertise is expected from the audience to intuitively navigate the app?

In the UI part, we need refer to objects from the server part.

If we do not call objects from the server in the UI part properly, the app might still work but the audience will not have access to it.

Commas and brackets!

Make sure that you always use commas and close the brackets appropriately. Otherwise, the design might not look as intended or the entire code might break even.

Take some time to decide what do you want to include in the app and what do you need for your audience.

For example, do you want the audience to view plots or tables, and if yes, do you want these to be interactive?

What code do you need to write on the server part and what is the final r object that you’d want to be displayed for the audience via the UI?

Figure 2: UI code and corresponding shiny outcome.

library(shiny)
library(bslib)

# Define UI for app that draws a histogram ----
ui <- page_sidebar(
  
  # App title ----
  title = "Hello Shiny!",
  
  # Sidebar panel for inputs ----
  sidebar = sidebar(
   
     # Input: Slider for the number of bins ----
    sliderInput(
      inputId = "bins",
      label = "Number of bins:",
      min = 1,
      max = 50,
      value = 30
    )
  ),
  # Output: Histogram ----
  plotOutput(outputId = "distPlot")
)

Refer to this material to start building your UI: https://adrian-stanciu.quarto.pub/r-beyond-data-analysis/apps.html

Server

# Define server logic required to draw a histogram ----
server <- function(input, output) {

  # Histogram of the Old Faithful Geyser Data ----
  # with requested number of bins
  # This expression that generates a histogram is wrapped in a call
  # to renderPlot to indicate that:
  #
  # 1. It is "reactive" and therefore should be automatically
  #    re-executed when inputs (input$bins) change
  # 2. Its output type is a plot
  output$distPlot <- renderPlot({

    x    <- faithful$waiting
    bins <- seq(min(x), max(x), length.out = input$bins + 1)

    hist(x, breaks = bins, col = "#007bc2", border = "white",
         xlab = "Waiting time to next eruption (in mins)",
         main = "Histogram of waiting times")

    })

}

Let’s discuss the main elements one-by-one.

Code snippet from the official shiny app website https://shiny.posit.co/r/getstarted/shiny-basics/lesson1/

The server part makes a shiny app, well, work.

Here is where code is written to import, clean, manipulate and analyse data, metadata and all sorts of other things.

One way that I find helpful to think of the server part is to see it as the old-school R coding on my local machine.

What does the server do?!

This distinction is less intuitive when we run the shiny app on the local machine. But, this distinction between UI and server becomes crucial when we deploy the app on online repositories, as we will see shortly.

Through structuring the app code in an UI and server part, we tell the respective online servers how to read and render the code.

The code for the server is a custom function, a very large and complex one but, still a custom function.

Custom function

Remember, custom functions look like function(){}.

The server function takes two arguments:

input signals what comes from the UI interface. That is, what the user of the app is inputing via the UI.

output signals what goes from the server to the UI. That is, what the user views as a result of interacting with the app.

The code and data for this shiny app example can be downloaded from the R beyond book. (See previous slides)

Watch out for reactive objects!

Reactive objects are plain R code wrapped inside an object that the server needs to compute.

Mind the brackets

Remember to always call it as such. It is an r object all right, but it looks like a function: reactiveobject().

Reactive objects are written inside ({YOUR SERVER CODE HERE}).

It is a specific code chunk for the server.

# An example from the illustrative example
# See previous slides
  
  tempdf <- reactive({
    # define the input object for data manipulation
    choice=input$stereotype
    
    # takes the loaded data (a step not shown here)
    # then, manipulates the data
    # note that the output was not assigned to an object in the code
    # the output is assigned to the reactive object "tempdf()"
    dfex %>% 
      sjlabelled::remove_all_labels() %>% 
      pivot_longer(contains("warm") | contains("comp")) %>% 
      filter(name %in% choice)
  
  })

Run the app

Running the app locally is as simple as pressing the button Run App.

This is in fact calling a function written inside the app.R script.

This function is shinyApp(ui = ui,server = server).

Deploy the app

When you are satisfied with the app after inspecting it on your local machine, you can now deploy it online so that everyone, everywhere can access it.

We would need a dedicated server and, of course, an access account on that server.

One efficient and smooth way to deploy a shiny app online is to use the dedicated server https://www.shinyapps.io/.

Practical session

Start building your website or a shiny app.

Use the remaining time to become more familiar with either of these options: web applications or website.

You have access to the slides also in HTML format from this site: https://rpubs.com/adrianstanciu/r-beyond-ecp.

I will walk around the room and answer any questions.

Thank you!

Did you know that the capybara is the biggest rodent in the world?

Contact: adrian.stanciu[at]uni.lu

Reference list

Friehs, M. T., Kotzur, P. F., Kraus, C., Schemmerling, M., Stanciu, A., & al, et. (2022). Warmth and competence perceptions of key protagonists are associated with containment measures during the COVID-19 pandemic: Evidence from 35 countries. Scientific Reports, 12, 21277. https://doi.org/10.1038/s41598-022-25228-9
Witte, E. H., Stanciu, A., & Zenker, F. (2022). Predicted as observed? How to identify empirically adequate theoretical constructs. Frontiers in Psychology, 13, 980261. https://doi.org/10.3389/fpsyg.2022.980261