Agenda

  1. Set up R and RStudio
  2. Understand the layout and functionality of RStudio
  3. Experiment with basic R expressions and data types
  4. Try out RStudio.cloud

Short survey

Introduction

R is a programming environment

  • uses a well-developed but simple programming language
  • allows for rapid development of new tools according to user demand
  • supports data analytics tasks

Downloading and installing R

RGui

RGui is an interactive R environment that comes with R installation, but it is very basic and not so user-friendly.

RStudio

RStudio is a development environment for R, and provides many advanced features to improve efficiency and ease of use for R users.

Downloading and installing RStudio

Getting started with RStudio

RStudio: console panel

This is the most important panel, because this is where R actually does stuff

RStudio: editor panel

Collections of commands (scripts) can be edited and saved.

RStudio: a typical workflow

RStudio: a typical workflow (cont.)

There are ways to speed up the workflow:

  • If you don’t select any code, R will just execute the line where the blinking cursor is

  • Instead of clicking the “Run” icon, you can just use the keyboard shortcut: Ctrl + Enter

RStudio: run the whole R script

RStudio: save your R script

RStudio: customization

Menu (on the top) → Tools → Global Options

Operators: Arithmetics

7 + 5
## [1] 12
7 - 5
## [1] 2
7 * 5
## [1] 35
7 / 5
## [1] 1.4

Like a calculator, R also has many functions that let you do more sophisticated manipulations.

round(2.05)
## [1] 2
factorial(3)  # 3! = 3 * 2 * 1
## [1] 6
sqrt(9)       # square root 
## [1] 3

Commenting out codes

Use # (number sign) to comment your codes

# 2+2
# commented out and will not be evaluated
  • R will ignore anything in a line that follows #
  • It is a good idea to comment your codes so that others can understand what you are trying to do.

Getting Help

There will be many occasions where you want to learn more about a built-in command or function. Type help(function_name) or ?function_name to get more information. For example:

help(factorial)
?factorial

Use two question marks ?? to search the whole help database, especially when you don’t know exactly the function name. For example,

??read

Your turn (30 seconds)

What do you think this command will return?

factorial(round(2.05) + 1)

R always works from the innermost parentheses to the outermost (just like a calculator)

factorial(round(2.05) + 1)

→ factorial(2 + 1)

→ factorial(3)

→ 6

Data types

R can recognize different types of data:

  • numbers
  • character strings (text)
  • logical
  • factor

Numeric

Any number, no quotes.

Appropriate for math.

1 + 1
200000
sqrt(9)
class(0.3)   
## [1] "numeric"
# "class()" is function that shows the data type of an input

Character

Any symbols surrounded by single quotes (’) or double quotes (“)

class("Unstructured Data Management")
## [1] "character"
nchar('Unstructured Data Management')
## [1] 28
toupper("Unstructured Data Management")
## [1] "UNSTRUCTURED DATA MANAGEMENT"
paste("Unstructured", "Data", "Management", sep="_") 
## [1] "Unstructured_Data_Management"

Your turn

How many characters are in the following strings:

Use paste to join the following words so that the result looks like How#are#you? (hint: ?paste)

  • How
  • are
  • you?

Logical

Logical values are either TRUE or FALSE (Note: they are uppercase).

2 + 3 == 5   # use '==' to check whether two values are equal
## [1] TRUE
3 < 2
## [1] FALSE
TRUE == T    # use T as a short hand for TRUE; F for FALSE
## [1] TRUE

Factor

R’s form of categorical data. Saved as an integer with a set of labels (e.g. levels)

states <- factor(c("FL", "GA", "AZ"))
states
## [1] FL GA AZ
## Levels: AZ FL GA
class(states)
## [1] "factor"
levels(states)
## [1] "AZ" "FL" "GA"

More about types

is.XYZ(x) function returns TRUE/FALSE for whether x is of type XYZ

as.XYZ(x) (tries to) “cast” x to type XYZ — to translate it sensibly into a XYZ-type value

is.integer(1.5)
## [1] FALSE
is.numeric(7)
## [1] TRUE
is.character(7)
## [1] FALSE
is.character("7")
## [1] TRUE

as.character(5/2)
## [1] "2.5"
as.numeric(as.character(5/2))
## [1] 2.5
2 * as.numeric(as.character(5/2))
## [1] 5

Data can have names

We can give names to data objects; these give us variables

Variables are created with the assignment operator, <- or =

Be careful that R is a case sensitive language. FOO, Foo, and foo are three different variables!

x = 2      # use the equal sign to assign value
y <- 3     # you can also use an arrow to assign value
x          # print the value of a variable by typing its name
## [1] 2
x * y
## [1] 6

The assignment operator also changes values:

x
## [1] 2
x <- 8
x
## [1] 8

Using names and variables makes code: easier to design, easier to debug, less prone to bugs, easier to improve, and easier for others to read

Variable names

Variable names cannot begin with numbers. Wise to avoid special characters, except for period (.) and underline (_)

Example of valid names:
  • a
  • b
  • FOO
  • my_var
  • .day

Example of invalid names:
  • 1
  • 2nd
  • ^mean
  • !bad
  • $

Notes about writing your R codes

A command can spread across multiple lines. This can often improve readability.

x = paste("How", "are", "you?",   # what happens if you only enter this line
          sep=" ")
x
## [1] "How are you?"
  • We can put multiple commands in the same line, but they need to be separated by a semicolon (;)
a = 1; b = 2
a + b
## [1] 3

Your turn

  1. Create variables f_name and l_name with value equals to your own first/last names
  2. Get the number of characters in f_name and l_name and save them to length_f_name and length_l_name respectively
  3. Use the paste() function to get your whole name
  4. length_f_name multiplied by length_l_name
  5. length_f_name divided by length_l_name
  6. Show if length_f_name is greater than length_l_name

RStudio.cloud

RStudio.cloud (https://rstudio.cloud/) provides a similar interface from web browsers. (No installation)

Free tier (Cloud Free) can handle most lightweight operations

Notes (1/3)

  1. The R code from this lab will be posted under the “In class R script” section of the content area.

  2. No lab assignment for this week.

  3. Remember to install R and IDE before next class.

    • The first lab assignments are due by Friday 4:30 pm on the week of the second class.

Notes (2/3): Learn R, in R

swirl() package enables you to learn R by following examples inside the console. It may feel a bit retro, but it’s one of the recommended way to onboard R while familiarizing yourself with its interface.

install.packages("swirl")
library("swirl")
swirl()

Notes (3/3): Learn R, from GSU Library workshop