Agenda

  1. Set up R and RStudio
  2. Understand the layout and functionality of RStudio
  3. Experiment with basic R expressions and data types

Introduction

R is a programming environment

  • uses a well-developed but simple programming language
  • allows for rapid development of new tools according to user demand
  • supports data analytics tasks

Downloading and installing R

RGui

RGui is an interactive R environment that comes with R installation, but it is very basic and not so user-friendly.

Rgui

RStudio

RStudio is a development environment for R, and provides many advanced features to improve efficiency and ease of use for R users.

Rgui

Downloading and installing RStudio

Getting started with RStudio

RStudio: console panel

This is the most important panel, because this is where R actually does stuff

RStudio: editor panel

Collections of commands (scripts) can be edited and saved.

RStudio: a typical workflow

RStudio: a typical workflow (cont.)

There are ways to speed up the workflow:

  • If you don’t select any code, R will just execute the line where the blinking cursor is

  • Instead of clicking the “Run” icon, you can just use the keyboard shortcut: Ctrl + Enter

RStudio: run the whole R script

RStudio: save your R script

RStudio: customization

Menu (on the top) -> Tools -> Global Options

Operators: Arithmetics

7 + 5
## [1] 12
7 - 5
## [1] 2
7 * 5
## [1] 35
7 / 5
## [1] 1.4

Like a calculator, R also has many functions that let you do more sophisticated manipulations.

round(2.05)
## [1] 2
factorial(3)  # 3! = 3 * 2 * 1
## [1] 6
sqrt(9)       # square root 
## [1] 3

Note:

  • Use # (number sign) to comment your codes
  • R will ignore anything in a line that follows #
  • It is a good idea to comment your codes so that others can understand what you are trying to do.

Getting Help

There will be many occasions where you want to learn more about a built-in command or function. Type help(function_name) or ?function_name to get more information. For example:

help(factorial)
?factorial

Use two question marks to search the whole help database, especially when you don’t know exactly the function name. For example,

??read

Your turn (30 seconds)

What do you think this command will return?

factorial(round(2.05) + 1)

R always works from the innermost parentheses to the outermost (just like a calculator)

factorial(round(2.05) + 1)

–> factorial(2 + 1)

–> factorial(3)

–> 6

Data types

R can recognize different types of data:

  • numbers
  • character strings (text)
  • logical
  • factor

Numeric

Any number, no quotes.

Appropriate for math.

1 + 1
200000
sqrt(9)
class(0.3)  # "class" is function that shows the data type of an input
## [1] "numeric"

Character

Any symbols surrounded by single quotes (’) or double quotes (“)

class("Unstructured Data Management")
## [1] "character"
nchar('Unstructured Data Management')
## [1] 28
toupper("Unstructured Data Management")
## [1] "UNSTRUCTURED DATA MANAGEMENT"
paste("Unstructured", "Data", "Management", sep="_") 
## [1] "Unstructured_Data_Management"

Your turn

How many characters are in the following strings:

  • 123
  • Email
  • rzhang6@gsu.edu

Use paste to join the following words so that the result looks like How#are#you? (hint: ?paste)

  • How
  • are
  • you?

Logical

Logical values are either TRUE or FALSE (Note: they are uppercase).

2 + 3 == 5   # use '==' to check whether two values are equal
## [1] TRUE
3 < 2
## [1] FALSE
TRUE == T    # use T as a short hand for TRUE; F for FALSE
## [1] TRUE

Factor

R’s form of categorical data. Saved as an integer with a set of labels (e.g. levels)

states <- factor(c("FL", "GA", "AZ"))
states
## [1] FL GA AZ
## Levels: AZ FL GA
class(states)
## [1] "factor"
levels(states)
## [1] "AZ" "FL" "GA"

More about types

is.XYZ(x) function returns TRUE/FALSE for whether x is of type XYZ

as.XYZ(x) (tries to) “cast” x to type XYZ — to translate it sensibly into a XYZ-type value

is.integer(1.5)
## [1] FALSE
is.numeric(7)
## [1] TRUE
is.character(7)
## [1] FALSE
is.character("7")
## [1] TRUE

as.character(5/2)
## [1] "2.5"
as.numeric(as.character(5/2))
## [1] 2.5
2 * as.numeric(as.character(5/2))
## [1] 5

Data can have names

We can give names to data objects; these give us variables

Variables are created with the assignment operator, <- or =

Be careful that R is a case sensitive language. FOO, Foo, and foo are three different variables!

x = 2      # use the equal sign to assign value
y <- 3     # you can also use an arrow to assign value
x          # print the value of a variable by typing its name
## [1] 2
x * y
## [1] 6

The assignment operator also changes values:

x
## [1] 2
x <- 8
x
## [1] 8

Using names and variables makes code: easier to design, easier to debug, less prone to bugs, easier to improve, and easier for others to read

Variable names

Variable names cannot begin with numbers. Wise to avoid special characters, except for period (.) and underline (_)

Example of valid names:

  • a
  • b
  • FOO
  • my_var
  • .day

Example of invalid names:

  • 1
  • 2nd
  • ^mean
  • !bad
  • $

Notes about writing your R codes

  • A command can spread across multiple lines. This can often improve readability.
x = paste("How", "are", "you?",   # what happens if you only enter this line
          sep=" ")
x
## [1] "How are you?"
  • We can put multiple commands in the same line, but they need to be separated by a semicolon (;)
a = 1; b = 2
a + b
## [1] 3

Your turn

  1. Create variables f_name and l_name with value equals to your own first/last names
  2. Get the number of characters in f_name and l_name and save them to length_f_name and length_l_name respectively
  3. Use the paste() function to get your whole name
  4. length_f_name multiplied by length_l_name
  5. length_f_name divided by length_l_name
  6. Show if length_f_name is greater than length_l_name

Notes

  1. The instructor will share the R code from this lab and post it under the “In class R script” section of the content area.

  2. No lab assignment for this week.