Introduction to R for Social Scientists

rm(list=ls())

Today:

Coding skills for social scientists
Research & Data organization

R-architecture

R exists as a base package with a reasonable amount of functionality. Data analysis and graphs The beauty of R: free, versatile, and expandable with packages with canned functions (pre-built implementations of various models).

Any user can write a function or even a package. Don’t know what package you need? The internet is your friend. Try StackExchange or CRAN. https://cran.r-project.org/ https://cran.r-project.org/web/packages/ggplot2/

math<-function(x) {
  x+1
  }

values<-c(0:10)
values

##  [1]  0  1  2  3  4  5  6  7  8  9 10

math(values)

##  [1]  1  2  3  4  5  6  7  8  9 10 11

plot(curve(math, xlim=c(0,12), ylab="Function Values"))

RStudio Environment TOUR

Work space
Menus
Command line

pibic<-c("Isabela", "Victor Hugo", "Ana", "Rob"); pibic
pibic2<-pibic[pibic != "Rob"]; pibic2
pibic<-c(pibic, "André"); pibic

R iS CASe SeNsItiVe

Creating Scripts

Lesson 1: Comment on your code using the ‘#’ symbol. Sections: # —- or ##### (open or close)

#####################################
# Begin the script file with a label for your future self:
# File Name: IntroR - Class 1
# Author: Robert Vidigal
# Version Date: 15 Nov 2021
# Data Creation / Results / Figures
#####################################

Working directory (wd)

# Next, set your working directory. This way you can save throughout without naming new file paths.
# A working directory lets R know where to look for files and data you reference, and where to deposit output you save
getwd()
setwd("~")
# In order to do this, you must have a folder named "PIBIC" and another one inside named "IntroR" (or whatever you'd like to call it) in your drive already. 
setwd("~/Dropbox/PIBIC2021")

# If you are using Windows, use your C:/ drive
setwd("C:/Section") # etc

item <- 5 # assignment
rm(item) # Remove an item
ls() # Check what variables remain in the workspace with 'ls()' (that is list objects)
rm(list = ls()) # remove everything from environment
history() # shows 25 most recent commands. This can take

Installing packages

If you’ve never used a package before, you may have to install it first:

install.packages("package.name")
install.packages("car")
install.packages("psych")

# install package which exists at github.com/username/packagename
# devtools::install_github("username/packagename")

library(psych)
require(car)

package::function()

# list vignettes available for a specific package
vignette(package = "dplyr")
# view specific vignette
vignette("grid")
# view all vignettes on your computer

Math Operators

+ # Adds things together
- # Subtracts things
* # Multiplies things
/ # Divides things
^ or ** # Exponentiation (i.e., to the power of, so, x^2 or x**2 is x2, x^3 is x3 and so on)
< # Less than
<= # Less than or equal to
> # Greater than
>= # Greater than or equal to
== # Exactly equals to 
!= # Not equal to
# (this might confuse you because you’ll be used to using ‘=’ as the symbol for ‘equals’, 
# but in R you usually use ‘==’, one '=' creates an object)
  
x=c(1,2,3,4,5,6,NA)
!x # Not x
x | y # x OU y (e.g., partido == "PSTU" | "PSOL" | "PCB" | "PCO")
x & y # x E y (e.g., gender == "F" & partido == "PT")
is.na(x) # Tests if NA present (Not Assignment)

Math Review

# Negative Numbers, Squares, Square Roots, Sums
# Adding negative numbers
5 + -3 # 2
-4 + 7 # 3
-5 + -6 # -11
1 + -3 # -2

# Negative multiplied by a positive is a negative
# The order does not matter!
7 * -3 # -21
-8 * 4 # -32

# Two negatives multiplied is a positive
-2 * -5    
-1 * -9

# Two positive multiplied are also a positive
4 * 7    

# Squares look like this:
6^2
6**2

(3-4)^2 # 1
3^2 - 4^2 # 

# Fractions
2/10
(2+1)/10

# Square root
?sqrt()
sqrt(49)

# Sum of all values
x=c(1,3,5,7,9)
1+3+5+7+9
sum(x)
sum(x)+2
sum(x+2) # x[1]+2 + x[2]+2 + ... + x[n]+2

There are many ways to enter data into R

#First, you can enter it manually
#Use '<-' to assign data to a variable/object

# Scalar example
scalar<-2
X<-3

# Vector example
set<-c(1,2,3,4,5,6)
vector <- 1:10

# Matrix example
matrix.1 <- matrix(1:12, nrow = 4)
matrix.2<-matrix(set, nrow=3, ncol=2, byrow=F)
matrix.3<-diag(3)

# Dataframe example
df <- data.frame(item1 = 1:18, item2 = LETTERS[1:18])

# List example
list <- list(item1 = 1:10, item2 = LETTERS[1:18])

Need help? You can always ask R

help(matrix)
?diag
??matrix

These help pages will often at least tell you the arguments that a function takes. If you need more help…to GOOGLE! stackoverflow and CRAN

Entering data in a new data frame

minidf<-data.frame(
  "name"=c("Sofia", "Pedro", "Juan", "Benjamin", "Frederico", "Igor", "Beatriz", "Anderson"), # Indivíduos
  "age"=c(23, 22, 44, 56, 90, 18, 77, 31), # In Years
  "female"=c(0, 1, 0, 1, 0, 0, 0, 1), # Female 1, Male 0
  "income"=c(2500, 5000, 2800, 6000, 9000, 2000, 3300, 7700), # Renda em R$
  "ideology"=c(1, 3, 4, 5, 4, 5, 6, 6), # 1 a 7
  "neuroticism"=c(.22, .14, .77, .99, .67, .45, .65, .5), # 0 a 1
  "polknow"=c(2, 3, 4, 5, 6, 7, 0, 4), # 0 a 7
  "riskaverse"=c(NA, NA, 1, 5, 5, 5, 3, 2) # 0 a 5
)
minidf

Saving data

write.csv(minidf, "~/Dropbox/PIBIC2021/data/minidf.csv", row.names = FALSE) # CSV
write.table(minidf, "~/Dropbox/PIBIC2021/data/minidf2.csv", sep=";") # XLS / TXT
foreign::write.dta(minidf, "~/Dropbox/PIBIC2021/data/minidf.dta") # STATA
save(minidf, file="~/Dropbox/PIBIC2021/data/minidf.RData") # RData

library(foreign)
write.dta(minidf, "~/Dropbox/PIBIC2021/data/minidf.dta") # STATA

Importing and loading DATA!

?read.table() # Other CSV files
?load() # RData

# Packages "foreign" and "haven".
# spss, dta, and txt types of data
foreign::read.spss()
haven::read_spss()
rio::import()

foreign::read.dta()
haven::read_dta()
readstata13::read.dta13()

?read.table()

The other main way to enter data is to pull in a dataset from outside R

These datasets can be .csv, .dta, etc. formats

DFMINI<-read.csv("~/Dropbox/PIBIC2021/data/minidf.csv", header=T, sep=",")
DFMINI2<-read.csv("~/Dropbox/PIBIC2021/data/minidf2.csv", header=T, sep=";")
load("~/Dropbox/PIBIC2021/data/minidf.RData")

DFMINI_STATA<-foreign::read.dta("~/Dropbox/PIBIC2021/data/minidf.dta")

lapopBR19<-read.csv("/Users/robertvidigal/Dropbox/PIBIC2021/data/LAPOP2019_work.csv", header=T, sep=",") # CSV
databr<-readstata13::read.dta13("~/Dropbox/PIBIC2021/data/Brazil_LAPOP_AmericasBarometer2019.dta", 
                                generate.factors = TRUE, 
                                nonint.factors = TRUE) # DTA

HELPFUL SOURCES WHEN CODING

The main project website: http://www.r-project.org/
Quick-R, an excellent introductory website: http://www.statmethods.net/index.htm
John Fox’s R Commander website: http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/