Installing and maintaining R

Here “PackageName” stands for the specific package you want to install.

install.packages(“PackageName”)

Once a package is installed, it does not need to be installed ever again. But, each time you start R, you have to load packages in memory:

library(dplyr)

Help and documentation

To get help on a specific name of a function: e.g., mean

?"mean"
## starting httpd help server ... done

And to get help about a package: e.g., dplyr

help(package = "dplyr")

Let’s play

Now, let’s explore how R works with data. First, R can be your calculator

1+2
## [1] 3

Now, try to do what we learned today: Central tendency measures. First, make a variable of quiz scores

quiz <- c(80, 82, 87, 87, 88, 88, 88, 88, 90, 90, 91 ,92, 96, 96, 96, 100)

Mean, median, max, and min

mean(quiz)
## [1] 89.9375
median(quiz)
## [1] 89
max(quiz)
## [1] 100
min(quiz)
## [1] 80

Use summary function that gives you mean, median, max, and min

summary(quiz)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   80.00   87.75   89.00   89.94   93.00  100.00

Variance and standard deviation

Let’s try variance and standard deviation too.

var(quiz)
## [1] 27.39583
sd(quiz)
## [1] 5.234103

Data visualization

Graphs are good visualization tools for data science. This course does not cover data visualization, but you can try by yourself. You may need to install a package of “ggplot2”.

hist(quiz) #this is a basic function without ggplot2 package

library(ggplot2) 
qplot(quiz) 
## Warning: `qplot()` was deprecated in ggplot2 3.4.0.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Well done!

Now you know how to play with quantitative data in R!