I am not expecting you to understand everything that is going on. We are mostly having a little fun before diving into more code examples.
#install.packages("tidyverse")
#install.packages("cowplot")
#install.packages("RColorBrewer")
#install.packages("readxl")
library("tidyverse")
library("cowplot")
library("RColorBrewer")
library("readxl")
I created an object called exp_activities and stored my 6 different activities in it.
exp_activities <- c("Book Reviews",
"Get my life together",
"Research",
"Sew or something else crafty",
"Something fun",
"Workout")
exp_activities # look at object to check work
## [1] "Book Reviews" "Get my life together"
## [3] "Research" "Sew or something else crafty"
## [5] "Something fun" "Workout"
I created another object to store my five activities that actually happened.
reality_activities <- c( "Book Reviews",
"Crochet a blanket",
"Lesson Plan",
"Stare at the Wall",
"Unintentional Naps")
reality_activities
## [1] "Book Reviews" "Crochet a blanket" "Lesson Plan"
## [4] "Stare at the Wall" "Unintentional Naps"
Now, I want to assign a time value for both my expected and realistic activities
# numbers indicate a percent of my total "free" time
exp_timespent <- c(20,
25,
35,
15,
5,
5)
class(exp_timespent)
## [1] "numeric"
exp_timespent
## [1] 20 25 35 15 5 5
reality_timespent <- c(10,
15,
35,
5,
35)
reality_timespent
## [1] 10 15 35 5 35
lockdown_exp <- data.frame(exp_activities, exp_timespent )
lockdown_exp
## exp_activities exp_timespent
## 1 Book Reviews 20
## 2 Get my life together 25
## 3 Research 35
## 4 Sew or something else crafty 15
## 5 Something fun 5
## 6 Workout 5
lockdown_reality <- data.frame( reality_activities, reality_timespent)
lockdown_reality
## reality_activities reality_timespent
## 1 Book Reviews 10
## 2 Crochet a blanket 15
## 3 Lesson Plan 35
## 4 Stare at the Wall 5
## 5 Unintentional Naps 35
expectations <-
ggplot( lockdown_exp,
aes( x = "", y = exp_timespent, fill = exp_activities)) +
geom_bar( stat = "identity", # Makes a stacked bar graph
color = "white", size = 4) +
coord_polar( "y", start = 0 ) + # But then puts it on a circle
theme_void() +
theme( legend.position = "bottom", legend.title = element_blank(), legend.direction = "vertical",
plot.title = element_text(hjust = 0.5, size=15, face="bold"))+
scale_fill_brewer(palette="RdPu", direction = -1 ) +
ggtitle("Expectations")
expectations
reality <-
ggplot( lockdown_reality,
aes( x = "", y = reality_timespent, fill = reality_activities)) +
geom_bar( stat = "identity", color = "white", size = 4) +
coord_polar( "y", start = 0 ) +
theme_void() +
theme( legend.position = "bottom", legend.title = element_blank(), legend.direction = "vertical",
plot.title = element_text(hjust = 0.5, size=15, face="bold"))+
scale_fill_brewer(palette="BuPu", direction = -1) +
ggtitle("Reality")
reality
full_plot <- plot_grid( expectations, reality)
full_plot
First of all, we need to install a few packages. A package is like a software that contains a list of functions that you can perform in R.
You can install packages in the following way:
# This is the main package we are using in this class
install.packages(tidyverse)
# This package provides you with color combinations for graphs
install.pacakges(RColorBrewer)
# This package allows to organize multiple plots
install.packages(cowplot)
Once you have the packages installed, you can use them by retrieving them from your library
library(tidyverse)
library(cowplot)
library(RColorBrewer)
Always start your script by listing the packages that are needed to reproduce it.
You might install packages as you work on a dataset depending on your needs, but make sure to annotate them at the very top.
This ensures that anyone can reproduce your code.
We will discuss packages more as we go.
Now that we have our packages, we start by creating a small set of activities that you were expecting to do during the lockdown
exp_activities <- c("Book Reviews",
"Get my life together",
"Research",
"Sew or something else crafty",
"Something fun",
"Workout")
To see what we are doing, type:
exp_activities
We now create a list of activities that we actually performed during the lockdown.
Note what changes: we change the object name (reality_activities) and the list of activities that will be stored in it.
Note what doesn’t change: this part of the code <- c() stays the same
reality_activities <- c( "Book Reviews",
"Crochet a blanket",
"Lesson Plan",
"Stare at the Wall",
"Unintentional Naps")
reality_activities
## [1] "Book Reviews" "Crochet a blanket" "Lesson Plan"
## [4] "Stare at the Wall" "Unintentional Naps"
Now that we have the two lists of activities, we need to report how much time we spent on each of them.
Numbers represent a percent of my total “free time” that I spent doing the activity.
exp_timespent <- c(20,
25,
35,
15,
5,
5)
exp_timespent
## [1] 20 25 35 15 5 5
We do the same for the “reality” activities
reality_timespent <- c(10,
15,
35,
5,
35)
Note that you can also write the code in one line as long as you respect the “spacing rules”. Start noticing them!
reality_timespent <- c(10, 15, 35, 5, 35)
Let’s check our output:
reality_timespent
Now that we have all information, we combine them into two datasets.
One datasets about expectations…
lockdown_exp <- data.frame(exp_activities, exp_timespent )
lockdown_exp
## exp_activities exp_timespent
## 1 Book Reviews 20
## 2 Get my life together 25
## 3 Research 35
## 4 Sew or something else crafty 15
## 5 Something fun 5
## 6 Workout 5
…and one about reality
lockdown_reality <- data.frame(reality_activities, reality_timespent)
lockdown_reality
## reality_activities reality_timespent
## 1 Book Reviews 10
## 2 Crochet a blanket 15
## 3 Lesson Plan 35
## 4 Stare at the Wall 5
## 5 Unintentional Naps 35
We can see the datasets that we just created
lockdown_exp
## exp_activities exp_timespent
## 1 Book Reviews 20
## 2 Get my life together 25
## 3 Research 35
## 4 Sew or something else crafty 15
## 5 Something fun 5
## 6 Workout 5
lockdown_reality
## reality_activities reality_timespent
## 1 Book Reviews 10
## 2 Crochet a blanket 15
## 3 Lesson Plan 35
## 4 Stare at the Wall 5
## 5 Unintentional Naps 35
We first create the “Expectations” plot.
library(ggplot2) # ggplot is needed for the ggplot() commands for graphing
# it is included automatically in tidyverse
expectations <-
ggplot( lockdown_exp,
aes( x = "", y = exp_timespent, fill = exp_activities)) +
geom_bar( stat = "identity", # Makes a stacked bar graph
color = "white", size = 4) +
coord_polar( "y", start = 0 ) + # But then puts it on a circle
theme_void() +
theme( legend.position = "bottom", legend.title = element_blank(), legend.direction = "vertical",
plot.title = element_text(hjust = 0.5, size=15, face="bold"))+
scale_fill_brewer(palette="RdPu", direction = -1 ) +
ggtitle("Expectations")
expectations
Now, we create the “reality” plot.
Look at the code…anything familiar? What changes from step 8? What are some ‘intuitive’ steps?
# Create a pie chart graph representing the activities done in reality.
reality <-
ggplot(lockdown_reality,
# Call the dataset that I want to use "lockdown_reality"
aes(x = "", y = reality_timespent, # y represents amount for each category
fill = reality_activities)) + # fill represents the names of the categories
geom_bar( stat = "identity", color = "white", size = 2) + # Creates stacked bar chart
coord_polar( "y", start = 0 ) + # puts bar chart on a circle axis (becomes pie chart)
theme_void() + # gets rid of extra grid stuff
theme(legend.position = "bottom", # Create a legend and put it at the bottom
legend.title = element_blank(), # Eliminates legend title
legend.direction = "vertical", # legend is vertical list
plot.title = element_text(hjust = 0.5, size=22, face="bold")) +
scale_fill_brewer(palette="BuPu", direction = -1) + # flips order of Color theme
ggtitle("Reality")
reality # view graph
Your code could also be spaced like the chunk below with comments between lines:
reality <-
ggplot( lockdown_reality,
aes( x = "", y = reality_timespent, fill = reality_activities)) +
geom_bar( stat = "identity", color = "white", size = 4) +
coord_polar( "y", start = 0 ) +
theme_void() +
theme( legend.position = "bottom",
legend.title = element_blank(),
legend.direction = "vertical",
plot.title = element_text(hjust = 0.5, size=15, face="bold")) +
scale_fill_brewer(palette="BuPu", direction = -1) +
ggtitle("Reality")
# View graph
reality
We can conclude by putting the two plots together.
The plot_grid() command came from the cowplot() package.
full_plot <- plot_grid(expectations, reality) # combine two graphs into one image, side by side
full_plot # View plot
The Tidyverse includes core functions for modifying data:
select() allows us to select particular variables. Returns a subset of COLUMNS.filter() allows us to select particular observations, much like Excel’s filter tool. Returns a subset of ROWS.arrange() allows us to sort observations, much like Excel’s sort tool.mutate() allows us to add or change variables.group_by() allows us to group by a category within a variable.summarize() aggregates measures, works with group_by() These functions follow a common syntax that is designed to work with a convenient Tidyverse tool called the “pipe” operator.The pipe operator is part of the Tidyverse and is written %>%. Recall that an operator is just a symbol like + or * that performs some function on whatever comes before it and whatever comes after it.
Now open the Krauth example!