title: “PUAD 5140” subtitle: “Working with your data” author: “type here your name” date: “September 20, 2022” output: html_document: toc: yes toc_float: yes theme: cerulean highlight: haddock toc_depth: 3 pdf_document: toc: yes html_notebook: toc: yes toc_float: yes number_sections: yes highlight: haddock toc_depth: 3 —

This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Cmd+Shift+Enter.

First things first:

It’s good practice to first get your packages load it to confirm they are all installed and working

install.packages("devtools")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
devtools::install_github("kosukeimai/qss-package")
## Skipping install of 'qss' from a github remote, the SHA1 (59933ec0) has not changed since last install.
##   Use `force = TRUE` to force installation
install.packages("tidyverse")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
install.packages("tinytex")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
#tinytex::install_tinytex() ##for mac
install.packages("qss")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
## Warning: package 'qss' is not available for this version of R
## 
## A version of this package for your version of R might be available elsewhere,
## see the ideas at
## https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages
#install.packages("glimpse")
install.packages("janitor")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
#devtools::install_github("ddauber/r4np")
install.packages("ggthemes")
## Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.2'
## (as 'lib' is unspecified)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6      ✔ purrr   0.3.4 
## ✔ tibble  3.1.8      ✔ dplyr   1.0.10
## ✔ tidyr   1.2.1      ✔ stringr 1.4.1 
## ✔ readr   2.1.2      ✔ forcats 0.5.2 
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(qss)
library(gapminder)
#library(r4np)
library(ggthemes)

A Sweet Example:

Load the data

data_url<-("https://raw.githubusercontent.com/fivethirtyeight/data/master/candy-power-ranking/candy-data.csv")
candy.raw<-read_csv(data_url)
## Rows: 85 Columns: 13
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): competitorname
## dbl (12): chocolate, fruity, caramel, peanutyalmondy, nougat, crispedricewaf...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
candy.raw
## # A tibble: 85 × 13
##    competito…¹ choco…² fruity caramel peanu…³ nougat crisp…⁴  hard   bar pluri…⁵
##    <chr>         <dbl>  <dbl>   <dbl>   <dbl>  <dbl>   <dbl> <dbl> <dbl>   <dbl>
##  1 100 Grand         1      0       1       0      0       1     0     1       0
##  2 3 Musketee…       1      0       0       0      1       0     0     1       0
##  3 One dime          0      0       0       0      0       0     0     0       0
##  4 One quarter       0      0       0       0      0       0     0     0       0
##  5 Air Heads         0      1       0       0      0       0     0     0       0
##  6 Almond Joy        1      0       0       1      0       0     0     1       0
##  7 Baby Ruth         1      0       1       1      1       0     0     1       0
##  8 Boston Bak…       0      0       0       1      0       0     0     0       1
##  9 Candy Corn        0      0       0       0      0       0     0     0       1
## 10 Caramel Ap…       0      1       1       0      0       0     0     0       0
## # … with 75 more rows, 3 more variables: sugarpercent <dbl>,
## #   pricepercent <dbl>, winpercent <dbl>, and abbreviated variable names
## #   ¹​competitorname, ²​chocolate, ³​peanutyalmondy, ⁴​crispedricewafer, ⁵​pluribus

Inspect the data

Let’s look at the number of rows and columns:

dim(candy.raw)
## [1] 85 13

And get a sneak-peak of your dataframe:

head(candy.raw)
## # A tibble: 6 × 13
##   competitor…¹ choco…² fruity caramel peanu…³ nougat crisp…⁴  hard   bar pluri…⁵
##   <chr>          <dbl>  <dbl>   <dbl>   <dbl>  <dbl>   <dbl> <dbl> <dbl>   <dbl>
## 1 100 Grand          1      0       1       0      0       1     0     1       0
## 2 3 Musketeers       1      0       0       0      1       0     0     1       0
## 3 One dime           0      0       0       0      0       0     0     0       0
## 4 One quarter        0      0       0       0      0       0     0     0       0
## 5 Air Heads          0      1       0       0      0       0     0     0       0
## 6 Almond Joy         1      0       0       1      0       0     0     1       0
## # … with 3 more variables: sugarpercent <dbl>, pricepercent <dbl>,
## #   winpercent <dbl>, and abbreviated variable names ¹​competitorname,
## #   ²​chocolate, ³​peanutyalmondy, ⁴​crispedricewafer, ⁵​pluribus

Prep your data in a new copy

candy.data<- candy.raw
head(candy.data)
## # A tibble: 6 × 13
##   competitor…¹ choco…² fruity caramel peanu…³ nougat crisp…⁴  hard   bar pluri…⁵
##   <chr>          <dbl>  <dbl>   <dbl>   <dbl>  <dbl>   <dbl> <dbl> <dbl>   <dbl>
## 1 100 Grand          1      0       1       0      0       1     0     1       0
## 2 3 Musketeers       1      0       0       0      1       0     0     1       0
## 3 One dime           0      0       0       0      0       0     0     0       0
## 4 One quarter        0      0       0       0      0       0     0     0       0
## 5 Air Heads          0      1       0       0      0       0     0     0       0
## 6 Almond Joy         1      0       0       1      0       0     0     1       0
## # … with 3 more variables: sugarpercent <dbl>, pricepercent <dbl>,
## #   winpercent <dbl>, and abbreviated variable names ¹​competitorname,
## #   ²​chocolate, ³​peanutyalmondy, ⁴​crispedricewafer, ⁵​pluribus

Rename a variable

candy.data %>% rename(candy_name =competitorname,
                                   hard_candy=hard)
## # A tibble: 85 × 13
##    candy_n…¹ choco…² fruity caramel peanu…³ nougat crisp…⁴ hard_…⁵   bar pluri…⁶
##    <chr>       <dbl>  <dbl>   <dbl>   <dbl>  <dbl>   <dbl>   <dbl> <dbl>   <dbl>
##  1 100 Grand       1      0       1       0      0       1       0     1       0
##  2 3 Musket…       1      0       0       0      1       0       0     1       0
##  3 One dime        0      0       0       0      0       0       0     0       0
##  4 One quar…       0      0       0       0      0       0       0     0       0
##  5 Air Heads       0      1       0       0      0       0       0     0       0
##  6 Almond J…       1      0       0       1      0       0       0     1       0
##  7 Baby Ruth       1      0       1       1      1       0       0     1       0
##  8 Boston B…       0      0       0       1      0       0       0     0       1
##  9 Candy Co…       0      0       0       0      0       0       0     0       1
## 10 Caramel …       0      1       1       0      0       0       0     0       0
## # … with 75 more rows, 3 more variables: sugarpercent <dbl>,
## #   pricepercent <dbl>, winpercent <dbl>, and abbreviated variable names
## #   ¹​candy_name, ²​chocolate, ³​peanutyalmondy, ⁴​crispedricewafer, ⁵​hard_candy,
## #   ⁶​pluribus
candy.data <- candy.data %>% rename(candy_name =competitorname,
                                   hard_candy=hard)

head(candy.data)
## # A tibble: 6 × 13
##   candy_name choco…¹ fruity caramel peanu…² nougat crisp…³ hard_…⁴   bar pluri…⁵
##   <chr>        <dbl>  <dbl>   <dbl>   <dbl>  <dbl>   <dbl>   <dbl> <dbl>   <dbl>
## 1 100 Grand        1      0       1       0      0       1       0     1       0
## 2 3 Muskete…       1      0       0       0      1       0       0     1       0
## 3 One dime         0      0       0       0      0       0       0     0       0
## 4 One quart…       0      0       0       0      0       0       0     0       0
## 5 Air Heads        0      1       0       0      0       0       0     0       0
## 6 Almond Joy       1      0       0       1      0       0       0     1       0
## # … with 3 more variables: sugarpercent <dbl>, pricepercent <dbl>,
## #   winpercent <dbl>, and abbreviated variable names ¹​chocolate,
## #   ²​peanutyalmondy, ³​crispedricewafer, ⁴​hard_candy, ⁵​pluribus

Some useful data prep/data wrangling examples

Exclude the variable pluribus from our dataframe

candy.data<- candy.data %>% select(-pluribus)

Select a subset of variables [candy_name, chocolate, sugarpercent,winpercent, caramel] in a new dataframe named candy.data2

candy.data2<-candy.data %>% select (candy_name, chocolate, sugarpercent,winpercent, caramel)
head(candy.data2)
## # A tibble: 6 × 5
##   candy_name   chocolate sugarpercent winpercent caramel
##   <chr>            <dbl>        <dbl>      <dbl>   <dbl>
## 1 100 Grand            1        0.732       67.0       1
## 2 3 Musketeers         1        0.604       67.6       0
## 3 One dime             0        0.011       32.3       0
## 4 One quarter          0        0.011       46.1       0
## 5 Air Heads            0        0.906       52.3       0
## 6 Almond Joy           1        0.465       50.3       0

Filter a subset of rows that meet a condition (only candys that are chocolate )

candy.data2<-candy.data2 %>% filter(chocolate==1)

Filter a subset of rows that meet two expressions (candy is chocolate and caramel)

candy.data3<-candy.data2 %>% filter(chocolate==1 & caramel==1)

Create a new variable named total that adds the number of ingredients for each candy name in the dataframe candy.data2:

candy.data <- candy.data %>% mutate(total = chocolate + fruity + caramel + peanutyalmondy  + nougat + crispedricewafer + hard_candy + bar)
head(candy.data)
## # A tibble: 6 × 13
##   candy_name choco…¹ fruity caramel peanu…² nougat crisp…³ hard_…⁴   bar sugar…⁵
##   <chr>        <dbl>  <dbl>   <dbl>   <dbl>  <dbl>   <dbl>   <dbl> <dbl>   <dbl>
## 1 100 Grand        1      0       1       0      0       1       0     1   0.732
## 2 3 Muskete…       1      0       0       0      1       0       0     1   0.604
## 3 One dime         0      0       0       0      0       0       0     0   0.011
## 4 One quart…       0      0       0       0      0       0       0     0   0.011
## 5 Air Heads        0      1       0       0      0       0       0     0   0.906
## 6 Almond Joy       1      0       0       1      0       0       0     1   0.465
## # … with 3 more variables: pricepercent <dbl>, winpercent <dbl>, total <dbl>,
## #   and abbreviated variable names ¹​chocolate, ²​peanutyalmondy,
## #   ³​crispedricewafer, ⁴​hard_candy, ⁵​sugarpercent

Hmmmm… that took too long! Let’s find a shorter way to do this last step. Let’s use the handy rowwise(). Rowwise() makes it easy to summarise values across columns within one row.

candy.data <- candy.data %>% rowwise() %>% mutate(total3 = sum(c_across(chocolate:bar)))
candy.data
## # A tibble: 85 × 14
## # Rowwise: 
##    candy_n…¹ choco…² fruity caramel peanu…³ nougat crisp…⁴ hard_…⁵   bar sugar…⁶
##    <chr>       <dbl>  <dbl>   <dbl>   <dbl>  <dbl>   <dbl>   <dbl> <dbl>   <dbl>
##  1 100 Grand       1      0       1       0      0       1       0     1   0.732
##  2 3 Musket…       1      0       0       0      1       0       0     1   0.604
##  3 One dime        0      0       0       0      0       0       0     0   0.011
##  4 One quar…       0      0       0       0      0       0       0     0   0.011
##  5 Air Heads       0      1       0       0      0       0       0     0   0.906
##  6 Almond J…       1      0       0       1      0       0       0     1   0.465
##  7 Baby Ruth       1      0       1       1      1       0       0     1   0.604
##  8 Boston B…       0      0       0       1      0       0       0     0   0.313
##  9 Candy Co…       0      0       0       0      0       0       0     0   0.906
## 10 Caramel …       0      1       1       0      0       0       0     0   0.604
## # … with 75 more rows, 4 more variables: pricepercent <dbl>, winpercent <dbl>,
## #   total <dbl>, total3 <dbl>, and abbreviated variable names ¹​candy_name,
## #   ²​chocolate, ³​peanutyalmondy, ⁴​crispedricewafer, ⁵​hard_candy, ⁶​sugarpercent

Look at a summary of the data:

now let’s explore a summary of this new var. Note I am not assigning anything back to my candy.data dataframe

And if we want to have a summary broken down by groups (say those candy names that have chocolate and those that don’t):

candy.data %>% count (chocolate ,total)  %>% 
  group_by(chocolate) %>% 
    # count creates a column called 'n'
    mutate(percent = n / sum(n) * 100)
## # A tibble: 9 × 4
## # Groups:   chocolate [2]
##   chocolate total     n percent
##       <dbl> <dbl> <int>   <dbl>
## 1         0     0     5   10.4 
## 2         0     1    28   58.3 
## 3         0     2    14   29.2 
## 4         0     3     1    2.08
## 5         1     1     7   18.9 
## 6         1     2    13   35.1 
## 7         1     3    10   27.0 
## 8         1     4     4   10.8 
## 9         1     5     3    8.11
candy.data %>%  
  group_by(chocolate) %>%  
  summarize(nougatcount = sum(nougat), caremelcount =sum(caramel), barcount =sum(bar))
## # A tibble: 2 × 4
##   chocolate nougatcount caremelcount barcount
##       <dbl>       <dbl>        <dbl>    <dbl>
## 1         0           1            4        1
## 2         1           6           10       20

Create a new variable that contains two categories: high, when the sugar content is greter or equal than 0.55 and low, when sugarpercent is lower than 0.55

candy.data2<-candy.data2 %>% mutate(choc_sugar = case_when(sugarpercent>=0.55 ~ 'high',
                                                           sugarpercent<0.55 ~ 'low'))

Export a file

Let’s export the file we just created as a comma-separated file.

write.csv(candy.data2 )
## "","candy_name","chocolate","sugarpercent","winpercent","caramel","choc_sugar"
## "1","100 Grand",1,0.73199999,66.971725,1,"high"
## "2","3 Musketeers",1,0.60399997,67.602936,0,"high"
## "3","Almond Joy",1,0.465,50.347546,0,"low"
## "4","Baby Ruth",1,0.60399997,56.914547,1,"high"
## "5","Charleston Chew",1,0.60399997,38.975037,0,"high"
## "6","HersheyÕs Kisses",1,0.127,55.375454,0,"low"
## "7","HersheyÕs Krackel",1,0.43000001,62.284481,0,"low"
## "8","HersheyÕs Milk Chocolate",1,0.43000001,56.490501,0,"low"
## "9","HersheyÕs Special Dark",1,0.43000001,59.236122,0,"low"
## "10","Junior Mints",1,0.197,57.21925,0,"low"
## "11","Kit Kat",1,0.31299999,76.7686,0,"low"
## "12","Peanut butter M&MÕs",1,0.82499999,71.46505,0,"high"
## "13","M&MÕs",1,0.82499999,66.574585,0,"high"
## "14","Milk Duds",1,0.30199999,55.064072,1,"low"
## "15","Milky Way",1,0.60399997,73.099556,1,"high"
## "16","Milky Way Midnight",1,0.73199999,60.800701,1,"high"
## "17","Milky Way Simply Caramel",1,0.96499997,64.35334,1,"high"
## "18","Mounds",1,0.31299999,47.829754,0,"low"
## "19","Mr Good Bar",1,0.31299999,54.526451,0,"low"
## "20","Nestle Butterfinger",1,0.60399997,70.735641,0,"high"
## "21","Nestle Crunch",1,0.31299999,66.47068,0,"low"
## "22","Peanut M&Ms",1,0.59299999,69.483788,0,"high"
## "23","ReeseÕs Miniatures",1,0.034000002,81.866257,0,"low"
## "24","ReeseÕs Peanut Butter cup",1,0.72000003,84.18029,0,"high"
## "25","ReeseÕs pieces",1,0.40599999,73.43499,0,"low"
## "26","ReeseÕs stuffed with pieces",1,0.98799998,72.887901,0,"high"
## "27","Rolo",1,0.86000001,65.716286,1,"high"
## "28","Sixlets",1,0.22,34.722,0,"low"
## "29","Nestle Smarties",1,0.26699999,37.887188,0,"low"
## "30","Snickers",1,0.546,76.673782,1,"low"
## "31","Snickers Crisper",1,0.60399997,59.529251,1,"high"
## "32","Tootsie Pop",1,0.60399997,48.982651,0,"high"
## "33","Tootsie Roll Juniors",1,0.31299999,43.068897,0,"low"
## "34","Tootsie Roll Midgies",1,0.17399999,45.736748,0,"low"
## "35","Tootsie Roll Snack Bars",1,0.465,49.653503,0,"low"
## "36","Twix",1,0.546,81.642914,1,"low"
## "37","Whoppers",1,0.87199998,49.524113,0,"high"

QUIZ -Now it’s your turn!

1. Import data from the QSS package directly from Github.

First, remember you need to create a dataframe -object- to assign the data downloaded from GitHub. Name the object (your new dataframe) election.raw. And remember to first create an object named election.url and assign to it this link: https://raw.githubusercontent.com/kosukeimai/qss-package/master/data-raw/csv/elections.csv. [hint1: remember to create a chunk below that contains your code for this step] [hint2: Use the dplyr function to read csv files]

data_url<-("https://raw.githubusercontent.com/kosukeimai/qss-package/master/data-raw/csv/elections.csv")
election.raw<-read_csv(data_url)
## Rows: 43494 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): state, county
## dbl (4): year, rep, dem, other
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

2. Explore the imported data:

First explore the number of columns and rows in a new chunk. [hint: use dim]

dim(election.raw)
## [1] 43494     6

Now take a peak at the first rows of your dataframe in a new chunk. [hint: use head()]

head(election.raw)
## # A tibble: 6 × 6
##    year state   county    rep   dem other
##   <dbl> <chr>   <chr>   <dbl> <dbl> <dbl>
## 1  1960 alabama autauga  1149  1324    89
## 2  1960 alabama baldwin  4812  5647   187
## 3  1960 alabama barbour  1166  2148    37
## 4  1960 alabama bibb     1052  1697     4
## 5  1960 alabama blount   2557  3404    19
## 6  1960 alabama bullock   412   757     6

3. Prep your data in a new copy

Insert a new chunk and create a copy of your dataframe: election.clean

election.clean<- election.raw
head(election.clean)
## # A tibble: 6 × 6
##    year state   county    rep   dem other
##   <dbl> <chr>   <chr>   <dbl> <dbl> <dbl>
## 1  1960 alabama autauga  1149  1324    89
## 2  1960 alabama baldwin  4812  5647   187
## 3  1960 alabama barbour  1166  2148    37
## 4  1960 alabama bibb     1052  1697     4
## 5  1960 alabama blount   2557  3404    19
## 6  1960 alabama bullock   412   757     6

Create a new variable with the total of all votes for democrats, republicans and other [hint use mutate to create a new variable totalvotes add these three variables, remember to use the operator: +]

election.clean <- election.clean %>% mutate(totalvotes = rep + dem + other)
head(election.clean)
## # A tibble: 6 × 7
##    year state   county    rep   dem other totalvotes
##   <dbl> <chr>   <chr>   <dbl> <dbl> <dbl>      <dbl>
## 1  1960 alabama autauga  1149  1324    89       2562
## 2  1960 alabama baldwin  4812  5647   187      10646
## 3  1960 alabama barbour  1166  2148    37       3351
## 4  1960 alabama bibb     1052  1697     4       2753
## 5  1960 alabama blount   2557  3404    19       5980
## 6  1960 alabama bullock   412   757     6       1175

Exclude (filter out) the variable “other” from the election.clean dataframe [hint: use select and the operator:-]

election.clean<- election.clean %>% select(-other)
head(election.clean)
## # A tibble: 6 × 6
##    year state   county    rep   dem totalvotes
##   <dbl> <chr>   <chr>   <dbl> <dbl>      <dbl>
## 1  1960 alabama autauga  1149  1324       2562
## 2  1960 alabama baldwin  4812  5647      10646
## 3  1960 alabama barbour  1166  2148       3351
## 4  1960 alabama bibb     1052  1697       2753
## 5  1960 alabama blount   2557  3404       5980
## 6  1960 alabama bullock   412   757       1175

Create a new variable demprop in the election.clean dataframe with the proportion of dem votes out of the total [hint: use dem/totalvotes]

election.clean <- election.clean %>% mutate(demprop = dem/totalvotes)
head(election.clean)
## # A tibble: 6 × 7
##    year state   county    rep   dem totalvotes demprop
##   <dbl> <chr>   <chr>   <dbl> <dbl>      <dbl>   <dbl>
## 1  1960 alabama autauga  1149  1324       2562   0.517
## 2  1960 alabama baldwin  4812  5647      10646   0.530
## 3  1960 alabama barbour  1166  2148       3351   0.641
## 4  1960 alabama bibb     1052  1697       2753   0.616
## 5  1960 alabama blount   2557  3404       5980   0.569
## 6  1960 alabama bullock   412   757       1175   0.644

Now let’s turn that into a percent. Replace the proportions in demprop with the percent [hint: use mutate and the SAME variable name -demprop-, and use demprop*100 to derive the percents]

election.clean <- election.clean %>% mutate(demprop = demprop*100)
head(election.clean)
## # A tibble: 6 × 7
##    year state   county    rep   dem totalvotes demprop
##   <dbl> <chr>   <chr>   <dbl> <dbl>      <dbl>   <dbl>
## 1  1960 alabama autauga  1149  1324       2562    51.7
## 2  1960 alabama baldwin  4812  5647      10646    53.0
## 3  1960 alabama barbour  1166  2148       3351    64.1
## 4  1960 alabama bibb     1052  1697       2753    61.6
## 5  1960 alabama blount   2557  3404       5980    56.9
## 6  1960 alabama bullock   412   757       1175    64.4

Now let’s create a new dataframe election.ca that contains only a subset of our election.clean dataframe. Use filter to keep only observations in California [hint: use filter(variable ==“value”)]

election.ca <- election.clean 

Let’s summarize this data. Create a summary table for your election.ca dataframe that groups observations by county [hint: use group_by and the variable that contains the counties]. the summary table should include the number of total votes, total rep votes and total dem votes [hint: use summarize(sum1 =sum(var1), sum2 =sum(var2), sum3= sum(var3))]

election.ca %>%  
  group_by(county) %>%  
  summarize(totalvotes = sum(totalvotes), totalrepvotes =sum(rep), totaldemvotes =sum(dem))
## # A tibble: 1,835 × 4
##    county    totalvotes totalrepvotes totaldemvotes
##    <chr>          <dbl>         <dbl>         <dbl>
##  1 abbeville     101181         46864         48665
##  2 acadia        300951        153571        126933
##  3 accomack      156255         81121         65891
##  4 ada          1374333        802136        488826
##  5 adair         377151        221346        141565
##  6 adams        2998998       1550243       1296437
##  7 addison       187109         79985         95448
##  8 aiken         570328        358094        190050
##  9 aitkin        103645         45022         53018
## 10 alachua       914130        375357        491492
## # … with 1,825 more rows

###Bonus point:

Try creating a density plot for a subset of the states: using ggplot:

#QUIZ ENDS

A few notes:

Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Cmd+Option+I.

When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Cmd+Shift+K to preview the HTML file).

The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike Knit, Preview does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.