Title- Tidyverse Recipies

Introduction

In this assignment we are trying some Tidyverse recipies. As part of this i would like to use unnest() function in tidyverse.

Load Libraries

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(tidyverse)

## -- Attaching packages -------------------------------------------------------------------------------- tidyverse 1.2.1 --

## v ggplot2 3.0.0     v readr   1.1.1
## v tibble  1.4.2     v purrr   0.2.5
## v tidyr   0.8.1     v stringr 1.3.1
## v ggplot2 3.0.0     v forcats 0.3.0

## -- Conflicts ----------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

library(tidyr)
library(knitr)
library(kableExtra)

unnest ()

Unnest is used when you want to make each list element its own row from a column with lists of items

unnest (data, …, .drop = NA, .id = NULL, .sep = NULL, .preserve = NULL)

data - a data frame. … - the columns to unnest; defaults to all list-columns .drop - whether additional list columns should be dropped .id - data frame identifier; creates a new column with the name .id, giving a unique identifier. .sep - identify a separator to use in the names of unnested data frame columns, which combine the name of the original list-col with the names from nest data frame .preserve - list-columns to preserve in the output

Example

biopics <- read_csv("https://raw.githubusercontent.com/vijay564/R-Maincode/master/tidyverse_recipies.csv") %>% 
    # Filter the "directors" column for entries that contain a comma -- that have more than one name
    filter(str_detect(director, ".\\,.")) %>%
    # Select a few columns of the dataframe for demonstration purposes
    select(title, country, director)

## Parsed with column specification:
## cols(
##   title = col_character(),
##   site = col_character(),
##   country = col_character(),
##   year_release = col_integer(),
##   box_office = col_character(),
##   director = col_character(),
##   number_of_subjects = col_integer(),
##   subject = col_character(),
##   type_of_subject = col_character(),
##   race_known = col_character(),
##   subject_race = col_character(),
##   person_of_color = col_integer(),
##   subject_sex = col_character(),
##   lead_actor_actress = col_character()
## )

head(biopics, 3) %>% 
    kable("html")  %>% 
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

title	country	director
Above and Beyond	US	Melvin Frank, Norman Panama
American Splendor	US	Shari Springer Berman, Robert Pulcini
Burning Blue	US	D.M.W. Greer With: Trent Ford, Tammy Blanchard, Morgan Spector

# Unnest the "directors" col-list twice along two different separators
dir_unnest <- unnest(biopics, director = strsplit(director, ",")) %>% 
    unnest(director = strsplit(director, ":"))

# Remove the pattern of a space and the word "With"            
dir_unnest$director <- str_replace(dir_unnest$director, "[[:space:]]With", "")

head(dir_unnest, 8) %>% 
    kable("html")  %>% 
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

title	country	director
Above and Beyond	US	Melvin Frank
Above and Beyond	US	Norman Panama
American Splendor	US	Shari Springer Berman
American Splendor	US	Robert Pulcini
Burning Blue	US	D.M.W. Greer
Burning Blue	US	Trent Ford
Burning Blue	US	Tammy Blanchard
Burning Blue	US	Morgan Spector

Tidyverse_Recipies

Vijaya Cherukuri

December 12, 2018