Due Date: 11:59pm, Oct 25

Group Homework

  • You will work with your group to complete this assignment.

  • Upload your html file on RPubs and include the link when you submit your submission files on Collab.

  • Submit your group’s shared .Rmd AND “knitted”.html files on Collab.

  • Note that this html file is now uploaded on RPubs.

Group Homework

  • Your “knitted .html” submission must be created from your “group .Rmd” but be created on your own computer.

  • Confirm this with the following comment included in your submission text box: “Honor Pledge: I have recreated my group submission using using the tools I have installed on my own computer”

  • Name the files with a group name and YOUR name for your submission.

  • Each group member must be able to submit this assignment as created from their own computer. If only some members of the group submit the required files, those group members must additionally provide a supplemental explanation along with their submission as to why other students in their group have not completed this assignment.

Part 1

Part 1: Instruction

  • Use the EuStockMarkets data that contains the daily closing prices of major European stock indices: Germany DAX (Ibis), Switzerland SMI, France CAC, and UK FTSE. Then, create multiple lines that show changes of each index’s daily closing prices over time.

  • Please use function gather from package tidyr to transform the data from a wide to a long format. For more info, refer to our lecture materials on dataformats (i.e., DS3003_dataformat_facets_note.pdf, DS3003_dataformat_facets_code.rmd, or DS3003_dataformat_facets_code.html

  • Use function plot_ly from package plotly to create a line plot.

Part 1: Results

stocks <- as.data.frame(EuStockMarkets) %>%
  gather(index, price) %>% # transforming to long
  mutate(time = rep(time(EuStockMarkets), 4))
plot_ly(stocks, x = ~time, y = ~price, color = ~index, mode = "lines") 

Part 2

Part 2: Instruction

  • Use a dataset in data repositories (e.g., kaggle) that gives the measurements in different conditions like iris data. For more info on iris data, use ?iris.

  • Briefly describe the dataset you’re using for this assignment (e.g., means to access data, context, sample, variables, etc…).

  • Transform the dataset from a wide to a long format. Produce any ggplot where the key variable is used in function facet_grid or facet_wrap.

  • One of the group members will present R codes and plots for Part 2 in class on Oct. 26 (Tue). Please e-mail the instructor with your RPubs link if you’re a presenter by 11:59pm, Oct 25.

Part 2: Data Description

This data came from Kaggle, but it was originally collected by Dr. Kristen Gorman and the Palmer Station, Antarctica Long Term Ecological Research Network (LTER). It has data about penguin attributes such as bill length, bill depth, flipper length, sex, and body mass. There are 344 rows and 9 columns.

Part 2: Results

peng <- read.csv("penguins.csv")
longpeng <- gather(peng, key = peng_att, value = measurement,
                    bill_length_mm:flipper_length_mm)

Part 2: Results

head(peng)
##   X species    island bill_length_mm bill_depth_mm flipper_length_mm
## 1 1  Adelie Torgersen           39.1          18.7               181
## 2 2  Adelie Torgersen           39.5          17.4               186
## 3 3  Adelie Torgersen           40.3          18.0               195
## 4 4  Adelie Torgersen             NA            NA                NA
## 5 5  Adelie Torgersen           36.7          19.3               193
## 6 6  Adelie Torgersen           39.3          20.6               190
##   body_mass_g    sex year
## 1        3750   male 2007
## 2        3800 female 2007
## 3        3250 female 2007
## 4          NA   <NA> 2007
## 5        3450 female 2007
## 6        3650   male 2007

Part 2: Results

head(longpeng)
##   X species    island body_mass_g    sex year       peng_att measurement
## 1 1  Adelie Torgersen        3750   male 2007 bill_length_mm        39.1
## 2 2  Adelie Torgersen        3800 female 2007 bill_length_mm        39.5
## 3 3  Adelie Torgersen        3250 female 2007 bill_length_mm        40.3
## 4 4  Adelie Torgersen          NA   <NA> 2007 bill_length_mm          NA
## 5 5  Adelie Torgersen        3450 female 2007 bill_length_mm        36.7
## 6 6  Adelie Torgersen        3650   male 2007 bill_length_mm        39.3

Part 2: Results

ggplot(longpeng, aes(x=species, y=measurement)) + geom_boxplot() +
   facet_grid(~ peng_att) + theme_classic()

Separating flipper_length_mm from the others since the values of it are a lot bigger than bill depth and bill length.

Part 2: Results

longpeng2 <- gather(peng, key = peng_att, value = measurement,
                    bill_length_mm:bill_depth_mm)

ggplot(longpeng2, aes(x=species, y=measurement)) + geom_boxplot() +
   facet_wrap(~ peng_att) + theme_classic()

Part 2: Results

longpeng3 <- gather(peng, key = peng_att, value = measurement,
                    flipper_length_mm)

ggplot(longpeng3, aes(x=species, y=measurement)) + geom_boxplot() +
   facet_wrap(~ peng_att) + theme_classic()