Due Date: 11:59pm, Oct 25
Your “knitted .html” submission must be created from your “group .Rmd” but be created on your own computer.
Confirm this with the following comment included in your submission text box: “Honor Pledge: I have recreated my group submission using using the tools I have installed on my own computer”
Name the files with a group name and YOUR name for your submission.
Each group member must be able to submit this assignment as created from their own computer. If only some members of the group submit the required files, those group members must additionally provide a supplemental explanation along with their submission as to why other students in their group have not completed this assignment.
Use the EuStockMarkets data that contains the daily closing prices of major European stock indices: Germany DAX (Ibis), Switzerland SMI, France CAC, and UK FTSE. Then, create multiple lines that show changes of each index’s daily closing prices over time.
Please use function gather from package tidyr to transform the data from a wide to a long format. For more info, refer to our lecture materials on dataformats (i.e., DS3003_dataformat_facets_note.pdf, DS3003_dataformat_facets_code.rmd, or DS3003_dataformat_facets_code.html
Use function plot_ly from package plotly to create a line plot.
library(tidyr) # load tidyr package library(plotly) # load plotly package data(EuStockMarkets) # load EuStockMarkets dat <- as.data.frame(EuStockMarkets) # coerce it to a data frame dat$time <- time(EuStockMarkets) # add `time` variable # add your codes # use gather to transform data from wide to long format long_dat <- dat %>% gather(StockIndex, Price, c(DAX, SMI, CAC, FTSE))
head(long_dat)
## time StockIndex Price ## 1 1991.496 DAX 1628.75 ## 2 1991.500 DAX 1613.63 ## 3 1991.504 DAX 1606.51 ## 4 1991.508 DAX 1621.04 ## 5 1991.512 DAX 1618.16 ## 6 1991.515 DAX 1610.61
# use plot_ly to create a line plot and show multiple lines that show #changes in each index's daily closing prices over time line_plot <- plot_ly(x = long_dat$time, y = long_dat$Price, type = 'scatter', mode = 'lines',color=long_dat$StockIndex) %>% layout(title = "Daily Closing Prices vs. Time", xaxis = list(title = 'Time'), yaxis = list(title = 'Price'), legend = list(title=list(text='Stock Index')))
Use a dataset in data repositories (e.g., kaggle) that gives the measurements in different conditions like iris data. For more info on iris data, use ?iris.
Briefly describe the dataset you’re using for this assignment (e.g., means to access data, context, sample, variables, etc…).
Transform the dataset from a wide to a long format. Produce any ggplot where the key variable is used in function facet_grid or facet_wrap.
One of the group members will present R codes and plots for Part 2 in class on Oct. 26 (Tue). Please e-mail the instructor with your RPubs link if you’re a presenter by 11:59pm, Oct 25.
This dataset is titled penguins and was collected by Dr. Kristen Forman and the Palmer station in Antarctica. The dataset can be found at this link. The dataset has 333 observations and 9 variables. The variables include species, island, bill length, bill depth, flipper length, body mass, sex, and year. The variable species is similar to the species variable in the iris dataset in that it comprises different conditions (Adélie, Chinstrap and Gentoo). The variables bill length, bill depth, and flipper length are measurements in mm given for each different species condition. This dataset is in wide form and can be transformed into long form with the gather() function.
# add your codes
penguins <- read.csv('penguins.csv')
# omit NA values
penguins <- na.omit(penguins)
# gather data into long format based on species (similar to iris ex.)
long_penguins <- penguins %>% gather(key=penguin_att,
value=measurement, c(bill_length_mm, bill_depth_mm,
flipper_length_mm))
head(long_penguins)
## X species island body_mass_g sex year penguin_att measurement ## 1 1 Adelie Torgersen 3750 male 2007 bill_length_mm 39.1 ## 2 2 Adelie Torgersen 3800 female 2007 bill_length_mm 39.5 ## 3 3 Adelie Torgersen 3250 female 2007 bill_length_mm 40.3 ## 4 5 Adelie Torgersen 3450 female 2007 bill_length_mm 36.7 ## 5 6 Adelie Torgersen 3650 male 2007 bill_length_mm 39.3 ## 6 7 Adelie Torgersen 3625 female 2007 bill_length_mm 38.9
plot <- ggplot(long_penguins,
aes(x=species, y=measurement)) +
geom_boxplot() +
facet_grid(~ penguin_att) +
theme_classic() +
labs(title = "Penguin Attribute Measurements for Each Species")