# Load all necessary packages here
library(ggplot2)
library(dplyr)
library(lubridate)
library(ical)

Setup

Administrative info:

  • Section:
  • My partner:
  • My partner’s question about how they use their time:

How to use this document

First, you should upload the .Rmd file in Moodle to RStudio. Also be sure to upload 192.ics which contains some sample data. After you have both of those in, open the .Rmd file. Click the “Knit” button at the top of the window to put it all together and see if things work.

Once it’s working, you edit some of the commands or text in Rstudio, and it will pop up a window showing you what you’ve changed. You should keep doing this periodically as you make changes, just to see what’s happening and to make sure that what you’ve entered in is working the way you wanted. Eventually, you can even use this document to generate the write-up you will turn in.

Instructions

This assignment involves the following three key steps:

  1. Collect data:
    1. Identify a question about how you use your time that you feel comfortable sharing with your partner and me. Feel free to make up data.
    2. Start logging time in an electronic calendar app like Google Calendar, macOS Calendar, or Outlook. If you already use a calendar app, we suggest you create a new calendar dedicated to this activity. That way your pre-existing calendar’s will be kept private.
  2. Exchange data and analyze your partner’s data:
    1. Export your calendar to an .ics file.
    2. Exchange your question and .ics data with your partner.
    3. Import your partner’s calendar data into R
  3. Write a 400 word reflection piece on this experience. You can write it in Google Docs, include the images you want from this document, and export to PDF. You could also write it all here, and click right side of the “Knit” button where there’s a small downwards-pointing triangle. You’ll see a menu that lets you select “Knit to PDF”. In particular, make sure you address the following questions.
    1. What was your partner’s question, and what is your answer?
    2. What difficulties in the data collection & analysis process did you encounter?
    3. When providing your data, what expectations do you have of the person receiving your data?
    4. As someone who receives and analyzes others’ data, what are your ethical responsibilities?

You will definitely encounter different types of problems during Steps 1 and 2 above. I therefore recommend you iterate through Steps 1 and 2 in their entirety early and often. That way you can identify and address any problems early on. For a full demonstration of Steps 1 and 2, watch this 6m56s YouTube screencast.

Example assignments

Here is a blog post written by a statistics student who completed a similar assignment. You will notice that her analysis is very in-depth, and that she analyzed her own data.

Tips for data collection

Don’t collect data on everything! That will eventually make you sick of it. Instead, remember what your question was ahead of time. What data do you need to answer that? If it turns out you need more than you can collect, it’s ok to narrow the scope of your question.

Importing the calendar

In class, I demonstrated how you can upload this file and the .ics file into RStudio. Once you do this, the code that appears here gets the calendar data into R to analyze. The lines with # briefly explain what’s happening. If there is more analysis or wrangling that you want to do, feel free to ask me how, and I can just give you the code you need. The OPTIONAL lines give instructions for two things you might want to do. You can include them by removing the # symbol. You may have to include a pipe %>% to chain things together if it doesn’t appear.

# DO THIS: Replace "192.ics" with the name of your calendar file here
calendar_data <- "192.ics" %>% 
  # Use ical package to import into R
  ical_parse_df() %>% 
  # Convert to "tibble" data frame format
  as_tibble() %>% 
  # Use lubridate package to wrangle dates, times, and timezones. For a list of
  # timezones run the following command in R: OlsonNames()
  mutate(
    start_datetime = with_tz(start, tzone = "America/New_York"),
    end_datetime = with_tz(end, tzone = "America/New_York"),
    duration = end_datetime - start_datetime,
    date = floor_date(start_datetime, unit = "day")
  ) %>%
  # Convert calendar entry to all lowercase and rename:
  mutate(activity = tolower(summary)) %>%
  # Do data wrangling to compute number of minutes and hours per day
  group_by(date, activity) %>%
  summarize(duration = sum(duration) %>% as.numeric()) # %>%
  # OPTIONAL: After looking at your data, change the time interval length units
  # if necessary. The line below changes units of time from hours to minutes.
  # mutate(hours = duration/60)# %>% 
  # OPTIONAL: Filter out rows to only include certain dates:
  # filter("2019-09-01" <= date, date <= "2019-09-06")

This line makes sure your data was uploaded by displaying the first few rows:

calendar_data
date activity duration
2019-09-02 sleep 480
2019-09-02 study 60
2019-09-03 exercise 60
2019-09-04 sleep 960
2019-09-04 study 180
2019-09-05 sleep 540
2019-09-06 exercise 30
2019-09-06 study 90
2019-09-07 exercise 30
2019-09-07 sleep 540

Using glimpse() from the dplyr package gives you an alternative look at your data. It also gives you the type of data each column is: <dttm> being date-time, <chr> being character (i.e. text), and <dbl> being double i.e. decimal numerical values.

glimpse(calendar_data)
## Rows: 10
## Columns: 3
## Groups: date [6]
## $ date     <dttm> 2019-09-02, 2019-09-02, 2019-09-03, 2019-09-04, 2019-09-04,…
## $ activity <chr> "sleep", "study", "exercise", "sleep", "study", "sleep", "ex…
## $ duration <dbl> 480, 60, 60, 960, 180, 540, 30, 90, 30, 540

RStudio’s spreadsheet viewer lets you see all the data at once. In the top-right window labeled “Environment”, if you click “calendar_data” it will open up a spreadsheet. Note by setting eval=FALSE in this code chunk, R Markdown will not “evaluate” this code chunk and ignore it.

View(calendar_data)

Analysis

Here you can play around with the data as you like. Remember that when you want to see the result of your code, you can click “Knit” and RStudio will show you the output.

The space below is for you to make a rough draft of your write-up. You could write everything here, click the black downward-pointing arrow next to “Knit”, choose “Knit to PDF”, and submit that. Or you could write everything up in a Google Doc, copy and paste your images from this document, export as a PDF submit that.

Describe the question here.

Describe data visualization #1 (that you’ll create below) here:

# Write your code to create data visualization #1 here. Be sure to label your
# axes and include a title to explain your graphic. 
ggplot(calendar_data, mapping=aes(x=date, y=duration, fill=activity))+geom_col()

Describe data visualization #2 (that you’ll create below) here:

# Write your code to create data visualization #1 here. Be sure to label your
# axes and include a title to explain your graphic. 
sleephours <- calendar_data %>% filter(activity=="sleep")
ggplot(sleephours, mapping=aes(x=date, y=duration))+geom_line()

Notice that this line graph looks strange; it’s because we’re missing data for Sept. 3 and Sept. 6! There was no “sleep” event for ggplot to use. Keep this in mind as you collect data.

Describe how these visualizations answer the question here.