Check-ins: name, how you’re doing, a win for this week
Announcements
Quiz
Go through quiz answers and discuss
Review homework
Break
Break
Group time to discuss story ideas so far
Let’s look for some data
Check out
Grading
Extra credit opportunity: 10 points if you attend at least one of the four sessions of the Arnolt Center’s AI symposium on Thursday. I will be there, so come say hi and I will give you the points. I especially encourage attend at the 4:15–5:30 p.m session – Work smarter: Leveraging AI for investigative journalism.
Please add/include the CSVs that you imported into your Notebook_04 Canvas assignment, so that I can run the notebook with your real data. Obviously you won’t be penalized for the ‘late’ submission.
Which function did you use in your homework along with group_by() and mathematical functions (such as sum()) to calculate a statistic for various groupings of data within a dataframe? (1 pt)
a: calculate()
b: summarise()
c: stats()
In R, with boolean data, also known as logical data, FALSE is equivalent to which number? (1 pt)
a: -1
b: 0
c: 1
In “What data can’t do”, how does Hannah Fry repeatedly refer to the complexities of data and math? (1 pt)
a: It’s counting things
b: It’s writing equations
c: It’s using a super computer
True or false: In addition to other causes, sea level rise is sometimes caused in part by sinking land masses. (1 pt)
Like the metre, the minute, or the meridian that runs through Greenwich, England, "sea level" is best thought of as a social and historical construct, the result of an inherently arbitrary decision taken by generations of people doing their best to make sense of a strange and chaotic world.
10-year census vs yearly American Community Survey
When using a join function, it will automatically join on identical columns - meaning the name of the variable/column is the same in both datasets
Or you can tell it tmax in x dataframe should be matched to T.MAX in y dataframe
If your columns aren’t truly identical (different values within the column), it will do funny things like add rows or columns
Left, right, inner, full - all are dplyr functions
Find it in help under “mutate-joins”
Image here + read another person explain joins: https://medium.com/@imanjokko/data-analysis-in-r-series-vi-joining-data-using-dplyr-fc0a83f0f064
Head over to class5_notebook.Rmd
Check in with your group
Let’s find and read in some data
Reading to be announced - will send an email via Canvas
Coding notebook: Due Feb 17 @ 5PM (on Canvas)
Story-based checkpoint:
List five potential datasets you might use in your story (in Potential Data Sources section)
Choose two of those datasets to use in the Selected Data Sources section, and complete the fields for those two datatsets (link, methodology, who gathered the data, etc).
Decoding Climate Change: Unlocking the Power of Programming for Data Journalism