2026-4-20

Groups!

##           group 1           group 2           group 3         group 4
## 1 Kang, Christine     Moore, Allana   Randall, Javion                
## 2     Ong, Alyssa Mahoney, Brigette       Qin, Celine Batson, Anthony
## 3   Leahy, Olivia      Myoung, Sein       Smith, Reid    Mendoza, Ava
## 4  Devir, Lindsey       Pham, Canon Wolfenstein, Luci   Pacheco, Alex
##           group 5
## 1    Barga, Jolie
## 2 Bell, Mary Rose
## 3  Knowles, Genny
## 4 Kang, Christine

Warm-up

  • Using the definition provided, how could you gather could you use to measure gentrification in Bay Area neighborhoods?
  • Consider custom made and ready made sources
##           group 1           group 2           group 3         group 4
## 1 Kang, Christine     Moore, Allana   Randall, Javion                
## 2     Ong, Alyssa Mahoney, Brigette       Qin, Celine Batson, Anthony
## 3   Leahy, Olivia      Myoung, Sein       Smith, Reid    Mendoza, Ava
## 4  Devir, Lindsey       Pham, Canon Wolfenstein, Luci   Pacheco, Alex
##           group 5
## 1    Barga, Jolie
## 2 Bell, Mary Rose
## 3  Knowles, Genny
## 4 Kang, Christine

Today’s Class

  • Warm-up: neighborhood change
  • Gathering data in the 21st century
  • Activity: data modeling
  • Data models

Wednesday’s Class

  • Using APIs
  • Web Scraping
  • Data Models
  • Prediction

Office Hours

  • Office Hours: Fridays, 1:30pm-3:00pm (Tyler)
  • Tuesdays, 10:30am-12:00pm (Yao)

Miscellaneous

  • Week 4 optional readings
  • Final project proposal upcoming!
  • Topic/methods open

Gathering Data in the Twenty-first Century

Traditional Methods for Studying Neighborhood Change

  • What are traditional methods for studying neighborhood change?
  • Probabilistic surveys!
  • Census
  • Qualitative interviews/ethnographic observations
  • Question: Does computation change the way we do these methods?

Traditional Methods for Studying Neighborhood Change

  • What are traditional methods for studying neighborhood change?
  • Probabilistic surveys! (recruit respondents on social media)
  • Census (use R package to pull data)
  • Qualitative interviews/ethnographic observations (observe online behavior)
  • Question: Does computation change the way we do these methods?

New Methods for Studying Neighborhood Change

  • What are new methods for studying neighborhood change?
  • Wiki Surveys (flexible surveys with user input) allourideas.org
  • Ecological Momentary Assessments (survey people in real time)
  • Gamification (fun surveys)

New (and old) Methods for Studying Neighborhood Change

  • Use Application Program Interfaces (APIs) to gather census data tidycensus
  • Open-access measures Urban Displacement Project
  • Wiki Surveys (flexible surveys with user input) allourideas.org
  • Gathering text from online sources (scraping)
  • Gathering images or data from online sources
  • Link surveys to gathered data

New (and old) Methods for Studying Neighborhood Change

  • Takeaway: many computational social science approaches blur the lines between new and old methods
  • Let’s look at some examples!

Gentrification

  • Much debate about occurrence and extent of gentrification
  • However: empirical evidence of neighborhood change is limited (surveys, census data)

Using Google Street View to Study Gentrification

  • Hwang’s solution: use Google Street View to look at neighborhood change

Using Google Street View to Study Gentrification

  • Hwang’s solution: use Google Street View to look at neighborhood change
  • Combined these data with Census data
  • Compared these estimates to earlier Chicago gentrification estimates

Using Google Street View to Study Gentrification

  • In pairs: discuss whether a Google Street View approach would be effective for studying gentrification in the Bay Area

Incarceration in the US

  • The US incarceration system is large (more on this soon)
  • Little is known about how people re-enter society after incarceration

Using Cell Phone Surveys to Understand Re-Integration

  • Sugie’s solution: administer cell phones to study participants leaving prison
  • Conduct “Ecological Momentary Assessments” (daily surveys) to assess well-being, job search, and more

Using Cell Phone Surveys to Understand Re-Integration

  • Sugie’s solution: administer cell phones to study participants leaving prison
  • Conduct “Ecological Momentary Assessments” (daily surveys) to assess well-being, job search, and more

Using Cell Phone Surveys to Understand Re-Integration

  • In pairs: discuss how Sugie’s approach helps us learn more about human behaviors, and any potential challenges to gathering data this way

Data Modeling activity

Data Modeling

  • Plot some (or all) of the incarceration data
  • How would you predict what rates will be in 2030?
##           group 1           group 2           group 3         group 4
## 1 Kang, Christine     Moore, Allana   Randall, Javion                
## 2     Ong, Alyssa Mahoney, Brigette       Qin, Celine Batson, Anthony
## 3   Leahy, Olivia      Myoung, Sein       Smith, Reid    Mendoza, Ava
## 4  Devir, Lindsey       Pham, Canon Wolfenstein, Luci   Pacheco, Alex
##           group 5
## 1    Barga, Jolie
## 2 Bell, Mary Rose
## 3  Knowles, Genny
## 4 Kang, Christine

Data Modeling

Why Model Data?

  • As social scientists, we often want to go beyond descriptions of social processes
  • We may want to make explanations/generalizations or even predictions
  • In our incarceration data, write an example of:
  1. A description
  2. An explanation/generalization
  3. A prediction

Why Model Data?

  1. Description: In the U.S., between 1970 and 2010, the incarceration rate increased from 161 to 731.

Why Model Data?

  1. Description: In the U.S., between 1970 and 2010, the incarceration rate increased from 161 to 731.
  2. Explanation/Generalization: As time passes, modern societies become increasingly rely on carceral solutions to social problems, and incarceration rates increase.

Why Model Data?

  1. Description: In the U.S., between 1970 and 2010, the incarceration rate increased from 161 to 731.
  2. Explanation/Generalization: As time passes, modern societies become increasingly rely on carceral solutions to social problems, and incarceration rates increase.
  3. Prediction: In 2030, the U.S. incarceration rate will reach 1,000 (persons per 100,000).

Linear Models

  • Estimate the line that minimizes squared “residuals”

Linear Models

  • We could calculate line of best fit from our incarceration data

Non-Linear Models

  • Polynomials: quadratic, cubic, etc.
  • Tree-based models! Random forests
  • Neural networks
  • And more

Splitting Our Data

  • Data scientists often split their data into training and test sets
  • Goal: choose model that is likely to predict well out of sample

Choosing the Right Model

  • What is the structure of our data? (numeric, character, binary, etc.)
  • What do we want to do (describe, explain, predict)?

More on Modeling

  • This class is short!
  • Optional readings have more information on types of modeling

Miscellaneous

  • Week 4 optional readings
  • Final project proposal upcoming!
  • Topic/methods open

Final Project Proposal!

Final Project Proposal

  • Writing Exercise: describe one idea for a final project
  • What data will you use? (traditional, new, mix)
  • What is the structure of your data (numeric, character, binary, etc.)?
  • What do you want to do (describe, explain, predict)?
  • What type of model will you use? (none, linear, non-linear)?