[1] "this is the output from the code chunk"
URBAN DATA ANALYSIS, MAPPING AND VISUALIZATION Section A | Fall 2024
Sara Hodges (she/her/hers) | hodgess@newschool.edu
Class: Mondays, 9am - 11:40am
Building: Parsons 2 W 13th St
Room: 1108
Lecture and/or Reading Discussion
Homework Assignments Questions
Lab
Canvas
Each week:
Ongoing:
Quantitative Research Project using R, on the topic of your choosing
R is a powerful programming language and software environment to handle data. You will use it to:
From R for Data Science Hadley Wickham & Garrett Grolemund
R is free, and open source
CRAN: software repository for R + gatekeepers for new packages
RStudio: the company that created RStudio application
Install R
Install RStudio
I use Quarto to make slides and an e-book for this course.
It contains text and executable R code together
Text in a box is a code chunk
- something you can execute in R
The output from the code chunk
displays below the box
[1] "this is the output from the code chunk"
The Console is where you can type code that executes immediately, and where you view the output.
Type into your console, and then press enter:
Notice
Create new objects (variables):
variable_name <- value of the variable
<-
is the ‘assignment operator’, it is like an equal sign
Notice
<-
#
tells R not to run that line of code
Notice
Use the Source Pane to write scripts to save your work.
Or to open and run existing scripts.
First, we will set up our files so that they are organized and everyone in the class has the same file structure.
- Right-click the methods1 folder to download to your computer
- Open two File Explorer(windows) or Finder(mac) panes
- In one navigate to your Downloads Folder
- Double-click on the *methods1-xxxxx.zip* file to unzip the files - you will see a methods1 folder in your Downloads folder now
- In the other navigate to the place that you want your methods1 work to live
- Drag the methods1 folder from Downloads
#
)Notice
Projects are a good way to keep track of all of the files for a specific task or project. We’ll create projects for each class in this course.
In-class exercise: Create a project
Create Project
Notice
There are lots of useful tabs in this pane
The Files window is like file explorer
Notice
You should be in your part1 folder to see your script
The Plots window display charts and maps you create
The Packages window lists the packages you have installed and provides a user interface to search for other packages and install them.
Packages are collections of functions and datasets developed by the R community to expand the things you can do in R.
tidyverse
, have become the backbone of analysis in R.install.packages('tidyverse')
The Help window is where you learn about packages and functions.
Two ways to open documentation:
??readr
The Environment shows all of the objects that you have in your workspace
If you are following along, you should have at least 4 objects in your Environment.
Now we’ll import our first dataset into R using the read_csv
function:
Notice
Data tables are called dataframes
in R.
Let’s explore our first data frame by typing some functions in our script
Rows: 13,025
Columns: 16
$ district_id <chr> "1700105", "2700106", "4500690", "5500030", "4807…
$ district <chr> "A-C CENTRAL CUSD 262", "A.C.G.C. PUBLIC SCHOOL D…
$ state <chr> "Illinois", "Minnesota", "South Carolina", "Wisco…
$ postal <chr> "IL", "MN", "SC", "WI", "TX", "CA", "ID", "MS", "…
$ county <chr> "Cass County", "Meeker County", "Abbeville County…
$ conum <chr> "17017", "27093", "45001", "55019", "48217", "060…
$ enroll <dbl> 369, 889, 2946, 772, 283, 18889, 690, 1043, 3262,…
$ native_enroll <dbl> NA, 4, 3, 1, 0, 41, 5, 0, 75, 267, 0, 6, 37, 2, 1…
$ aapi_enroll <dbl> NA, 6, 11, NA, 0, 6641, 1, 9, 52, 191, 1, 4, 186,…
$ latinx_enroll <dbl> 6, 72, 48, 476, 40, 8583, 420, 0, 1152, 334, 431,…
$ black_enroll <dbl> 4, 1, 984, 8, 4, 1414, 6, 1005, 34, 100, 14, 6, 1…
$ white_enroll <dbl> 351, 770, 1831, 278, 234, 979, 245, 13, 1695, 326…
$ hawpi_enroll <dbl> NA, NA, 2, NA, 0, 98, 1, 0, 8, 19, 0, NA, 30, NA,…
$ two_plus_race_enroll <dbl> 8, 36, 67, 9, 5, 1133, 12, 16, 246, 246, 11, 57, …
$ pct_bipoc <dbl> 0.049, 0.134, 0.378, 0.640, 0.173, 0.948, 0.645, …
$ fte <dbl> 41.50, 68.18, 222.87, 58.97, 23.56, 842.75, 50.11…
View the dataframe by clicking on it in the Environment pane or typing View(ed22)
in the Console.
district_id | district | state | postal | county | conum | enroll | native_enroll | aapi_enroll | latinx_enroll | black_enroll | white_enroll | hawpi_enroll | two_plus_race_enroll | pct_bipoc | fte |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1700105 | A-C CENTRAL CUSD 262 | Illinois | IL | Cass County | 17017 | 369 | NA | NA | 6 | 4 | 351 | NA | 8 | 0.049 | 41.50 |
2700106 | A.C.G.C. PUBLIC SCHOOL DISTRICT | Minnesota | MN | Meeker County | 27093 | 889 | 4 | 6 | 72 | 1 | 770 | NA | 36 | 0.134 | 68.18 |
Notice
When you view the dataframe you can:
Filter
button to view rows by filtering one columnYou should always look at the metadata (information about your dataset) so that you understand what you’re looking at and the limitations of the data.
Import data/raw/school_district_demographics_metadata.xlsx
to see the definitions of each column.
Now look at the ed22 dataframe to answer some questions by filtering and sorting the dataframe.
See assignments for week 1 in Canvas.