Assignment 3

Instructions

Cornelia has the following explicit criteria for her college:

A violent crime rate below the median of the dataset. Accredited by Southern Association of Colleges and Schools Commission on Colleges *The midpoint of SAT scores at the institution (math) above the median of the dataset.
You’ll need to drop any data that has NA values for the three criteria she provided.

Cornelia would like a list of colleges showing the name, city, violent crime rate, accrediting agency, and accrediting agency for colleges that meet >her criteria, sorted by the midpoint SAT math score, from highest to lowest. She would also like a count of the number of institutions that meet >her criteria.

Data Import

  1. Crime Data
library(readr)
library(dplyr)
library(tidyverse)
Crime_2015 <- read_csv("Crime_2015.csv")
# Let's look at the first few rows of the dataset
top_n(Crime_2015,5) #Show top 5
## # A tibble: 5 x 12
##   MSA   ViolentCrime Murder  Rape Robbery AggravatedAssau… PropertyCrime
##   <chr>        <dbl>  <dbl> <dbl>   <dbl>            <dbl>         <dbl>
## 1 Worc…         435.    1.6  31.8    73               328.         1746.
## 2 Yaki…         293.    8.4  31.3    77               177.         3537.
## 3 York…         218.    4.3  23.3    64.3             126.         1551.
## 4 Yuba…         372.    3.5  37      81.6             250.         2589.
## 5 Yuma…         327.    4.9  34.6    44.8             243.         2250.
## # … with 5 more variables: Burglary <dbl>, Theft <dbl>,
## #   MotorVehicleTheft <dbl>, State <chr>, City <chr>
  1. College Data
library(readr)
library(dplyr) 
CollegeScorecard <- read_csv("CollegeScorecard.csv") %>% rename(City=CITY)
#Let's join the two tables by city
collegesTable <- Crime_2015 %>% inner_join(CollegeScorecard, by="City")

Arthmetic needed

# Let's start by filtering cities with the crime rate below the median of the dataset
violentCrimeNonNA <- collegesTable$ViolentCrime[complete.cases(collegesTable$ViolentCrime)] #Remove NA
medianCrimeRate <- median(violentCrimeNonNA)
medianCrimeRate
## [1] 398.2
accredAg <- "Southern Association of Colleges and Schools Commission on Colleges"
satNonNA <- collegesTable$SATMTMID[complete.cases(collegesTable$SATMTMID)] #Remove NA
medianSAT <- median(satNonNA)
medianSAT
## [1] 529

The median SAT score is: 529 The median Crime Rate is: 392.2

Now, ## Filter and order

listOfCollegesCrimePass <- collegesTable %>% filter(ViolentCrime < medianCrimeRate)
listOfCollegesSATPass <- listOfCollegesCrimePass %>% filter(SATMTMID > medianSAT)  
listOfCollegesAccred <- listOfCollegesSATPass %>% filter(AccredAgency == accredAg)
#name, city, violent crime rate, accrediting agency, and accrediting agency
finalList <- listOfCollegesAccred[c("INSTNM", "City","SATMTMID","ViolentCrime","AccredAgency")] %>% arrange(desc(SATMTMID))

Let’s organize this in a nicer table

library(DT)
datatable(finalList)

Total Number of Institutions

numberOfColleges <- nrow(finalList)
numberOfColleges
## [1] 36

The total number of colleges that meets her criteria is 36