library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.2 ✔ tibble 3.3.0
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.1.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tidyr)
library(ggplot2)
library(pastecs)
##
## Attaching package: 'pastecs'
##
## The following objects are masked from 'package:dplyr':
##
## first, last
##
## The following object is masked from 'package:tidyr':
##
## extract
library(readxl)
Quant_Data_Set <- read_excel("C:/Users/carmo/Downloads/Quant Data Set.xlsx")
View(Quant_Data_Set)
I start by removing the any values listed as “N/A”
Clean_Data <- drop_na(Quant_Data_Set)
After removing the “N/A” values, I run a pairs function for the
entire dataset in order to easily visualize postivie or negative
correlations amongst all variables in the dataset.
pairs(Clean_Data)

I do notice many postive correlations in this data set. However,
most of them are specific to funding. For instance, there is a positive
correlation to between the Overall Funding a community receives and the
a bucket of funding within that overall bucket. As my topic is specific
to understanding the relationship between funding and the rate of
homelessness, I chose to view the correlation of the two. As both of the
varaibles are not normal, I ran a log to both and used that log value
before running cor and pairs.
Clean_Pairs_Data <- Clean_Data %>% select(`Total Amount Awarded`, `Overall Homeless`)
Clean_Log <- log(Clean_Pairs_Data)
cor(Clean_Log)
## Total Amount Awarded Overall Homeless
## Total Amount Awarded 1.0000000 0.6914389
## Overall Homeless 0.6914389 1.0000000
pairs(Clean_Log)

cor(Clean_Log, method = 'spearman')
## Total Amount Awarded Overall Homeless
## Total Amount Awarded 1.0000000 0.6878916
## Overall Homeless 0.6878916 1.0000000
cor(Clean_Log, method = 'kendall')
## Total Amount Awarded Overall Homeless
## Total Amount Awarded 1.0000000 0.5004718
## Overall Homeless 0.5004718 1.0000000
In looking at the three correlations, it determines that is a
positive correlation between the Total Amount of Funding a community
receives and the Overall Homelessness in that community. Basically, it
is saying, that for unit of funding awarded, there is .069 units of
homelessness.