The following guide walks you through how to add local data as layers in your Cancer InFocus (CIF) dashboards. By local data, we mean data that is available to you locally that you would like to permanently add to one or more CIF dashboards.
To add local data to your CIF dashboards, you will first need to have
the data files prepared and available somewhere on your workstation. For
the data to display properly, you will need it to match the counties or
Census tracts present in your catchment area. In particular, the data
will need to contain the same county or Census tract FIPS codes as are
present in the other data files you acquire from CancerInFocus.org. This
information should be included in a column named FIPS. Your
data should also contain columns named County and
State (if county-level data) or Tract,
County, and State (if Census tract-level
data).
The remainder of the file will include the variable(s) you wish to
add. It is recommended that these variables be presented in long format,
that is with one observation per row. In long format, the final two
columns should be named measure and value and
filled in accordingly.
Note: In this tutorial we assume your data is in long format. If your
data is in wide format (e.g. with a new column for every new variable)
then you will need to add a pivot to the step where the local data file
gets read into create_shapefiles_v5-3.R.
Next, you will need to prepare a file called
local_measures.csv that contains information about how your
new data should be displayed. A blank version of this file is available
in the CIF GitHub repo. This file should contain four columns as
described below:
measure – The name of the variable as found in the
measure column of your prepared dataset from Step 1.def – How the variable should be displayed in the map
legend and elsewhere on your CIF dashboards. For instance, if the
variable you’re adding is for years of potential life lost, you may want
to use something like “Years of Potential Life Lost per 100,000
People”.fmt – How variable values should be formatted when
displayed in legends, etc. The available formats are pct if
the variable represents a percentage or int if the variable
represents an integer or real number value.source – The data source for the local variable being
added.Once you have filled in this file with the appropriate values for all
variables being added, save it in the same folder as
CIFvars_v5-3.R.
create_shapefiles_v5-3.RNow that the necessary files have been prepared, you will begin
incorporating the local data into the data pre-processing for your CIF
dashboards. The first step in doing this is to read information about
the new variable(s) into the script and combine them with the existing
nice_names dataframe. To do this, add the following code
immediately after the variable assignment for nice_names
(which occurs in approximately line 22):
more_names = read.csv('local_measures.csv', header = T)
nice_names = rbind(nice_names, more_names)
create_shapefiles_v5-3.RFinally, you are ready to read in the local data file prepared in
Step 1. If this data is at the county level, the following code should
be added after the chunk preparing the hf_county dataframe.
If your local data is at the Census tract level, this code should be
added after the chunk preparing the hf_tract dataframe.
local_data = read.csv('usr/dir/file.csv', header = T) %>%
mutate(measure = str_replace_all(measure, "_", " "),
cat = "Category",
RE = NA,
Sex = NA) %>%
left_join(nice_names, by="measure") %>%
mutate(lbl = case_when(
fmt == "pct" ~ paste0(round(value*100, 1), "%"),
fmt == "int" ~ prettyNum(round(value, 2), big.mark = ",")
)) %>%
filter(def != "NA") %>%
select(cat, everything())
A few notes on the above:
usr/dir/file.csv should be replaced by the
path and file name for the local data file on your workstation.cat = "Category" in the
mutate() function should be replaced with the appropriate
category for your new variable(s). You can use an existing category in
the data or create a new one. The existing categories are
“Sociodemographics”, “Economics & Insurance”, “Environment”,
“Housing & Transportation”, “Disparities”, “Social Vulnerability
Index”, “Screening & Risk Factors”, “Other Health Factors”, “Cancer
Incidence (age-adj per 100k)”, “Cancer Incidence (% late-stage
diagnosis)”, and “Cancer Mortality (age-adj per 100k)”.pivot_longer() function after the read.csv()
function in the above to put your data into the appropriate format for
the script.After reading in the local data you will need to combine it with the
existing data to incorporate it into your dashboards. This is
accomplished by adding the newly assigned local_data
dataframe to the bind_rows() function for the variable
all_county (if adding county-level data) or
all_tract (if adding Census tract-level data).
By completing the above steps, you should be able to add local data into your CIF dashboards.
Note, if the local data you are adding is intended to replace
existing data (e.g. if you are bringing in new cancer rates to replace
those provide by Cancer InFocus), you will need to remove the
corresponding dataframes from the all_county or
all_tract variable assignments, as appropriate.
If you run into any issues, or have any comments for improving this tutorial, please contact us at CancerInFocus@uky.edu.