The following guide walks you through how to add custom data layers to your Cancer InFocus (CIF) dashboards.
To add custom data to your CIF dashboards, you will first need to
have the data files prepared and available somewhere on your
workstation. For the data to display properly, you will need it to match
the counties or Census tracts present in your catchment area. In
particular, the data will need to contain the same county or Census
tract FIPS codes as are present in the other data files you acquire from
CancerInFocus.org. This information should be included in a column named
FIPS
. Your data should also contain columns named
County
and State
(if county-level data) or
Tract
, County
, and State
(if
Census tract-level data).
The remainder of the file will include the variable(s) you wish to
add. It is recommended that these variables be presented in long format,
that is with one observation per row. In long format, the final two
columns should be named measure
and value
and
filled in accordingly.
Note: In this tutorial we assume your data is in long format. If your
data is in wide format (e.g. with a new column for every new variable)
then you will need to add a pivot to the step where the custom data file
gets read into create_shapefiles_v5-2.R
.
Next, you will need to prepare a file called
custom_measures.csv
that contains information about how
your new data should be displayed. A blank version of this file is
available in the CIF GitHub repo. This file should contain four columns
as described below:
measure
– The name of the variable as found in the
measure
column of your prepared dataset from Step 1.def
– How the variable should be displayed in the map
legend and elsewhere on your CIF dashboards. For instance, if the
variable you’re adding is for years of potential life lost, you may want
to use something like “Years of Potential Life Lost per 100,000
People”.fmt
– How variable values should be formatted when
displayed in legends, etc. The available formats are pct
if
the variable represents a percentage or int
if the variable
represents an integer or real number value.source
– The data source for the custom variable being
added.Once you have filled in this file with the appropriate values for all
variables being added, save it in the folder you used for the
pathCIF
variable in CIFvars_v5-2.R
.
create_shapefiles_v5-2.R
Now that the necessary files have been prepared, you will begin
incorporating the custom data into the data pre-processing for your CIF
dashboards. The first step in doing this is to read information about
the new variable(s) into the script and combine them with the existing
nice_names
dataframe. To do this, add the following code
immediately after the variable assignment for nice_names
(which occurs in approximately line 22):
more_names = read.csv('custom_measures.csv', header = T)
nice_names = rbind(nice_names, more_names)
create_shapefiles_v5-2.R
Finally, you are ready to read in the custom data file prepared in
Step 1. If this data is at the county level, the following code should
be added after the chunk preparing the hf_county
dataframe.
If your custom data is at the Census tract level, this code should be
added after the chunk preparing the hf_tract
dataframe.
custom_data = read.csv('usr/dir/file.csv', header = T) %>%
mutate(measure = str_replace_all(measure, "_", " "),
cat = "Category",
RE = NA,
Sex = NA) %>%
left_join(nice_names, by="measure") %>%
mutate(lbl = case_when(
fmt == "pct" ~ paste0(round(value*100, 1), "%"),
fmt == "int" ~ prettyNum(round(value, 2), big.mark = ",")
)) %>%
filter(def != "NA") %>%
select(cat, everything())
A few notes on the above:
usr/dir/file.csv
should be replaced by the
path and file name for the custom data file on your workstation.cat = "Category"
in the
mutate()
function should be replaced with the appropriate
category for your new variable(s). You can use an existing category in
the data or create a new one. The existing categories are
“Sociodemographics”, “Economics & Insurance”, “Environment”,
“Housing & Transportation”, “Disparities”, “Social Vulnerability
Index”, “Screening & Risk Factors”, “Other Health Factors”, “Cancer
Incidence (age-adj per 100k)”, and “Cancer Mortality (age-adj per
100k)”.pivot_longer()
function after the read.csv()
function in the above to put your data into the appropriate format for
the script.After reading in the custom data you will need to combine it with the
existing data to incorporate it into your dashboards. This is
accomplished by adding the newly assigned custom_data
dataframe to the rbind()
function for the variable
all_county
(if adding county-level data) or
all_tract
(if adding Census tract-level data).
By completing the above steps, you should be able to add custom data into your CIF dashboards.
Note, if the custom data you are adding is intended to replace
existing data (e.g. if you are bringing in new cancer rates to replace
those provide by Cancer InFocus), you will need to remove the
corresponding dataframes from the all_county
or
all_tract
variable assignments, as appropriate.
If you run into any issues, or have any comments for improving this tutorial, please contact us at CancerInFocus@uky.edu.