This session, we’ll learn about working with high-resolution data and variable transformations in Data Studio.
We address the use of:
We’ll largely use text data, however some numeric data are included, as well.
Recall from the previous session that our reports need a data source with the right variable type.
We learned how to:
You can review the previous session here.
For this session, we’ll use text (), geospatial, and numeric (quantitative) data.
These data are scraped from the National Literacy Directory (NLD) and include:
Served)Programming)Search Results for the National Literacy Directory
The National Literacy Directory contains thousands of adult literacy programs across the U.S.
Data Collection: Although search results are limited to a 25-mile radius, you can override this in the URL.
Overriding the Maximum Radius of Search Results
Wow! Just change “radius=25” to any distance. An amateur could scrape the entire database in half an hour!
(Don’t do this, though).
Access the Practice Data: Access the data in the Google Sheet: “Local Literacy Programs”.
File and Make a Copy to copy the Google Sheet to your Google DriveConnect the Practice Data: Visit the Google Data Studio homepage.
Blank Report or Create and ReportCreate New Data SourceSelect on “Google Sheets”ConnectModify Field Types: Double-check each Field (i.e. variables) and Type (e.g. “Text”, “Number”).
Type mostly correctly
Website should be set to Type “URL”Email should remain as Type “text”
Type “URL”, it links to the domain pageAdd to Report and confirm again with Add to ReportShould you need to re-edit fields, click Resources and Manage added data sources in the Data Studio menu.
Tables are great for accompanying a visualization with higher-resolution data. Keep in mind:
Also remember this: Visualizations are seen. Tables are read. Let this guide your aesthetic choices.
Tables with Values Comprised of Long Strings of Text
Open-ended questions in surveys may result in large strings of text. Tables can manage them.
Tables with Multiple Fields
Visualizations usually can’t capture this degree of resolution.
Creating a table is a relatively simple procedure.
Insert in the Data Studio menuData Studio will automatically populate the table with fields from your data.
Creating a Table in Data Studio
Simply click “Insert”, choose “Table”, and place anywhere.
Selecting fields involves clicking-and-dragging fields (variables) in one of four slots:
Populate Tables by Clicking-and-Dragging Fields
Ensuring your table contains the data you want to show is the foremost step to making an awesome table.
Customizing table data is fun and easy. There are a few key options to keep in mind in the Data Tab:
Customizing table stye is funner and easier. Keep the following key options in mind in the Style Tab:
A Paginated Table
Used in conjunction with “Rows per Page”, pagination allows a static table that is navigable with arrows.
A Non-Paginated, Scrollable Table
Depending on how many “Rows per Page” you allow, non-paginated tables allow users to scroll up and down.
Now that we know how to create and modify a table, we can further modify the variables it shows.
We do this in the Edit Field page. Reach it by:
Resource tab in the Data Studio menuEDIT under ActionsADD A FIELDThe Edit Fields Page
Select “ADD A FIELD” in the upper-right corner to create new variables from existing fields.
Add new fields in the “New Field” page by using formulas and functions from existing variables:
The “New Field” Page
If there ever was an easy way to transform variables, you’re currently looking at it. Don’t sweat.
The easiest transformations are arithmetic. These use operators you already know and love. For example:
+ for addition- for subtraction* for multiplication/ for division^ for exponents() for precedenceAnd with these operators, we have to remember Operator Precedence, i.e. PEMDAS or the Order of Operations, i.e.
Please excuse my dear Aunt Sally.
Consider the following:
\[ x + 2 = 4 \]
Solve for x.
What did you get?
Recall that field Served contains values for total clients served by each organization.
Now consider this:
\[ Served \cdot 2 = x \]
What would x be?
Double the total clients. Or “Total Clients times two”.
Arithmetic operations work exactly the same way, using fields instead of variables.
We have two “number” fields: Served (total clients served) and Programming (total cost of programs).
Served and / and Programming in the “Formula” field+Field NameYou can format the new variable as “Currency” in the “Edit Fields” page.
Creating a New Field: SROI
Simply use existing field names and arithmetic operators to perform arithmetic transformations.
An Updated Table Including Our New Variable
Once saved, you can click-and-drag this variable right into the “Metrics” slot to update your table.
Numeric transformations, includng statistical transformations are easy to achieve with functions. For example:
MAX()MIN()MEDIAN()AVG()COUNT()COUNT_DISTINCT()LOG()LOG10()POWER()PERCENTILE()ROUND()SUM()STDDEV()SQRT()SIN()VARIANCE()Typing the beginning of these in the “Formula” field on the “New Field” page will provide an autocomplete list.
What’s more, selecting one of these functions gives you all the instructions you need to use it!
Qualitative transformations of existing variables can be a bit more advanced, but there are some easy, neat tricks.
Regular expressions, or regex, is outside the scope of this tutorial, but we can use some basic techniques.
We can generate sentences that change with our selected data using function CONCAT().
CONCAT() is short for “concatenation”, which sort of means “to paste”CONCAT("This organization has served ", Served, " clients in total.")
Contextualizing a Variable with Autogenerated Text
You can use multiple variables with the CONCAT() function and create very sophisticated, autogenerated text.
Autogenerated Text in a Table
This is just one sentence. Think of this as a building block for elaborate information architecture.
Every shred of qualitative data in this image is an autogenerated table.
This report uses a series of CONCAT(), regex, and if-then functions to autoformat all data from survey responses.
Instructions: Using function CONCAT(), create a new variable named Address by concatenating:
Address Line 1Address Line 2We can use the regular expressions function REGEXP_EXTRACT() to pull the city and state from each line.
REGEX_EXTRACT(), e.g., it may not work with REXP_EXTRACT()Here, we’ll use variable Address Line 2 to extract city and state and convert to a type “geo” field. Here:
Address Line 2In English, it’s saying “extract any character, any number of times, until you hit the end of ’ NY’”.
REGEXP_EXTRACT(Address Line 2, "(.* NY)"))
Using Regular Expressions to Extract City, State
The metacharacter . means “any character”, while * means “any number of times”.
Extracted “City” Variable Using Regular Expressions
Note that the regular expression only extracted text up to the end of “NY”.
Extracted City & State Converted to a Dot Map
By converted our extracted text to type “geo”, we can visualize it in a dot map of New York State.
Regular Expressions in Google Apps: Learn more about string manipulation with Google’s RE2 documentation.
Google Functions: Learn more about functions in Google apps with Google Sheets function list.
Score cards are pretty easy to make compared to string manipulation.
Insert from the Data Studio menuSUM()Notably, you can make “Compact Numbers” in the Style Tab, which convers, e.g., “121,200” to “121.2 K”.
A Score Card for Total Clients Served
Score Cards help track key performance indicators and other values you should keep an eye on.