1 Introduction

This session, we’ll learn about working with high-resolution data and variable transformations in Data Studio.

We address the use of:

  • Filters, which allows our audience to drill down into particular data
  • Data Tables, which provide data at higher resolutions than visualizations
  • Score Cards, which highlight key performance indicators to keep an eye on
  • Transformations, which create new data from existing variables

We’ll largely use text data, however some numeric data are included, as well.


1.1 Review

Recall from the previous session that our reports need a data source with the right variable type.

We learned how to:

  1. Modify global layout options, like multi-page reports, canvas size, and navigation
  2. Modify report themes like daytime and nighttime modes, default color palletes, etc.
  3. Create geospatial visualizations using geo maps such as choropleth and dot maps
  4. Use color effectively, including gradient, divergent, and discrete palletes
  5. Use tools to extract colors from images, autogenerate new palletes, and testing accessibility

You can review the previous session here.


2 Practice Data

For this session, we’ll use text (), geospatial, and numeric (quantitative) data.

These data are scraped from the National Literacy Directory (NLD) and include:

  • 22 organizations within 28 miles of the Central New York Community Foundation
  • Organization’s name, address lines, phone, point of contact, email, and website URL
  • A randomly generated number of total clients served, between 1,000 and 10,000 (Served)
  • A randomly dollar amount in total programming costs between $100 K and $5 M (Programming)


Search Results for the National Literacy Directory

The National Literacy Directory contains thousands of adult literacy programs across the U.S.


Data Collection: Although search results are limited to a 25-mile radius, you can override this in the URL.


Overriding the Maximum Radius of Search Results

Wow! Just change “radius=25” to any distance. An amateur could scrape the entire database in half an hour!

(Don’t do this, though).


2.1 Connecting to Data Studio

Access the Practice Data: Access the data in the Google Sheet: “Local Literacy Programs”.

  • Click on File and Make a Copy to copy the Google Sheet to your Google Drive

Connect the Practice Data: Visit the Google Data Studio homepage.

  • Select Blank Report or Create and Report
  • Name your report by replacing “Untitled Report”
  • Click on Create New Data Source
  • Click Select on “Google Sheets”
  • Click “Local Literacy Programs”
  • Select Connect


2.2 Editing Table Fields

Modify Field Types: Double-check each Field (i.e. variables) and Type (e.g. “Text”, “Number”).

  • Google Data Studio will likely guess Type mostly correctly
    • Variable Website should be set to Type “URL”
    • Variable Email should remain as Type “text”
      • If an email is Type “URL”, it links to the domain page
  • Click Add to Report and confirm again with Add to Report

Should you need to re-edit fields, click Resources and Manage added data sources in the Data Studio menu.


3 Data Tables

Tables are great for accompanying a visualization with higher-resolution data. Keep in mind:

  • Tables can include virtually any field (i.e. variable)
  • “Number” variables should usually be assigned to the Metric slots
  • “Text” and “URL” variables should usually be assigned to Dimension slots
  • “Text” variables can be quite large, especially for “Open” survey responses
  • The Date Range Dimension allows filtering tables by dates/times

Also remember this: Visualizations are seen. Tables are read. Let this guide your aesthetic choices.


Tables with Values Comprised of Long Strings of Text

Open-ended questions in surveys may result in large strings of text. Tables can manage them.


Tables with Multiple Fields

Visualizations usually can’t capture this degree of resolution.


3.1 Creating Tables

Creating a table is a relatively simple procedure.

  1. Click on Insert in the Data Studio menu
  2. Select “Table”
  3. Place it

Data Studio will automatically populate the table with fields from your data.


Creating a Table in Data Studio

Simply click “Insert”, choose “Table”, and place anywhere.


3.2 Selecting Variables

Selecting fields involves clicking-and-dragging fields (variables) in one of four slots:

  • Date Range Dimensions accept datetime variables and allow filtering table records by date
  • Dimensions accept most variables, but usually work best with type “text”, “geo”, and “URL”
  • Metrics also accept most variables, but are ideal for type quantitative data
    • If you insert a qualitative variable, you can use aggregation functions, e.g. occurrences
    • Those familiar with SQL, R, or other data science languages should feel at home with these
  • Sort and Secondary Sort accept any quantiative or qualitative variable
    • Quantitative variables sort numerically and qualitative variables sort alphabetically
    • Ascending order is from least to greatest (quantitative), or alphabetically (qualitative)
    • Descending order is from greatest to least (quantitative), or alphabetically (qualitative)


Populate Tables by Clicking-and-Dragging Fields

Ensuring your table contains the data you want to show is the foremost step to making an awesome table.


3.3 Customizing Data

Customizing table data is fun and easy. There are a few key options to keep in mind in the Data Tab:

  • Summary Rows provide totals of quantitative variables. Here, it totals programming costs and clients served.
  • Rows per Page sets the maximum number of records on each table “page”, navigable with the lower-right arrows.
  • Default Date Range filters data for a fixed period of time.
    • The date range may be specified between certain dates
    • The date range may cover a set time range ending at present day, e.g. x days, weeks, quarters, or years


3.4 Customizing Style

Customizing table stye is funner and easier. Keep the following key options in mind in the Style Tab:

  • Show Header reveals or removes field (variable) names for each column
  • Wrap Text (Header) allows headers to widen according to the length of variable labels
  • Wrap Text (Body) allows rows to widen according to the length of text strings
  • Odd Row Color and Even Row Color allow every other row to be highlighted, increasing interpretability
  • Row Numbers provides numbers for each record in your table; useful for multi-page tables
  • Show Pagination (Footer) allows navigation of table pages with arrows
    • This is used in conjunction with Rows per Page in the Data Tab
    • Essentially, this makes your table scrollable or static


A Paginated Table

Used in conjunction with “Rows per Page”, pagination allows a static table that is navigable with arrows.


A Non-Paginated, Scrollable Table

Depending on how many “Rows per Page” you allow, non-paginated tables allow users to scroll up and down.


4 New Fields

Now that we know how to create and modify a table, we can further modify the variables it shows.

We do this in the Edit Field page. Reach it by:

  • Selecting the Resource tab in the Data Studio menu
  • Clicking “Manage added data sources”
  • Clicking EDIT under Actions
  • Selecting ADD A FIELD


The Edit Fields Page

Select “ADD A FIELD” in the upper-right corner to create new variables from existing fields.


4.1 Adding New Fields

Add new fields in the “New Field” page by using formulas and functions from existing variables:

  • Available Fields provides a searchable list of existing fields which you can click-and-drag
  • Field Name allows you to name your new variable; you can always rename it in Data Studio
  • Formula is where functions and operators are used to create transformations


The “New Field” Page

If there ever was an easy way to transform variables, you’re currently looking at it. Don’t sweat.


4.2 Arithmetic Operators

The easiest transformations are arithmetic. These use operators you already know and love. For example:

  • + for addition
  • - for subtraction
  • * for multiplication
  • / for division
  • ^ for exponents
  • () for precedence

And with these operators, we have to remember Operator Precedence, i.e. PEMDAS or the Order of Operations, i.e.

Please excuse my dear Aunt Sally.

  1. Parentheses
  2. Exponents
  3. Multiplication
  4. Division
  5. Addition
  6. Subtraction


4.2.1 A Motivating Example

Consider the following:

\[ x + 2 = 4 \]

Solve for x.

What did you get?


Recall that field Served contains values for total clients served by each organization.

Now consider this:

\[ Served \cdot 2 = x \]

What would x be?

Double the total clients. Or “Total Clients times two”.

Arithmetic operations work exactly the same way, using fields instead of variables.


4.3 Arithmetic Transformations

We have two “number” fields: Served (total clients served) and Programming (total cost of programs).

  • Therefore, we can calculate Social Return on Investment, or SROI
  • That is, insert Served and / and Programming in the “Formula” field
  • You can type this out, click-and-drag, or highlight the variable and select +
  • Name this new variable “SROI” in Field Name
  • Select “Save”

You can format the new variable as “Currency” in the “Edit Fields” page.


Creating a New Field: SROI

Simply use existing field names and arithmetic operators to perform arithmetic transformations.


An Updated Table Including Our New Variable

Once saved, you can click-and-drag this variable right into the “Metrics” slot to update your table.


4.4 More Numeric Transformations

Numeric transformations, includng statistical transformations are easy to achieve with functions. For example:

  • MAX()
  • MIN()
  • MEDIAN()
  • AVG()
  • COUNT()
  • COUNT_DISTINCT()
  • LOG()
  • LOG10()
  • POWER()
  • PERCENTILE()
  • ROUND()
  • SUM()
  • STDDEV()
  • SQRT()
  • SIN()
  • VARIANCE()

Typing the beginning of these in the “Formula” field on the “New Field” page will provide an autocomplete list.

What’s more, selecting one of these functions gives you all the instructions you need to use it!


4.5 Qualitative Transformations

Qualitative transformations of existing variables can be a bit more advanced, but there are some easy, neat tricks.

  • Values with qualitative or “text” data are called strings
  • Transforming qualitative variables is known as string manipulation
  • Advanced pattern detextion, matching, extraction, and replacement may be done with regular expressions

Regular expressions, or regex, is outside the scope of this tutorial, but we can use some basic techniques.


4.5.1 Generating Sentences

We can generate sentences that change with our selected data using function CONCAT().

  • CONCAT() is short for “concatenation”, which sort of means “to paste”
  • We just type in desired text in quotations and include the field name
  • Separate the quotations and variable names with commas
  • Your variable doesn’t have spaces, so include those!
CONCAT("This organization has served ", Served, " clients in total.")


Contextualizing a Variable with Autogenerated Text

You can use multiple variables with the CONCAT() function and create very sophisticated, autogenerated text.


Autogenerated Text in a Table

This is just one sentence. Think of this as a building block for elaborate information architecture.


Every shred of qualitative data in this image is an autogenerated table.

This report uses a series of CONCAT(), regex, and if-then functions to autoformat all data from survey responses.


4.5.2 Applied Practice

Instructions: Using function CONCAT(), create a new variable named Address by concatenating:

  • Field Address Line 1
  • Field Address Line 2


4.5.3 Extracting Text

We can use the regular expressions function REGEXP_EXTRACT() to pull the city and state from each line.

  • Regular expressions are daunting at first
  • There daunting after that, too
  • They’re also somewhat frustrating in Data Studio and are extremely particular
    • I.e. if it works in Google Sheets with REGEX_EXTRACT(), e.g., it may not work with REXP_EXTRACT()

Here, we’ll use variable Address Line 2 to extract city and state and convert to a type “geo” field. Here:

  • The first argument is your field, Address Line 2
  • The second is a regular expression

In English, it’s saying “extract any character, any number of times, until you hit the end of ’ NY’”.

REGEXP_EXTRACT(Address Line 2, "(.* NY)"))


Using Regular Expressions to Extract City, State

The metacharacter . means “any character”, while * means “any number of times”.


Extracted “City” Variable Using Regular Expressions

Note that the regular expression only extracted text up to the end of “NY”.


Extracted City & State Converted to a Dot Map

By converted our extracted text to type “geo”, we can visualize it in a dot map of New York State.


4.6 Further Resources

Regular Expressions in Google Apps: Learn more about string manipulation with Google’s RE2 documentation.

Google Functions: Learn more about functions in Google apps with Google Sheets function list.


5 Score Cards

Score cards are pretty easy to make compared to string manipulation.

  • Select Insert from the Data Studio menu
  • Place the score card on your canvas
  • Resize and stylize as appropriate
  • Select an aggregation function, e.g. SUM()

Notably, you can make “Compact Numbers” in the Style Tab, which convers, e.g., “121,200” to “121.2 K”.


A Score Card for Total Clients Served

Score Cards help track key performance indicators and other values you should keep an eye on.