1 Welcome

Welcome to Picture This: Applied Practice in Data Visualization.

This page contains a host of materials for practicing data visualization, including:

  • Links to the original presentation
  • Thirteen preprocessed datasets for applied practice
  • Dataset documentation to explore and understand your data
  • Resources for color, accessibility, and other tools to polish your viz
  • Tutorials to learn about and connect your data to Google Data Studio & MS Excel




2 Presentation

Click here to download Tuesday’s presentation, Picture This: Best Practices in Data Visualization.



Recall the main takeaways from Tuesday’s presentation:

  • You’re already an expert (i.e. you know what looks good).
  • You’re always persuading (i.e. every chart is manipulative).
  • Respect your audience (i.e. their intelligence, time, etc.).
    • Use the best “data ink” and limit eye movement
    • Rely on conventions (e.g. “up” is “North”, “green” is “good”)
    • Lose the junk (i.e. ink should only aid understanding)
    • Add polish (i.e. spend 5-10 extra minutes on a viz)



You can view the code to process our practice datasets and write this page by visiting the GitHub repository.

Lastly, click here to view the handout on choosing the best visualization tooling.





3 Practice Datasets

Feel free to use your own data or peruse the available practice datasets and documentation below.

Data were selected for their variance to support several approaches to data visualization.

Some datasets are moderately large, others are quite small, but all are clean.

  • Variable names are formatted for case, punctuation, and descriptiveness
  • Datetimes have been formatted to ISO format and parsed on both Month and Year
  • Coordinate values are concatenated and parsed for all possible longitude-latitude permutations


It’s up to you to find the stories in the data.




3.1 Mammalian Sleep Patterns

Total Observations: 83

Total Variables: 11

Recommended Stories: Comparisons

Sleep patterns of 83 mammals, including their common name, genus, diet, total sleep state, REM state, and awake state (in hours), as well as body and brain weight.

View the documentation here.


Mammal Genus Vore Order Status Total Sleep REM Cycle
Arctic fox Vulpes Carnivore Carnivora 12.5
Bottle-nosed dolphin Tursiops Carnivore Cetacea 5.2
Caspian seal Phoca Carnivore Carnivora VU 3.5 0.4
Cheetah Acinonyx Carnivore Carnivora LC 12.1
Common porpoise Phocoena Carnivore Cetacea VU 5.6


Open it here in Google Sheets to copy and connect it to Google Data Studio or download as a .CSV.




3.2 U.S. Economics

Total Observations: 574

Total Variables: 6

Recommended Stories: Timeseries

Historical data on economic indicators in the United States from 1967 to 2015, including total population, personal savings rates, personal consumption expenditures, unemployment rates, and median unemployment duration (in weeks).

View the documentation here.


Date Year Month Population Consumption Savings Duration Unemployed
1967-07-01 1967 July 198,712,000 $506,700,000,000 12.6 4.5 2,944,000
1967-08-01 1967 August 198,911,000 $509,800,000,000 12.6 4.7 2,945,000
1967-09-01 1967 September 199,113,000 $515,600,000,000 11.9 4.6 2,958,000
1967-10-01 1967 October 199,311,000 $512,200,000,000 12.9 4.9 3,143,000
1967-11-01 1967 November 199,498,000 $517,400,000,000 12.8 4.7 3,066,000


Open it here in Google Sheets to copy and connect it to Google Data Studio or download as a .CSV.




3.3 Australian Wine

Total Observations: 176

Total Variables: 6

Recommended Stories: Timeseries

Historical data on monthly wine sales in Australia by total bottles from 1980 to 1994.

View the documentation here (pp. 137-138).


Year Month Date Year & Month Year & Month (ISO) Bottles
1980 January 1980-01-01 1980, January 1980-01 15,136
1980 February 1980-02-01 1980, February 1980-02 16,733
1980 March 1980-03-01 1980, March 1980-03 20,016
1980 April 1980-04-01 1980, April 1980-04 17,708
1980 May 1980-05-01 1980, May 1980-05 18,019


Open it here in Google Sheets to copy and connect it to Google Data Studio or download as a .CSV.




3.4 Houston Crime

Total Observations: 86,314

Total Variables: 19

Recommended Stories: Geospatial, Timeseries

Dates and times of violent crimes reported in Houston, Texas, including census blocks, cop beats, and longitude-latitude coordinates.

View the documentation here.


Date Offense Street Day Longitude Latitude
2010-01-01 Murder Marlive Friday -95.4373883 29.6779015
2010-01-01 Robbery Telephone Friday -95.2988769 29.6917121
2010-01-01 Aggravated Assault Wickview Friday -95.455864 29.5992174
2010-01-01 Aggravated Assault Ashland Friday -95.4033373 29.7902425
2010-01-01 Aggravated Assault Canyon Friday -95.3779081 29.6706341


Open it here in Google Sheets to copy and connect it to Google Data Studio.

This is a big table! If you’re only using Microsoft Excel, click here to directly download as .CSV.




3.5 Anderson’s Irises

Total Observations: 150

Total Variables: 6

Recommended Stories: Comparisons

Edgar Anderson’s 150 samples of three unique species of Iris flowers, including Setosa, Versicolor, and Virginica, as well as the dimensions of their petals and sepals.

View the documentation here.


Iris Species Sepal Length Sepal Width Petal Length Petal Width
1 Setosa 5.1 3.5 1.4 0.2
2 Setosa 4.9 3.0 1.4 0.2
3 Setosa 4.7 3.2 1.3 0.2
4 Setosa 4.6 3.1 1.5 0.2
5 Setosa 5.0 3.6 1.4 0.2


Open it here in Google Sheets to copy and connect it to Google Data Studio or download as a .CSV.




3.6 College Score Cards

Total Observations: 7,112

Total Variables: 21

Recommended Stories: Geospatial & Timeseries

Reduced from 1,978 variables, the U.S. Department of Education’s College Score Cards contain key data points on post-secondary institutions, including ACT and SAT scores, admissions rates, total undergrads, tuition revenue, and average faculty salaries, as well as longitude-latitude coordinates.

View the documentation here.

Find the data portal here.


ID Institution City State Longitude Latitude
100654 Alabama A & M University Normal AL -86.568502 34.783368
100663 University of Alabama at Birmingham Birmingham AL -86.799345 33.505697
100690 Amridge University Montgomery AL -86.17401 32.362609
100706 University of Alabama in Huntsville Huntsville AL -86.640449 34.724557
100724 Alabama State University Montgomery AL -86.295677 32.364317


Open it here in Google Sheets to copy and connect it to Google Data Studio.

This is a big table! If you’re only using Microsoft Excel, click here to directly download as .CSV.




3.7 Australian Wool

Total Observations: 119

Total Variables: 3

Recommended Stories: Timeseries

Quarterly woolen yarn sales in tons from 1965 to 1994.

View the documentation here (pp. 138).


Year Quarter Tons
1965 1 6,172
1965 2 6,709
1965 3 6,633
1965 4 6,660
1966 1 6,786


Open it here in Google Sheets to copy and connect it to Google Data Studio or download as a .CSV.




3.8 Fuel Efficiency

Total Observations: 234

Total Variables: 11

Recommended Stories: Comparisons

Comparison data on the fuel economy of 38 popular cars from 1999 and 2008, including make, model, engine displacement, cylinders, transmission type, class, and city/highway mileage.

View the documentation here.


Make Model Year Cylinders Transmission City Highway
Audi A4 1999 4 Manual 18 29
Audi A4 1999 4 Manual 21 29
Audi A4 2008 4 Manual 20 31
Audi A4 2008 4 Manual 21 30
Audi A4 1999 6 Manual 16 26


Open it here in Google Sheets to copy and connect it to Google Data Studio or download as a .CSV.




3.9 53,000 Diamonds

Total Observations: 53,940

Total Variables: 9

Recommended Stories: Comparisons

A massive set of 53,940 diamonds measured in carats, length, width, and depth dimensions, and professionally-assessed color, cut, and clarity.

View the documentation here.


Cut Carats Color Clarity Price (USD) Length Width Depth
Ideal 3.50 H I1 $12,587 9.65 9.59 6.03
Ideal 3.22 I I1 $12,545 9.49 9.42 5.92
Ideal 3.01 J SI2 $16,037 9.25 9.20 5.69
Ideal 3.01 J I1 $16,538 8.99 8.93 5.86
Ideal 2.75 D I1 $13,156 9.04 8.98 5.49


Open it here in Google Sheets to copy and connect it to Google Data Studio.

This is a big table! If you’re only using Microsoft Excel, click here to directly download as .CSV.




3.10 Texas Real Estate

Total Observations: 8,602

Total Variables: 11

Recommended Stories: Comparisons, Geospatial, Timeseries

Historical data on Texas real estate sales from 2000 to 2015, including city, state, date of sale, total sales, total value (USD), median price, and estimated duration to sell all listed properties.

View the documentation here


Year City State Month Total Sales Total Value Median Price
2000 Abilene Texas January 72 $5,380,000 $71,400
2000 Abilene Texas February 98 $6,505,000 $58,700
2000 Abilene Texas March 130 $9,285,000 $58,100
2000 Abilene Texas April 98 $9,730,000 $68,600
2000 Abilene Texas May 141 $10,590,000 $67,300


Open it here in Google Sheets to copy and connect it to Google Data Studio or download as a .CSV.




3.11 Daily Gold Prices

Total Observations: 1,108

Total Variables: 4

Recommended Stories: Timeseries

Daily morning valuations of gold in USD from 1985 to 1988.

View the documentation here (pp. 84-85).


Date Year Month USD
1985-01-01 1985 January $306.25
1985-01-02 1985 January $299.50
1985-01-03 1985 January $303.45
1985-01-04 1985 January $296.75
1985-01-05 1985 January $304.40


Open it here in Google Sheets to copy and connect it to Google Data Studio or download as a .CSV.




3.12 Chick Weights

Total Observations: 71

Total Variables: 3

Recommended Stories: Comparisons

Experiment results from randomly assigning, newborn chicks to different diets, like soybean and meatmeal, and measuring their weight.

View the documentation here.


Chick Feed Group Grams
1 Horsebean 179
2 Horsebean 160
3 Horsebean 136
4 Horsebean 227
5 Horsebean 217


Open it here in Google Sheets to copy and connect it to Google Data Studio or download as a .CSV.




3.13 Classic Cars

Total Observations: 32

Total Variables: 12

Recommended Stories: Comparisons

Specs of 32 classic cars from a 1974 issue of U.S. magazine Motor Trend, including fuel economy, cylinders, gears, weight, engine displacement, horsepower, and transmission.

View the documentation here.


Model MPG Cylinders Displacement Horsepower Pounds Transmission
Mazda RX4 21.0 6 160 110 2,620 Manual
Mazda RX4 Wag 21.0 6 160 110 2,875 Manual
Datsun 710 22.8 4 108 93 2,320 Manual
Hornet 4 Drive 21.4 6 258 110 3,215 Automatic
Hornet Sportabout 18.7 8 360 175 3,440 Automatic


Open it here in Google Sheets to copy and connect it to Google Data Studio or download as a .CSV.




4 Connecting to Data Studio

Google Data Studio is a point-and-click, drag-and-drop application for creating interactive reporting tools.

In order to create a report and connect your data, you must:

  1. Sign into your Google account
  2. Copy or upload a Google Sheets table to your Google Drive
  3. Create a new report in Google Data Studio (with one click)
  4. Select your connection application (Google Sheets)
  5. Make sure your variables are the correct class and add your new “connection”




4.1 Copy an Existing Dataset

Open up a dataset in Google Sheets.

  1. Click “File” and “Make a copy…” (see Step 1.1)
  2. Make sure “Folder” is set to “My Drive” (see Step 1.2)
  3. Click “OK”


Step 1.1: Select “Make a copy…” in “File”



Step 1.2: Make sure “Folder” is set to “My Drive” and click “OK”





4.2 Create & Connect Reports

In order to create a new report and connect it to your data, visit the Data Studio home page.

  1. Click “Start a New Report” (see Step 2.1)
  2. Select “Create New Data Source” (see Step 2.2)
  3. Select your connection method (see Step 2.3)


Step 2.1: Click “Start a New Report”



Step 2.2: Click “Create New Data Source”



Step 2.3: Select “Google Sheets” as your connection method





4.3 Edit Your Connection

It’s the final stretch. You just have to choose your table and check your variables.

  1. Select the spreadsheet, worksheet, and options from Google Sheets (see Step 3.1)
  2. Click “OK”, then check that the “Type” of each variable is correct (see Step 3.2)
  3. Click “Add to Report”


Step 3.1: Select the correct “Spreadsheet” (B) and options (D)



Step 3.2: Check each variable “Type” and select “Add to Report”




5 Visualization Resources

The following are just a few resources for creating effective visualizations with any tooling.




5.1 Choose the Best Viz

What visualization is best depends on shape and type. Data Viz Project can help.





5.2 Extract Colors

HTML Color Codes allows you to upload images and logos to extract their precise colors.





5.3 Generate Palettes

Coolers is an excellent tool for generating complementary and gradient color palettes.





5.4 Mind Accessibility

If using light fonts or dark backgrounds, ensure the contrast is still visually-accessible with WebAIM.





5.5 Use Legible Fonts

Though Google Data Studio fonts are limited, they’re extremely legible.

Consider downloading some font families to complement your reports and presentations in Google Fonts.

Tuesday’s presentation used “Quicksand” - it’s dope. You can easily look up how to install new fonts.





6 Contact

Feel free to contact me at any time with questions or for help.




7 Thanks

Thank you to everyone who’s participated in Picture This: Best Practices in Data Visualization.

Special thanks to Dori Neptune for facilitating all logistics - we couldn’t have done it without her.

Lastly, thanks to the following and their leadership for trusting me in front of an audience: