Data Visualisation Class Week 1

Chapter 1: Design and Integrity

Dr James Baglin

Last updated: 29 January, 2020

How to use these

Viewing slides…

  • Press ‘f’ enable fullscreen mode
  • Press ‘o’ or ‘Esc’ to enable overview mode
  • Pressing ‘Esc’ exits all of these modes.
  • Hold down ‘alt’ and click on any element to zoom in. ‘Alt’ + click anywhere to zoom back out.
  • Use the Search box (top right) to search keywords in presentation

Printing slides…

Coming soon

Slides coming soon…

Teaching Team

Course Coordinator

  • Dr James Baglin (Week 2 - 12)
  • Senior Lecturer, School of Science, Mathematical Sciences
  • Email: james.baglin@rmit.edu.au
  • Phone: 03 9925 6118
  • Office: Building 8, Level 9, Room 69

Lecturer/Tutors

Quiz!

Class Discussion

Why do we visualise data?

  • Andy Kirk (2012) defined data visualisation as “the representation and presentation of data that exploits our visual perception abilities in order to amplify cognition” (p.17)
  • Why do we visualise?
    • Exploration - identifying interesting and important features (Buja et al. 2009)
    • Assist informal inference (Bakker et al. 2008)
    • Assist in teaching statistics (Paparistodemou and Meletiou-Mavrotheris 2008)
    • Exploits our visual processing power to rapidly process, compare and identify trends or interesting features in the data (Kirk 2012)
    • Makes data more appealing

Why do we visualise data? Cont.

The greatest value of a picture is when it forces us to notice what we never expected to see - John W Tukey, Exploratory Data Analysis, (1977)


By Source, Fair use, Link

Course Orientation

  • Postgrads: This course assumes you have completed MATH1324 Introduction to Statistics (or equivalent) and you are familiar with R.
  • MATH2349 Data Preprocessing is advantageous, but not required.
  • Undergraduates: MATH2200 Introduction to Probability and Statistics, MATH2201 Basic Statistical Methodologies, MATH2202 Data Preparation for Analytics and Familiarity with R will be highly beneficial
  • If your R is rusty, please work through the R Bootcamp

Course Orientation

Course Orientation Cont.

  • Class time:
    • Announcements and Questions
    • Demonstration
    • Challenge Exercise/Revision Quiz
    • Supervised self-directed learning
  • Course Schedules
  • Flexible Learning: MATH2270 and MATH2237 have been designed to run online. Classes for MATH2270 are recorded and available in Canvas under the Echo360 link.
  • Students in MATH2237 can watch the MATH2270 recordings.

Slack

  • Slack is the communication tool that will be used for both MATH2270 and MATH2237.

https://datavis-math2270-1910.slack.com

Activity - Turning Tables

  • The following table was taken from the Department of the Environment and Energy website - Australian Energy Update Report 2018
  • How does Victoria stack-up to other states in terms of renewable energy use?
  • Turn the table into a visualisation - use any data visualisation technology you are familiar with.
  • Post your visualisations to Slack under the #exercises channel.

Reflection

  • Reflect on the exercise and discuss the following questions:
    • How did you approach the task?
    • What did you find difficult?
    • What decisions where you faced with?
    • Where you happy with your design?

What will I learn in this course?

  • Chapter 1: The Craft of Data Visualisation
  • Chapter 2: A Grammar for Data Visualisation
  • Chapter 3: Visual Perception
  • Chapter 4: Colour
  • Chapter 5: Univariate and Bivariate Data Visualisations
  • Chapter 6: Multivariate Data Visualisations
  • Chapter 7: Spatial Data Visualisation
  • Chapter 8: Getting Interactive
  • Chapter 9: Data Visualisation Apps
  • Chapter 10: Dashboards

What technology will I learn?

  • Technology (Open source):
    • ggplot2 (General data visualisation)
    • Plotly (D3 interactive visualisation)
    • Leaflet (Spatial)
    • Shiny (Interactive application development)

A Visual Design Process - Andy Kirk (2012)

  • Objectives:
    • Strive for form and function
    • Justify everything you do
    • Keep it accessible and intuitive
    • Avoid deceiving the viewer
  • Purpose - Why are you visualising the data?
  • Editorial Focus - What is the narrative?
  • Options - What decisions need to be made?
  • Methods - What method should you use? Existing or new?
  • Construction - Getting the job done. What technology should I use?
  • Evaluation - Did you achieve your goals?

Trifecta Check-up

Kaiser Fung, author of Junk Charts, provides a very simple and powerful framework, called the Trifecta Check-up, to use when evaluating a data visualisation.

Critique

  • Critique the following data visualisation by Greg Laden according to the Trifecta check-up

Artic Death Spiral

  • You can read Kaiser’s review here.

Course Feedback

  • I take feedback seriously.
  • You can see a history of student feedback on Canvas –> Modules –> Welcome & Orientation –> Student Feedback.
  • CES feedback comes too late
  • You can also fill out this form (Available from the Student Feedback page on Canvas) at any stage of the semester.
  • Often, I can solve issues right away :)

Assessment

  • Course assessment is comprised of the following:
    • Assignments (50%): Three challenging assignments spread throughout the course. All three are open-ended. Work on data visualisation topics that interest you.
    • Exam (50%): Supervised, paper-based exam during the exam period. Details to follow later in the semester.

Assessment Cont.

  • All Assignments throughout the semester are due on Mondays 11:59pm AEST, unless otherwise stated.
  • Assignment 1 is available on Canvas.

Practice Makes Perfect

Each chapter will include revision and practice material.

  • Chapter Quizzes: Each chapter will have a short review quiz to assess your knowledge and help you prepare for the exam.
  • Challenge Exercises - These activities are fun and great for developing your data visualisation skills. Show off your solutions on Slack.

Completion of these activities will not be assessed, but they will greatly assist your preparation for the assignments and exam.

Class Time

Use the remaining class time to work on the following:

References

Bakker, A., P. Kent, J. Derry, R. Noss, and C. Hoyles. 2008. “Statistical inference at work: Statistical process control as an example.” Statistical Education Research Journal 7 (2): 130–45. http://www.stat.auckland.ac.nz/serj.

Buja, A., D. Cook, H. Hofmann, M. Lawrence, E.-K. Lee, D. F. Swayne, and H. Wickham. 2009. “Statistical inference for exploratory data analysis and model diagnostics.” Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 367 (1906): 4361–83. doi:10.1098/rsta.2009.0120.

Kirk, A. 2012. Data visualization: a successful design process. Birmingham, UK: Packt Publishing Ltd.

Paparistodemou, E., and M. Meletiou-Mavrotheris. 2008. “Developing young students’ informal inference skills in data analysis.” Statistics Education Research Journal 7 (2): 83–106. http://www.stat.auckland.ac.nz/serj.

Tukey, J. W. 1977. Exploratory data analysis. Reading, MA: Addison-Wesley.