Day 1

Introduction

Something has changed

Data used to be something that was collected in lab notebooks or in government offices and published in printed tables. Recording each piece of data required a human intervention, accessing each piece of data required a human interpretation.

Nowadays, data is being recorded automatically. Vast volumes are being collected. It's easily copied and transferred.

What is Data?

Measurements together with the context in which the measurement was taken. Information collected for a purpose.

Major types of information being collected nowadays:

What is “Big Data”?

The goal of computing on data is to reduce it to a form where decisions and judgments can be supported. We'll call this a “presentation” of data.

Some goals for data presentations:

Goals for this course

One way to think about this is as the “grammar of data.” We want you to know what are the nouns and verbs, and what combinations make sense.

What we won't do

Style of the class

You are pioneers

Today's Agenda

Log in to R. Basic aspects of syntax:

Create an R Markdown document about basic syntax.

Tabular data

Structure of tabular data:

Basic operations on data frames: examples with small data sets, CPS85, KidsFeet, Utilities.

Today's graphics modality: The scatter plot

Quick scatter plots: xyplot

Graphical principles of scatter plots

Elaborate scatter plots with mScatter:

Use of R Markdown

Larger data: FAOsimple country data

How to make scatter.

Scaling and logarithms.

Work through the in-class activity.