Whats Covered

   


Intro to Basics

How it Works

  • This section has some pretty juicy R charts here.
  • This was hott stuff back in 1996
    • Many of these styles of charts are still usefull now for data exploration
    • Thought I would not use base R graphics to make them
    • I’d start with ggplot2 or ggvis
    • Still its fun to see these charts. Don’t wory about the code to make them for now.

Arthmetic with R

Yeah, it does all the usual stuff.

## [1] 10
## [1] 0
## [1] 15
## [1] 5
## [1] 32
## [1] 1

Apples and oranges

  • This would fail because you can’t add character vectors.
  • Rstudio won’t even knit the document.
  • It will throw an error in the code.
  • I have eval=F so this will run and just paste the error below

Error in my_apples + my_oranges : non-numeric argument to binary operator

Vectors

Create a vector

## [1] "Here we go!"

Naming a vector

##  Mon Tues  Wed Thur  Fri 
##  140  -50   20 -120  240
##  Mon Tues  Wed Thur  Fri 
##  -24  -50  100 -350   10

Vector selection: the good times (2)

##   Tuesday Wednesday  Thursday 
##       -50        20      -120

Vector selection: the good times (3)

##   Tuesday Wednesday  Thursday    Friday 
##       -50       100      -350        10

Selection by comparison - Step 1

##    Monday   Tuesday Wednesday  Thursday    Friday 
##      TRUE     FALSE      TRUE     FALSE      TRUE

Matrices

What’s a matrix?

##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6
## [3,]    7    8    9

Adding a column for the worldwide box office

##                            US non-US worldwide_vector
## A New Hope              461.0  314.4            775.4
## The Empire Strikes Back 290.5  247.9            538.4
## Return of the Jedi      309.3  165.8            475.1

Adding a row

  • In the course they had the star_wars_matrix2 pre loaded
  • Here I just copy the original
  • this works to show what rbind does
##                            US non-US
## A New Hope              461.0  314.4
## The Empire Strikes Back 290.5  247.9
## Return of the Jedi      309.3  165.8
##                         US non-US
## The Phantom Menace   474.5  552.5
## Attack of the Clones 310.7  338.7
## Revenge of the Sixth 380.3  468.5
##                            US non-US
## A New Hope              461.0  314.4
## The Empire Strikes Back 290.5  247.9
## Return of the Jedi      309.3  165.8
## The Phantom Menace      474.5  552.5
## Attack of the Clones    310.7  338.7
## Revenge of the Sixth    380.3  468.5

The total box office revenue for the entire saga

##                            US non-US
## A New Hope              461.0  314.4
## The Empire Strikes Back 290.5  247.9
## Return of the Jedi      309.3  165.8
## The Phantom Menace      474.5  552.5
## Attack of the Clones    310.7  338.7
## Revenge of the Sixth    380.3  468.5
##     US non-US 
## 2226.3 2087.8

A little arithmetic with matrices

##                            US non-US
## A New Hope              92.20  62.88
## The Empire Strikes Back 58.10  49.58
## Return of the Jedi      61.86  33.16

A little arithmetic with matrices (2)

##                      US non-US
## The Phantom Menace    5      5
## Attack of the Clones  6      6
## Revenge of the Sixth  7      7
##                               US   non-US
## A New Hope              92.20000 62.88000
## The Empire Strikes Back 48.41667 41.31667
## Return of the Jedi      44.18571 23.68571
## [1] 61.60079
## [1] 42.62746

   


Factors

What’s a factor and why would you use it?

R uses factors for categorical variables!

What’s a factor and why would you use it? (2)

## [1] Male   Female Female Male   Male  
## Levels: Female Male

Summarizing a factor

##    Length     Class      Mode 
##         5 character character
## Female   Male 
##      2      3

Data frames

What’s a data frame?

##                      mpg cyl  disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4           21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
## Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
## Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
## Merc 240D           24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
## Merc 230            22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
## Merc 280            19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
## Merc 280C           17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
## Merc 450SE          16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
## Merc 450SL          17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
## Merc 450SLC         15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
## Cadillac Fleetwood  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
## Lincoln Continental 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
## Chrysler Imperial   14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
## Fiat 128            32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
## Honda Civic         30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
## Toyota Corolla      33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
## Toyota Corona       21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
## Dodge Challenger    15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
## AMC Javelin         15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
## Camaro Z28          13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
## Pontiac Firebird    19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
## Fiat X1-9           27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
## Porsche 914-2       26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
## Lotus Europa        30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
## Ford Pantera L      15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
## Ferrari Dino        19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
## Maserati Bora       15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
## Volvo 142E          21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2

Quick, hae a look at your data set

##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Have a look at the structure

## 'data.frame':    32 obs. of  11 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
##  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec: num  16.5 17 18.6 19.4 17 ...
##  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

Create a data frame

##   planets               type diameter rotation rings
## 1 Mercury Terrestrial planet    0.382    58.64 FALSE
## 2   Venus Terrestrial planet    0.949  -243.02 FALSE
## 3   Earth Terrestrial planet    1.000     1.00 FALSE
## 4    Mars Terrestrial planet    0.532     1.03 FALSE
## 5 Jupiter          Gas giant   11.209     0.41  TRUE
## 6  Saturn          Gas giant    9.449     0.43  TRUE
## 7  Uranus          Gas giant    4.007    -0.72  TRUE
## 8 Neptune          Gas giant    3.883     0.67  TRUE

Create a data frame (2)

## 'data.frame':    8 obs. of  5 variables:
##  $ planets : Factor w/ 8 levels "Earth","Jupiter",..: 4 8 1 3 2 6 7 5
##  $ type    : Factor w/ 2 levels "Gas giant","Terrestrial planet": 2 2 2 2 1 1 1 1
##  $ diameter: num  0.382 0.949 1 0.532 11.209 ...
##  $ rotation: num  58.64 -243.02 1 1.03 0.41 ...
##  $ rings   : logi  FALSE FALSE FALSE FALSE TRUE TRUE ...

Selection of data frame elements

##   planets               type diameter rotation rings
## 1 Mercury Terrestrial planet    0.382    58.64 FALSE
## 2   Venus Terrestrial planet    0.949  -243.02 FALSE
## 3   Earth Terrestrial planet    1.000     1.00 FALSE
##   planets      type diameter rotation rings
## 6  Saturn Gas giant    9.449     0.43  TRUE
## 7  Uranus Gas giant    4.007    -0.72  TRUE
## 8 Neptune Gas giant    3.883     0.67  TRUE

Sorting your data frame

## [1] 5 6 7 8 3 2 4 1
##   planets               type diameter rotation rings
## 5 Jupiter          Gas giant   11.209     0.41  TRUE
## 6  Saturn          Gas giant    9.449     0.43  TRUE
## 7  Uranus          Gas giant    4.007    -0.72  TRUE
## 8 Neptune          Gas giant    3.883     0.67  TRUE
## 3   Earth Terrestrial planet    1.000     1.00 FALSE
## 2   Venus Terrestrial planet    0.949  -243.02 FALSE
## 4    Mars Terrestrial planet    0.532     1.03 FALSE
## 1 Mercury Terrestrial planet    0.382    58.64 FALSE

   


Lists

Lists, why would you need them?

They are useful sometimes.

Lists, why would you need them? (2)

  • A list in R is similar to your to-do list at work or school:
    • the different items on that list most likely differ in length, characteristic, type of activity that has to do be done, …
  • A list in R allows you to gather a variety of objects under one name (that is, the name of the list) in an ordered way.
    • These objects can be matrices, vectors, data frames, even other lists, etc.
    • It is not even required that these objects are related to each other in any way.
  • You could say that a list is some kind super data type:
    • you can store practically any piece of information in it!

Creating a list

##  [1]  1  2  3  4  5  6  7  8  9 10
##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9
##                    mpg cyl  disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
## Duster 360        14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
## Merc 240D         24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
## Merc 230          22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
## Merc 280          19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
## [[1]]
##  [1]  1  2  3  4  5  6  7  8  9 10
## 
## [[2]]
##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9
## 
## [[3]]
##                    mpg cyl  disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
## Duster 360        14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
## Merc 240D         24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
## Merc 230          22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
## Merc 280          19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4

Creating a named list

## $vec
##  [1]  1  2  3  4  5  6  7  8  9 10
## 
## $mat
##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9
## 
## $df
##                    mpg cyl  disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
## Duster 360        14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
## Merc 240D         24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
## Merc 230          22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
## Merc 280          19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4

Creating a named list (2)

## $moviename
## [1] "The Shining"
## 
## $actors
## [1] "Jack Nicholson"   "Shelley Duvall"   "Danny Lloyd"      "Scatman Crothers"
## [5] "Barry Nelson"    
## 
## $reviews
##   scores sources                                              comments
## 1    4.5   IMDb1                     Best Horror Film I Have Ever Seen
## 2    4.0   IMDb2 A truly brilliant and scary film from Stanley Kubrick
## 3    5.0   IMDb3                 A masterpiece of psychological horror

Selecting elements from a list

## [1] "Barry Nelson"
##   scores sources                                              comments
## 2      4   IMDb2 A truly brilliant and scary film from Stanley Kubrick

Adding more movie information to the list

## List of 4
##  $ moviename: chr "The Shining"
##  $ actors   : chr [1:5] "Jack Nicholson" "Shelley Duvall" "Danny Lloyd" "Scatman Crothers" ...
##  $ reviews  :'data.frame':   3 obs. of  3 variables:
##   ..$ scores  : num [1:3] 4.5 4 5
##   ..$ sources : Factor w/ 3 levels "IMDb1","IMDb2",..: 1 2 3
##   ..$ comments: Factor w/ 3 levels "A masterpiece of psychological horror",..: 3 2 1
##  $ year     : num 1980