AS - Day 1

Introduction to R - Session 2

Dr. J. Kavanagh

2023-09-09

Displaying Research Findings

A useful package for visualising research findings as charts and graphs is ‘ggplot2’. It is included in the ‘tidyverse’ package and follows the guidelines of the ‘Layered Grammar of Graphics’.

The key layers are:

Brief tutorial for ggplot2

We’re going to utilise elements of Prof. Chris Brunsdon’s introductionary lecture to ggplot2 available here

To start with load the mtcars sample dataset.

data(mtcars)

mtcars %>% head()
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

ggplot2 (cont.)

Next we’re goign to make cylinders an ordinal factor, just to give an example of what this means here is the following example:

orders <- c("grande", "ventil", "tall", "grande", "ventil", "tall")
unique(orders)
## [1] "grande" "ventil" "tall"
new_orders_factor <- orders %>% factor(levels = c("tall", "grande", "ventil"))

new_orders_factor
## [1] grande ventil tall   grande ventil tall  
## Levels: tall grande ventil

To ensure that a vector has been properly assigned as a factor, use the is.factor() function

is.factor(new_orders_factor)
## [1] TRUE
is.factor(orders)
## [1] FALSE

So now we’re going to change the cyl column and re-order in numerical order.

# Using mutate() we can adjust the cyl column
mtcars %>% mutate(cyl=factor(cyl, ordered = TRUE, levels=c(4,6,8))) %>% head(n=6)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Histograms

Here is an example of a histogram use the syntax of ggplot2. aes = aesthetic means a mapping between a variable and a characteristic of the plot, x refers to the x-axis. Histograms are a very useful visualisation and can add significant value

mtcars %>% ggplot(aes(x=mpg)) + geom_histogram(binwidth = 5)

You can save these as gg objects, for example:

mtcars %>% ggplot(aes(x=mpg)) + geom_histogram(binwidth = 5) -> my_plot

Themes

Now you can add to the gg object like labels and themes, using the library ggthemes, there are additional themes you can utilise, for example:

my_plot + xlab("Miles per Gallon") + ylab("Number of Cars") + theme_economist_white()

Scatterplot

Another type of visualisation is a scatterplot, the geom_smooth() function highlights the overall trend.

my_scatplot <- ggplot(mtcars,aes(x=wt,y=mpg)) + geom_point()

my_scatplot + xlab('Weight (x 1000lbs)') + ylab('Miles per Gallon') + geom_smooth()
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

Scatterplot - Additional variables

We can also specify colours of a third variable, cylinders, in addition to the miles per gallon and weight.

my_scatplot <- ggplot(mtcars,aes(x=wt,y=mpg,col=cyl)) + geom_point()

my_scatplot + labs(x='Weight (x1000lbs)',y='Miles per Gallon',colour='Number of\n Cylinders')

Facets

The facet_grid() command can help illustrate the disparate elements of a dataset in a concise way.

my_scatplot <- ggplot(mtcars,aes(x=wt,y=mpg,col=cyl)) + geom_point()

my_scatplot + facet_grid(~cyl)

Joining Datasets

Returning to the judges datasets, note the number of rows, observations and class as shown by the glimpse() command.

glimpse(judges_appointments)
## Rows: 4,202
## Columns: 15
## $ judge_id                       <int> 3419, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11…
## $ court_name                     <chr> "U. S. District Court, Southern Distric…
## $ court_type                     <chr> "USDC", "USDC", "USDC", "USDC", "USDC",…
## $ president_name                 <chr> "Barack Obama", "Franklin D. Roosevelt"…
## $ president_party                <chr> "Democratic", "Democratic", "Republican…
## $ nomination_date                <chr> "07/28/2011", "02/03/1936", "01/06/1880…
## $ predecessor_last_name          <chr> "Kaplan", "new", "Ketcham", "McFadden",…
## $ predecessor_first_name         <chr> "Lewis A.", NA, "Winthrop", "Frank H.",…
## $ senate_confirmation_date       <chr> "03/22/2012", "02/12/1936", "01/14/1880…
## $ commission_date                <chr> "03/23/2012", "02/15/1936", "01/14/1880…
## $ chief_judge_begin              <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ chief_judge_end                <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ retirement_from_active_service <chr> NA, "02/15/1966", NA, "05/31/1996", "02…
## $ termination_date               <chr> NA, "05/28/1971", "02/09/1891", NA, "12…
## $ termination_reason             <chr> NA, "Death", "Appointment to Another Ju…
glimpse(judges_people)
## Rows: 3,532
## Columns: 13
## $ judge_id         <int> 3419, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 2989, 32…
## $ name_first       <chr> "Ronnie", "Matthew", "Marcus", "William", "Harold", "…
## $ name_middle      <chr> NA, "T.", "Wilson", "Marsh", "Arnold", "Waldo", "L.",…
## $ name_last        <chr> "Abrams", "Abruzzo", "Acheson", "Acker", "Ackerman", …
## $ name_suffix      <chr> NA, NA, NA, "Jr.", NA, NA, NA, NA, NA, NA, NA, NA, "J…
## $ birth_date       <int> 1968, 1889, 1828, 1927, 1928, 1926, 1925, 1887, 1921,…
## $ birthplace_city  <chr> "New York", "Brooklyn", "Washington", "Birmingham", "…
## $ birthplace_state <chr> "NY", "NY", "PA", "AL", "NJ", "FL", "NY", "IL", "PA",…
## $ death_date       <int> NA, 1971, 1906, NA, 2009, 1984, NA, 1956, NA, 1916, 1…
## $ death_city       <chr> NA, "Potomac", "Pittsburgh", NA, "West Orange", "Spri…
## $ death_state      <chr> NA, "MD", "PA", NA, "NJ", "IL", NA, NA, NA, "MO", "MS…
## $ gender           <chr> "F", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M"…
## $ race             <chr> "White", "White", "White", "White", "White", "White",…

Join by Rows & Columns

Using base R it is possible to join datasets either by row or column

# Create data frame
df <- data.frame(a=c(1, 3, 3, 4, 5),
                 b=c(7, 7, 8, 3, 2),
                 c=c(3, 3, 6, 6, 8))

df
##   a b c
## 1 1 7 3
## 2 3 7 3
## 3 3 8 6
## 4 4 3 6
## 5 5 2 8
# Create a second dataframe 

df2 <- c(11, 14, 17)

df2
## [1] 11 14 17
# rbind() will join this datasets together as they are of equal length and stack one atop the other
df_new <- rbind(df, df2)

df_new
##    a  b  c
## 1  1  7  3
## 2  3  7  3
## 3  3  8  6
## 4  4  3  6
## 5  5  2  8
## 6 11 14 17

This is an example of how to join dataframes by column

# Create data frame
df <- data.frame(a=c(1, 3, 3, 4, 5),
                 b=c(7, 7, 8, 3, 2),
                 c=c(3, 3, 6, 6, 8))

df
##   a b c
## 1 1 7 3
## 2 3 7 3
## 3 3 8 6
## 4 4 3 6
## 5 5 2 8
# Define vector
df2 <- c(11, 14, 16, 17, 22)
df2
## [1] 11 14 16 17 22
# cbind vector to data frame
df_new <- cbind(df, df2)

df_new
##   a b c df2
## 1 1 7 3  11
## 2 3 7 3  14
## 3 3 8 6  16
## 4 4 3 6  17
## 5 5 2 8  22

Dates in R

There are a number of different dates included in the judges_unified dataframe. However, none of these variables are the correct class as shown by the glimpse().

# Date of Nomination
judges_unified$nomination_date %>% head()
## [1] "07/28/2011" "02/03/1936" "01/06/1880" "07/22/1982" "09/28/1979"
## [6] "06/18/1976"
# Date of Confirmation
judges_unified$senate_confirmation_date %>% head()
## [1] "03/22/2012" "02/12/1936" "01/14/1880" "08/18/1982" "10/31/1979"
## [6] "07/02/1976"
# Date of Commission
judges_unified$commission_date %>% head()
## [1] "03/23/2012" "02/15/1936" "01/14/1880" "08/18/1982" "11/02/1979"
## [6] "07/02/1976"
# Date of Termination
judges_unified$termination_date %>% head()
## [1] NA           "05/28/1971" "02/09/1891" NA           "12/02/2009"
## [6] "03/31/1979"

Adjusting the dates

There are a number of different ways to adjust dates, however, as the data is structured we can use the mdy() command from the package ‘lubridate’ to make a relatively simple change.

# Create some sample dates 
begin <- c("May 11, 1996", "September 12, 2001", "July 1, 1988")
end <- c("7/8/97","10/23/02","1/4/91")
class(begin)
## [1] "character"
## [1] "character"
class(end)
## [1] "character"
## [1] "character"
(begin <- mdy(begin))
## [1] "1996-05-11" "2001-09-12" "1988-07-01"
## [1] "1996-05-11" "2001-09-12" "1988-07-01"
(end <- mdy(end))
## [1] "1997-07-08" "2002-10-23" "1991-01-04"
## [1] "1997-07-08" "2002-10-23" "1991-01-04"
class(begin)
## [1] "Date"
## [1] "Date"
class(end)
## [1] "Date"
## [1] "Date"

Creating Dates variables for the Judges dataset

Use the mdy() command and verify the results with the class() command

mdy(judges_unified$nomination_date) -> judges_unified$nomination_date
class(judges_unified$nomination_date)
## [1] "Date"
mdy(judges_unified$senate_confirmation_date) -> judges_unified$senate_confirmation_date
class(judges_unified$senate_confirmation_date)
## [1] "Date"
mdy(judges_unified$commission_date) -> judges_unified$commission_date
class(judges_unified$commission_date)
## [1] "Date"
mdy(judges_unified$termination_date) -> judges_unified$termination_date
class(judges_unified$termination_date)
## [1] "Date"

Creating specific date dataframes

First you need to create a new dataframe that provides the number of nominations per day

# This creates a new variable, however, you will need to rename the column names 
judges_unified %>% count(nomination_date) -> judges_nominations_date

judges_nominations_date
## # A tibble: 2,036 × 2
##    nomination_date     n
##    <date>          <int>
##  1 1789-09-24         13
##  2 1789-09-25          2
##  3 1790-02-08          4
##  4 1790-06-11          1
##  5 1790-07-02          1
##  6 1790-08-02          1
##  7 1790-12-17          2
##  8 1791-03-04          1
##  9 1791-10-31          2
## 10 1792-01-12          1
## # ℹ 2,026 more rows

Changing Column Names

This is vital as you will create multiple smaller dataframes and need to individualise the column names. This will prevent future errors.

# Rename the columns
colnames(judges_nominations_date) <- c("Date", "Nominations")

# Check your results
judges_nominations_date
## # A tibble: 2,036 × 2
##    Date       Nominations
##    <date>           <int>
##  1 1789-09-24          13
##  2 1789-09-25           2
##  3 1790-02-08           4
##  4 1790-06-11           1
##  5 1790-07-02           1
##  6 1790-08-02           1
##  7 1790-12-17           2
##  8 1791-03-04           1
##  9 1791-10-31           2
## 10 1792-01-12           1
## # ℹ 2,026 more rows

Using floor_date()

Group nomination dates into years using the floor_date() command from the ‘lubridate’ package. Its fairly intelligent and can reorganise dates into days, months, years etc. First create dataframes of eac

judges_nominations_date %>% group_by(year=floor_date(Date, "year")) %>% 
  summarize(No_of_Nominations=sum(Nominations)) -> judges_nominations_yearly

judges_nominations_yearly
## # A tibble: 220 × 2
##    year       No_of_Nominations
##    <date>                 <int>
##  1 1789-01-01                15
##  2 1790-01-01                 9
##  3 1791-01-01                 3
##  4 1792-01-01                 1
##  5 1793-01-01                 2
##  6 1794-01-01                 1
##  7 1795-01-01                 2
##  8 1796-01-01                 5
##  9 1797-01-01                 1
## 10 1798-01-01                 2
## # ℹ 210 more rows

Repeat this process for Commission and Termination Date. Create a new dataframe for judges terminations

# This creates a new variable, however, you will need to rename the column names 
judges_unified %>% count(termination_date) -> judges_terminations_date

judges_terminations_date
## # A tibble: 2,498 × 2
##    termination_date     n
##    <date>           <int>
##  1 1790-05-18           1
##  2 1790-08-16           1
##  3 1790-10-12           1
##  4 1791-03-05           1
##  5 1791-05-09           1
##  6 1792-01-04           1
##  7 1793-01-01           1
##  8 1793-01-16           1
##  9 1794-03-17           1
## 10 1794-06-09           1
## # ℹ 2,488 more rows

Create a new dataframe for judges commissions

# This creates a new variable, however, you will need to rename the column names 
judges_unified %>% count(commission_date) -> judges_commissions_date

judges_commissions_date
## # A tibble: 2,066 × 2
##    commission_date     n
##    <date>          <int>
##  1 1789-09-26         12
##  2 1789-09-27          1
##  3 1789-09-29          1
##  4 1789-09-30          1
##  5 1790-02-10          4
##  6 1790-06-14          1
##  7 1790-07-03          1
##  8 1790-08-03          1
##  9 1790-12-20          2
## 10 1791-03-04          1
## # ℹ 2,056 more rows

Repeat the earlier steps for the other two columns in the judges_unified dataset

# Rename the columns
colnames(judges_terminations_date) <- c("Date", "Terminations")

# Check your results
judges_terminations_date
## # A tibble: 2,498 × 2
##    Date       Terminations
##    <date>            <int>
##  1 1790-05-18            1
##  2 1790-08-16            1
##  3 1790-10-12            1
##  4 1791-03-05            1
##  5 1791-05-09            1
##  6 1792-01-04            1
##  7 1793-01-01            1
##  8 1793-01-16            1
##  9 1794-03-17            1
## 10 1794-06-09            1
## # ℹ 2,488 more rows
# Rename the columns
colnames(judges_commissions_date) <- c("Date", "Commissions")

# Check your results
judges_commissions_date
## # A tibble: 2,066 × 2
##    Date       Commissions
##    <date>           <int>
##  1 1789-09-26          12
##  2 1789-09-27           1
##  3 1789-09-29           1
##  4 1789-09-30           1
##  5 1790-02-10           4
##  6 1790-06-14           1
##  7 1790-07-03           1
##  8 1790-08-03           1
##  9 1790-12-20           2
## 10 1791-03-04           1
## # ℹ 2,056 more rows
judges_terminations_date %>% group_by(year=floor_date(Date, "year")) %>% 
  summarize(No_of_Terminations=sum(Terminations)) -> judges_terminations_yearly

judges_terminations_yearly
## # A tibble: 222 × 2
##    year       No_of_Terminations
##    <date>                  <int>
##  1 1790-01-01                  3
##  2 1791-01-01                  2
##  3 1792-01-01                  1
##  4 1793-01-01                  2
##  5 1794-01-01                  2
##  6 1795-01-01                  4
##  7 1796-01-01                  3
##  8 1797-01-01                  1
##  9 1798-01-01                  2
## 10 1799-01-01                  2
## # ℹ 212 more rows
judges_commissions_date %>% group_by(year=floor_date(Date, "year")) %>% 
  summarize(No_of_Commissions=sum(Commissions)) -> judges_commissions_yearly

judges_commissions_yearly
## # A tibble: 219 × 2
##    year       No_of_Commissions
##    <date>                 <int>
##  1 1789-01-01                15
##  2 1790-01-01                 9
##  3 1791-01-01                 3
##  4 1792-01-01                 1
##  5 1793-01-01                 1
##  6 1794-01-01                 3
##  7 1795-01-01                 1
##  8 1796-01-01                 4
##  9 1797-01-01                 3
## 10 1798-01-01                 2
## # ℹ 209 more rows

Merging the data points

In order to merge the three date columns: Nominations, Commissions and Terminations. It is necessary to create a dataframe of the entire year range. In this case, 1789-2014 or 226 years.

# This creates a new 
all_dates <- data.frame(year=seq(as.Date("1789-01-01"), by="year", length.out=226))

all_dates
##           year
## 1   1789-01-01
## 2   1790-01-01
## 3   1791-01-01
## 4   1792-01-01
## 5   1793-01-01
## 6   1794-01-01
## 7   1795-01-01
## 8   1796-01-01
## 9   1797-01-01
## 10  1798-01-01
## 11  1799-01-01
## 12  1800-01-01
## 13  1801-01-01
## 14  1802-01-01
## 15  1803-01-01
## 16  1804-01-01
## 17  1805-01-01
## 18  1806-01-01
## 19  1807-01-01
## 20  1808-01-01
## 21  1809-01-01
## 22  1810-01-01
## 23  1811-01-01
## 24  1812-01-01
## 25  1813-01-01
## 26  1814-01-01
## 27  1815-01-01
## 28  1816-01-01
## 29  1817-01-01
## 30  1818-01-01
## 31  1819-01-01
## 32  1820-01-01
## 33  1821-01-01
## 34  1822-01-01
## 35  1823-01-01
## 36  1824-01-01
## 37  1825-01-01
## 38  1826-01-01
## 39  1827-01-01
## 40  1828-01-01
## 41  1829-01-01
## 42  1830-01-01
## 43  1831-01-01
## 44  1832-01-01
## 45  1833-01-01
## 46  1834-01-01
## 47  1835-01-01
## 48  1836-01-01
## 49  1837-01-01
## 50  1838-01-01
## 51  1839-01-01
## 52  1840-01-01
## 53  1841-01-01
## 54  1842-01-01
## 55  1843-01-01
## 56  1844-01-01
## 57  1845-01-01
## 58  1846-01-01
## 59  1847-01-01
## 60  1848-01-01
## 61  1849-01-01
## 62  1850-01-01
## 63  1851-01-01
## 64  1852-01-01
## 65  1853-01-01
## 66  1854-01-01
## 67  1855-01-01
## 68  1856-01-01
## 69  1857-01-01
## 70  1858-01-01
## 71  1859-01-01
## 72  1860-01-01
## 73  1861-01-01
## 74  1862-01-01
## 75  1863-01-01
## 76  1864-01-01
## 77  1865-01-01
## 78  1866-01-01
## 79  1867-01-01
## 80  1868-01-01
## 81  1869-01-01
## 82  1870-01-01
## 83  1871-01-01
## 84  1872-01-01
## 85  1873-01-01
## 86  1874-01-01
## 87  1875-01-01
## 88  1876-01-01
## 89  1877-01-01
## 90  1878-01-01
## 91  1879-01-01
## 92  1880-01-01
## 93  1881-01-01
## 94  1882-01-01
## 95  1883-01-01
## 96  1884-01-01
## 97  1885-01-01
## 98  1886-01-01
## 99  1887-01-01
## 100 1888-01-01
## 101 1889-01-01
## 102 1890-01-01
## 103 1891-01-01
## 104 1892-01-01
## 105 1893-01-01
## 106 1894-01-01
## 107 1895-01-01
## 108 1896-01-01
## 109 1897-01-01
## 110 1898-01-01
## 111 1899-01-01
## 112 1900-01-01
## 113 1901-01-01
## 114 1902-01-01
## 115 1903-01-01
## 116 1904-01-01
## 117 1905-01-01
## 118 1906-01-01
## 119 1907-01-01
## 120 1908-01-01
## 121 1909-01-01
## 122 1910-01-01
## 123 1911-01-01
## 124 1912-01-01
## 125 1913-01-01
## 126 1914-01-01
## 127 1915-01-01
## 128 1916-01-01
## 129 1917-01-01
## 130 1918-01-01
## 131 1919-01-01
## 132 1920-01-01
## 133 1921-01-01
## 134 1922-01-01
## 135 1923-01-01
## 136 1924-01-01
## 137 1925-01-01
## 138 1926-01-01
## 139 1927-01-01
## 140 1928-01-01
## 141 1929-01-01
## 142 1930-01-01
## 143 1931-01-01
## 144 1932-01-01
## 145 1933-01-01
## 146 1934-01-01
## 147 1935-01-01
## 148 1936-01-01
## 149 1937-01-01
## 150 1938-01-01
## 151 1939-01-01
## 152 1940-01-01
## 153 1941-01-01
## 154 1942-01-01
## 155 1943-01-01
## 156 1944-01-01
## 157 1945-01-01
## 158 1946-01-01
## 159 1947-01-01
## 160 1948-01-01
## 161 1949-01-01
## 162 1950-01-01
## 163 1951-01-01
## 164 1952-01-01
## 165 1953-01-01
## 166 1954-01-01
## 167 1955-01-01
## 168 1956-01-01
## 169 1957-01-01
## 170 1958-01-01
## 171 1959-01-01
## 172 1960-01-01
## 173 1961-01-01
## 174 1962-01-01
## 175 1963-01-01
## 176 1964-01-01
## 177 1965-01-01
## 178 1966-01-01
## 179 1967-01-01
## 180 1968-01-01
## 181 1969-01-01
## 182 1970-01-01
## 183 1971-01-01
## 184 1972-01-01
## 185 1973-01-01
## 186 1974-01-01
## 187 1975-01-01
## 188 1976-01-01
## 189 1977-01-01
## 190 1978-01-01
## 191 1979-01-01
## 192 1980-01-01
## 193 1981-01-01
## 194 1982-01-01
## 195 1983-01-01
## 196 1984-01-01
## 197 1985-01-01
## 198 1986-01-01
## 199 1987-01-01
## 200 1988-01-01
## 201 1989-01-01
## 202 1990-01-01
## 203 1991-01-01
## 204 1992-01-01
## 205 1993-01-01
## 206 1994-01-01
## 207 1995-01-01
## 208 1996-01-01
## 209 1997-01-01
## 210 1998-01-01
## 211 1999-01-01
## 212 2000-01-01
## 213 2001-01-01
## 214 2002-01-01
## 215 2003-01-01
## 216 2004-01-01
## 217 2005-01-01
## 218 2006-01-01
## 219 2007-01-01
## 220 2008-01-01
## 221 2009-01-01
## 222 2010-01-01
## 223 2011-01-01
## 224 2012-01-01
## 225 2013-01-01
## 226 2014-01-01
# Use the anti_join() command to show the missing dates
anti_join(all_dates, judges_commissions_yearly, by="year") -> missing_dates_commissions

missing_dates_commissions
##         year
## 1 1800-01-01
## 2 1805-01-01
## 3 1808-01-01
## 4 1810-01-01
## 5 1828-01-01
## 6 1831-01-01
## 7 1833-01-01
## 8 1843-01-01
# Merge the missing dates and the yearly judges commissions
merge(judges_commissions_yearly, missing_dates_commissions, by="year", all.y = T, all.x = T) -> judges_commissions_yearly

Repeat these process for Nominations & Terminations

# Use the anti_join() command to show the missing dates
anti_join(all_dates, judges_nominations_yearly, by="year") -> missing_dates_nominations

missing_dates_nominations
##         year
## 1 1800-01-01
## 2 1808-01-01
## 3 1810-01-01
## 4 1814-01-01
## 5 1827-01-01
## 6 1838-01-01
## 7 1843-01-01
# Merge the missing dates and the yearly judges nominations
merge(judges_nominations_yearly, missing_dates_nominations, by="year", all.y = T, all.x = T) -> judges_nominations_yearly
# Use the anti_join() command to show the missing dates
anti_join(all_dates, judges_terminations_yearly, by="year") -> missing_dates_terminations

missing_dates_terminations
##         year
## 1 1789-01-01
## 2 1807-01-01
## 3 1808-01-01
## 4 1817-01-01
## 5 1827-01-01
# Merge the missing dates and the yearly judges terminations
merge(judges_terminations_yearly, missing_dates_terminations, by="year", all.y = T, all.x = T) -> judges_terminations_yearly

Merging the three data points

Now each of the datasets are the same length they can be joined together. It is always a good policy to join the datasets in sequence, start with the Nominations and Commissions.

inner_join(judges_nominations_yearly, judges_commissions_yearly, by="year") -> nominations_commissions_yearly

nominations_commissions_yearly
##           year No_of_Nominations No_of_Commissions
## 1   1789-01-01                15                15
## 2   1790-01-01                 9                 9
## 3   1791-01-01                 3                 3
## 4   1792-01-01                 1                 1
## 5   1793-01-01                 2                 1
## 6   1794-01-01                 1                 3
## 7   1795-01-01                 2                 1
## 8   1796-01-01                 5                 4
## 9   1797-01-01                 1                 3
## 10  1798-01-01                 2                 2
## 11  1799-01-01                 2                 2
## 12  1800-01-01                NA                NA
## 13  1801-01-01                18                21
## 14  1802-01-01                 7                10
## 15  1803-01-01                 2                 2
## 16  1804-01-01                 3                 3
## 17  1805-01-01                 1                NA
## 18  1806-01-01                 5                 5
## 19  1807-01-01                 1                 2
## 20  1808-01-01                NA                NA
## 21  1809-01-01                 1                 1
## 22  1810-01-01                NA                NA
## 23  1811-01-01                 3                 3
## 24  1812-01-01                 5                 5
## 25  1813-01-01                 2                 2
## 26  1814-01-01                NA                 2
## 27  1815-01-01                 1                 1
## 28  1816-01-01                 1                 1
## 29  1817-01-01                 2                 2
## 30  1818-01-01                 3                 4
## 31  1819-01-01                 3                 4
## 32  1820-01-01                 3                 3
## 33  1821-01-01                 2                 1
## 34  1822-01-01                 2                 3
## 35  1823-01-01                 4                 6
## 36  1824-01-01                 5                 5
## 37  1825-01-01                 3                 3
## 38  1826-01-01                 7                 8
## 39  1827-01-01                NA                 1
## 40  1828-01-01                 2                NA
## 41  1829-01-01                 4                 5
## 42  1830-01-01                 3                 3
## 43  1831-01-01                 1                NA
## 44  1832-01-01                 1                 2
## 45  1833-01-01                 2                NA
## 46  1834-01-01                 3                 4
## 47  1835-01-01                 3                 1
## 48  1836-01-01                 8                 9
## 49  1837-01-01                 4                 5
## 50  1838-01-01                NA                 2
## 51  1839-01-01                 2                 4
## 52  1840-01-01                 4                 4
## 53  1841-01-01                 6                 6
## 54  1842-01-01                 3                 3
## 55  1843-01-01                NA                NA
## 56  1844-01-01                 1                 1
## 57  1845-01-01                 4                 2
## 58  1846-01-01                 4                 7
## 59  1847-01-01                 2                 3
## 60  1848-01-01                 1                 3
## 61  1849-01-01                 5                 4
## 62  1850-01-01                 2                 4
## 63  1851-01-01                 2                 4
## 64  1852-01-01                 2                 3
## 65  1853-01-01                 6                 5
## 66  1854-01-01                 1                 2
## 67  1855-01-01                 7                 9
## 68  1856-01-01                 4                 4
## 69  1857-01-01                 4                 5
## 70  1858-01-01                 4                 5
## 71  1859-01-01                 2                 2
## 72  1860-01-01                 5                 5
## 73  1861-01-01                 8                 7
## 74  1862-01-01                 8                 9
## 75  1863-01-01                10                10
## 76  1864-01-01                12                15
## 77  1865-01-01                 9                 6
## 78  1866-01-01                 5                10
## 79  1867-01-01                 2                 2
## 80  1868-01-01                 2                 2
## 81  1869-01-01                10                 8
## 82  1870-01-01                13                14
## 83  1871-01-01                 4                 6
## 84  1872-01-01                 4                 5
## 85  1873-01-01                 4                 3
## 86  1874-01-01                 4                 5
## 87  1875-01-01                 7                 7
## 88  1876-01-01                 2                 2
## 89  1877-01-01                 7                 7
## 90  1878-01-01                 5                 5
## 91  1879-01-01                 9                 9
## 92  1880-01-01                 6                 6
## 93  1881-01-01                 7                 8
## 94  1882-01-01                 9                10
## 95  1883-01-01                 5                 3
## 96  1884-01-01                 6                 8
## 97  1885-01-01                 4                 3
## 98  1886-01-01                 6                 4
## 99  1887-01-01                 5                 5
## 100 1888-01-01                 5                 9
## 101 1889-01-01                 6                 2
## 102 1890-01-01                 9                13
## 103 1891-01-01                21                14
## 104 1892-01-01                17                32
## 105 1893-01-01                14                14
## 106 1894-01-01                 6                 6
## 107 1895-01-01                 7                 9
## 108 1896-01-01                10                 7
## 109 1897-01-01                 8                 9
## 110 1898-01-01                 6                 5
## 111 1899-01-01                15                15
## 112 1900-01-01                 7                 5
## 113 1901-01-01                12                13
## 114 1902-01-01                12                14
## 115 1903-01-01                13                14
## 116 1904-01-01                 9                 7
## 117 1905-01-01                26                27
## 118 1906-01-01                11                11
## 119 1907-01-01                15                12
## 120 1908-01-01                 6                 6
## 121 1909-01-01                15                15
## 122 1910-01-01                33                22
## 123 1911-01-01                16                28
## 124 1912-01-01                14                13
## 125 1913-01-01                 9                 9
## 126 1914-01-01                15                14
## 127 1915-01-01                 4                 5
## 128 1916-01-01                15                15
## 129 1917-01-01                 9                10
## 130 1918-01-01                12                11
## 131 1919-01-01                12                13
## 132 1920-01-01                 6                 6
## 133 1921-01-01                15                14
## 134 1922-01-01                20                15
## 135 1923-01-01                23                25
## 136 1924-01-01                12                16
## 137 1925-01-01                26                25
## 138 1926-01-01                 5                12
## 139 1927-01-01                12                10
## 140 1928-01-01                23                27
## 141 1929-01-01                29                35
## 142 1930-01-01                16                10
## 143 1931-01-01                26                22
## 144 1932-01-01                 9                18
## 145 1933-01-01                 9                 9
## 146 1934-01-01                 9                 9
## 147 1935-01-01                17                16
## 148 1936-01-01                13                14
## 149 1937-01-01                33                33
## 150 1938-01-01                 9                 9
## 151 1939-01-01                35                34
## 152 1940-01-01                32                32
## 153 1941-01-01                26                21
## 154 1942-01-01                 9                15
## 155 1943-01-01                16                16
## 156 1944-01-01                 9                 9
## 157 1945-01-01                23                22
## 158 1946-01-01                17                18
## 159 1947-01-01                17                15
## 160 1948-01-01                 3                 7
## 161 1949-01-01                30                30
## 162 1950-01-01                44                39
## 163 1951-01-01                14                17
## 164 1952-01-01                 5                 5
## 165 1953-01-01                10                 9
## 166 1954-01-01                47                47
## 167 1955-01-01                21                21
## 168 1956-01-01                24                24
## 169 1957-01-01                21                21
## 170 1958-01-01                17                17
## 171 1959-01-01                37                36
## 172 1960-01-01                12                12
## 173 1961-01-01                63                62
## 174 1962-01-01                58                62
## 175 1963-01-01                16                16
## 176 1964-01-01                24                23
## 177 1965-01-01                31                35
## 178 1966-01-01                66                81
## 179 1967-01-01                39                38
## 180 1968-01-01                27                28
## 181 1969-01-01                26                26
## 182 1970-01-01                66                65
## 183 1971-01-01                74                71
## 184 1972-01-01                30                35
## 185 1973-01-01                23                23
## 186 1974-01-01                36                36
## 187 1975-01-01                22                20
## 188 1976-01-01                29                31
## 189 1977-01-01                32                32
## 190 1978-01-01                34                34
## 191 1979-01-01               152               141
## 192 1980-01-01                48                75
## 193 1981-01-01                48                62
## 194 1982-01-01                44                62
## 195 1983-01-01                35                36
## 196 1984-01-01                44                44
## 197 1985-01-01                86                85
## 198 1986-01-01                46                47
## 199 1987-01-01                64                44
## 200 1988-01-01                21                41
## 201 1989-01-01                24                15
## 202 1990-01-01                48                57
## 203 1991-01-01                89                58
## 204 1992-01-01                33                64
## 205 1993-01-01                48                29
## 206 1994-01-01                83               102
## 207 1995-01-01                69                54
## 208 1996-01-01                 7                22
## 209 1997-01-01                65                37
## 210 1998-01-01                37                65
## 211 1999-01-01                48                34
## 212 2000-01-01                25                39
## 213 2001-01-01                54                30
## 214 2002-01-01                48                72
## 215 2003-01-01                94                69
## 216 2004-01-01                11                35
## 217 2005-01-01                33                18
## 218 2006-01-01                21                37
## 219 2007-01-01                51                38
## 220 2008-01-01                17                30
## 221 2009-01-01                32                10
## 222 2010-01-01                30                51
## 223 2011-01-01                97                61
## 224 2012-01-01                15                49
## 225 2013-01-01                46                48
## 226 2014-01-01                61                62
## 227       <NA>               160                33

Follow this with the final data column, Terminations.

inner_join(nominations_commissions_yearly, judges_terminations_yearly, by="year") -> nominations_commissions_terminations_yearly

nominations_commissions_terminations_yearly
##           year No_of_Nominations No_of_Commissions No_of_Terminations
## 1   1789-01-01                15                15                 NA
## 2   1790-01-01                 9                 9                  3
## 3   1791-01-01                 3                 3                  2
## 4   1792-01-01                 1                 1                  1
## 5   1793-01-01                 2                 1                  2
## 6   1794-01-01                 1                 3                  2
## 7   1795-01-01                 2                 1                  4
## 8   1796-01-01                 5                 4                  3
## 9   1797-01-01                 1                 3                  1
## 10  1798-01-01                 2                 2                  2
## 11  1799-01-01                 2                 2                  2
## 12  1800-01-01                NA                NA                  1
## 13  1801-01-01                18                21                  6
## 14  1802-01-01                 7                10                 20
## 15  1803-01-01                 2                 2                  1
## 16  1804-01-01                 3                 3                  2
## 17  1805-01-01                 1                NA                  1
## 18  1806-01-01                 5                 5                  5
## 19  1807-01-01                 1                 2                 NA
## 20  1808-01-01                NA                NA                 NA
## 21  1809-01-01                 1                 1                  1
## 22  1810-01-01                NA                NA                  2
## 23  1811-01-01                 3                 3                  1
## 24  1812-01-01                 5                 5                  4
## 25  1813-01-01                 2                 2                  2
## 26  1814-01-01                NA                 2                  3
## 27  1815-01-01                 1                 1                  1
## 28  1816-01-01                 1                 1                  1
## 29  1817-01-01                 2                 2                 NA
## 30  1818-01-01                 3                 4                  3
## 31  1819-01-01                 3                 4                  3
## 32  1820-01-01                 3                 3                  1
## 33  1821-01-01                 2                 1                  1
## 34  1822-01-01                 2                 3                  2
## 35  1823-01-01                 4                 6                  4
## 36  1824-01-01                 5                 5                  6
## 37  1825-01-01                 3                 3                  4
## 38  1826-01-01                 7                 8                  7
## 39  1827-01-01                NA                 1                 NA
## 40  1828-01-01                 2                NA                  5
## 41  1829-01-01                 4                 5                  2
## 42  1830-01-01                 3                 3                  2
## 43  1831-01-01                 1                NA                  1
## 44  1832-01-01                 1                 2                  1
## 45  1833-01-01                 2                NA                  3
## 46  1834-01-01                 3                 4                  3
## 47  1835-01-01                 3                 1                  3
## 48  1836-01-01                 8                 9                  5
## 49  1837-01-01                 4                 5                  2
## 50  1838-01-01                NA                 2                  3
## 51  1839-01-01                 2                 4                  3
## 52  1840-01-01                 4                 4                  1
## 53  1841-01-01                 6                 6                  6
## 54  1842-01-01                 3                 3                  3
## 55  1843-01-01                NA                NA                  1
## 56  1844-01-01                 1                 1                  2
## 57  1845-01-01                 4                 2                  5
## 58  1846-01-01                 4                 7                  1
## 59  1847-01-01                 2                 3                  1
## 60  1848-01-01                 1                 3                  1
## 61  1849-01-01                 5                 4                  5
## 62  1850-01-01                 2                 4                  1
## 63  1851-01-01                 2                 4                  3
## 64  1852-01-01                 2                 3                  3
## 65  1853-01-01                 6                 5                  5
## 66  1854-01-01                 1                 2                  1
## 67  1855-01-01                 7                 9                  6
## 68  1856-01-01                 4                 4                  1
## 69  1857-01-01                 4                 5                  5
## 70  1858-01-01                 4                 5                  2
## 71  1859-01-01                 2                 2                  5
## 72  1860-01-01                 5                 5                  3
## 73  1861-01-01                 8                 7                 20
## 74  1862-01-01                 8                 9                  5
## 75  1863-01-01                10                10                 11
## 76  1864-01-01                12                15                  8
## 77  1865-01-01                 9                 6                  2
## 78  1866-01-01                 5                10                  6
## 79  1867-01-01                 2                 2                  2
## 80  1868-01-01                 2                 2                  1
## 81  1869-01-01                10                 8                  4
## 82  1870-01-01                13                14                  8
## 83  1871-01-01                 4                 6                  6
## 84  1872-01-01                 4                 5                  3
## 85  1873-01-01                 4                 3                  5
## 86  1874-01-01                 4                 5                  7
## 87  1875-01-01                 7                 7                  3
## 88  1876-01-01                 2                 2                  2
## 89  1877-01-01                 7                 7                  5
## 90  1878-01-01                 5                 5                  5
## 91  1879-01-01                 9                 9                  8
## 92  1880-01-01                 6                 6                  6
## 93  1881-01-01                 7                 8                  8
## 94  1882-01-01                 9                10                  9
## 95  1883-01-01                 5                 3                  5
## 96  1884-01-01                 6                 8                  5
## 97  1885-01-01                 4                 3                  3
## 98  1886-01-01                 6                 4                  6
## 99  1887-01-01                 5                 5                  5
## 100 1888-01-01                 5                 9                  5
## 101 1889-01-01                 6                 2                  4
## 102 1890-01-01                 9                13                  6
## 103 1891-01-01                21                14                 10
## 104 1892-01-01                17                32                  8
## 105 1893-01-01                14                14                 12
## 106 1894-01-01                 6                 6                  1
## 107 1895-01-01                 7                 9                  5
## 108 1896-01-01                10                 7                  9
## 109 1897-01-01                 8                 9                 10
## 110 1898-01-01                 6                 5                  6
## 111 1899-01-01                15                15                  8
## 112 1900-01-01                 7                 5                  5
## 113 1901-01-01                12                13                 10
## 114 1902-01-01                12                14                  9
## 115 1903-01-01                13                14                 10
## 116 1904-01-01                 9                 7                  6
## 117 1905-01-01                26                27                 19
## 118 1906-01-01                11                11                 12
## 119 1907-01-01                15                12                  9
## 120 1908-01-01                 6                 6                  7
## 121 1909-01-01                15                15                 14
## 122 1910-01-01                33                22                 13
## 123 1911-01-01                16                28                 50
## 124 1912-01-01                14                13                 10
## 125 1913-01-01                 9                 9                 16
## 126 1914-01-01                15                14                 13
## 127 1915-01-01                 4                 5                  8
## 128 1916-01-01                15                15                 15
## 129 1917-01-01                 9                10                  6
## 130 1918-01-01                12                11                 11
## 131 1919-01-01                12                13                  8
## 132 1920-01-01                 6                 6                  7
## 133 1921-01-01                15                14                 11
## 134 1922-01-01                20                15                 11
## 135 1923-01-01                23                25                 10
## 136 1924-01-01                12                16                 16
## 137 1925-01-01                26                25                 17
## 138 1926-01-01                 5                12                  6
## 139 1927-01-01                12                10                 10
## 140 1928-01-01                23                27                 16
## 141 1929-01-01                29                35                 18
## 142 1930-01-01                16                10                 16
## 143 1931-01-01                26                22                 19
## 144 1932-01-01                 9                18                  9
## 145 1933-01-01                 9                 9                 11
## 146 1934-01-01                 9                 9                  7
## 147 1935-01-01                17                16                 11
## 148 1936-01-01                13                14                  6
## 149 1937-01-01                33                33                 12
## 150 1938-01-01                 9                 9                 14
## 151 1939-01-01                35                34                 15
## 152 1940-01-01                32                32                 15
## 153 1941-01-01                26                21                 18
## 154 1942-01-01                 9                15                  8
## 155 1943-01-01                16                16                 14
## 156 1944-01-01                 9                 9                 14
## 157 1945-01-01                23                22                 19
## 158 1946-01-01                17                18                 14
## 159 1947-01-01                17                15                 13
## 160 1948-01-01                 3                 7                 22
## 161 1949-01-01                30                30                 21
## 162 1950-01-01                44                39                 14
## 163 1951-01-01                14                17                 10
## 164 1952-01-01                 5                 5                 15
## 165 1953-01-01                10                 9                 17
## 166 1954-01-01                47                47                 13
## 167 1955-01-01                21                21                 15
## 168 1956-01-01                24                24                  9
## 169 1957-01-01                21                21                 16
## 170 1958-01-01                17                17                 21
## 171 1959-01-01                37                36                 16
## 172 1960-01-01                12                12                 17
## 173 1961-01-01                63                62                 19
## 174 1962-01-01                58                62                 21
## 175 1963-01-01                16                16                 21
## 176 1964-01-01                24                23                 19
## 177 1965-01-01                31                35                 29
## 178 1966-01-01                66                81                 40
## 179 1967-01-01                39                38                 18
## 180 1968-01-01                27                28                 14
## 181 1969-01-01                26                26                 23
## 182 1970-01-01                66                65                 20
## 183 1971-01-01                74                71                 24
## 184 1972-01-01                30                35                 24
## 185 1973-01-01                23                23                 18
## 186 1974-01-01                36                36                 34
## 187 1975-01-01                22                20                 28
## 188 1976-01-01                29                31                 27
## 189 1977-01-01                32                32                 18
## 190 1978-01-01                34                34                 26
## 191 1979-01-01               152               141                 37
## 192 1980-01-01                48                75                 29
## 193 1981-01-01                48                62                 40
## 194 1982-01-01                44                62                 38
## 195 1983-01-01                35                36                 23
## 196 1984-01-01                44                44                 24
## 197 1985-01-01                86                85                 25
## 198 1986-01-01                46                47                 29
## 199 1987-01-01                64                44                 27
## 200 1988-01-01                21                41                 31
## 201 1989-01-01                24                15                 29
## 202 1990-01-01                48                57                 39
## 203 1991-01-01                89                58                 28
## 204 1992-01-01                33                64                 30
## 205 1993-01-01                48                29                 25
## 206 1994-01-01                83               102                 27
## 207 1995-01-01                69                54                 39
## 208 1996-01-01                 7                22                 29
## 209 1997-01-01                65                37                 26
## 210 1998-01-01                37                65                 40
## 211 1999-01-01                48                34                 34
## 212 2000-01-01                25                39                 32
## 213 2001-01-01                54                30                 34
## 214 2002-01-01                48                72                 38
## 215 2003-01-01                94                69                 29
## 216 2004-01-01                11                35                 34
## 217 2005-01-01                33                18                 30
## 218 2006-01-01                21                37                 30
## 219 2007-01-01                51                38                 30
## 220 2008-01-01                17                30                 30
## 221 2009-01-01                32                10                 35
## 222 2010-01-01                30                51                 37
## 223 2011-01-01                97                61                 47
## 224 2012-01-01                15                49                 33
## 225 2013-01-01                46                48                 32
## 226 2014-01-01                61                62                 31
## 227       <NA>               160                33               1374
# Remember to rename the columns
colnames(nominations_commissions_terminations_yearly) <- c("Year", "Nominations", "Commissions", "Terminations")

Explore the new dataframes

Using ggplot we can display the findings

judges_commissions_yearly %>% ggplot(aes(x=year, y=No_of_Commissions)) + geom_line(linewidth = 0.8)
## Warning: Removed 1 row containing missing values (`geom_line()`).

Creating a clearer visualisation

Note how with a additional information added to the basic plot we can create a very effective graph

judges_commissions_yearly %>% 
ggplot(aes(x=year, y=No_of_Commissions)) + 
geom_line(linewidth = 0.8) + 
labs(title = "Judicial Commmissions - 1789-2014",
       tag = "Figure 1", 
       x = "Year",
       y = "No.") +
scale_x_date(date_breaks = "65 years", date_labels = "%Y") +
theme_classic() +
theme(axis.text.x = element_text(colour = "darkslategrey", size = 16), 
      axis.text.y = element_text(colour = "darkslategrey", size = 16),
      legend.background = element_rect(fill = "white", linewidth = 4, colour = "white"),
      legend.justification = c(0, 1),
      legend.position = c(0.9, 1),
      text = element_text(family = "Georgia"),
      plot.title = element_text(size = 18, margin = margin(b = 10)),
      plot.subtitle = element_text(size = 12, color = "darkslategrey", margin = margin(b = 25)),
      plot.caption = element_text(size = 8, margin = margin(t = 10), color = "grey70", hjust = 0))
## Warning: Removed 1 row containing missing values (`geom_line()`).

Class Exercise

Create a bar graph of yearly Judges nominations and terminations

Create a comparative line chart or bar chart of the nominations, commissions and terminations