Dr. J. Kavanagh
2023-09-14
Here is a clear and straightforward way to import data into RStudio
Using some of the lessons from Day 1, explore the dataset
## year n
## 1 2003 30
## 2 2004 30
## 3 2005 29
## 4 2006 30
## 5 2007 28
## 6 2008 28
## 7 2009 28
## 8 2010 35
## 9 2011 35
## 10 2012 35
## 11 2013 36
## 12 2014 36
## 13 2015 36
## 14 2016 35
## 15 2017 35
## 16 2018 36
## 17 2019 36
## 18 2020 36
## Rows: 594
## Columns: 3
## $ commodity_and_head_of_duty <chr> "Alcohols Beer - Import", "Alcohols Beer - …
## $ year <int> 2020, 2020, 2020, 2020, 2020, 2020, 2020, 2…
## $ net_receipts_.m <int> 183235512, 167879917, 12701490, 40340740, 2…
rbind(receipts_alcohol_home, receipts_alcohol_import) -> receipts_alcohol_unified
receipts_alcohol_unified %>% ggplot(aes(x=Year, y=Tax, colour=Type, group=Type)) + geom_line()## Type Year Tax
## 12 Alcohols Beer - Home 2009 278308367
## 13 Alcohols Beer - Home 2008 309303625
## 14 Alcohols Beer - Home 2007 306918459
## 15 Alcohols Beer - Home 2006 349185923
## 16 Alcohols Beer - Home 2005 361930565
## 17 Alcohols Beer - Home 2004 378081957
## Type Year Tax
## 14 Alcohols Beer - Import 2007 157883543
## 15 Alcohols Beer - Import 2006 111507924
## 16 Alcohols Beer - Import 2005 95377167
## 17 Alcohols Beer - Import 2004 80113005
## 18 Alcohols Beer - Import 2003 82770969
## 19 Alcohols Beer - Import 2003 372619049
In the receipts_alcohol_import dataframe, an error was made and 2003 was entered twice by the data creators. Therefore we need to fix the data.
## [1] 2020 2019 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006
## [16] 2005 2004 2003 2003
## [1] 2003
## [1] 2020 2019 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006
## [16] 2005 2004 2003 2002
So we’re going to be re-using the tidyverse library and then an additional HistData package to reproduce the Minard map of Napleon’s campaign.
install.packages("tidyverse", "ggthemes", "HistData", "lubridate", "gridExtra")
## long lat survivors direction group
## 1 24.0 54.9 340000 A 1
## 2 24.5 55.0 340000 A 1
## 3 25.5 54.5 340000 A 1
## 4 26.0 54.7 320000 A 1
## 5 27.0 54.8 300000 A 1
## 6 28.0 54.9 280000 A 1
## 7 28.5 55.0 240000 A 1
## 8 29.0 55.1 210000 A 1
## 9 30.0 55.2 180000 A 1
## 10 30.3 55.3 175000 A 1
## 11 32.0 54.8 145000 A 1
## 12 33.2 54.9 140000 A 1
## 13 34.4 55.5 127100 A 1
## 14 35.5 55.4 100000 A 1
## 15 36.0 55.5 100000 A 1
## 16 37.6 55.8 100000 A 1
## 17 37.7 55.7 100000 R 1
## 18 37.5 55.7 98000 R 1
## 19 37.0 55.0 97000 R 1
## 20 36.8 55.0 96000 R 1
## 21 35.4 55.3 87000 R 1
## 22 34.3 55.2 55000 R 1
## 23 33.3 54.8 37000 R 1
## 24 32.0 54.6 24000 R 1
## 25 30.4 54.4 20000 R 1
## 26 29.2 54.3 20000 R 1
## 27 28.5 54.2 20000 R 1
## 28 28.3 54.3 20000 R 1
## 29 27.5 54.5 20000 R 1
## 30 26.8 54.3 12000 R 1
## 31 26.4 54.4 14000 R 1
## 32 25.0 54.4 8000 R 1
## 33 24.4 54.4 4000 R 1
## 34 24.2 54.4 4000 R 1
## 35 24.1 54.4 4000 R 1
## 36 24.0 55.1 60000 A 2
## 37 24.5 55.2 60000 A 2
## 38 25.5 54.7 60000 A 2
## 39 26.6 55.7 40000 A 2
## 40 27.4 55.6 33000 A 2
## 41 28.7 55.5 33000 A 2
## 42 28.7 55.5 33000 R 2
## 43 29.2 54.2 30000 R 2
## 44 28.5 54.1 30000 R 2
## 45 28.3 54.2 28000 R 2
## 46 24.0 55.2 22000 A 3
## 47 24.5 55.3 22000 A 3
## 48 24.6 55.8 6000 A 3
## 49 24.6 55.8 6000 R 3
## 50 24.2 54.4 6000 R 3
## 51 24.1 54.4 6000 R 3
## long lat survivors direction group
## 1 24.0 54.9 340000 A 1
## 2 24.5 55.0 340000 A 1
## 3 25.5 54.5 340000 A 1
## 4 26.0 54.7 320000 A 1
## 5 27.0 54.8 300000 A 1
## 6 28.0 54.9 280000 A 1
## 7 28.5 55.0 240000 A 1
## 8 29.0 55.1 210000 A 1
## 9 30.0 55.2 180000 A 1
## 10 30.3 55.3 175000 A 1
## 11 32.0 54.8 145000 A 1
## 12 33.2 54.9 140000 A 1
## 13 34.4 55.5 127100 A 1
## 14 35.5 55.4 100000 A 1
## 15 36.0 55.5 100000 A 1
## 16 37.6 55.8 100000 A 1
## 17 37.7 55.7 100000 R 1
## 18 37.5 55.7 98000 R 1
## 19 37.0 55.0 97000 R 1
## 20 36.8 55.0 96000 R 1
## 21 35.4 55.3 87000 R 1
## 22 34.3 55.2 55000 R 1
## 23 33.3 54.8 37000 R 1
## 24 32.0 54.6 24000 R 1
## 25 30.4 54.4 20000 R 1
## 26 29.2 54.3 20000 R 1
## 27 28.5 54.2 20000 R 1
## 28 28.3 54.3 20000 R 1
## 29 27.5 54.5 20000 R 1
## 30 26.8 54.3 12000 R 1
## 31 26.4 54.4 14000 R 1
## 32 25.0 54.4 8000 R 1
## 33 24.4 54.4 4000 R 1
## 34 24.2 54.4 4000 R 1
## 35 24.1 54.4 4000 R 1
## 36 24.0 55.1 60000 A 2
## 37 24.5 55.2 60000 A 2
## 38 25.5 54.7 60000 A 2
## 39 26.6 55.7 40000 A 2
## 40 27.4 55.6 33000 A 2
## 41 28.7 55.5 33000 A 2
## 42 28.7 55.5 33000 R 2
## 43 29.2 54.2 30000 R 2
## 44 28.5 54.1 30000 R 2
## 45 28.3 54.2 28000 R 2
## 46 24.0 55.2 22000 A 3
## 47 24.5 55.3 22000 A 3
## 48 24.6 55.8 6000 A 3
## 49 24.6 55.8 6000 R 3
## 50 24.2 54.4 6000 R 3
## 51 24.1 54.4 6000 R 3
## long lat survivors direction group
## 1 24.0 54.9 340000 A 1
## 2 24.5 55.0 340000 A 1
## 3 25.5 54.5 340000 A 1
## 4 26.0 54.7 320000 A 1
## 5 27.0 54.8 300000 A 1
## 6 28.0 54.9 280000 A 1
## 7 28.5 55.0 240000 A 1
## 8 29.0 55.1 210000 A 1
## 9 30.0 55.2 180000 A 1
## 10 30.3 55.3 175000 A 1
## 11 32.0 54.8 145000 A 1
## 12 33.2 54.9 140000 A 1
## 13 34.4 55.5 127100 A 1
## 14 35.5 55.4 100000 A 1
## 15 36.0 55.5 100000 A 1
## 16 37.6 55.8 100000 A 1
## 17 37.7 55.7 100000 R 1
## 18 37.5 55.7 98000 R 1
## 19 37.0 55.0 97000 R 1
## 20 36.8 55.0 96000 R 1
## 21 35.4 55.3 87000 R 1
## 22 34.3 55.2 55000 R 1
## 23 33.3 54.8 37000 R 1
## 24 32.0 54.6 24000 R 1
## 25 30.4 54.4 20000 R 1
## 26 29.2 54.3 20000 R 1
## 27 28.5 54.2 20000 R 1
## 28 28.3 54.3 20000 R 1
## 29 27.5 54.5 20000 R 1
## 30 26.8 54.3 12000 R 1
## 31 26.4 54.4 14000 R 1
## 32 25.0 54.4 8000 R 1
## 33 24.4 54.4 4000 R 1
## 34 24.2 54.4 4000 R 1
## 35 24.1 54.4 4000 R 1
## 36 24.0 55.1 60000 A 2
## 37 24.5 55.2 60000 A 2
## 38 25.5 54.7 60000 A 2
## 39 26.6 55.7 40000 A 2
## 40 27.4 55.6 33000 A 2
## 41 28.7 55.5 33000 A 2
## 42 28.7 55.5 33000 R 2
## 43 29.2 54.2 30000 R 2
## 44 28.5 54.1 30000 R 2
## 45 28.3 54.2 28000 R 2
## 46 24.0 55.2 22000 A 3
## 47 24.5 55.3 22000 A 3
## 48 24.6 55.8 6000 A 3
## 49 24.6 55.8 6000 R 3
## 50 24.2 54.4 6000 R 3
## 51 24.1 54.4 6000 R 3
## plot path of troops, and another layer for city names
plot_troops <- ggplot(Minard.troops, aes(long, lat)) +
geom_path(aes(linewidth = survivors, colour = direction, group = group),
lineend = "round", linejoin = "round")# Create a new gg object
plot_minard <- plot_troops + plot_cities +
scale_size("Survivors", range = c(1, 10),
breaks = breaks, labels = scales::comma(breaks)) +
scale_color_manual("Direction",
values = c("grey50", "red"),
labels=c("Advance", "Retreat")) +
coord_cartesian(xlim = c(24, 38)) +
xlab(NULL) +
ylab("Latitude") +
ggtitle("Napoleon's March on Moscow") +
theme_bw() +
theme(legend.position=c(.8, .2), legend.box="horizontal")## plot temperature vs. longitude, with labels for dates
plot_temp <- ggplot(Minard.temp, aes(long, temp)) +
geom_path(color="grey", size=1.5) +
geom_point(size=2) +
geom_text(aes(label=date)) +
xlab("Longitude") + ylab("Temperature") +
coord_cartesian(xlim = c(24, 38)) +
theme_bw()## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: Removed 1 rows containing missing values (`geom_text()`).