library(tidyverse)
library(ggplot2)
library(RColorBrewer)
setwd("C:/Users/enomc/OneDrive - montgomerycollege.edu/Documents/Data Science")
<- read_csv("nations.csv") nations
Nations homework
##load the dataset
<- nations |>
nationalmutate(gdp = (gdp_percap*population)/10^12)
<- national|>
nationals3 filter(country %in% c("Angola", "Armenia","Argentina","Austria"))
national
# A tibble: 5,275 × 11
iso2c iso3c country year gdp_percap population birth_rate neonat_mortal_rate
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 AD AND Andorra 1996 NA 64291 10.9 2.8
2 AD AND Andorra 1994 NA 62707 10.9 3.2
3 AD AND Andorra 2003 NA 74783 10.3 2
4 AD AND Andorra 1990 NA 54511 11.9 4.3
5 AD AND Andorra 2009 NA 85474 9.9 1.7
6 AD AND Andorra 2011 NA 82326 NA 1.6
7 AD AND Andorra 2004 NA 78337 10.9 2
8 AD AND Andorra 2010 NA 84419 9.8 1.7
9 AD AND Andorra 2001 NA 67770 11.8 2.1
10 AD AND Andorra 2002 NA 71046 11.2 2.1
# ℹ 5,265 more rows
# ℹ 3 more variables: region <chr>, income <chr>, gdp <dbl>
<- nationals3 |>
aplot ggplot(aes(color=country, x=year, y=gdp_percap),) +
geom_line (position = "dodge", stat = "identity") +
scale_color_brewer(palette = "Set1") +
geom_point() +
scale_color_brewer(palette = "Set1") +
labs(fill = "Country",
y = "Gross Domestic Product in Trillions",
title = "Austria's road to the largest economy",
caption = "Source: Nations dataset")
Scale for colour is already present.
Adding another scale for colour, which will replace the existing scale.
aplot
Ignoring unknown labels:
• fill : "Country"
Warning: Width not defined
ℹ Set with `position_dodge(width = ...)`
Warning: Removed 25 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 25 rows containing missing values or values outside the scale range
(`geom_point()`).
<-national |>
cplot group_by(region, year) |>
summarise(sum_GDP = sum(gdp, na.rm = TRUE))
`summarise()` has grouped output by 'region'. You can override using the
`.groups` argument.
cplot
# A tibble: 175 × 3
# Groups: region [7]
region year sum_GDP
<chr> <dbl> <dbl>
1 East Asia & Pacific 1990 5.52
2 East Asia & Pacific 1991 6.03
3 East Asia & Pacific 1992 6.50
4 East Asia & Pacific 1993 7.04
5 East Asia & Pacific 1994 7.64
6 East Asia & Pacific 1995 8.29
7 East Asia & Pacific 1996 8.96
8 East Asia & Pacific 1997 9.55
9 East Asia & Pacific 1998 9.60
10 East Asia & Pacific 1999 10.1
# ℹ 165 more rows
<- cplot |>
bplot ggplot(aes(x=year, y=sum_GDP, fill=region))+
geom_area(position= "stack", stat="identity")+
geom_area(position= "stack", stat="identity", color ="white")+
scale_fill_brewer(palette="Set2")+
labs(fill="Region",
y= "Gross Domestic Product in Trillions",
title= "GDP by Region",
caption = "Source: Nations dataset")
bplot
First I loaded the required datasets in my first chunk. Then I used mutate from dplyr. I did it so that there would be a new environment called national. National would take the gdp per cap by population in the nations dataset and then dividing by a trillion using 10^12 as mentioned earlier. Then I used ggplot to draw both of my charts and fdiltered my first chart for four countries. I added geom line and geom point layers and then use set 1 colorbrewer pallete. I also used dplyr after and then used summarize na.rm =TRUE so that even with null values the data would still be able to be used.