I focus on the unemployment rates of the United State of America for this assignment.
library(wbstats)
library(ggplot2)
str(wb_cachelist, max.level = 1)
## List of 7
## $ countries :'data.frame': 304 obs. of 18 variables:
## $ indicators :'data.frame': 16978 obs. of 7 variables:
## $ sources :'data.frame': 43 obs. of 8 variables:
## $ datacatalog:'data.frame': 238 obs. of 29 variables:
## $ topics :'data.frame': 21 obs. of 3 variables:
## $ income :'data.frame': 7 obs. of 3 variables:
## $ lending :'data.frame': 4 obs. of 3 variables:
new_cache <- wbcache()
unemploy_vars <- wbsearch(pattern = "unemployment")
head(unemploy_vars)
## indicatorID
## 35 WP15177.9
## 36 WP15177.8
## 37 WP15177.7
## 38 WP15177.6
## 39 WP15177.5
## 40 WP15177.4
## indicator
## 35 Received government transfers in the past year, income, richest 60% (% ages 15+) [w2]
## 36 Received government transfers in the past year, income, poorest 40% (% ages 15+) [w2]
## 37 Received government transfers in the past year, secondary education or more (% ages 15+) [w2]
## 38 Received government transfers in the past year, primary education or less (% ages 15+) [w2]
## 39 Received government transfers in the past year, older adults (% ages 25+) [w2]
## 40 Received government transfers in the past year, young adults (% ages 15-24) [w2]
unEmpRate <-wb(indicator = "SL.UEM.TOTL.NE.ZS")
head(unEmpRate)
## iso3c date value indicatorID
## 4 ARB 2016 9.559937 SL.UEM.TOTL.NE.ZS
## 6 ARB 2014 10.526193 SL.UEM.TOTL.NE.ZS
## 8 ARB 2012 10.674805 SL.UEM.TOTL.NE.ZS
## 9 ARB 2011 11.146858 SL.UEM.TOTL.NE.ZS
## 10 ARB 2010 9.301470 SL.UEM.TOTL.NE.ZS
## 11 ARB 2009 9.501470 SL.UEM.TOTL.NE.ZS
## indicator iso2c
## 4 Unemployment, total (% of total labor force) (national estimate) 1A
## 6 Unemployment, total (% of total labor force) (national estimate) 1A
## 8 Unemployment, total (% of total labor force) (national estimate) 1A
## 9 Unemployment, total (% of total labor force) (national estimate) 1A
## 10 Unemployment, total (% of total labor force) (national estimate) 1A
## 11 Unemployment, total (% of total labor force) (national estimate) 1A
## country
## 4 Arab World
## 6 Arab World
## 8 Arab World
## 9 Arab World
## 10 Arab World
## 11 Arab World
class(unEmpRate)
## [1] "data.frame"
str(unEmpRate)
## 'data.frame': 4311 obs. of 7 variables:
## $ iso3c : chr "ARB" "ARB" "ARB" "ARB" ...
## $ date : chr "2016" "2014" "2012" "2011" ...
## $ value : num 9.56 10.53 10.67 11.15 9.3 ...
## $ indicatorID: chr "SL.UEM.TOTL.NE.ZS" "SL.UEM.TOTL.NE.ZS" "SL.UEM.TOTL.NE.ZS" "SL.UEM.TOTL.NE.ZS" ...
## $ indicator : chr "Unemployment, total (% of total labor force) (national estimate)" "Unemployment, total (% of total labor force) (national estimate)" "Unemployment, total (% of total labor force) (national estimate)" "Unemployment, total (% of total labor force) (national estimate)" ...
## $ iso2c : chr "1A" "1A" "1A" "1A" ...
## $ country : chr "Arab World" "Arab World" "Arab World" "Arab World" ...
usaunEmpRate <- wb(country="USA", indicator = "SL.UEM.TOTL.NE.ZS")
head(usaunEmpRate[1:3])
## iso3c date value
## 2 USA 2018 3.8956
## 3 USA 2017 4.3552
## 4 USA 2016 4.8692
## 5 USA 2015 5.2800
## 6 USA 2014 6.1675
## 7 USA 2013 7.3749
unEmpRateUSA <- wb(country=c("USA"), indicator = "SL.UEM.TOTL.NE.ZS")
g <- ggplot(unEmpRateUSA, aes(x=as.numeric(date), y=value, color=country)) +
geom_line() +
labs(title="Unemployment Rate in USA",
x= "Date",
y= "Percent")
g
We can draw two conclusions. First, unemployment rates peaked around 1982 and 2009 and reached around 10% due to the early 1980s recession and the Great recession of 2008. Since they were a severe global economic recessions, I want to compare with other countries in Europe and Asia, especailly, United Kingdom and Japan.
unEmpRateUSKORENG <- wb(country=c("USA","JPN","GBR"), indicator = "SL.UEM.TOTL.NE.ZS")
h <- ggplot(unEmpRateUSKORENG, aes(x=as.numeric(date), y=value, color=country)) +
geom_line() +
labs(title="Unemployment Rate Comparisons",
x= "Date",
y= "Percent")
h
Recession hit the United Kingdom the most at the beginning of the 1980s. The unemployment rate of the United Kingdom was gradually decreased since 1982, while the United State had some cycles but overall, the unemployment rate of the United State also was decrease. The United States exited the recession relatively early. Japan had a different pattern from the United Kingdom and the United State. the unemployment rate of Japan peaked 2001 around 5%.
I often play tennis. My key word is “Tennis” for this assignment. I would like to see growing or decresaing popularity or to review periodic variations over the year.
library(gtrendsR)
library(reshape2)
googleData = gtrends(c("Tennis"), gprop = "web", time = "today 12-m")[[1]]
googleData = dcast(googleData, date ~ keyword + geo, value.var = "hits")
plot(googleData, type="l")
The graph peaked around September 1 to 7 because US Open begins last Monday in August and ends on second Sunday in September. The graph had the second peak around July because other US Open series begin July.
Now, I am wondering if other countries which have the Grand Slam tournaments have different trends on keyword, “tennis”. There are 4 Grand Slam tournaments: Australian Open, French Open, Wimbledon (England), and US Open. Therefore, I selected three countries, such as Australia, France, and England.
google.trends = gtrends(c("Tennis"), geo = c("US", "AU", "GB", "FR"), gprop = "web", time = "today 12-m")
plot(google.trends)
Let’s check each Grand Slam dates.
Australia has a peak in January; Franch has a peak in June; England has a peak in July. People like their own Grand Slam more.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
library(gtrendsR)
library(maps)
world <- map_data("world")
world %>%
mutate(region = replace(region, region=="USA", "United States")) %>%
mutate(region = replace(region, region=="UK", "United Kingdom")) -> world
tennis_world <- gtrends("tennis", time = "today 12-m")
# create data frame for plotting
tennis_world$interest_by_country %>%
filter(location %in% world$region, hits > 0) %>%
mutate(region = location, hits = as.numeric(hits)) %>%
select(region, hits) -> my_df
ggplot() +
geom_map(data = world,
map = world,
aes(x = long, y = lat, map_id = region),
fill="#ffffff", color="#ffffff", size=0.15) +
geom_map(data = my_df,
map = world,
aes(fill = hits, map_id = region),
color="#ffffff", size=0.15) +
scale_fill_continuous(low = 'grey', high = 'red') +
theme(axis.ticks = element_blank(),
axis.text = element_blank(),
axis.title = element_blank())