For this task, I am extracting the number of scientific and technical journal articles by country. This refers to the number of articles published in the field of physics, biology, chemistry, mathematics, clinical medicine, biomedical research, engineering and technology, and earth and space sciences. I chose five countries (US, Canada, China, India, and Germany) to see the trend in publications over the years. I think it would be really interesting to see the number of publications in each field as well. Also, it would be great if the publications for economics were included.
# Packages needed
library(wbstats)
library(ggplot2)
# extract data from worldbank API between 2000 to 2018 by country
ForData <- wb_data(country= c("USA", "Canada", "China", "India", "Germany"),
indicator = "IP.JRN.ARTC.SC",
start_date = 2000, end_date = 2018)
str(ForData)
## tibble [95 x 9] (S3: tbl_df/tbl/data.frame)
## $ iso2c : chr [1:95] "CA" "CA" "CA" "CA" ...
## $ iso3c : chr [1:95] "CAN" "CAN" "CAN" "CAN" ...
## $ country : chr [1:95] "Canada" "Canada" "Canada" "Canada" ...
## $ date : num [1:95] 2000 2001 2002 2003 2004 ...
## $ IP.JRN.ARTC.SC: num [1:95] 33854 33374 36153 38040 41974 ...
## ..- attr(*, "label")= chr "Scientific and technical journal articles"
## $ unit : chr [1:95] NA NA NA NA ...
## $ obs_status : chr [1:95] NA NA NA NA ...
## $ footnote : chr [1:95] NA NA NA NA ...
## $ last_updated : Date[1:95], format: "2021-03-19" "2021-03-19" ...
# Checking min and max number of overall publications
min(ForData$IP.JRN.ARTC.SC)
## [1] 21770.72
max(ForData$IP.JRN.ARTC.SC)
## [1] 528263.2
# plot the data
ggplot(ForData)+
geom_point(aes(x=date, y=IP.JRN.ARTC.SC, color=country))+
labs(title = "Number of scientific publications by country",
x = "Date",
y = "Number of publications")+
scale_y_continuous(breaks = seq(20000,550000,50000))
Result shows increase in number of publications for all countries over time. However, I would like to point out China in particular as it has the highest number of publications and the number has increased by more than 460,000 in 18 years period.
For this task, I am revisiting my first assignment related to the Wall Street Bet stocks. Starting from the January 2021, prices of stocks such as GameStop (GME) and AMC theatre (AMC) went up without any good news and beyond the actual value of the company.
As a response, some trading service companies like Robinhood and Webull put some restrictions on the allowable number of stocks to be traded. Many individual investors and even politicians raised questions about such action. The trading service companies lifted their restriction that led to jump in stock prices and thus more investors became interested in those stocks.
So, I am looking at the number of searches of terms ‘Robinhood’, ‘Webull’, and other wall street bets during that period.
# part 2: These packages are needed
library(gtrendsR)
library(reshape2)
library(dplyr)
# Extract goggle data
googleData = gtrends(c("Robinhood", "GME", "AMC", "webull"), gprop = "web",
geo = c("US"), time = "2021-01-01 2021-03-18")[[1]]
googleData = dcast(googleData, date ~ keyword + geo, value.var = "hits")
str(googleData)
## 'data.frame': 77 obs. of 5 variables:
## $ date : POSIXct, format: "2021-01-01" "2021-01-02" ...
## $ AMC_US : chr "3" "3" "3" "2" ...
## $ GME_US : chr "<1" "<1" "<1" "<1" ...
## $ Robinhood_US: chr "1" "3" "3" "3" ...
## $ webull_US : chr "<1" "<1" "<1" "1" ...
#make less than 1 values = 0, only exist in GME and webull
googleData$GME_US <- ifelse(googleData$GME_US == "<1", 0, googleData$GME_US)
googleData$webull_US <- ifelse(googleData$webull_US == "<1", 0, googleData$webull_US)
# Plotting search trends
plot(googleData$date, googleData$GME_US, type = "l", col = "red",
ylab = "number of searches", xlab = "Date",
main = "Searches of financial companies and wall street bets over time")
lines(googleData$date, googleData$AMC_US, type = "l", col = "blue")
lines(googleData$date, googleData$Robinhood_US, type = "l", col = "darkgreen")
lines(googleData$date, googleData$webull_US, type = "l", col = "black")
legend("topright", legend=c("GME", "AMC", "Robinhood", "Webull"),
col = c("red", "blue", "darkgreen", "black"), lwd=1, cex = 0.8)
As shown by the graph, GME and AMC are searched more followed by Robinhood. Webull searches are not as high as Robinhood because majority of individual investors use the latter company for trading purposes. It is interesting to see, specially in the first three waves, that stock searches are followed by the Robinhood search.
Next, I looked into the relationship between the search trend of wall street bets and their prices (read from Yahoo finance directly).
# packages required
library(tidyquant)
library(dplyr)
# read GME and AMC prices data from yahoo finance and use the adjusted price
gme <- tq_get("GME",
from = "2021-01-01",
to = "2021-03-19",
get = "stock.prices")
max(gme$adjusted) # maximum price
## [1] 347.51
amc <- tq_get("AMC",
from = "2021-01-01",
to = "2021-03-19",
get = "stock.prices")
min(amc$adjusted) # minimum price
## [1] 1.98
# Note: There is no weekend price data
# For plotting purpose later, drop values of weekend from google trend data
googleData <- googleData %>% semi_join(gme, by = "date") # removes unmatching rows by date
gme$googGME <- googleData$GME_US # include trend data in the price data frame
gme$googAMC <- googleData$AMC_US
# plot the graph
par(mfrow=c(1,2))
plot(gme$date, gme$adjusted, type = "l", col = "red", ylim = c(1, 350),
ylab = "GME trend and price", xlab = "Date")
lines(gme$date, gme$googGME, type = "l", col = "blue")
legend("topright", legend = c("stock prices", "google trend"),
col = c("red", "blue"), lwd=1, cex = 0.5)
plot(amc$date, amc$adjusted, type = "l", col = "red", ylim = c(1, 90),
ylab = "AMC trend and price", xlab = "Date")
lines(gme$date, gme$googAMC, type = "l", col = "blue")
legend("topright", c("stock prices", "google trend"),
cex = 0.5, col = c("red", "blue"), lwd=1)
We can see that the google trend actually seem to follow the stock price fluctuations. Both AMC and GME received increasing attention of investors. Number of searches and prices have the same pattern. When the price sky rocketed and dropped down quickly, so did the search trend. Google trends can also be used to forecast the stock market movements.