R Markdown

In today’s fast-paced and data-driven world, it’s crucial for businesses and individuals to make informed decisions based on reliable and up-to-date information. One way to obtain such information is by web scraping financial data from sources like Yahoo Finance and Google Trends, and conducting time series analysis to reveal insights and patterns that could be useful for decision-making.

In this tutorial, we will explore the process of scraping financial data from Yahoo Finance and Google Trends using R and R Studio. We will begin by discussing the importance of web scraping in financial analysis and how it can be used to extract relevant data from the web. We will then move on to the specifics of scraping Yahoo Finance and Google Trends data and discuss the steps involved in conducting time series analysis.

We will walk through a step-by-step approach, starting with the installation of the necessary R packages and libraries required for web scraping and time series analysis. Next, we will demonstrate how to extract data from Yahoo Finance and Google Trends using R and R Studio, and how to clean, format and process the extracted data to ensure it is suitable for time series analysis.

Finally, we will conduct time series analysis on the extracted financial data using various techniques such as forecasting and trend analysis, and show how the results can be visualized using R’s powerful graphics capabilities. By the end of this tutorial, you will have gained the skills and knowledge necessary to conduct web scraping and time series analysis on financial data using R and R Studio, enabling you to make informed and data-driven decisions.

## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## Loading required package: xts
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## 
## ######################### Warning from 'xts' package ##########################
## #                                                                             #
## # The dplyr lag() function breaks how base R's lag() function is supposed to  #
## # work, which breaks lag(my_xts). Calls to lag(my_xts) that you type or       #
## # source() into this session won't work correctly.                            #
## #                                                                             #
## # Use stats::lag() to make sure you're not using dplyr::lag(), or you can add #
## # conflictRules('dplyr', exclude = 'lag') to your .Rprofile to stop           #
## # dplyr from breaking base R's lag() function.                                #
## #                                                                             #
## # Code in packages is not affected. It's protected by R's namespace mechanism #
## # Set `options(xts.warn_dplyr_breaks_lag = FALSE)` to suppress this warning.  #
## #                                                                             #
## ###############################################################################
## 
## Attaching package: 'xts'
## The following objects are masked from 'package:dplyr':
## 
##     first, last
## Loading required package: TTR
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union
## [1] "AMZN"
## Joining with `by = join_by(date)`
##  Time-Series [1:61, 1:3] from 1 to 6: 82 89 96.3 88.8 94.7 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : NULL
##   ..$ : chr [1:3] "stock.close" "month" "hits"
## 
## Call:
## lm(formula = stock.close ~ hits, data = stock)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -46.790 -31.263  -1.593  29.399  46.464 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 128.40170    5.29101  24.268   <2e-16 ***
## hits          0.05422    0.15329   0.354    0.725    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 30.92 on 59 degrees of freedom
## Multiple R-squared:  0.002116,   Adjusted R-squared:  -0.0148 
## F-statistic: 0.1251 on 1 and 59 DF,  p-value: 0.7248

## `geom_smooth()` using formula = 'y ~ x'

## Loading required package: ggpp
## Registered S3 methods overwritten by 'ggpp':
##   method                  from   
##   heightDetails.titleGrob ggplot2
##   widthDetails.titleGrob  ggplot2
## 
## Attaching package: 'ggpp'
## The following object is masked from 'package:ggplot2':
## 
##     annotate
## Registered S3 method overwritten by 'ggpmisc':
##   method                  from   
##   as.character.polynomial polynom

Reference:

Intro to the package, GtrendsR - https://cran.r-project.org/web/packages/gtrendsR/gtrendsR.pdf

https://www.rdocumentation.org/packages/gtrendsR/versions/1.5.1