Stock Analysis

1. Stock price concepts

  1. What is a stock?

A stock is a type of security that signifies ownership in a corporation and represents a claim on part of the corporation’s assets and earnings. Stocks are bought and sold on stock exchanges.

  1. How can we trade a stock?

To trade a stock, an investor must open a brokerage account with a stockbroker. The investor can then place buy or sell orders through the brokerage, who will execute the trades on a stock exchange on behalf of the investor. Common order types are market orders, limit orders, and stop orders. Trading usually takes place between 9:30 am to 4 pm EST during regular market hours.

  1. Explain the colors of stock trading boards?

On stock trading boards or dashboards, different colors are used to represent the current stock price movements: green for positive price change, red for negative price change, and blue for no significant change. The darker the color, the larger the price movement.

  1. Why do some people invest in stocks?

Some people invest in stocks to earn dividends from profitable companies, to benefit from capital gains as stock prices appreciate over time, and to diversify their investment portfolios in pursuit of strong long-term returns. Stocks are seen as a way to build wealth through ownership in growing businesses. The potential for high returns is an incentive for stock market investment despite the risks.

2. Daily stock prices

2.1 Obtaining data from R packages

  1. Write a R code to obtain stock prices for 4 companies: NETFLIX, TESLA, VINFAST, GENERAL MOTORS in the latest 4 months.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tidyquant)
## Loading required package: PerformanceAnalytics
## Loading required package: xts
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## 
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## 
## 
## ######################### Warning from 'xts' package ##########################
## #                                                                             #
## # The dplyr lag() function breaks how base R's lag() function is supposed to  #
## # work, which breaks lag(my_xts). Calls to lag(my_xts) that you type or       #
## # source() into this session won't work correctly.                            #
## #                                                                             #
## # Use stats::lag() to make sure you're not using dplyr::lag(), or you can add #
## # conflictRules('dplyr', exclude = 'lag') to your .Rprofile to stop           #
## # dplyr from breaking base R's lag() function.                                #
## #                                                                             #
## # Code in packages is not affected. It's protected by R's namespace mechanism #
## # Set `options(xts.warn_dplyr_breaks_lag = FALSE)` to suppress this warning.  #
## #                                                                             #
## ###############################################################################
## 
## Attaching package: 'xts'
## 
## The following objects are masked from 'package:dplyr':
## 
##     first, last
## 
## 
## Attaching package: 'PerformanceAnalytics'
## 
## The following object is masked from 'package:graphics':
## 
##     legend
## 
## Loading required package: quantmod
## Loading required package: TTR
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
library(quantmod)
library(plotly)
## 
## Attaching package: 'plotly'
## 
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## 
## The following object is masked from 'package:stats':
## 
##     filter
## 
## The following object is masked from 'package:graphics':
## 
##     layout
library(DT)
symbols <- c( "NFLX", "TSLA", "VFS", "GM")

prices=tq_get(symbols, get = "stock.prices",from = "2024-01-18")

prices %>% datatable()
  1. Write a R code to plot stock prices for 4 companies: NETFLIX, TESLA, VINFAST, GENERAL MOTORS in the latest 4 months.
prices %>%
    group_by(symbol) %>% 
    ggplot(aes(x = date, y = adjusted, color = symbol)) +
    geom_line(linewidth = 1) +
    labs(
        title = "Daily Stock Prices",
        x = "", 
        y = "Adjusted Prices", 
        color = ""
    ) +
    facet_wrap(~ symbol, ncol = 2, scales = "free_y") +
    scale_y_continuous(labels = scales::dollar) +
    theme_tq() + 
    scale_color_tq()

3. Daily stock return

3.1 Hand calculation

  1. Compute by hand the daily arithmetic return for VINFAST in the last 5 trading days.
prices %>% filter(date <= "2024-05-17", date >= "2024-05-13", symbol =="VFS")
## # A tibble: 5 × 8
##   symbol date        open  high   low close   volume adjusted
##   <chr>  <date>     <dbl> <dbl> <dbl> <dbl>    <dbl>    <dbl>
## 1 VFS    2024-05-13  3.20  4.64  3.08  4.56 15396300     4.56
## 2 VFS    2024-05-14  4.65  4.99  4     4.11  8503000     4.11
## 3 VFS    2024-05-15  4.21  4.29  4     4.24  2090400     4.24
## 4 VFS    2024-05-16  4.23  4.87  4.23  4.42  5959300     4.42
## 5 VFS    2024-05-17  4.35  4.88  4.24  4.88  3143700     4.88
  1. Compute by hand the arithmetic return for VINFAST for a whole period of the last 5 trading days
  2. Compute by hand the daily log return for VINFAST in the last 5 trading days.
  3. Compute by hand the log return for VINFAST for a whole period of the last 5 trading days
daily_log_returns_5d <- prices %>% filter(date <= "2024-05-18", date >= "2024-05-13", symbol =="VFS") %>%
    group_by(symbol) %>%
    mutate(daily_log_return = (adjusted - lag(adjusted, 1)) / lag(adjusted, 1)) %>%
    select(symbol, date, adjusted, daily_log_return) %>%
    na.omit()

daily_log_returns_5d %>% datatable()
daily_arithmetic_return_5d <- prices %>% filter(date <= "2024-05-18", date >= "2024-05-13", symbol =="VFS") %>%
    group_by(symbol) %>%
    mutate(daily_arithmetic_return = (adjusted - lag(adjusted, 1)) / lag(adjusted, 1)) %>%
    select(symbol, date, adjusted, daily_arithmetic_return) %>%
    na.omit()

daily_arithmetic_return_5d %>% datatable()

3.2 Code practice

  1. Write a R code to compute and visualize the daily arithmetic return for VINFAST in the last 4 months
daily_arithmetic_return_4m <- prices %>% filter(date < "2024-05-18", date >= "2024-01-18", symbol =="VFS") %>%
    group_by(symbol) %>%
    mutate(arithmetic_return = (adjusted - lag(adjusted, 1)) / lag(adjusted, 1)) %>%
    select(symbol, date, adjusted, arithmetic_return) %>%
    na.omit()

daily_arithmetic_return_4m %>% datatable()
daily_arithmetic_return_4m %>% ggplot(aes(x=date, y=arithmetic_return, col=symbol))+geom_line()+facet_grid(symbol~.)

  1. Write a R code to compute and visualize the daily log return for VINFAST in the last 4 months
daily_log_returns_4m <- prices %>% filter(date < "2024-05-18", date >= "2024-01-18", symbol == "VFS") %>%
    group_by(symbol) %>%
    mutate(daily_log_return=log(adjusted)-log(lag(adjusted,1))) %>% 
    select(symbol, date, adjusted,daily_log_return)%>%
    na.omit() # Remove rows with NA values in arithmetic_return

daily_log_returns_4m %>% datatable()
daily_log_returns_4m %>% ggplot(aes(x=date, y=daily_log_return, col=symbol))+geom_line()+facet_grid(symbol~.)

4. Daily Stock volatility

4.2 Code practice

Compute and visualize the sample daily volatility for 4 stocks NETFLIX, TESLA, VINFAST, GENERAL MOTORS in the last 4 months. Which one has the greatest volatility?

data= daily_log_returns_4s_4m <- prices %>% 
  filter(date <"2024-05-18", date >="2024-01-18", symbol %in%c( "NFLX", "TSLA", "VFS", "GM")) %>%
  group_by(symbol)%>% mutate(daily_log_return=log(adjusted)-log(lag(adjusted,1))) %>%   
  select(symbol, date, adjusted, daily_log_return)%>%
  na.omit()

data %>% 
  select(symbol,date,daily_log_return)%>% 
  pivot_wider(names_from = symbol, values_from = daily_log_return) %>% ## showing 4 stocks by 4 columns
  na.omit()
## # A tibble: 84 × 5
##    date           NFLX     TSLA      VFS        GM
##    <date>        <dbl>    <dbl>    <dbl>     <dbl>
##  1 2024-01-19 -0.00487  0.00146  0.0619   0.0265  
##  2 2024-01-22  0.00570 -0.0161   0.0263  -0.00536 
##  3 2024-01-23  0.0133   0.00163 -0.00325 -0.00255 
##  4 2024-01-24  0.102   -0.00628 -0.0382  -0.0152  
##  5 2024-01-25  0.0310  -0.129    0.0349   0.0132  
##  6 2024-01-26  0.0149   0.00339 -0.0299   0.000569
##  7 2024-01-29  0.00937  0.0411   0.0282   0.00595 
##  8 2024-01-30 -0.0227   0.00345 -0.0182   0.0751  
##  9 2024-01-31  0.00224 -0.0227  -0.00837  0.0169  
## 10 2024-02-01  0.00601  0.00835 -0.00337  0.00180 
## # ℹ 74 more rows
data %>% ggplot(aes(x=date, y=daily_log_return, col=symbol))+geom_line()+facet_grid(symbol~.)

data%>% 
  mutate(diff=daily_log_return-mean(daily_log_return)) %>% 
  summarize(volatility = sqrt(sum(diff^2) / (n() - 1)))
## # A tibble: 4 × 2
##   symbol volatility
##   <chr>       <dbl>
## 1 GM         0.0158
## 2 NFLX       0.0220
## 3 TSLA       0.0364
## 4 VFS        0.0640
data=daily_log_returns_4s_4m %>% filter(date< "2024-05-18", date >="2024-01-18", symbol %in%c( "NFLX", "TSLA", "VFS", "GM"))
data %>% datatable()
data%>% 
  mutate(diff=daily_log_return-mean(daily_log_return)) %>% 
  summarize(volatility = sqrt(sum(diff^2) / (n() - 1)))
## # A tibble: 4 × 2
##   symbol volatility
##   <chr>       <dbl>
## 1 GM         0.0158
## 2 NFLX       0.0220
## 3 TSLA       0.0364
## 4 VFS        0.0640

5. Daily covariance between two stocks

Consider 4 stocks NETFLIX, TESLA, VINFAST, GENERAL MOTORS in the last 4 months.

  1. Plot the scatter plot for each pair of stocks’ log returns?
data1=daily_log_returns_4s_4m %>% select(symbol,date,daily_log_return)%>% pivot_wider(names_from = symbol, values_from = daily_log_return)

data1
## # A tibble: 84 × 5
##    date           NFLX     TSLA      VFS        GM
##    <date>        <dbl>    <dbl>    <dbl>     <dbl>
##  1 2024-01-19 -0.00487  0.00146  0.0619   0.0265  
##  2 2024-01-22  0.00570 -0.0161   0.0263  -0.00536 
##  3 2024-01-23  0.0133   0.00163 -0.00325 -0.00255 
##  4 2024-01-24  0.102   -0.00628 -0.0382  -0.0152  
##  5 2024-01-25  0.0310  -0.129    0.0349   0.0132  
##  6 2024-01-26  0.0149   0.00339 -0.0299   0.000569
##  7 2024-01-29  0.00937  0.0411   0.0282   0.00595 
##  8 2024-01-30 -0.0227   0.00345 -0.0182   0.0751  
##  9 2024-01-31  0.00224 -0.0227  -0.00837  0.0169  
## 10 2024-02-01  0.00601  0.00835 -0.00337  0.00180 
## # ℹ 74 more rows
plot(data1$NFLX[-1],data1$TSLA[-1])

plot(data1$NFLX[-1],data1$VFS[-1])

plot(data1$NFLX[-1],data1$GM[-1])

plot(data1$VFS[-1],data1$TSLA[-1])

plot(data1$VFS[-1],data1$GM[-1])

plot(data1$GM[-1],data1$TSLA[-1])

  1. Compute the sample daily covariance for each pair of 4 stocks.
data2=data1 %>% filter(date < "2024-05-18", date >="2023-02-18")  
data2 %>% datatable()
# The sample daily covariance of NFLX and VFS
## Method 1:
 sum((data2$NFLX-mean(data2$NFLX))*(data2$VFS - mean(data2$VFS))) / (nrow(data2) - 1)
## [1] 0.0002011746
## Method 2 :
cov(data2$NFLX,data2$VFS)
## [1] 0.0002011746
# The sample daily covariance of NFLX and TSLA
# Method 1:
 sum((data2$NFLX-mean(data2$NFLX))*(data2$TSLA - mean(data2$TSLA))) / (nrow(data2) - 1)
## [1] 3.499522e-05
# Method 2 :
cov(data2$NFLX,data2$TSLA)
## [1] 3.499522e-05
# Method 1:
 sum((data2$NFLX-mean(data2$NFLX))*(data2$GM - mean(data2$GM))) / (nrow(data2) - 1)
## [1] 3.798974e-06
# Method 2 :
cov(data2$NFLX,data2$GM)
## [1] 3.798974e-06
# Method 1:
 sum((data2$VFS-mean(data2$VFS))*(data2$GM - mean(data2$GM))) / (nrow(data2) - 1)
## [1] 0.0002024637
# Method 2 :
cov(data2$VFS,data2$GM)
## [1] 0.0002024637
# Method 1:
 sum((data2$VFS-mean(data2$VFS))*(data2$TSLA - mean(data2$TSLA))) / (nrow(data2) - 1)
## [1] 0.0006529491
# Method 2 :
cov(data2$VFS,data2$TSLA)
## [1] 0.0006529491
# Method 1:
 sum((data2$GM-mean(data2$GM))*(data2$TSLA - mean(data2$TSLA))) / (nrow(data2) - 1)
## [1] 8.277625e-05
# Method 2 :
cov(data2$GM,data2$TSLA)
## [1] 8.277625e-05

6. Stock selection

Choose 3 Vietnamese stocks that you would like to invest. Visualize and explain the stocks’ performance (return, volatility and covariance between them) over the last 3 years.

symbols <- c("FPT", "PVD", "BMP")
prices1=tq_get(symbols, get = "stock.prices",from = "2021-01-01")
prices1 %>% datatable()