google and second set to shares.#first data
google = read.csv("google_stock_price.csv")
#second data
shares = read.csv("alphabet_quarterly_shares_outstanding.csv")
dim function, we know
that the google has 4842 observations and 7 variables.
Meanwhile shares has 59 observations and 2 variables.#first data
dim(google)
## [1] 4842 7
#second data
dim(shares)
## [1] 59 2
google, we have 7 observations which are:
Date (character): This column represents the
calender date when the data about the stock is recorded.
Open (numeric): This column represents the first
recorded price of the stock for a trading session.
High(numeric): The high price represents the highest
traded price of the stock during a given trading session. It reflects
the peak value that the stock reached during the day.
Low (numeric): The low price is the lowest traded
price of the stock during a specific trading session. It indicates the
minimum value that the stock reached during the day.
Close (numeric): The closing price is the last
traded price of the stock at the end of a trading session. It reflects
the final value at which the stock was traded before the market
closes.
Adj Close (numeric): The adjusted closing price
accounts for corporate actions, such as dividends, stock splits, and new
stock offerings, that may affect the stock’s price but are not directly
related to its performance. The adjusted close is often used to assess
the stock’s performance over time.
Volume (integer): Volume represents the total number
of shares traded during a specific time period. It gives an indication
of the level of market activity and liquidity for that stock. High
volume often suggests increased investor interest, while low volume may
indicate less active trading.
In our second data, shares, we have 2 observations which
are:
Date (character): This column represents the
calender date when the data about the shares outstanding is
recorded.
Shares (numeric): This column represents the
company’s shares outstanding at that time. Shares outstanding refer to
the total number of a company’s shares of stock that are currently held
by shareholders, including both institutional investors and individual
investors.
How has the stock market price for Google fluctuated over
time?
What does the historical annual stock price data for Alphabet
reveal? Which year boasts the highest or lowest average stock
price?
What is the annual price change in Alphabet’s stock price,
and which year exhibits the widest range (lowest to highest)?
How does Alphabet’s stock price differ (highest minus lowest)
per day for the entire 2023 records, and when did the highest range
occur?
During which periods of the year 2023 is the closing price
higher than the opening price, and vice versa?
Is there a prevailing trend where the closing price
consistently surpasses the opening price, or vice versa?
How does Alphabet’s moving average change over 50, 100, and
200 days, and does it mirror the movement of the close price?
How has Alphabet’s cumulative return evolved over
time?
What is Alphabet’s average monthly return, and which month
experiences the highest and lowest average return?
How does Alphabet’s average daily return vary within a week,
and which day witnesses the highest and lowest average
return?
What is Alphabet’s average daily trading volume within a
week, and which day experiences the highest or lowest average
volume?
What are the average daily highest and lowest prices for
Alphabet within a week, and which day exhibits the highest average high
price and the lowest average low price?
How has the stock market volume for Google changed over
time?
How do the variables in our dataset correlate with each
other, and are there any variables that significantly influence one
another?
How has Alphabet’s volatility changed over time, and during
which period did it reach its peak?
How is Alphabet’s volatility distributed in a density curve,
and does it adhere to a normal distribution?
To what extent does Alphabet’s volatility align with a normal
distribution?
What is Alphabet’s shares outstanding per quarter from 2009
to 2023, and when does the shares outstanding reach its peak?
How does Alphabet’s market capitalization change over time, and when does it reach its peak?
#first data
knitr::kable(summary(google), caption = 'Table 4.1 Summary of "google"')
| Date | Open | High | Low | Close | Adj.Close | Volume | |
|---|---|---|---|---|---|---|---|
| Length:4842 | Min. : 2.47 | Min. : 2.534 | Min. : 2.39 | Min. : 2.491 | Min. : 2.491 | Min. :1.584e+05 | |
| Class :character | 1st Qu.: 12.82 | 1st Qu.: 12.949 | 1st Qu.: 12.70 | 1st Qu.: 12.828 | 1st Qu.: 12.828 | 1st Qu.:2.867e+07 | |
| Mode :character | Median : 26.47 | Median : 26.668 | Median : 26.26 | Median : 26.470 | Median : 26.470 | Median :6.202e+07 | |
| NA | Mean : 41.16 | Mean : 41.602 | Mean : 40.75 | Mean : 41.181 | Mean : 41.181 | Mean :1.192e+08 | |
| NA | 3rd Qu.: 57.20 | 3rd Qu.: 57.730 | 3rd Qu.: 56.63 | 3rd Qu.: 57.315 | 3rd Qu.: 57.315 | 3rd Qu.:1.470e+08 | |
| NA | Max. :151.86 | Max. :152.100 | Max. :149.89 | Max. :150.709 | Max. :150.709 | Max. :1.651e+09 |
#second data
knitr::kable(summary(shares), caption = 'Table 4.2 Summary of "shares"')
| Date | Shares | |
|---|---|---|
| Length:59 | Length:59 | |
| Class :character | Class :character | |
| Mode :character | Mode :character |
data.frame to tibble for
simplicity. We also transform variable Date type in
google and shares from character type to Date
type. Next, we transform variable Shares in
shares to match the unit that we want. Next, we create
google_volume, a derivative of google that
only contains the date and volume of our stock price. Meanwhile
google_price excludes volume, containing only different
types of stock prices, and google_2023 contains only stock
prices recorded in 2023.#first data: turn data type into tibble
google = as_tibble(google)
#first data: variable transformation
google = google %>%
mutate(Date = ymd(Date))
#first data: isolate variable "Date" and "Volume"
google_volume = google %>%
select(Date, Volume)
#first data: containing variables other than "Volume"
google_price = google %>%
select(-c(Volume)) %>%
pivot_longer(-Date,
names_to="Category",
values_to="Price")
#first data: containing only 2023 price records
google_2023 = google %>%
filter(year(Date) == 2023)
#second data: turn data type into tibble
shares = as_tibble(shares)
#second data: variable transformation
shares <- shares %>%
mutate(Shares = as.numeric(gsub(",", "", Shares)) / 10**3) %>%
mutate(Date = mdy(Date))
google, we found that the
variable Close and Adj. Close have the same
value. Hence we decided to remove the latter.#check identical values
identical(google$Close, google$Adj.Close)
## [1] TRUE
#remove variable
google = google %>%
select(-Adj.Close)
#first data
identical(length(unique(google$Date)),
nrow(google))
## [1] TRUE
#second data
identical(length(unique(shares$Date)),
nrow(shares))
## [1] TRUE
par(mfrow=c(1,6))
boxplot(google$Open, main = "'Open'")
boxplot(google$Close, main = "'Close'")
boxplot(google$High, main = "'High'")
boxplot(google$Low, main = "'Low'")
boxplot(google$Volume, main = "'Volume'")
boxplot(shares$Shares, main = "'Shares'")
| Throughout our data processing, we use these libraries and fonts to
improve our data visualization.
library(dslabs)
library(dplyr)
library(tidyverse)
library(lubridate)
library(gridExtra)
library(tidyr)
library(plotly)
library(extrafont)
library(reshape2)
font_import(pattern = "lmroman*")
## Importing fonts may take a few minutes, depending on the number of fonts and the speed of the system.
## Continue? [y/n]
loadfonts()
library("TTR")
| Problem: How has the stock market price for
Google fluctuated over time?
| Explanation: The stock price represents the
monetary value assigned to a single share of a company’s stock,
determined by market forces such as supply and demand. Each year label
on the x-axis marks the first recorded closing price at the beginning of
that specific year.
| Brief Analysis: Our plot reveals a
consistent upward trajectory over the years, with a notable peak in late
2021, marking the highest closing price. However, a substantial decline
is observed from the commencement of 2022 until 2023. Interestingly, the
stock demonstrates a renewed upward trend from the early months of 2023,
showcasing dynamic market movements and suggesting a resilient growth
pattern amid fluctuations.
a1 = google_price %>%
filter(Category == "Close") %>%
ggplot(aes(Date, Price)) +
geom_line(col = "olivedrab")
a1 +
scale_x_date(date_breaks = "1 year",
date_labels = "%Y") +
labs(y = "Price (in US Dollar) \n",
x = "\n Date",
title = "Alphabet Stock Market Price\n",
caption = "Source: Kaggle, 2023\n Data Visualization by Radithya Chaidir") +
theme_bw() +
theme(axis.text.x = element_text(angle = 30,
hjust = 1),
text = element_text(family = "LM Roman 10"),
plot.title = element_text(size = 20,
hjust = 0.5),
plot.margin = margin(30, 30, 20, 30, "pt"))
| Problem: What does the historical annual
stock price data for Alphabet reveal? Which year boasts the highest or
lowest average stock price?
| Explanation: The table provides a
comprehensive overview of Alphabet’s historical annual stock price data,
featuring key metrics for each year. The “Year” column indicates the
respective calendar year. “Average Stock Price” represents the mean
value of stock prices throughout the year. “Year Open” and “Year Close”
signify the opening and closing prices for the year, respectively. “Year
High” and “Year Low” highlight the maximum and minimum stock prices
during the year. The “Annual Change (%)” column reflects the percentage
change in stock price from the opening to the closing of the year. This
table is crucial for investors, analysts, and stakeholders as it
encapsulates vital information for evaluating historical stock
performance, identifying trends, and making informed investment
decisions.
| Brief Analysis: The analysis of Alphabet’s
historical annual stock price data reveals significant trends and
variations. The year 2021 stands out with the highest average stock
price, while 2005 marks the lowest. Notably, the year 2009 experienced a
remarkable annual change, reaching almost 101 percent compared to the
previous year. The annual change for each year showcases substantial
variability, reflecting the dynamic nature of market conditions. Recent
years, particularly 2022, witnessed a decline in annual change after
three consecutive years of positive shifts.
t1 = google %>%
mutate(Year = year(Date),
Month = month(Date),
Date_in_a_month = mday(Date),
Date_in_a_year = yday(Date)) %>%
filter(!Year %in% c(2004, 2023)) %>%
group_by(Year) %>% #closing price only state in the table
summarize("Average Stock Price" = mean(Close),
Year_Open = Open[Date == min(Date)],
Year_High = max(High),
Year_Low = min(Low),
Year_Close = Close[Date == max(Date)],
Annual_Change = (Year_Close - Year_Open) / Year_Open * 100) %>%
arrange(desc(Year))
knitr::kable(t1,
col.names = c("Year",
"Average Stock Price",
"Year Open",
"Year High",
"Year Low",
"Year Close",
"Annual Change (%)"),
caption = 'Table 2.1 Alphabet Historical Annual Stock Price Data')
| Year | Average Stock Price | Year Open | Year High | Year Low | Year Close | Annual Change (%) |
|---|---|---|---|---|---|---|
| 2022 | 115.193719 | 144.475494 | 152.10001 | 83.449997 | 88.730003 | -38.584738 |
| 2021 | 125.530687 | 87.876999 | 151.85001 | 84.949997 | 144.679504 | 64.638650 |
| 2020 | 74.070191 | 67.077499 | 92.36000 | 50.676800 | 87.594002 | 30.586266 |
| 2019 | 59.419653 | 50.828499 | 68.25000 | 50.703499 | 66.850998 | 31.522668 |
| 2018 | 55.661257 | 52.417000 | 63.69450 | 48.505501 | 51.780499 | -1.214303 |
| 2017 | 46.089042 | 38.940498 | 53.92450 | 38.790001 | 52.320000 | 34.358834 |
| 2016 | 37.174335 | 37.150002 | 40.83400 | 33.153000 | 38.591000 | 3.878864 |
| 2015 | 30.100284 | 26.378079 | 38.99900 | 24.311253 | 37.944000 | 43.846718 |
| 2014 | 27.990164 | 27.782366 | 30.60728 | 24.383057 | 26.247936 | -5.523034 |
| 2013 | 22.023519 | 17.918339 | 27.92035 | 17.323069 | 27.913124 | 55.779643 |
| 2012 | 16.010401 | 16.262545 | 19.28721 | 13.861045 | 17.618462 | 8.337668 |
| 2011 | 14.171237 | 14.856315 | 16.10862 | 11.781341 | 16.087200 | 8.285268 |
| 2010 | 13.340574 | 15.615220 | 15.71236 | 10.800268 | 14.793799 | -5.260385 |
| 2009 | 10.951161 | 7.686190 | 15.59131 | 7.042354 | 15.441621 | 100.900844 |
| 2008 | 11.577903 | 17.257067 | 17.36915 | 6.159413 | 7.662529 | -55.597732 |
| 2007 | 13.418465 | 11.606496 | 18.61124 | 10.884203 | 17.222446 | 48.386271 |
| 2006 | 10.241246 | 10.523555 | 12.77711 | 8.257798 | 11.469011 | 8.984193 |
| 2005 | 6.918047 | 4.916571 | 11.11359 | 4.298140 | 10.332770 | 110.162124 |
| Problem: What is the annual price change in
Alphabet’s stock price, and which year exhibits the widest range (lowest
to highest)?
| Explanation: Plotting Alphabet’s stock
market annual price range provides a concise overview of how the stock’s
price has fluctuated throughout each year. This is important for
investors and analysts to gauge the volatility and stability of
Alphabet’s stock over time.
| Brief Analysis: The analysis of Alphabet’s
stock market annual price range reveals distinct trends over the years.
Initially, the range between the highest and lowest prices remained
relatively stable during the early years of trading. However, in 2021,
there was a notable surge in the highest price compared to previous
years, indicating a potential uptrend in market sentiment or company
performance. Interestingly, the widest range was observed in 2022,
suggesting increased volatility or uncertainty in the market during that
period.
a13 = google_price %>%
mutate(Year = year(Date)) %>%
filter(Category %in% c("Low", "High")) %>%
ggplot(aes(Year, Price, group = Year)) +
geom_boxplot(col = "purple")
a13 +
labs(y = "Price (in US Dollar) \n",
x = "\n Year",
title = "Alphabet Annual Price Range\n",
caption = "Source: Kaggle, 2023\n Data Visualization by Radithya Chaidir") +
theme_bw() +
theme(text = element_text(family = "LM Roman 10"),
plot.title = element_text(size = 20,
hjust = 0.5),
plot.margin = margin(30, 30, 20, 30, "pt"))
| Problem: How does Alphabet’s stock price
differ (highest minus lowest) per day for the entire 2023 records, and
when did the highest range occur?
| Explanation: Plotting Alphabet’s stock
market daily price range in 2023 provides a granular view of how the
stock’s price fluctuated on a day-to-day basis throughout the year. This
information is crucial for investors and traders as it allows them to
assess the intraday volatility, identify patterns or trends, and make
timely decisions regarding buying, selling, or holding the stock.
| Brief Analysis: The analysis of Alphabet’s
stock market daily price range in 2023 reveals consistent patterns, with
ranges typically peaking at the beginning and end of each month while
dipping during the mid-month period, except for March. Interestingly,
March deviates from this trend, indicating potential market-specific
factors influencing price movements during that month. Notably, the
lowest range occurred at the end of February, suggesting decreased
volatility or stability in the market during that period, while the
highest range was observed at the beginning of June, signifying
heightened intraday price fluctuations.
a14 = google_2023 %>%
mutate(price_range = High - Low) %>%
ggplot(aes(Date, price_range)) +
geom_bar(stat = "identity",
col = "olivedrab")
a14 +
scale_x_date(date_breaks = "1 month",
date_labels = "%b") +
labs(y = "Price Difference (in US Dollar) \n",
x = "\n Date",
title = "Alphabet Daily Price Range in 2023\n",
caption = "Source: Kaggle, 2023\n Data Visualization by Radithya Chaidir") +
theme_bw() +
theme(text = element_text(family = "LM Roman 10"),
plot.title = element_text(size = 20,
hjust = 0.5),
plot.margin = margin(30, 30, 20, 30, "pt"))
| Problem: During which periods of the year
2023 is the closing price higher than the opening price, and vice
versa?
| Explanation: Candlestick charts are a visual
representation of stock prices, displaying the open, high, low, and
close prices for a specific time period. Each “candlestick” on the chart
represents the trading activity within that timeframe. The body of the
candlestick is colored, typically green or red, to indicate whether the
closing price is higher or lower than the opening price. A green (or
hollow) candlestick suggests a bullish market, where the close is higher
than the open, symbolizing upward price movement. Conversely, a red (or
filled) candlestick indicates a bearish market, signifying a close lower
than the open and a potential downward trend.
| Brief Analysis: Our plot reveals a
consistent upward trajectory, aligning with the broader trend observed
in the stock market. Noteworthy instances include a substantial
reduction in price range around February and the end of October,
indicative of potential market stability or specific influencing factors
during those periods. The presence of a cluster of green candles in late
April and mid-May suggests bullish market sentiment, corresponding to
periods of increased closing prices compared to openings. Conversely,
pronounced occurrences of red candles at the beginning of February, the
end of July, and the end of October signal bearish tendencies.
a3 = google_2023 %>%
plot_ly(x = ~Date,
type = "candlestick",
open = google_2023$Open,
close = google_2023$Close,
high = google_2023$High,
low = google_2023$Low)
a3 %>%
layout(title = list(text = "Alphabet Stock Market Analysis: Candlestick Representation",
font = list(size = 25)),
xaxis = list(rangeslider = list(visible = F)),
yaxis = list(title = 'Price (in US Dollar)'),
font = list(family = 'Latin Modern'),
margin = list(
l = 100,
r = 70,
b = 80,
t = 100
))
| Problem: Is there a prevailing trend where
the closing price consistently surpasses the opening price, or vice
versa?
| Explanation: The comparison of Alphabet’s
open and close frequencies provides valuable insights into the stock’s
daily trading dynamics. An abundance of records where the close price
consistently exceeds the open may suggest sustained positive market
sentiment, while frequent instances of the open price surpassing the
close could indicate potential bearish trends.
| Brief Analysis: The analysis of the plot
depicting Alphabet’s open and close frequencies reveals a relatively
balanced distribution, indicating a comparable occurrence of instances
where the close price is higher than the open and vice versa. The near
equilibrium between these occurrences suggests a dynamic and fluctuating
market sentiment, with neither bullish nor bearish trends dominating
consistently.
a4 = google %>%
mutate(Case = ifelse(Open > Close,
"Open > Close",
"Close > Open")) %>%
select(Case) %>%
group_by(Case) %>%
summarize(Count = n() / 4842*100) %>%
ggplot(aes(x = Case,y = Count, fill = Case)) +
geom_bar(stat = "identity",
width = 0.6)
a4 +
labs(y = "Percentage \n",
x = "\n Frequency Comparison",
title = "Alphabet Open and Close Frequencies\n",
caption = "\nSource: Kaggle, 2023\n Data Visualization by Radithya Chaidir") +
theme_bw() +
theme(text = element_text(family = "LM Roman 10"),
plot.title = element_text(size = 20,
hjust = 0.5),
plot.margin = margin(30, 30, 20, 30, "pt"),
legend.position = "none",
axis.title.y=element_blank(),
axis.text.y=element_blank(),
axis.ticks.y=element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
axis.line.x = element_line(),
axis.ticks = element_blank(),
axis.title.x=element_blank()) +
coord_cartesian(ylim=c(0,65)) +
geom_text(aes(x = "Close > Open",
y = 55, label = "50,4%")) +
geom_text(aes(x = "Open > Close",
y = 54, label = "49,6%")) +
scale_fill_manual(values = c("Close > Open" = "olivedrab",
"Open > Close" = "firebrick"))
| Problem: How does Alphabet’s moving average
change over 50, 100, and 200 days, and does it mirror the movement of
the close price?
| Explanation: Plotting Alphabet’s stock
market moving average changes over 50, 100, and 200 days provides
insights into the stock’s trend and volatility over different time
horizons. Moving averages are calculated by averaging the closing prices
of a stock over a specified period, such as 50, 100, or 200 days. These
averages smooth out short-term fluctuations, revealing underlying trends
in the stock’s price movement. This analysis is crucial for investors as
it helps identify long-term trends, short-term fluctuations, and
potential buy or sell signals.
| Brief Analysis: The analysis of the plot
reveals that the moving average changes over 50, 100, and 200 days
exhibit a strong similarity to the close price plot across all observed
days. This suggests that the moving averages effectively capture the
underlying trend and volatility of Alphabet’s stock price over various
time horizons.
a12 = google %>%
mutate(SMA_50 = SMA(Close, n = 50),
SMA_100 = SMA(Close, n=100),
SMA_200 = SMA(Close, n = 200)) %>%
pivot_longer(-Date,
names_to = "Category",
values_to = "Value") %>%
filter(Category %in% c("Close",
"SMA_50",
"SMA_100",
"SMA_200")) %>%
filter(!is.na(Value)) %>%
ggplot(aes(Date, Value, col = Category)) +
geom_line()
a12 +
scale_x_date(date_breaks = "1 year",
date_labels = "%Y") +
labs(y = "Price (in US Dollar) \n",
x = "\n Date",
title = "Alphabet Moving Average\n",
caption = "Source: Kaggle, 2023\n Data Visualization by Radithya Chaidir") +
theme_bw() +
theme(text = element_text(family = "LM Roman 10"),
plot.title = element_text(size = 20,
hjust = 0.5),
plot.margin = margin(30, 30, 20, 30, "pt"),
axis.text.x = element_text(angle = 30,
hjust = 1)) +
scale_color_manual(
values = c("Close" = "red",
"SMA_50" = "blue",
"SMA_100" = "green",
"SMA_200" = "yellow"), # Specify colors
breaks = c("Close",
"SMA_50",
"SMA_100",
"SMA_200"),
labels = c("Close",
"SMA for 50 days",
"SMA for 100 days",
"SMA for 200 days")
) +
theme(
legend.position = c(0.215, 0.695),
legend.background =
element_rect(fill = "white",
color = "black"))
| Problem: How has Alphabet’s cumulative
return evolved over time?
| Explanation: The cumulative return plot for
Alphabet provides a snapshot of the overall performance and growth of
the stock over the specified time period. Cumulative return in stock
market analysis refers to the average return over a period of time, such
as a day, week, or month. This visualization is crucial for investors as
it illustrates the overall profitability of an investment over time.
Positive cumulative returns indicate profit, while negative values
suggest losses.
| Brief Analysis: The cumulative return plot
for Alphabet Inc. mirrors the movement observed in the stock market
price plot, displaying a pattern of fluctuations over time. As the stock
market experiences variations in value, the cumulative return, which
accumulates these changes, naturally exhibits corresponding
oscillations. This alignment is expected, as the cumulative return
serves as a cumulative measure of the stock’s overall performance.
a7 = google %>%
mutate(daily_return = Close / lag(Close) - 1) %>%
filter(!is.na(daily_return)) %>%
mutate(cumulative_return = cumprod(1 + daily_return) - 1) %>%
ggplot(aes(Date, cumulative_return))+
geom_line(col = "#003366", linewidth = 1)+
geom_bar(stat = "identity",
fill = "#FF8C00")
a7 +
scale_x_date(date_breaks = "1 year",
date_labels = "%Y") +
labs(y = "Price (in US Dollar) \n",
x = "\n Date",
title = "Alphabet Cumulative Return\n",
caption = "Source: Kaggle, 2023\n Data Visualization by Radithya Chaidir") +
theme_bw() +
theme(text = element_text(family = "LM Roman 10"),
plot.title = element_text(size = 20,
hjust = 0.5),
plot.margin = margin(30, 30, 20, 30, "pt"),
axis.text.x = element_text(angle = 30,
hjust = 1))
| Problem: What is Alphabet’s average monthly
return, and which month experiences the highest and lowest average
return?
| Explanation: Plotting Alphabet’s stock
market average monthly return provides a summarized view of the stock’s
performance on a monthly basis, showing the average change in price over
each month. Monthly return refers to the percentage change in the
stock’s price over a single month, calculated as the difference between
the closing price at the end of the month and the closing price at the
beginning of the month, divided by the initial price and multiplied by
100. This information is crucial for investors as it allows them to
assess the stock’s trend over time, identify seasonal patterns or
trends, and evaluate the stock’s overall volatility and stability.
| Brief Analysis: The analysis of Alphabet’s
stock market average monthly return unveils several key insights.
Firstly, the returns tend to fluctuate, indicating dynamic market
conditions. Interestingly, the last quarter exhibits stability and
positivity in returns, suggesting potential market resilience or
favorable company developments. Notably, October emerges as a standout
month with the highest return, nearly reaching 0.004, possibly driven by
positive market catalysts or strong financial performance. Conversely,
February stands out with the lowest return, dipping to -0.001,
indicating a period of underperformance.
a15 = google %>%
mutate(daily_return = Close / lag(Close) - 1,
Month = month(Date)) %>%
filter(!is.na(daily_return)) %>%
group_by(Month) %>%
summarize(avg = mean(daily_return)) %>%
ggplot(aes(factor(Month), avg)) +
geom_bar(stat = "identity",
aes(fill = avg < 0))
a15 +
scale_fill_manual(values = c("olivedrab","firebrick"),
guide = FALSE) +
scale_x_discrete(
labels = month.abb) +
labs(y = "Return \n",
x = "\n Month",
title = "Alphabet Average Monthly Return\n",
caption = "Source: Kaggle, 2023\n Data Visualization by Radithya Chaidir") +
theme_bw() +
theme(text = element_text(family = "LM Roman 10"),
plot.title = element_text(size = 20,
hjust = 0.5),
plot.margin = margin(30, 30, 20, 20, "pt"),
legend.position = "none")
| Problem: How does Alphabet’s average daily
return vary within a week, and which day witnesses the highest and
lowest average return?
| Explanation: Plotting Alphabet’s stock
market average daily return provides a snapshot of the stock’s
performance on a day-to-day basis, showcasing the average change in
price over each trading day. Daily return refers to the percentage
change in the stock’s price over a single trading day, calculated as the
difference between the closing price at the end of the day and the
closing price at the beginning of the day, divided by the initial price
and multiplied by 100.
| Brief Analysis: The analysis of Alphabet’s
stock market average daily return reveals notable patterns in its
performance throughout the trading week. Firstly, returns tend to surge
on Tuesday and Wednesday, with the highest peak observed on Wednesday.
This trend suggests increased investor activity and potentially positive
market sentiment mid-week. Conversely, Monday stands out with the lowest
return, reaching only 0.0003, indicating a subdued start to the trading
week.
a16 = google %>%
mutate(daily_return = Close / lag(Close) - 1,
Day = factor(weekdays(Date),
levels = c("Monday",
"Tuesday",
"Wednesday",
"Thursday",
"Friday"))) %>%
filter(!is.na(daily_return)) %>%
group_by(Day) %>%
summarize(avg = mean(daily_return)) %>%
ggplot(aes(Day, avg)) +
geom_bar(stat = "identity",
aes(fill = Day))
a16 +
labs(y = "Return \n",
x = "\n Day",
title = "Alphabet Day of the Week Effect\n",
caption = "Source: Kaggle, 2023\n Data Visualization by Radithya Chaidir") +
theme_bw() +
theme(text = element_text(family = "LM Roman 10"),
plot.title = element_text(size = 20,
hjust = 0.5),
plot.margin = margin(30, 30, 20, 20, "pt"),
legend.position = "none")
| Problem: What is Alphabet’s average daily
trading volume within a week, and which day experiences the highest or
lowest average volume?
| Explanation: Analyzing Alphabet’s average
daily trading volume within a week provides insights into the variations
in trading volume on specific days, helping identify potential trends or
anomalies. Monitoring these fluctuations is crucial for investors and
traders as it can influence decision-making, such as determining optimal
entry and exit points based on the market’s liquidity and activity
levels.
| Brief Analysis: The analysis of the plot
reveals that Friday stands out with the highest volume average,
indicating increased market activity, possibly influenced by end-of-week
trading strategies. In contrast, Monday records the lowest volume
average, suggesting a subdued start to the trading week. The remaining
days exhibit relatively similar average volume levels.
a5 = google_volume %>%
mutate(Day = wday(Date, label = TRUE, abb = FALSE)) %>%
group_by(Day) %>%
summarize(Average_per_Day = mean(Volume) / 10**8) %>%
ggplot(aes(Day, Average_per_Day))+
geom_bar(stat = "identity", aes(fill = Day))+
scale_y_continuous(trans = "log10")
a5 +
labs(y = "Volume (in Hundred Millions) \n",
x = "\n Day",
title = "Alphabet Daily Volume Average\n",
caption = "Source: Kaggle, 2023\n Data Visualization by Radithya Chaidir") +
theme_bw() +
theme(text = element_text(family = "LM Roman 10"),
plot.title = element_text(size = 20,
hjust = 0.5),
plot.margin = margin(30, 30, 20, 30, "pt"),
legend.position = "none")
| Problem: What are the average daily highest
and lowest prices for Alphabet within a week, and which day exhibits the
highest average high price and the lowest average low price?
| Explanation: The plotted average daily
highest and lowest prices for Alphabet within a week offer insights into
the stock’s weekly performance dynamics. This analysis is crucial as it
provides a concise overview of the price behavior throughout the trading
week. Identifying patterns and trends helps market participants
comprehend the stock’s weekly volatility and potential trading
opportunities.
| Brief Analysis: The analysis of the plot
reveals distinct patterns in Alphabet’s average daily high and low
prices within a week. Notably, Thursday stands out as the day with the
highest average highest and average lowest prices, reflecting potential
market activity peaks. In contrast, Monday exhibits the lowest averages
for both highest and lowest prices. The remaining days of the week show
a relatively consistent pattern.
a6 = google %>%
mutate(Day = wday(Date,
label = TRUE,
abb = FALSE)) %>%
group_by(Day) %>%
summarize(average_high = mean(High),
average_low = mean(Low)) %>%
pivot_longer(-Day,
names_to = "Category",
values_to = "Price") %>%
mutate(Category = as.factor(Category)) %>%
ggplot(aes(Day, Price,
group = Category,
fill = Category)) +
geom_bar(stat="identity",
position = "identity") +
coord_cartesian(ylim = c(40,42))
a6 +
labs(y = "Price (in US Dollar) \n",
x = "\n Day",
title = "Alphabet Daily Open and Close Average\n",
caption = "Source: Kaggle, 2023\n Data Visualization by Radithya Chaidir") +
theme_bw() +
theme(text = element_text(family = "LM Roman 10"),
plot.title = element_text(size = 20,
hjust = 0.5),
plot.margin = margin(30, 30, 20, 30, "pt"),
legend.position = "bottom") +
scale_fill_manual(values = c("average_high" = "#003366",
"average_low" = "#FFD700"),
name = "Price Types: ",
labels = c("Average High",
"Average Low"))
| Problem: How has the stock market volume for
Google changed over time?
| Explanation: Stock market volume refers to
the total number of shares traded within a specific period, typically
during a trading day. It is a crucial indicator of market activity and
liquidity. High volume often accompanies significant price movements.
Each year label on the x-axis marks the first recorded closing price at
the beginning of that specific year.
| Brief Analysis: Our plot reveals a general
decreasing trend over the years, marked by notable fluctuations in
shorter periods. Notably, the volume peaked at the beginning of 2006,
reflecting heightened market activity during that period. However, a
significant decline is observed towards the end of 2022, suggesting
potential shifts in investor sentiment or changes in market
dynamics.
a2 = google_volume %>%
ggplot(aes(Date, Volume / 10^9)) +
geom_line(col = "#1f77b4")
a2 +
scale_x_date(date_breaks = "1 year",
date_labels = "%Y") +
labs(y = "Volume (in Billions) \n",
x = "\n Date",
title = "Alphabet Stock Market Volume\n",
caption = "Source: Kaggle, 2023\n Data Visualization by Radithya Chaidir") +
theme_bw() +
theme(axis.text.x = element_text(angle = 30,
hjust = 1),
text = element_text(family = "LM Roman 10"),
plot.title = element_text(size = 20,
hjust = 0.5),
plot.margin = margin(30, 30, 20, 30, "pt"))
| Problem: How do the variables in our dataset
correlate with each other, and are there any variables that
significantly influence one another?
| Explanation: The correlation analysis of
Alphabet’s stock market variables, including open, close, high, low, and
volume, reveals the strength and direction of relationships between
variables. The correlation coefficient ranges from -1 to 1, indicating
perfect positive (implying that as one variable increases, the other
also increases proportionally), perfect negative (suggesting that as one
variable increases, the other decreases proportionally), or no linear
correlation (signifies a weak or no linear correlation between the
variables.)
| Brief Analysis: The correlation analysis of
Alphabet’s stock market variables reveals a perfect positive
relationship (correlation coefficient of 1) among open, close, high, and
low prices, indicating a strong linear association. On the other hand,
volume demonstrates a moderate negative correlation (approximately
-0.47) with the rest of the variables.
#new data: Alphabet's Variables Correlation Matrix
correlation_matrix = google %>%
select(-Date) %>%
cor()
correlation_melted = melt(correlation_matrix)
a11 = correlation_melted %>%
ggplot(aes(Var1, Var2, fill = value,
label = sprintf("%.2f", value))) +
geom_tile(color = "white")
a11 +
geom_text(size = 3, color = "white",
show.legend = FALSE) +
labs(y = "",
x = "",
title = "Variable Correlation Analysis",
subtitle = "\n1: perfect positive, -1: perfect negative, ~0: non-linear \n",
caption = "Source: Kaggle, 2023\n Data Visualization by Radithya Chaidir") +
theme_bw() +
scale_fill_gradient(low = "#8B6969", high = "firebrick") +
theme(text = element_text(family = "LM Roman 10"),
plot.title = element_text(size = 20,
hjust = 0.5),
plot.margin = margin(30, 30, 20, 30, "pt"),
legend.position = "none")
| Problem: How has Alphabet’s volatility
changed over time, and during which period did it reach its peak?
| Explanation: Alphabet Inc.’s volatility
analysis plot captures the fluctuations and variations in the stock’s
price over time. Stock market volatility is the measure of how far the
current price of an asset deviates from its average past prices,
indicating the strength or conviction behind a price movement. A higher
volatility implies greater price variability, indicating a potentially
riskier asset. Understanding volatility is fundamental for effective
risk management and optimizing investment portfolios.
| Brief Analysis: The volatility analysis for
Alphabet Inc. reveals noteworthy trends in the stock’s price
fluctuation. Notably, volatility reached its zenith in the early months
of 2008, coinciding with a period of heightened market uncertainty.
Subsequently, it experienced a significant downturn at the close of 2008
and the onset of 2020, possibly indicative of more stable market
conditions during those periods. The overall pattern showcases
considerable fluctuations, highlighting the dynamic nature of the
stock’s price movements over the analyzed timeframe. Interestingly,
recent observations in 2023 exhibit periods with volatility registering
below zero, suggesting potential anomalies (outliers) or unique market
dynamics during that period.
a8 = google %>%
mutate(daily_return = Close / lag(Close) - 1) %>%
filter(!is.na(daily_return)) %>%
ggplot(aes(Date, daily_return)) +
geom_line(col = "#4169E1")
a8 +
scale_x_date(date_breaks = "1 year",
date_labels = "%Y") +
labs(y = "Volatility \n",
x = "\n Date",
title = "Alphabet Volatility Analysis",
subtitle = "\nNegative volatility is uncommon and may indicate anomalies. \n",
caption = "Source: Kaggle, 2023\n Data Visualization by Radithya Chaidir") +
theme_bw() +
theme(text = element_text(family = "LM Roman 10"),
plot.title = element_text(size = 20,
hjust = 0.5),
plot.margin = margin(30, 30, 20, 30, "pt"),
axis.text.x = element_text(angle = 30,
hjust = 1))
| Problem: How is Alphabet’s volatility
distributed in a density curve, and does it adhere to a normal
distribution?
| Explanation: Analyzing Alphabet’s volatility
distribution with a density curve allows us to understand the pattern of
volatility fluctuations over time. Comparing it to a normal density
curve helps identify deviations from expected market behavior. This is
crucial for assessing whether Alphabet’s volatility follows a typical
market trend or if there are unique patterns requiring attention.
| Brief Analysis: The analysis of Alphabet’s
volatility distribution reveals a peak higher than the normal
distribution, suggesting periods of intensified market activity.
Additionally, the distribution appears slimmer, indicating less
variability than the expected normal market conditions. This can also be
caused by anomalies incurred inside our data.
#new data: Alphabet's daily return
google_dr = google %>%
mutate(daily_return = Close / lag(Close) - 1) %>%
filter(!is.na(daily_return))
a9 = google_dr %>%
ggplot(aes(daily_return)) +
geom_density(aes(col = "Volatility Dens."),
size = 1) +
stat_function(
fun = dnorm,
args = list(mean = mean(google_dr$daily_return),
sd = sd(google_dr$daily_return)),
size = 1,
linetype = 1,
aes(color = "Normal Dist.")
)
a9 +
labs(y = "Density \n",
x = "\n Daily Return",
title = "Alphabet Volatility Distribution\n",
caption = "Source: Kaggle, 2023\n Data Visualization by Radithya Chaidir") +
theme_bw() +
theme(text = element_text(family = "LM Roman 10"),
plot.title = element_text(size = 20,
hjust = 0.5),
plot.margin = margin(30, 30, 20, 30, "pt")) +
scale_color_manual(values = c("#003366","firebrick"),
name = "Density Curve") +
theme(
legend.position = c(0.87, 0.8),
legend.background =
element_rect(fill = "white",
color = "black"))
| Problem: To what extent does Alphabet’s
volatility align with a normal distribution?
| Explanation: Plotting volatility
distribution using a QQ plot is important to assess whether the
distribution of volatility follows a normal distribution. The QQ plot
visually compares the quantiles of the observed data with the quantiles
of a theoretical normal distribution. If the points on the QQ plot
closely align with a straight line, it suggests that the data is
approximately normally distributed. This analysis is crucial in
understanding the statistical properties of the volatility data and can
guide further modeling or analysis decisions based on the assumed
distribution.
| Brief Analysis: The analysis of Alphabet’s
stock market volatility distribution using a QQ plot reveals a
significant deviation from normality, indicating that a substantial
portion of the data does not conform to a standard normal distribution.
This departure from normality is likely influenced by potential outliers
within the dataset or dynamic market at that time.
#new data: Alphabet's daily return
daily_return = google %>%
mutate(daily_return = Close / lag(Close) - 1) %>%
filter(!is.na(daily_return)) %>%
select(daily_return)
a10 = daily_return %>%
ggplot(aes(sample = daily_return)) +
stat_qq(col= "#FF8C00") +
stat_qq_line()
a10 +
labs(y = "",
x = "",
title = "Alphabet Volatility Normality\n",
caption = "Source: Kaggle, 2023\n Data Visualization by Radithya Chaidir") +
theme_bw() +
theme(text = element_text(family = "LM Roman 10"),
plot.title = element_text(size = 20,
hjust = 0.5),
plot.margin = margin(30, 30, 20, 30, "pt"))
| Problem: How does Alphabet’s market
capitalization change over time, and when does it reach its peak?
| Explanation: Market capitalization (market
cap) is the total market value of a company’s outstanding shares of
stock, calculated by multiplying the current share price by the total
number of outstanding shares. The objective of plotting market
capitalization is to track the company’s valuation over time, reflecting
its growth, investor sentiment, and market conditions.
| Brief Analysis: Alphabet’s market
capitalization reached its peak at the beginning of 2022, surpassing
2000 (units). From 2009 to 2020, the market cap showed a gradual
increase, followed by a rapid rise from 2020 to its peak in early 2022.
In 2022-2023, there was a significant decline, with the market cap
almost losing 500 units within a year. However, since the beginning of
2023, there has been a recovery trend, indicating a gradual increase in
market capitalization.
a18 = left_join(shares, google_price, by = "Date") %>%
filter(!is.na(Price) & Category == "Close") %>%
mutate(Market_cap = Shares * Price) %>%
ggplot(aes(Date, Market_cap)) +
geom_line(size = 1,
col = "#4169E1")
a18 +
scale_x_date(date_breaks = "1 year",
date_labels = "%Y") +
labs(y = "",
x = "\n Date",
title = "Alphabet Market Capitalization\n",
caption = "Source: Kaggle and macrotrends.net, 2023\n Data Visualization by Radithya Chaidir") +
theme_bw() +
theme(text = element_text(family = "LM Roman 10"),
plot.title = element_text(size = 20,
hjust = 0.5),
plot.margin = margin(30, 30, 20, 20, "pt"),
axis.text.x = element_text(angle = 30,
hjust = 1))
[1] S. Navami. (2024). Complete Exploratory Data Analysis (EDA) using Python. Medium.
[2] Joshi, S. (2023). Google Stock Price (2004-2023). Kaggle. Retrieved January, 2024, from here.
[3] Slamet, I., Laela, N., & P. (2016). Analysis of Calendar Anomaly in Indonesia Stock Market using Stochastic Dominance. AIP Publishing.
[4] Alphabet Shares Outstanding 2010-2024|GOOGL. Retrieved January, 2024, from here.
[5] Selvam, M., Raja, M., & Mozhi, P. (2007). Forecasting the Time Volatility of Emerging Asian Stock Market Index. Asia Pasific Business Review.