rm(list = ls())

Hypothesis Testing

Using the getSymbols command, download MONTHLY prices (from Yahoo) for Wal Mart, Inc (WMT) from Jan 2019 to Ago 2021.

library(quantmod)
getSymbols(c("WMT"), from="2019-01-01", to= "2021-08-30", periodicity="monthly", src="yahoo")

1.-Calculate simple and continuously compounded (cc) returns of WMTstock (you can create new R objects or new columns in the price objects)

returns.zoo <-diff(log(Ad(WMT))) 
returns.df <- as.data.frame(diff(log(Ad(WMT))))
colnames(returns.df) <- "r_WMT"

Summary

summary(returns.df)
summary(returns.zoo)
library(PerformanceAnalytics)
table.Stats(returns.df$r_WMT)
mean_r_WMT <- mean(returns.df$r_WMT, na.rm=TRUE) 
sd_r_WMT <- sd(returns.df$r_WMT, na.rm=TRUE) 
var_r_WMT <- var(returns.df$r_WMT, na.rm=TRUE) 
cat("Mean =", mean_r_WMT)
cat("Standard deviation = ", sd_r_WMT)
cat("Variance = ", var_r_WMT)
returns.df$R_WMT <- WMT$WMT.Adjusted / lag(WMT$WMT.Adjusted, n=1) - 1
table.Stats(returns.df$R_WMT)
mean(returns.df$R_WMT, na.rm=TRUE)
sd(returns.df$R_WMT, na.rm=TRUE)
var(returns.df$R_WMT, na.rm=TRUE)
mean_R_WMT = mean(returns.df$R_WMT, na.rm=TRUE)
cat("Mean of simple returns: ", mean_R_WMT,"\n")
mean_r_WMT = mean(returns.df$r_WMT, na.rm=TRUE) 
cat("Mean cc return: ", mean_r_WMT)

WE CAN SEE THAT THE MEAN OF SIMPLE MONTHLY RETURNS IS HIGHER THAN THE MEAN OF CC RETURNS (1.65648% VS 1.548398%). OUTSIDE THE RANGES OF -5 TO +5% THESE RANGES BECOME EVER MORE SIGNIFICANT.

2.- Do a histogram for the simple returns of WMT. INTERPRET this histogram

hist(returns.df$r_WMT, main="Histogram of WMT monthly returns", xlab="Continuously Compounded returns", col="dark blue")

ONE CAN TELL THAT THE MOST FREQUENT RETURNS OF STARBUCKS ARE BETWEEN -5% TO +5%. AROUND 10+ MONTHS THROUGHOUT IT’S HISTORY, WALMART HAS OFFERED MONTHLY RETURNS FROM 0 TO 10%. FOR AROUND 10 MONTHS WALMART HAS BEEN OFFERING NEGATIVE RETURNS BETWEEN -5% AND 0%. THE HISTOGRAM IN QUESTION IS A SYMMETRIC UNIMODAL HISTOGRAM, WITH A SKEW TO THE RIGHT HAND SIDE.

3.- Is the average of WMT cc returns significantly higher than 1%? NO, WMT DOESN’T INDICATE THAT ITS RETURNS ARE SIGNIFICANTYL HIGHER THAN ONE PERCENT ON AVERAGE, YIELDING ME WITH THE RESULT OF 0.8%.

  1. Which is the hypothesis test you need for this (Null and alternative hypothesis)? H0: mean(r_WMT) = 0 Ha: mean(r_WMT) > 0 H0: mean(r_WMT) = 0 Ha: mean(r_WMT) > 0
ttest_WMT <- t.test(as.numeric(WMT), alternative = "greater")
class(ttest_WMT)
names(ttest_WMT)
  1. Manually calculate the corresponding t-tests and show your calculation (without using the t.test function). Also do the hypothesis test using the t.test function. INTERPRET YOUR RESULTS

xa = Value of the variable of study you got from the data x0 = Value of the variable of study according to the Null Hypothesis (Ho) t = (xa - x0) / standard error(xa) In this case, the variable of study is the mean of returns, so xa= mean(returns): t = (mean(returns) - 0) / se The numerator is the difference between the mean returns of your data minus the mean return of the NULL HYPOTHESIS, which is zero. The denominator is the standard error of the variable of study (se), which is the individual standard deviation divided by the squared root of N. Then, the t value is actually the distance between the actual value of the mean return minus the hypothetical value, which is zero. This distance is measured in # of standard deviations of the mean returns. Then, se = sd(returns) / sqrt(N) t = (mean(sample)) - 0) / (sd(returns) / sqrt(N)) T VALUE WMT = 5.851

N <- nrow(WMT) 
se_WMT <- sd(WMT) / sqrt(N)
se_WMT
mean_WMT <- mean(WMT)
t_value_WMT <- (mean_WMT - 0) / se_WMT 
t_value_WMT

SINCE THE T VALUE FOR WMT TEST IS BIGGER THAN 2 THEN THERE IS ENOUGH STATITSTICAL EVIDENCE THAT POINTS TOWARDS THE FACT THAT THE MEAN OF RETURNS IS LARGER THAN 0

IN THIS CASE WE COULD REJECT THE NULL HYPOTHESIS, SO WE HAVE EVIDENCE TO SUPPORT OUR HYPOTHESIS(ALTERNATIVE) THAT STATES THAT THE MEAN RETURN OF BOTH STOCKS ARE LARGER THAN 0.

Regression model

library(quantmod)
getSymbols(c("CEMEXCPO.MX", "^MXX"), from="2018-01-01", to="2020-08-30", periodicity="daily", src="yahoo")
head(CEMEXCPO.MX)
tail(CEMEXCPO.MX)
head(MXX)
tail(MXX)

1.- Run a market regression model for CEMEX. Write the code you need (in R chunks) and show the output of the regression.

CEMEXCPO.MX <- na.omit(diff(log(CEMEXCPO.MX$CEMEXCPO.MXAdjusted)))
r_MXX <- na.omit(diff(log(MXX$MXX.Adjusted)))
all_rets <- merge(CEMEXCPO.MX, MXX)
colnames(all_rets) <- c("CEMEXCPO.MX", "MXX")
plot.default(x=all_rets$MXX,y=all_rets$CEMEXCPO.MX.Adjusted)
sumsquares <- anova(CEMEXCPO.MX)
sumsquares
t_value_WMT
VaR(CEMEXCPO.MX)
var_r_WMT

2.- QUESTIONS REGRESSION MODEL a- WHAT IS THE REGRESSION EQUATION OF THIS MODEL THE REGRESSION IS GIVEN BY THE Y=MB+X FORMULA WHERE Y IS THE DEPENDENT VARIABLE, X IS THE INDEPENDENT VARIABLE AND B IS THE SLOPE OF THE LINE WHILE A IS THE Y INTERCEPT. b- WHAT CAN YOU SAY ABOUT THE MARKET RISK OF THE STOCK? I THINK THAT THE STOCK IS SIGNIFICANTLY MORE RISKY THAN THE MARKET, IT ONLY TAKES TO SEE THE HISTORICAL DATA CONCERNING THE STOCK’S PRICE TO TELL CEMEX HAS SEEN BETTER DAYS. C.- CEMEX OVERFORMS THE MARKET, THERE IS A PROBABILITY OF 4% OF BEING WRONG WHEN REJECTING HYPOTHESIS 0. WITH THE T VALUE of 2.388 I CAN SAY THAT CEMEX OFFERS RETURNS OVER THE MARIKET. ALSO I CAN VERIFY THIS INFORMATION LOOKING AT THE CONFIDENCE INTERVAL. D.- HOW MUCH THE VARIANCE OF THE STOCK RETURNS CANNOT BE EXPLAINED BY THE VARIANCE OF MARKET RETURNS? 98.01% SINCE THE R SQUARED IS EQUAL TO .01956.

---
title: "Exam 1 Practice"
author: Stefan Schweitzer A01209755
output: html_notebook
---
```{r}
rm(list = ls())
```


## Hypothesis Testing

Using the getSymbols command, download MONTHLY prices (from Yahoo) for Wal Mart, Inc (WMT) from Jan 2019 to Ago 2021.

```{r}
library(quantmod)
getSymbols(c("WMT"), from="2019-01-01", to= "2021-08-30", periodicity="monthly", src="yahoo")
```
1.-Calculate simple and continuously compounded (cc) returns of WMTstock (you can create new R objects or new columns in the price objects)

```{r}
returns.zoo <-diff(log(Ad(WMT))) 
returns.df <- as.data.frame(diff(log(Ad(WMT))))
colnames(returns.df) <- "r_WMT"
```

Summary
```{r}
summary(returns.df)
```
```{r}
summary(returns.zoo)
```
```{r}
library(PerformanceAnalytics)
```

```{r}
table.Stats(returns.df$r_WMT)
```

```{r}
mean_r_WMT <- mean(returns.df$r_WMT, na.rm=TRUE) 
sd_r_WMT <- sd(returns.df$r_WMT, na.rm=TRUE) 
var_r_WMT <- var(returns.df$r_WMT, na.rm=TRUE) 
cat("Mean =", mean_r_WMT)
```
```{r}
cat("Standard deviation = ", sd_r_WMT)
```
```{r}
cat("Variance = ", var_r_WMT)
```
```{r}
returns.df$R_WMT <- WMT$WMT.Adjusted / lag(WMT$WMT.Adjusted, n=1) - 1
table.Stats(returns.df$R_WMT)
```

```{r}
mean(returns.df$R_WMT, na.rm=TRUE)
```
```{r}
sd(returns.df$R_WMT, na.rm=TRUE)
```
```{r}
var(returns.df$R_WMT, na.rm=TRUE)
```
```{r}
mean_R_WMT = mean(returns.df$R_WMT, na.rm=TRUE)
cat("Mean of simple returns: ", mean_R_WMT,"\n")
```
```{r}
mean_r_WMT = mean(returns.df$r_WMT, na.rm=TRUE) 
cat("Mean cc return: ", mean_r_WMT)
```
 WE CAN SEE THAT THE MEAN OF SIMPLE MONTHLY RETURNS IS HIGHER THAN THE MEAN OF CC RETURNS (1.65648% VS 1.548398%). 
OUTSIDE THE RANGES OF -5 TO +5% THESE RANGES BECOME EVER MORE SIGNIFICANT.


2.- Do a histogram for the simple returns of WMT. INTERPRET this histogram
```{r}
hist(returns.df$r_WMT, main="Histogram of WMT monthly returns", xlab="Continuously Compounded returns", col="dark blue")
```
ONE CAN TELL THAT THE MOST FREQUENT RETURNS OF STARBUCKS ARE BETWEEN -5% TO +5%. AROUND 10+ MONTHS THROUGHOUT IT'S HISTORY, WALMART HAS OFFERED MONTHLY RETURNS FROM 0 TO 10%. FOR AROUND 10 MONTHS WALMART HAS BEEN OFFERING NEGATIVE RETURNS BETWEEN -5% AND 0%.
THE HISTOGRAM IN QUESTION IS A SYMMETRIC UNIMODAL HISTOGRAM, WITH A SKEW TO THE RIGHT HAND SIDE. 


3.- Is the average of WMT cc returns significantly higher than 1%? NO, WMT DOESN'T INDICATE THAT ITS RETURNS ARE SIGNIFICANTYL HIGHER THAN ONE PERCENT ON AVERAGE, YIELDING ME WITH THE RESULT OF 0.8%.

A) Which is the hypothesis test you need for this (Null and alternative hypothesis)?
H0: mean(r_WMT) = 0 
Ha: mean(r_WMT) > 0 
H0: mean(r_WMT) = 0 
Ha: mean(r_WMT) > 0

```{r}
ttest_WMT <- t.test(as.numeric(WMT), alternative = "greater")
```

```{r}
class(ttest_WMT)
```
```{r}
names(ttest_WMT)
```

B) Manually calculate the corresponding t-tests and show your calculation (without using the t.test function). Also do the hypothesis test using the t.test function. INTERPRET YOUR RESULTS

xa = Value of the variable of study you got from the data
x0 = Value of the variable of study according to the Null Hypothesis (Ho)
t = (xa - x0) / standard error(xa)
In this case, the variable of study is the mean of returns, so xa= mean(returns):
t = (mean(returns) - 0) / se
The numerator is the difference between the mean returns of your data minus the mean return of the NULL HYPOTHESIS, which is zero. The denominator is the standard error of the variable of study (se), which is the individual standard deviation divided by the squared root of N.
Then, the t value is actually the distance between the actual value of the mean return minus the hypothetical value, which is zero. This distance is measured in # of standard deviations of the mean returns.
Then,
se = sd(returns) / sqrt(N)
t = (mean(sample)) - 0) / (sd(returns) / sqrt(N))
T VALUE WMT = 5.851

```{r}
N <- nrow(WMT) 
se_WMT <- sd(WMT) / sqrt(N)
se_WMT
```
```{r}
mean_WMT <- mean(WMT)
```

```{r}
t_value_WMT <- (mean_WMT - 0) / se_WMT 
t_value_WMT
```

SINCE THE T VALUE FOR WMT TEST IS BIGGER THAN 2 THEN THERE IS ENOUGH STATITSTICAL EVIDENCE THAT POINTS TOWARDS THE FACT THAT THE MEAN OF RETURNS IS LARGER THAN 0

IN THIS CASE WE COULD REJECT THE NULL HYPOTHESIS, SO WE HAVE EVIDENCE TO SUPPORT OUR HYPOTHESIS(ALTERNATIVE) THAT STATES THAT THE MEAN RETURN OF BOTH STOCKS ARE LARGER THAN 0.


## Regression model

```{r}
library(quantmod)
getSymbols(c("CEMEXCPO.MX", "^MXX"), from="2018-01-01", to="2020-08-30", periodicity="daily", src="yahoo")
```
```{r}
head(CEMEXCPO.MX)
```

```{r}
tail(CEMEXCPO.MX)
```
```{r}
head(MXX)
```
```{r}
tail(MXX)
```


1.- Run a market regression model for CEMEX. Write the code you need (in R chunks) and show the output of the regression.

```{r}
CEMEXCPO.MX <- na.omit(diff(log(CEMEXCPO.MX$CEMEXCPO.MXAdjusted)))
r_MXX <- na.omit(diff(log(MXX$MXX.Adjusted)))
all_rets <- merge(CEMEXCPO.MX, MXX)
colnames(all_rets) <- c("CEMEXCPO.MX", "MXX")
```
```{r}
plot.default(x=all_rets$MXX,y=all_rets$CEMEXCPO.MX.Adjusted)
```

```{r}
sumsquares <- anova(CEMEXCPO.MX)
sumsquares
```

```{r}
t_value_WMT
```
```{r}
VaR(CEMEXCPO.MX)
```
```{r}
var_r_WMT
```


2.- QUESTIONS REGRESSION MODEL
a- WHAT IS THE REGRESSION EQUATION OF THIS MODEL
THE REGRESSION IS GIVEN BY THE Y=MB+X FORMULA WHERE Y IS THE DEPENDENT VARIABLE, X IS THE INDEPENDENT VARIABLE AND B IS THE SLOPE OF THE LINE WHILE A IS THE Y INTERCEPT. 
b- WHAT CAN YOU SAY ABOUT THE MARKET RISK OF THE STOCK?
I THINK THAT THE STOCK IS SIGNIFICANTLY MORE RISKY THAN THE MARKET, IT ONLY TAKES TO SEE THE HISTORICAL DATA CONCERNING THE STOCK'S PRICE TO TELL CEMEX HAS SEEN BETTER DAYS.
C.- CEMEX OVERFORMS THE MARKET, THERE IS A PROBABILITY OF 4% OF BEING WRONG WHEN REJECTING HYPOTHESIS 0. WITH THE T VALUE of 2.388 I CAN SAY THAT CEMEX OFFERS RETURNS OVER THE MARIKET. ALSO I CAN VERIFY THIS INFORMATION LOOKING AT THE CONFIDENCE INTERVAL.
D.- HOW MUCH THE VARIANCE OF THE STOCK RETURNS CANNOT BE EXPLAINED BY THE VARIANCE OF MARKET RETURNS?
98.01% SINCE THE R SQUARED IS EQUAL TO .01956.

