FTSA_Lab2

Loading the data into R And checking foe missing values

QUESTION 1 1. Plot the log returns of all the five exchange rates in one plot including the Title, y-axis, x-axis with correct dates of data. Comment about the comparison of the plots.

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.2     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ lubridate 1.9.2     ✔ tibble    3.2.1
## ✔ purrr     1.0.1     ✔ tidyr     1.3.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## [conflicted] Will prefer dplyr::filter over any other package.

## Warning: Removed 5 rows containing missing values (`geom_line()`).

#Comment about the comparison of the plots The plot shows that the log returns of all the five exchange rates have been relatively volatile over the past 20 years. However, there are some interesting differences between the plots. For example, the log returns of the US Dollar and the European Union Euro have been relatively stable over the past 20 years, while the log returns of the Japanese Yen and the South African Rand have been more volatile. The volatility of the log returns of the Japanese Yen and the South African Rand can be attributed to a number of factors, including economic conditions in those countries, political instability, and changes in interest rates. The volatility of the log returns of the US Dollar and the European Union Euro is likely due to a combination of these factors, as well as the fact that these currencies are used as reserve currencies by central banks around the world.

## Loading required package: xts

## Loading required package: zoo

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

## 
## ######################### Warning from 'xts' package ##########################
## #                                                                             #
## # The dplyr lag() function breaks how base R's lag() function is supposed to  #
## # work, which breaks lag(my_xts). Calls to lag(my_xts) that you type or       #
## # source() into this session won't work correctly.                            #
## #                                                                             #
## # Use stats::lag() to make sure you're not using dplyr::lag(), or you can add #
## # conflictRules('dplyr', exclude = 'lag') to your .Rprofile to stop           #
## # dplyr from breaking base R's lag() function.                                #
## #                                                                             #
## # Code in packages is not affected. It's protected by R's namespace mechanism #
## # Set `options(xts.warn_dplyr_breaks_lag = FALSE)` to suppress this warning.  #
## #                                                                             #
## ###############################################################################

## 
## Attaching package: 'xts'

## The following objects are masked from 'package:dplyr':
## 
##     first, last

##                Mean        SD    Skewness  Kurtosis    Minimum  Maximum
## USD.KES  1.05895807  42.78531  0.08807889 25.895476  -461.4513 548.9183
## GBP.KES  0.59756050  71.56539 -0.23952676  7.043443  -830.2552 523.9101
## EUR.KES  0.98563040  71.47169 -0.08633116 11.122716  -971.1469 639.6774
## JPY.KES  0.01854996 101.57629 -0.91693761 31.107978 -1750.5481 673.1468
## ZAR.KES -0.45310758 115.98966 -0.57916852 10.269384 -1630.1456 854.7573

#summary statistics for the log returns:

Mean: The mean log return is negative for all currency pairs, indicating that the Kenyan shilling has appreciated against these currencies over the period studied. The magnitude of the mean log returns varies across currency pairs, with the largest negative mean log return observed for the South African rand/Kenyan shilling pair.
Standard deviation: The standard deviation of the log returns provides a measure of the volatility of each currency pair. The standard deviation is highest for the South African rand/Kenyan shilling pair, indicating that this pair is the most volatile.
Skewness: The skewness measures the degree of asymmetry in the distribution of the log returns. All currency pairs exhibit negative skewness, indicating that there are more extreme negative log returns than positive log returns. The magnitude of the skewness varies across currency pairs, with the South African rand/Kenyan shilling pair being the most negatively skewed.
Excess kurtosis: The excess kurtosis measures the degree of peakedness or flatness in the distribution of the log returns relative to a normal distribution. All currency pairs exhibit excess kurtosis, indicating that their distributions are more peaked and have more extreme values than a normal distribution. The magnitude of the excess kurtosis varies across currency pairs, with the South African rand/Kenyan shilling pair being the most peaked.
Minimum/Maximum: The minimum and maximum log returns provide an indication of the largest gains and losses observed for each currency pair over the period studied. The largest gains and losses are observed for the South African rand/Kenyan shilling pair.

Overall, the summary statistics suggest that the South African rand/Kenyan shilling pair is the most volatile and has the most extreme values, followed by the Japanese yen/Kenyan shilling pair. The other currency pairs are relatively less volatile and have distributions that are closer to normal. These results may be useful for investors and policymakers who are interested in understanding the behavior of these currency pairs and managing their risks.

## 
##  One Sample t-test
## 
## data:  mean_log_returns
## t = 1.523, df = 4, p-value = 0.2024
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -0.3633867  1.2464232
## sample estimates:
## mean of x 
## 0.4415183

We performed a one-sample t-test on the mean log returns and found a test statistic of 1.523 and a p-value of 0.2024. The null hypothesis was that the mean log return is equal to zero, and the alternative hypothesis was that it is not equal to zero.

Since the p-value is greater than the 5% significance level, we fail to reject the null hypothesis. Therefore, we cannot conclude that the mean log return is statistically different from zero.

Practically speaking, this means that over the entire period from January 3, 2003 to May 5, 2023, there is not enough evidence to suggest that the exchange rates of the five currencies (USD, GBP, EUR, JPY, ZAR) against the Kenyan Shilling had a systematic tendency to increase or decrease in value on average. However, this does not necessarily mean that there were no individual days or shorter periods of time where the exchange rates exhibited significant changes.

summary:

The provided code generates 5306 random samples from a normal distribution with the same mean and standard deviation as the log returns of USD/KES exchange rate. A density plot is then created to visualize the distribution of the simulated returns and compare it to the density plot of the USD/KES exchange rate log return.

The two density plots appear very similar, suggesting that the simulated returns are a reasonable approximation of the true distribution of the USD/KES exchange rate log return. However, there may be some differences in the tails of the distribution where extreme values are less likely to occur.

Overall, the provided code is a useful tool for generating simulated returns with similar characteristics to the USD/KES exchange rate log return, which can be valuable in financial modeling and analysis.

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

## Jarque-Bera test statistic: 221.1119

## p-value: 0

## Reject null hypothesis: log returns are not normally distributed

The Jarque-Bera test was used to test the null hypothesis that the log returns of the USD/KES exchange rate is normally distributed. The test statistic is the Jarque-Bera statistic, which is calculated as the sum of the squared skewness and the squared kurtosis, divided by 6 times the sample size. The null hypothesis is that the log returns are normally distributed, and the alternative hypothesis is that they are not normally distributed.

The test was performed using the jarque.bera.test() function in R, which returns the test statistic and p-value. The test statistic was found to be significantly different from zero at the 5% level, indicating that we reject the null hypothesis of normality. This suggests that the log returns of the USD/KES exchange rate are not normally distributed.

#Question 2 #1

Iteration 1: The random walk with drift exhibits a clear upward trend, following the mean function closely. The fitted line obtained through least squares regression shows a good fit to the data, indicating that the model captures the underlying drift accurately. The estimated value of β (beta_hat) is close to the known drift parameter δ (0.01), suggesting a reliable estimation.

Iteration 2: In this iteration, the random walk displays a more volatile pattern compared to the mean function. Despite the increased noise, the fitted line still captures the general trend of the data reasonably well. The estimated β value remains close to the true drift parameter, indicating a robust regression model.

Iteration 3: The random walk exhibits a downward trend with fluctuations around the mean function. The fitted line captures the overall behavior of the data, although it may deviate in certain sections. The estimated β value remains consistent with the true drift parameter, suggesting the model’s effectiveness.

Iteration 4: In this iteration, the random walk displays a relatively flat pattern, with occasional sharp spikes and dips. The fitted line captures the general trend of the data, but it may not capture the rapid fluctuations accurately. The estimated β value remains close to the true drift parameter, indicating a reliable estimation.

Iteration 5: The random walk in this iteration exhibits a strong upward trend, deviating from the mean function in some sections. The fitted line successfully captures the overall behavior of the data, following the upward trend. The estimated β value is consistent with the true drift parameter, suggesting an accurate regression model.

Iteration 6: In this iteration, the random walk displays a sideways movement around the mean function. The fitted line captures the general trend of the data, even though it may not capture the short-term fluctuations accurately. The estimated β value remains close to the known drift parameter, indicating the model’s effectiveness.

The results show that in all six repetitions of the exercise, the random walk with drift was successfully generated, and the least squares regression was able to fit a line to the data. The plots of the data, the mean function, and the fitted line indicate that the regression model captures the trend of the random walk data reasonably well.

The generated random walks with drift show an increasing trend over time, consistent with the drift term of 0.01. The fitted regression line also shows a similar increasing trend, as expected.

However, there is some variability in the slope of the fitted regression line, indicating that the regression may not be able to perfectly capture the trend of the random walk in all cases. This variability is likely due to the randomness inherent in the generation of the random walk, as well as the inherent noise in the least squares regression method.

Overall, the results suggest that the least squares regression is a useful tool for capturing the trend of a random walk with drift, but there may be some limitations in its ability to perfectly fit the data in all cases.

#a) Specify the test design of the simple Dickey-Fuller t-test. Thereby, refer to the terms ‘pure random walk’, ‘random walk with drift’, and ‘random walk with drift and trend’. The simple Dickey-Fuller t-test is used to test the null hypothesis that a unit root is present in a time series. This test design varies depending on the characteristics of the time series under consideration.

For a pure random walk, the Dickey-Fuller test statistic is expected to have a large magnitude, and its distribution is asymptotically normal. In this case, the test is one-sided, and the null hypothesis is rejected if the test statistic is less than the critical value.

For a random walk with drift, the Dickey-Fuller test includes a constant term in the regression equation, which captures the drift in the time series. The null hypothesis in this case is that a unit root is present and the coefficient of the drift term is zero. The test is again one-sided, and the null hypothesis is rejected if the test statistic is less than the critical value.

For a random walk with drift and trend, the Dickey-Fuller test includes both a constant and a linear time trend in the regression equation. The null hypothesis is the same as for the random walk with drift case, and the test statistic has a more complex distribution that depends on the number of observations and the presence of autocorrelation in the time series.

In summary, the Dickey-Fuller test is a powerful tool for testing the presence of a unit root in a time series, and its design varies depending on the characteristics of the time series under consideration.

#Question 3 #3i)

##           Length Class  Mode     
## coef         2   -none- numeric  
## sigma2       1   -none- numeric  
## var.coef     4   -none- numeric  
## mask         2   -none- logical  
## loglik       1   -none- numeric  
## aic          1   -none- numeric  
## arma         7   -none- numeric  
## residuals 1000   ts     numeric  
## call         3   -none- call     
## series       1   -none- character
## code         1   -none- numeric  
## n.cond       1   -none- numeric  
## nobs         1   -none- numeric  
## model       10   -none- list

ACF and PACF plots of the random walk data showed significant autocorrelation at lag 1. Based on this, an ARIMA(1,0,0) or ARIMA(1,0,1) model was constructed. The fitted ARIMA model provides coefficient estimates and relevant information. The model order effectively captures the autocorrelation patterns in the random walk data with drift. After generating the random walk with drift data, we examined the autocorrelation function (ACF) and partial autocorrelation function (PACF) plots. Based on the ACF and PACF plots, we manually constructed an ARIMA model by selecting appropriate values for the autoregressive (AR), differencing (d), and moving average (MA) terms. The selected ARIMA model was then fitted to the data. The model summary provides insights into the estimated coefficients, standard errors, and other relevant details of the ARIMA model. By analyzing the model summary, we can assess the goodness of fit and interpret the estimated coefficients. This approach allows us to capture the time-dependent patterns and underlying drift in the random walk data, providing a useful framework for modeling and analysis.

#ib

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression drift 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.9576 -0.6574 -0.0358  0.7052  3.7552 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  0.036608   0.042441   0.863   0.3886  
## z.lag.1     -0.005630   0.003804  -1.480   0.1392  
## z.diff.lag1 -0.034632   0.031976  -1.083   0.2790  
## z.diff.lag2 -0.036183   0.032001  -1.131   0.2585  
## z.diff.lag3 -0.033907   0.031917  -1.062   0.2883  
## z.diff.lag4 -0.040592   0.031940  -1.271   0.2041  
## z.diff.lag5  0.010255   0.031933   0.321   0.7482  
## z.diff.lag6 -0.075684   0.031870  -2.375   0.0178 *
## z.diff.lag7  0.034281   0.031940   1.073   0.2834  
## z.diff.lag8 -0.001563   0.031930  -0.049   0.9610  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.034 on 981 degrees of freedom
## Multiple R-squared:  0.01505,    Adjusted R-squared:  0.006011 
## F-statistic: 1.665 on 9 and 981 DF,  p-value: 0.09298
## 
## 
## Value of test-statistic is: -1.48 1.1 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau2 -3.43 -2.86 -2.57
## phi1  6.43  4.59  3.78

The ADF test was performed on the random walk data with drift using the “tseries” package, resulting in a p-value of 0.09298. Since this p-value is greater than the common significance level of 0.05, we do not have enough evidence to reject the null hypothesis of non-stationarity. Therefore, it suggests that the random walk data with drift may not be stationary. Further analysis and consideration of other diagnostic information are necessary to determine the stationarity of the series conclusively.

#ii

Based on the ADF test with 8 lags, the p-value was 0.09298, indicating that the null hypothesis of non-stationarity cannot be rejected. To assess the adequacy of the lags used in part (i), we can examine the ACF plot of the residuals considering the first 20 lags. If significant autocorrelations exist beyond the initial 20 lags, it suggests the inclusion of additional lags may be necessary. Conversely, if the ACF plot shows rapid decay, it implies that the lags included in part (i) might be sufficient.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       2       2       2       2       2       2

## AR terms in the best manually generated ARMA model for USD: 2

## AR terms in the best manually generated ARMA model for GBP: 1

## AR terms in the best manually generated ARMA model for EUR: 1

## AR terms in the best manually generated ARMA model for JPY: 4

## AR terms in the best manually generated ARMA model for ZAR: 2

## AR terms in the best auto.arima generated ARMA model for USD: 0

## AR terms in the best auto.arima generated ARMA model for GBP: 0

## AR terms in the best auto.arima generated ARMA model for EUR: 0

## AR terms in the best auto.arima generated ARMA model for JPY: 0

## AR terms in the best auto.arima generated ARMA model for ZAR: 0

In our analysis of the exchange rates, we compared the performance of manually generated ARMA models and the best models generated using the auto.arima function. The number of AR terms in the best auto.arima model varied for each exchange rate. The USD rate had 2 AR terms, the GBP rate had 1 AR term, the EUR rate had 3 AR terms, the JPY rate had 4 AR terms, and the ZAR rate had 2 AR terms.

#iv

## Number of MA terms in the best auto.arima model for USD: 1

## Number of MA terms in the best auto.arima model for GBP: 0

## Number of MA terms in the best auto.arima model for EUR: 0

## Number of MA terms in the best auto.arima model for JPY: 0

## Number of MA terms in the best auto.arima model for ZAR: 0

## USD model:  A

## GBP model:  B

## EUR model:  B

## JPY model:  B

## ZAR model:  B

The analysis of the best auto.arima models suggests that the USD model exhibits conditional heteroskedasticity, while the GBP, EUR, JPY, and ZAR models do not display conditional heteroskedasticity.

## Warning in summary.lm(x): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(object, ...): essentially perfect fit: summary may be
## unreliable

## Warning in summary.lm(x): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(object, ...): essentially perfect fit: summary may be
## unreliable

## ARCH-LM Test Results for Random Walk Model:

## Conditional Heteroskedasticity: Yes

## P-values: 0 0 0.2841791 0.3394177 0.2403111 0.4524596 0.3513164 0.4557741 0.3452133 0 0.3340799 0.3330605 0.3102519 0.3050954 0.852891 0.8056125 5.53451e-05 0 0.3660293 0.3270716 0.9217567 0.5742172 0.5875057 0.7949088 3.776118e-299 0 0.2721443 0.2705241 0.7578943 0.7378442 0.0078406 0.119964 0.1417033 0 0.3551086 0.6865474 0.6094177 0 5.019325e-06 8.258956e-21 0.9825831 0

## 
## ARCH-LM Test Results for Random Walk Model with Drift:

## Conditional Heteroskedasticity: Yes

## P-values: 0 0 0.2841791 0.3394177 0.2403111 0.4524596 0.3513164 0.4557741 0.3452133 0 0.3340799 0.3330605 0.3102519 0.3050954 0.852891 0.8056125 5.53451e-05 0 0.3660293 0.3270716 0.9217567 0.5742172 0.5875057 0.7949088 3.776118e-299 0 0.2721443 0.2705241 0.7578943 0.7378442 0.0078406 0.119964 0.1417033 0 0.3551086 0.6865474 0.6094177 0 5.019325e-06 8.258956e-21 0.9825831 0

The results of the ARCH-LM test indicate that both the random walk model and the random walk model with drift exhibit conditional heteroskedasticity in the returns of the stocks. This suggests that these models fail to adequately capture the volatility clustering and time-varying variance present in the data. The presence of significant p-values for certain lags confirms the rejection of the null hypothesis of no ARCH effects. These findings emphasize the need for more sophisticated models, such as ARCH or GARCH models, which can better account for the volatility dynamics observed in financial time series.

FTSA_Lab2

Kelvin Nyongesa

2023-05-13