Q1.This problem uses the data set ford.csv on the book’s web site. The data were taken from the ford.s data set in R’s fEcofin package. This

package is no longer on CRAN. This data set contains 2000 daily Ford returns from January 2, 1984, to December 31, 1991.

rm(list=ls())
install.packages("fEcofin", repos="http://R-Forge.R-project.org")
## Installing package into '/usr/local/lib/R/site-library'
## (as 'lib' is unspecified)
library(fEcofin)
data(ford.s , package = "fEcofin")
ford.s[,2] -> ford

(a) Find the sample mean, sample median, and standard deviation of the Ford returns.

mean(ford) -> mu
median(ford)
## [1] 0
var(ford)
## [1] 0.0003354601
sqrt(var(ford))
## [1] 0.01831557
sd(ford) -> sig

(b)Create a normal plot of the Ford returns. Do the returns look normally distributed? If not, how do they differ from being normally

distributed?

ford.return = ford
n = length(ford.return) 
year_ford = 1984 + (1:n) * (1991.25 - 1984) / n 
plot(year_ford, ford.return, main = "Ford daily returns", xlab = "year", type = "l", ylab = "log return")

(c)Test for normality using the Shapiro–Wilk test? What is the p-value? Can you reject the null hypothesis of a normal distribution at 0.01?

sh = shapiro.test(ford) 
sh
## 
##  Shapiro-Wilk normality test
## 
## data:  ford
## W = 0.96388, p-value < 2.2e-16
#par(mfrow=c(2, 2)) 

 # { 

  #qqnorm(ford, datax = T, main = i) 
  #qqline(ford, datax = T) 
  #print(shapiro.test(ford) 
  
#Shapiro–Wilk tests give p<0.01. Which means that the data are not consistent with  normal distribution. According to the Shapiro wilk test, reject the null hypothesis of a normal distribution at 0.01. 

(d)Create several t-plots of the Ford returns using a number of choices of the degrees of freedom parameter (df). What value of df gives a

plot that is as linear as possible? The returns include the return on Black Monday, October 19, 1987. Discuss whether or not to ignore that return when looking for the best choices of df.

matrix(ford)-> ford.matrix
n=dim(ford.matrix)[1]
q_grid = (1:n) / (n + 1) 
df_grid = c(1, 4, 6, 10, 20, 30) 
index.names = dimnames(ford.matrix)[[2]] 
for(i in 1:1) 
{
#dev.new() 
par(mfrow = c(3, 2))
for(df in df_grid) 
{
qqplot(ford.matrix[,i], qt(q_grid,df),
main = paste(index.names[,i], ", df = ", df) ) 
  abline(lm(qt(c(0.25, 0.75), df = df) ~ 
              quantile(ford.matrix[,i], c(0.25, 0.75))))
  
}}

#When t-plot with "6" degree of freedom, reference line fits to the quantiles. 6 of degree of freedom gives a plot that is linear.