学号:1940802 姓名:刘睿奇
首先导入基本的模块。
library(fBasics)
Loading required package: timeDate
Loading required package: timeSeries
library(TTR)
Attaching package: 㤼㸱TTR㤼㸲
The following object is masked from 㤼㸱package:fBasics㤼㸲:
volatility
library(xts)
Loading required package: zoo
Attaching package: 㤼㸱zoo㤼㸲
The following object is masked from 㤼㸱package:timeSeries㤼㸲:
time<-
The following objects are masked from 㤼㸱package:base㤼㸲:
as.Date, as.Date.numeric
library(zoo)
library(quantmod)
Registered S3 method overwritten by 'quantmod':
method from
as.zoo.data.frame zoo
Version 0.4-0 included new data defaults. See ?getSymbols.
1、 1)试下载IBM公司股票每日的开盘价、收盘价、最高价、最低价;
getSymbols('IBM')
㤼㸱getSymbols㤼㸲 currently uses auto.assign=TRUE by default, but will
use auto.assign=FALSE in 0.5-0. You will still be able to use
㤼㸱loadSymbols㤼㸲 to automatically load data. getOption("getSymbols.env")
and getOption("getSymbols.auto.assign") will still be checked for
alternate defaults.
This message is shown once per session and may be disabled by setting
options("getSymbols.warning4.0"=FALSE). See ?getSymbols for details.
[1] "IBM"
head(IBM)
IBM.Open IBM.High IBM.Low IBM.Close IBM.Volume IBM.Adjusted
2007-01-03 97.18 98.40 96.26 97.27 9196800 66.10009
2007-01-04 97.25 98.79 96.88 98.31 10524500 66.80686
2007-01-05 97.60 97.95 96.91 97.42 7221300 66.20204
2007-01-08 98.50 99.50 98.35 98.90 10340000 67.20779
2007-01-09 99.08 100.33 99.07 100.07 11108200 68.00288
2007-01-10 98.50 99.05 97.93 98.89 8744800 67.20101
chartSeries(IBM) #作图,非题目要求
IBM_ohlc <- IBM[,1:4] #将开盘价、收盘价、最高价、最低价存入变量IBM_ohlc
head(IBM_ohlc)
IBM.Open IBM.High IBM.Low IBM.Close
2007-01-03 97.18 98.40 96.26 97.27
2007-01-04 97.25 98.79 96.88 98.31
2007-01-05 97.60 97.95 96.91 97.42
2007-01-08 98.50 99.50 98.35 98.90
2007-01-09 99.08 100.33 99.07 100.07
2007-01-10 98.50 99.05 97.93 98.89
2)提取其中的收盘价,并画出其时间序列图;
IBM_c <- IBM[,4] #将收盘价存入变量IBM_c
head(IBM_c)
IBM.Close
2007-01-03 97.27
2007-01-04 98.31
2007-01-05 97.42
2007-01-08 98.90
2007-01-09 100.07
2007-01-10 98.89
chartSeries(IBM_c)
3)计算收盘价的每日简单收益率、对数收益率;
IBM.srr <- 1 / (IBM_c$IBM.Close / diff(IBM_c$IBM.Close) - 1) #简单收益率
IBM.lrr <- diff(log(IBM_c$IBM.Close)) #对数收益率
head(IBM.srr)
IBM.Close
2007-01-03 NA
2007-01-04 0.010691899
2007-01-05 -0.009052996
2007-01-08 0.015191994
2007-01-09 0.011830111
2007-01-10 -0.011791756
head(IBM.lrr)
IBM.Close
2007-01-03 NA
2007-01-04 0.010635145
2007-01-05 -0.009094223
2007-01-08 0.015077751
2007-01-09 0.011760682
2007-01-10 -0.011861830
4)画出收盘价对数收益率的时间序列图。
chartSeries(IBM.srr) #简单收益率时间序列图,非题目要求
chartSeries(IBM.lrr) #对数收益率时间序列图
读取数据:
foo <- read.table(file = 'C:\\Users\\Ensom\\Documents\\Rscript\\d-axp3dx-0111.txt', header = T, row.names = 1)
head(foo)
MEAN <- apply(foo, MARGIN = 2, FUN = mean)
SD <- apply(foo, MARGIN = 2, FUN = sd)
SKEW <- apply(foo, MARGIN = 2, FUN = skewness)
KURT <- apply(foo, MARGIN = 2, FUN = kurtosis) #函数本身就是超额峰度的意思
MAX <- apply(foo, MARGIN = 2, FUN = max)
MIN <- apply(foo, MARGIN = 2, FUN = min)
rbind(MEAN, SD, SKEW, KURT, MAX, MIN)
axp vw ew sp
MEAN 0.0005339487 0.0002240615 0.0006258233 0.0000942284
SD 0.0263684068 0.0136518353 0.0120803692 0.0137791170
SKEW 0.4597733639 -0.0983183709 -0.2474101701 0.0081519428
KURT 9.5920528473 7.9821338456 8.1084277351 8.5326673290
MAX 0.2064850000 0.1148890000 0.1074220000 0.1158000000
MIN -0.1759490000 -0.0897620000 -0.0782400000 -0.0903500000
foo_l <- log(foo + 1) #将简单收益率转换为对数收益率,并存入foo_l
MEAN <- apply(foo_l, MARGIN = 2, FUN = mean)
SD <- apply(foo_l, MARGIN = 2, FUN = sd)
SKEW <- apply(foo_l, MARGIN = 2, FUN = skewness)
KURT <- apply(foo_l, MARGIN = 2, FUN = kurtosis) #函数本身就是超额峰度的意思
MAX <- apply(foo_l, MARGIN = 2, FUN = max)
MIN <- apply(foo_l, MARGIN = 2, FUN = min)
rbind(MEAN, SD, SKEW, KURT, MAX, MIN)
axp vw ew sp
MEAN 0.0001880014 0.0001307504 0.0005525758 -7.486380e-07
SD 0.0262943876 0.0136703590 0.0120995505 1.379041e-02
SKEW 0.0209917876 -0.3003522828 -0.4273150537 -2.063569e-01
KURT 9.0204992951 7.8800820200 8.0177118109 8.322826e+00
MAX 0.1877111733 0.1087548484 0.1020347916 1.095716e-01
MIN -0.1935228578 -0.0940491752 -0.0814703930 -9.469537e-02
3、几何(X)和代数(Y)考试分数如下表。 X 79 75 77 73 78 81 76 72 70 Y 80 82 76 77 84 81 72 70 75
先在Excel中将原表的行列转置如下: X Y 79 80 75 82 77 76 73 77 78 84 81 81 76 72 72 70 70 75
读取数据:
da = read.table("clipboard",header = T) #需要先复制上面的表格
print(da)
(1)请建立由X估计Y的回归方程;
plot(da$X, da$Y) #散点图,非题目要求
z <- lm(da$Y~da$X)
print(z)
Call:
lm(formula = da$Y ~ da$X)
Coefficients:
(Intercept) da$X
18.1722 0.7833
故,回归方程为 Y == 18.1722 + 0.7833 * X
(2)检验回归方程的显著性;
summary(z)
Call:
lm(formula = da$Y ~ da$X)
Residuals:
Min 1Q Median 3Q Max
-5.7056 -2.4889 -0.0556 1.9944 5.0778
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 18.1722 30.6805 0.592 0.5723
da$X 0.7833 0.4051 1.934 0.0944 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.051 on 7 degrees of freedom
Multiple R-squared: 0.3482, Adjusted R-squared: 0.2551
F-statistic: 3.74 on 1 and 7 DF, p-value: 0.09439
X后有“.”标记,可知在0.90的置信度下是显著的,但在0.95的置信度下是不显著的
(3)几何得分为71分的学生,其代数分数为几何?
x1 = 71
y1 = 18.1722 + 0.7833 * x1
print(y1)
[1] 73.7865
故,估计其代数分数为74分
4、 1)试画出正态分布N(5,100)的密度函数图和分布函数图;
curve(dnorm(x, 5, sqrt(100)), xlim = c(5-30, 5+30), col = 'darkblue', lwd = 2) #密度函数图
curve(pnorm(x, 5, sqrt(100)), xlim = c(5-40, 5+40), col = 'darkred', lwd = 2) #分布函数图
2)试求出服从正态分布N(5,100)的随机数1000个,并画出直方图,上面套上正态分布拟合图和非参数核函数拟合图(用不同颜色);
N <- 1000 #随机数容量
rand_set <- rnorm(N, 5, 10) #产生随机数
hist(rand_set, xlim = c(min(rand_set)-5, max(rand_set))+5, probability = T, nclass = max(rand_set) - min(rand_set) + 1, col = 'lightblue', main = '1000 random number of N(5,100)') #直方图
se <- seq(-40,40,length.out = 1000)
ye <- dnorm(se, 5, 10)
lines(se, ye, col="darkblue", lwd = 2) #正态分布拟合图
lines(density(rand_set), col = 'red', lwd = 2) #核函数拟合图
legend('topright',lwd = 2, legend = c('N(5,100)','density'), col = c('darkblue','red')) #绘制图例
3)对随机数进行正态性检验,并给出结论。
normalTest(rand_set) #fBasics包进行Kolmogorov-Smirnov正态性检验
Title:
Shapiro - Wilk Normality Test
Test Results:
STATISTIC:
W: 0.999
P VALUE:
0.8579
Description:
Fri Jun 19 13:32:55 2020 by user: Ensom
W = 0.999 接近1,P VALUE=0.8579 > 0.05,不能拒绝随机数服从正态分布的假设。
此外还可以使用QQ图检验:
qqnorm(rand_set, main="Normality Check via QQ Plot") #QQ图
qqline(rand_set, col='red')
可见,随机数点基本都在qq线附近,比较符合正态分布。