課本第四章習題:第7題、第8題、第9題、第12題、第13題
4.7
Suppose that we wish to predict whether a given stock will issue a dividend this year (“Yes” or “No”) based on \(X\), last year’s percent profit. We examine a large number of companies and discover that the mean value of \(X\) for companies that issued a dividend was \(\overline{X}\) = 10, while the mean for those that didn’t was \(\overline{X}\) = 0. In addition, the variance of X for these two sets of companies was \(\sigma^2\) = 36. Finally, 80 % of companies issued dividends. Assuming that X follows a normal distribution, predict the probability that a company will issue a dividend this year given that its percentage profit was \(X\) = 4 last year.
Hint: Recall that the density function for a normal random variable is \[f(x) = \frac{1}{\sqrt{2\pi\sigma^2}}e^{\frac{-(x-\mu)^2}{2\sigma^2}}\] You will need to use Bayes’ theorem.
(4.12)
x = 4
mean1 = 10
mean2 = 0
variance = 36
prior1 = 0.8
prior2 = 0.2
#fk1 = 1/sqrt(2*prior*variance)*exp(-(x-mean)^2 / 2*variance);fk1
fk1 = (prior1*exp(-(x-mean1)^2 / 2*variance))/(prior1*exp(-(x-mean1)^2 / 2*variance) + prior2*exp(-(x-mean2)^2 / 2*variance)) ; fk1
[1] 1.803211e-156
4.8
Suppose that we take a data set, divide it into equally-sized training and test sets, and then try out two different classification procedures. First we use logistic regression and get an error rate of 20 % on the training data and 30 % on the test data. Next we use 1-nearest neigh- bors (i.e. K = 1) and get an average error rate (averaged over both test and training data sets) of 18 %. Based on these results, which method should we prefer to use for classification of new observations? Why?
logistic是線性的、paramatric的,knn是非線性、non-paramatric的,knn不論在test或train中error rate都比logistic小,代表這份資料是適合使用knn這類型的方式。
4.9
This problem has to do with odds.
- On average, what fraction of people with an odds of 0.37 of defaulting on their credit card payment will in fact default?
\[odds = \frac {P(X)}{1-P(X)}\]
p = 0.37/(1+0.37);p
[1] 0.270073
由odds算出約有27%的人會違約
- Suppose that an individual has a 16 % chance of defaulting on her credit card payment. What are the odds that she will de- fault?
odds = 0.16/(1-0.16);odds
[1] 0.1904762
4.12
This problem involves writing functions. (a) Write a function, Power(), that prints out the result of raising 2 to the 3rd power. In other words, your function should compute \(2^3\) and print out the results. Hint: Recall that x^a raises x to the power a. Use the print() function to output the result.
power <- function(x){
print (x^3)
}
Create a new function, Power2(), that allows you to pass any two numbers, x and a, and prints out the value of x^a. You can do this by beginning your function with the line
Power2=function(x,a){
You should be able to call your function by entering, for instance,
Power2(3,8)
on the command line. This should output the value of 38, namely, 6, 561.
power2 <- function(x,a){
print(x^a)
}
- Using the Power2() function that you just wrote, compute \(10^3\), \(8^17\), and \(131^3\).
power2(10,3)
[1] 1000
power2(8,17)
[1] 2.2518e+15
power2(131,3)
[1] 2248091
Now create a new function, Power3(), that actually returns the result x^a as an R object, rather than simply printing it to the screen. That is, if you store the value x^a in an object called result within your function, then you can simply return() this
return()
result, using the following line:
return(result)
The line above should be the last line in your function, before the } symbol.
power3 <- function(x,a){
result = x^a
return(result)
}
- Now using the Power3() function, create a plot of f(x) = \(x^2\). The x-axis should display a range of integers from 1 to 10, and the y-axis should display \(x^2\). Label the axes appropriately, and use an appropriate title for the figure. Consider displaying either the x-axis, the y-axis, or both on the log-scale. You can do this by using log=‘‘x’’, log=‘‘y’’, or log=‘‘xy’’ as arguments to the plot() function.
x = c(1:10)
plot(x,power3(x, 2),xlab="Log of x",ylab="Log of x^2", main = "Log of x^2 vs Log of x",log = "xy")

Create a function, PlotPower(), that allows you to create a plot of x against x^a for a fixed a and for a range of values of x. For instance, if you call
PlotPower (1:10,3)
then a plot should be created with an x-axis taking on values 1, 2, . . . , 10, and a y-axis taking on values \(1^3\), \(2^3\), . . . , \(10^3\).
PlotPower<- function(x,a){
yt = paste("Log of x^",a)
maint = paste(yt,"vs Log of x")
plot(x,power3(x, a),xlab="Log of x",ylab=yt, main = maint,log = "xy")
}
PlotPower(c(1:10),3)

4.13
Using the Boston data set, fit classification models in order to predict whether a given suburb has a crime rate above or below the median. Explore logistic regression, LDA, and KNN models using various sub- sets of the predictors. Describe your findings.
library(ISLR)
data(Boston)
summary(Boston$crim)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.00632 0.08204 0.25651 3.61352 3.67708 88.97620
#median = 0.25651
Boston$below <- ifelse(Boston$crim > 0.25651,1,0)
Boston$below <- as.factor(Boston$below)
# below = 1 :高於平均 / 0 : 低於平均
library(corrplot)
Boston$below <- as.numeric(Boston$below)
cor <- cor(Boston)
corrplot(cor)

set.seed(144)
library(caTools)
split = sample.split(Boston$below,SplitRatio = 0.7)
TR = subset(Boston,split == TRUE)
TS = subset(Boston,split == FALSE)
glm1 <- glm(below~zn+indus+nox+age+dis+rad+tax+black+lstat,TR,family = "binomial")#correlation高的
pred = predict(glm1, newdata=TS, type="response")
x = table(actual = TS$below,predict = pred > 0.5);x
predict
actual FALSE TRUE
0 67 9
1 9 67
acc = sum(diag(x))/sum(x);acc
[1] 0.8815789
library(MASS)
lda1 <- lda(below~zn+indus+nox+age+dis+rad+tax+black+lstat,TR)
pred <- predict(lda1, TS)
x <- table(actual = TS$below , pred = pred$class);x
pred
actual 0 1
0 68 8
1 16 60
acc = sum(diag(x))/sum(x) ; acc
[1] 0.8421053
library(class)
TR.X <- as.matrix(TR$below)
TS.X <- as.matrix(TS$below)
set.seed(1)
pred <- knn(TR.X,TS.X,TR$below, k = 1)
x <- table(pred = pred,actual = TS$below);x
actual
pred 0 1
0 76 0
1 0 76
acc = sum(diag(x))/sum(x) ; acc
[1] 1
在同樣的predictor組合下,準確率是 knn(k=1) > logistic > lda,代表這份資料適合使用non-parametric,非線性的方法
---
title: "Statistical learning HW8"
author: "M074020010 陳韻卉"
output:
  html_notebook: default
  word_document: default
---

課本第四章習題：第7題、第8題、第9題、第12題、第13題


## 4.7
Suppose that we wish to predict whether a given stock will issue a
dividend this year (“Yes” or “No”) based on $X$, last year’s percent
proﬁt. We examine a large number of companies and discover that the
mean value of $X$ for companies that issued a dividend was $\overline{X}$ = 10,
while the mean for those that didn’t was $\overline{X}$ = 0. In addition, the variance of X for these two sets of companies was $\sigma^2$ = 36. Finally, 80 % of companies issued dividends. Assuming that X follows a normal distribution, predict the probability that a company will issue
a dividend this year given that its percentage proﬁt was $X$ = 4 last
year.

Hint: Recall that the density function for a normal random variable
is 
$$f(x) = \frac{1}{\sqrt{2\pi\sigma^2}}e^{\frac{-(x-\mu)^2}{2\sigma^2}}$$ 
You will need to use Bayes’ theorem.<br>

![](./1.PNG)
(4.12)

```{r}
x = 4
mean1 = 10
mean2 = 0
variance = 36
prior1 = 0.8
prior2 = 0.2
#fk1 = 1/sqrt(2*prior*variance)*exp(-(x-mean)^2 / 2*variance);fk1
fk1 = (prior1*exp(-(x-mean1)^2 / 2*variance))/(prior1*exp(-(x-mean1)^2 / 2*variance) + prior2*exp(-(x-mean2)^2 / 2*variance)) ; fk1
```


## 4.8

Suppose that we take a data set, divide it into equally-sized training
and test sets, and then try out two diﬀerent classiﬁcation procedures.
First we use logistic regression and get an error rate of 20 % on the
training data and 30 % on the test data. Next we use 1-nearest neigh-
bors (i.e. K = 1) and get an average error rate (averaged over both
test and training data sets) of 18 %. Based on these results, which
method should we prefer to use for classiﬁcation of new observations?
Why?

logistic是線性的、paramatric的，knn是非線性、non-paramatric的，knn不論在test或train中error rate都比logistic小，代表這份資料是適合使用knn這類型的方式。

## 4.9
This problem has to do with odds.

(a) On average, what fraction of people with an odds of 0.37 of
defaulting on their credit card payment will in fact default?

$$odds = \frac {P(X)}{1-P(X)}$$

```{r}
p = 0.37/(1+0.37);p
```
由odds算出約有27%的人會違約

(b) Suppose that an individual has a 16 % chance of defaulting on
her credit card payment. What are the odds that she will de-
fault?

```{r}
odds = 0.16/(1-0.16);odds
```


## 4.12

 This problem involves writing functions.
(a) Write a function, Power(), that prints out the result of raising 2
to the 3rd power. In other words, your function should compute
$2^3$ and print out the results.
Hint: Recall that x^a raises x to the power a. Use the print()
function to output the result.

```{r}
power <- function(x){
  print (x^3)
}
```


(b) Create a new function, Power2(), that allows you to pass any
two numbers, x and a, and prints out the value of x^a. You can
do this by beginning your function with the line

    Power2=function(x,a){
    
You should be able to call your function by entering, for instance,

    Power2(3,8)

on the command line. This should output the value of 38, namely,
6, 561.

```{r}
power2 <- function(x,a){
  print(x^a)
}
```

(c) Using the Power2() function that you just wrote, compute $10^3$,
$8^17$, and $131^3$.

```{r}
power2(10,3)
power2(8,17)
power2(131,3)
```

(d) Now create a new function, Power3(), that actually returns the
result x^a as an R object, rather than simply printing it to the
screen. That is, if you store the value x^a in an object called
result within your function, then you can simply return() this

    return()

result, using the following line:

    return(result)

The line above should be the last line in your function, before
the } symbol.

```{r}
power3 <- function(x,a){
  result = x^a
  return(result)
}
```

(e) Now using the Power3() function, create a plot of f(x) = $x^2$.
The x-axis should display a range of integers from 1 to 10, and
the y-axis should display $x^2$. Label the axes appropriately, and
use an appropriate title for the ﬁgure. Consider displaying either
the x-axis, the y-axis, or both on the log-scale. You can do this
by using log=‘‘x’’, log=‘‘y’’, or log=‘‘xy’’ as arguments to
the plot() function.

```{r}
x = c(1:10)
plot(x,power3(x, 2),xlab="Log of x",ylab="Log of x^2", main = "Log of x^2 vs Log of x",log = "xy")

```


(f) Create a function, PlotPower(), that allows you to create a plot
of x against x^a for a ﬁxed a and for a range of values of x. For
instance, if you call

    PlotPower (1:10,3)

then a plot should be created with an x-axis taking on values
1, 2, . . . , 10, and a y-axis taking on values $1^3$, $2^3$, . . . , $10^3$.

```{r}
PlotPower<- function(x,a){
  yt = paste("Log of x^",a)
  maint = paste(yt,"vs Log of x")
  plot(x,power3(x, a),xlab="Log of x",ylab=yt, main = maint,log = "xy")
}
```


```{r}
PlotPower(c(1:10),3)
```

## 4.13

Using the Boston data set, ﬁt classiﬁcation models in order to predict
whether a given suburb has a crime rate above or below the median.
Explore logistic regression, LDA, and KNN models using various sub-
sets of the predictors. Describe your ﬁndings.

```{r}
library(ISLR)
data(Boston)

summary(Boston$crim)

```

```{r}
#median = 0.25651
Boston$below <- ifelse(Boston$crim > 0.25651,1,0)
# below = 1 :高於平均 / 0 : 低於平均
```

```{r}
library(corrplot)
Boston$below <- as.numeric(Boston$below)
cor <- cor(Boston)
corrplot(cor)
```


```{r}
set.seed(144)
library(caTools)
Boston$below <-  as.factor(Boston$below)
split = sample.split(Boston$below,SplitRatio = 0.7)
TR = subset(Boston,split == TRUE)
TS = subset(Boston,split == FALSE)
```

```{r}

glm1 <- glm(below~zn+indus+nox+age+dis+rad+tax+black+lstat,TR,family = "binomial")#correlation高的
```


```{r}
pred = predict(glm1, newdata=TS, type="response")
x = table(actual = TS$below,predict = pred > 0.5);x
acc = sum(diag(x))/sum(x);acc
```


```{r}
library(MASS)
lda1 <- lda(below~zn+indus+nox+age+dis+rad+tax+black+lstat,TR)
```


```{r}
pred <- predict(lda1, TS)
x <- table(actual = TS$below , pred = pred$class);x
acc = sum(diag(x))/sum(x) ; acc
```

```{r}
library(class)
TR.X <- as.matrix(TR$below)
TS.X <- as.matrix(TS$below)
set.seed(1)
pred <- knn(TR.X,TS.X,TR$below, k = 1)
x <- table(pred = pred,actual = TS$below);x
acc = sum(diag(x))/sum(x) ; acc
```

在同樣的predictor組合下，準確率是 knn(k=1) > logistic > lda，代表這份資料適合使用non-parametric,非線性的方法

