1. Consider the dataset given by x=c(0.725,0.429,-0.372 ,0.863).
What value of mu minimizes sum((x - mu)ˆ2)?
Answer: Note that sum((x - mu)ˆ2) or the least squares equation is
minimized by the empirical mean. Hence, we have
x<-c(0.725,0.429,-0.372 ,0.863)
mean(x)
## [1] 0.41125
mu=0.41125 which minimizes sum((x - mu)ˆ2) with the data set of
x=c(0.725,0.429,-0.372 ,0.863).
2. Reconsider the previous question. Suppose that weights were
given, w = c(2, 2, 1, 1) so that we wanted to minimize sum(w * (x - mu)
ˆ 2) for mu. What value would we obtain?
x<-c(0.725,0.429,-0.372 ,0.863)
w<-c(2, 2, 1, 1)
sum(x*w)/sum(w)
## [1] 0.4665
Take the Galton and obtain the regression through the origin slope
estimate where the centered parental height is the outcome and the
child’s height is the predictor.
library(UsingR)
## Warning: package 'UsingR' was built under R version 4.2.3
## Loading required package: MASS
## Loading required package: HistData
## Warning: package 'HistData' was built under R version 4.2.3
## Loading required package: Hmisc
## Warning: package 'Hmisc' was built under R version 4.2.3
##
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:base':
##
## format.pval, units
data(Galton)
head(Galton)
## parent child
## 1 70.5 61.7
## 2 68.5 61.7
## 3 65.5 61.7
## 4 64.5 61.7
## 5 64.0 61.7
## 6 67.5 62.2
setting
y=Galton$parent
x=Galton$child
yc= y- mean(y) ### centered
xc= x- mean(x) ### centered
sum(yc*xc)/sum(xc^2) ### the regression with y as the outcome and x as the predictor through the origin
## [1] 0.3256475
or we can have,
lm(formula = yc ~ xc -1)
##
## Call:
## lm(formula = yc ~ xc - 1)
##
## Coefficients:
## xc
## 0.3256