1. Consider the dataset given by x=c(0.725,0.429,-0.372 ,0.863). What value of mu minimizes sum((x - mu)ˆ2)?

Answer: Note that sum((x - mu)ˆ2) or the least squares equation is minimized by the empirical mean. Hence, we have

x<-c(0.725,0.429,-0.372 ,0.863)
mean(x)
## [1] 0.41125

mu=0.41125 which minimizes sum((x - mu)ˆ2) with the data set of x=c(0.725,0.429,-0.372 ,0.863).

2. Reconsider the previous question. Suppose that weights were given, w = c(2, 2, 1, 1) so that we wanted to minimize sum(w * (x - mu) ˆ 2) for mu. What value would we obtain?

x<-c(0.725,0.429,-0.372 ,0.863)
w<-c(2, 2, 1, 1)
sum(x*w)/sum(w)
## [1] 0.4665

Take the Galton and obtain the regression through the origin slope estimate where the centered parental height is the outcome and the child’s height is the predictor.

library(UsingR)
## Warning: package 'UsingR' was built under R version 4.2.3
## Loading required package: MASS
## Loading required package: HistData
## Warning: package 'HistData' was built under R version 4.2.3
## Loading required package: Hmisc
## Warning: package 'Hmisc' was built under R version 4.2.3
## 
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:base':
## 
##     format.pval, units
data(Galton)
head(Galton)
##   parent child
## 1   70.5  61.7
## 2   68.5  61.7
## 3   65.5  61.7
## 4   64.5  61.7
## 5   64.0  61.7
## 6   67.5  62.2

setting

y=Galton$parent
x=Galton$child
yc= y- mean(y) ### centered
xc= x- mean(x) ### centered
sum(yc*xc)/sum(xc^2) ### the regression with y as the outcome and x as the predictor through the origin
## [1] 0.3256475

or we can have,

lm(formula = yc ~ xc -1)
## 
## Call:
## lm(formula = yc ~ xc - 1)
## 
## Coefficients:
##     xc  
## 0.3256