1. Problem set 1

In this problem set we’ll work out some properties of the least squares solution that we reviewed in the weekly readings. Consider the unsolvable system \(\mathbf{Ax} = \mathbf{b}\) as given below:

\[\begin{bmatrix} 1 & 0 \\ 1 & 1 \\ 1 & 3 \\ 1 & 4 \end{bmatrix}\begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 0 \\ 8 \\ 8 \\ 20 \end{bmatrix}\]

  1. Write R markdown script to compute \(\mathbf{A^TA}\) and \(\mathbf{A^Tb}\).
leastsquares <- function(A, b) {
  r <- dim(A)[1]
  c_A <- dim(A)[2]
  c_b <- dim(b)[2]
  # Create transpose of A
  At <- matrix(A, nrow=c_A, ncol=r, byrow = T)
  # Create AtA = t(A) %*% A
  AtA <- matrix(NA, nrow=c_A, ncol=c_A)
  for (i in 1:c_A) {
    for (j in 1:c_A) {
      AtA[i,j] <- sum(At[i,]*A[,j])
    }
  }
  # Create Atb = t(A) %*% b
  Atb <- matrix(NA, nrow=c_A, ncol=c_b)
  for (i in 1:c_A) {
    for (j in 1:c_b) {
      Atb[i,j] <- sum(At[i,]*b[,j])
    }
  }
  # Convert to list for output
  leastsquares <- list("At" = At, "AtA" = AtA, "Atb" = Atb)
return(leastsquares)
}
a_i <- c(1,1,1,1,0,1,3,4)
b_i <- c(0,8,8,20)
A <- matrix(a_i, nrow=4); A
##      [,1] [,2]
## [1,]    1    0
## [2,]    1    1
## [3,]    1    3
## [4,]    1    4
b <- matrix(b_i, nrow=4); b
##      [,1]
## [1,]    0
## [2,]    8
## [3,]    8
## [4,]   20
matrices <- leastsquares(A,b)

\[\mathbf{A^T} = \begin{bmatrix} 1 & 1 & 1 & 1 \\ 0 & 1 & 3 & 4 \end{bmatrix}, \quad \mathbf{A^TA} = \begin{bmatrix} 4 & 8 \\ 8 & 26 \end{bmatrix}, \quad \mathbf{A^Tb} = \begin{bmatrix} 36 \\ 112 \end{bmatrix}\]

matrices$At
##      [,1] [,2] [,3] [,4]
## [1,]    1    1    1    1
## [2,]    0    1    3    4
matrices$AtA
##      [,1] [,2]
## [1,]    4    8
## [2,]    8   26
matrices$Atb
##      [,1]
## [1,]   36
## [2,]  112
  1. Solve for \(\mathbf{\hat{x}}\) in R using the above two computed matrices. \[\mathbf{\hat{x}} = \left(\mathbf{A^TA}\right)^{-1}\mathbf{A^Tb} = \begin{bmatrix} 1 \\ 4 \end{bmatrix}\]
x_hat <- solve(matrices$AtA) %*% matrices$Atb; x_hat
##      [,1]
## [1,]    1
## [2,]    4
\begin{center}\textcolor{red}{ORDER CHANGED TO FACILITATE CODING}\end{center}
  1. {5th Problem} Show that the error \(\mathbf{e} = \mathbf{b} - \mathbf{p} = [-1; 3;-5; 3]\). \[\mathbf{p} = \mathbf{A\hat{x}}\]
p <- A %*% x_hat; p
##      [,1]
## [1,]    1
## [2,]    5
## [3,]   13
## [4,]   17
e <- b - p; e
##      [,1]
## [1,]   -1
## [2,]    3
## [3,]   -5
## [4,]    3
  1. {3rd Problem} What is the squared error of this solution? \[\mathbf{E} = \sum\mathbf{e_i^{2}}\]
e.sq <- e^2; e.sq
##      [,1]
## [1,]    1
## [2,]    9
## [3,]   25
## [4,]    9
E <- sum(e.sq); E
## [1] 44
  1. {6th Problem} Show that the error \(\mathbf{e}\) is orthogonal to \(\mathbf{p}\) and to each of the columns of \(\mathbf{A}\). \[\mathbf{e \perp p} \Leftarrow \mathbf{e \cdot p} = 0, \quad \mathbf{e \perp a_i} \Leftarrow \mathbf{e \cdot a_i} = 0\]
r <- dim(A)[1]
dot <- numeric(r + 1)
for (i in 1:(r + 1)) {
    if (i==1) {
      dot[i] <- sum(e %*% p[ ,1])
    } else {
      dot[i] <- sum(e %*% A[(i - 1), 1])
    }
}
dot <- round(dot, 10); dot
## [1] 0 0 0 0 0
all.equal(dot, rep(0, (r + 1)))
## [1] TRUE

Verifying Conformity with Projection Matrix Properties \[\mathbf{P} = \mathbf{P^2} = \mathbf{P^T}, \quad \mathbf{p} = \mathbf{Pb}, \quad \mathbf{PA} = \mathbf{IA}\]

P = A %*% solve(matrices$AtA) %*% matrices$At; P
##       [,1] [,2] [,3]  [,4]
## [1,]  0.65 0.45 0.05 -0.15
## [2,]  0.45 0.35 0.15  0.05
## [3,]  0.05 0.15 0.35  0.45
## [4,] -0.15 0.05 0.45  0.65
all.equal(P, P %*% P, t(P))
## [1] TRUE
all.equal(p, P %*% b)
## [1] TRUE
all.equal(P%*%A, diag(nrow = dim(A)[1]) %*% A)
## [1] TRUE

  1. {4th Problem} Instead of \(\mathbf{b} = [0; 8; 8; 20]\), start with \(\mathbf{p} = [1; 5; 13; 17]\) and find the exact solution (i.e. show that this system is solvable as all equations are consistent with each other. This should result in an error vector \(\mathbf{e} = 0\)).
p <- c(1,5,13,17)
b <- matrix(p, nrow=4)
matrices <- leastsquares(A,b)
x_hat <- solve(matrices$AtA) %*% matrices$Atb; x_hat
##      [,1]
## [1,]    1
## [2,]    4
p <- A %*% x_hat
e <- round(b - p, 10); e
##      [,1]
## [1,]    0
## [2,]    0
## [3,]    0
## [4,]    0

2. Problem set 2

Consider the modified auto-mpg data (obtained from the UC Irvine Machine Learning dataset). This dataset contains 5 columns: displacement, horsepower, weight, acceleration, mpg. We are going to model mpg as a function of the other four variables.

Write an R markdown script that takes in the auto-mpg data, extracts an \(\mathbf{A}\) matrix from the first 4 columns and \(\mathbf{b}\) vector from the fifth (mpg) column. Using the least squares approach, your code should compute the best fitting solution. That is, find the best fitting equation that expresses mpg in terms of the other 4 variables. Finally, calculate the fitting error between the predicted mpg of your model and the actual mpg. Your script should be able to load in the 5 column data set, extract \(\mathbf{A}\) and \(\mathbf{b}\), and perform the rest of the calculations.

Adding test cases to demonstrate that your code is working will be very helpful.

  1. Download auto-mpg data.
url <- "https://raw.githubusercontent.com/jzuniga123/SPS/master/DATA%20605/auto-mpg.data"
auto.mpg <- as.matrix(read.table(url))
  1. Extract matrix \(\mathbf{A}\) from the first 4 columns and vector \(\mathbf{b}\) from the fifth (mpg) column
r <- nrow(auto.mpg)
c <- ncol(auto.mpg)
A <- auto.mpg[ ,1:(c-1)]; head(A)
##       V1  V2   V3   V4
## [1,] 307 130 3504 12.0
## [2,] 350 165 3693 11.5
## [3,] 318 150 3436 11.0
## [4,] 304 150 3433 12.0
## [5,] 302 140 3449 10.5
## [6,] 429 198 4341 10.0
b <- as.matrix(auto.mpg[ ,c]); head(b)
##      [,1]
## [1,]   18
## [2,]   15
## [3,]   18
## [4,]   16
## [5,]   17
## [6,]   15
  1. Using the least squares approach, compute the best fitting solution (the best fitting equation that expresses mpg in terms of the other 4 variables). \[mpg_i = \hat{x}_1displacement_i + \hat{x}_2horsepower_i + \hat{x}_3weight_i + \hat{x}_4acceleration_i + \varepsilon_i\]
# Using the function created in Problem set 1.
matrices <- leastsquares(A,b)
x_hat <- solve(matrices$AtA) %*% matrices$Atb; round(x_hat, 10)
##              [,1]
## [1,] -0.030037938
## [2,]  0.157115685
## [3,] -0.006217883
## [4,]  1.997320955

\[mpg_i = -0.03\left(displacement_i\right) + 0.157\left(horsepower_i\right) + -0.006\left(weight_i\right) + 1.997\left(acceleration_i\right) + \varepsilon_i\]

  1. Calculate the fitting error between the predicted mpg of your model and the actual mpg. \[\mathbf{e} = \mathbf{b} - \mathbf{A\hat{x}}\]
e <- round(b - A %*% x_hat, 10); e
##                 [,1]
##   [1,]   4.616218240
##   [2,]  -0.417359037
##   [3,]   3.378826784
##   [4,]  -1.057678955
##   [5,]   4.548869576
##   [6,]   3.795990039
##   [7,]   2.168546863
##   [8,]   3.271103548
##   [9,]  -0.142844892
##  [10,]   3.824436855
##  [11,]   1.975971155
##  [12,]   5.536161185
##  [13,]   7.858731092
##  [14,]  -8.468590101
##  [15,]  -2.742699116
##  [16,]  -0.321690829
##  [17,]  -4.972739352
##  [18,]  -2.218717778
##  [19,]   0.370436481
##  [20,]  -7.848905950
##  [21,]  -3.703824969
##  [22,]  -0.778050635
##  [23,]  -6.987689299
##  [24,]   0.804756613
##  [25,]  -0.657722327
##  [26,] -12.233178241
##  [27,] -14.951848663
##  [28,] -12.159299460
##  [29,] -19.719209715
##  [30,]   0.370436481
##  [31,]   1.183711807
##  [32,]  -0.640753298
##  [33,]   0.669964336
##  [34,]  -3.313786305
##  [35,]  -1.461226547
##  [36,]   2.256278832
##  [37,]  -1.256842633
##  [38,]   0.792408065
##  [39,]   3.307368745
##  [40,]  -0.630130773
##  [41,]  -0.512012410
##  [42,]   3.064125878
##  [43,]   3.847729670
##  [44,]   5.511997111
##  [45,]  -0.079401003
##  [46,]  -8.083454073
##  [47,]   1.245193435
##  [48,]   2.240084874
##  [49,]  -1.006113756
##  [50,]   2.582061235
##  [51,]  -4.676970301
##  [52,]   4.581320866
##  [53,]  -4.004617659
##  [54,]   0.399417231
##  [55,]  -6.058761986
##  [56,] -11.053764063
##  [57,]  -4.325840586
##  [58,]  -5.375792992
##  [59,] -15.492501357
##  [60,] -13.910196876
##  [61,]  -7.962108847
##  [62,]   0.196570453
##  [63,]   1.817495518
##  [64,]  -0.268175454
##  [65,]   0.213082632
##  [66,]   2.427055537
##  [67,]  -1.956865955
##  [68,]   0.189423170
##  [69,]  -1.882177867
##  [70,]  -2.307838563
##  [71,]  -6.613731476
##  [72,]  -0.202331179
##  [73,]  -0.685001225
##  [74,]  -5.182284625
##  [75,]  -2.627473140
##  [76,]  -6.686469463
##  [77,]  -6.644874743
##  [78,]  -9.489197447
##  [79,]  -4.298171700
##  [80,]  -4.912626158
##  [81,]  -3.268903147
##  [82,]  -2.014808131
##  [83,]   1.870147438
##  [84,]  -3.810741917
##  [85,]  -4.453818975
##  [86,]  -0.572944463
##  [87,]  -0.436751318
##  [88,]  -2.281862629
##  [89,]  -0.496856579
##  [90,]   1.599135061
##  [91,]   5.236600385
##  [92,]  -0.117511192
##  [93,]  -2.631272353
##  [94,]  -0.092034372
##  [95,]  -0.869559439
##  [96,]  -1.893587012
##  [97,]  -5.288394024
##  [98,]  -7.771640963
##  [99,]  -4.388236947
## [100,]  -2.488267220
## [101,]   0.121118380
## [102,]  -8.132509893
## [103,]   2.556090063
## [104,]   3.315277511
## [105,]   0.076845960
## [106,]  -2.759801642
## [107,]  -3.360905725
## [108,] -14.691043265
## [109,] -10.125639731
## [110,]  -7.688229374
## [111,]  -7.794805561
## [112,] -13.277332909
## [113,]  -3.747384683
## [114,]  -2.071663713
## [115,]   2.147729675
## [116,]  -0.495878224
## [117,]  -3.995060003
## [118,]  -1.839559026
## [119,]  -2.781122073
## [120,]  -8.087952807
## [121,]   0.148765116
## [122,]  -1.071059649
## [123,]  -3.992430765
## [124,]  -5.955752434
## [125,]  -2.646401283
## [126,]  -3.661823795
## [127,]  -6.413682799
## [128,]  -2.977980249
## [129,]  -0.620391085
## [130,]  -6.607532947
## [131,]  -0.726962907
## [132,]  -2.646724907
## [133,]  -6.901363757
## [134,]  -2.229195635
## [135,]   0.861021202
## [136,]   0.202603570
## [137,]   0.733982841
## [138,]  -2.043332905
## [139,]  -4.924766785
## [140,]  -0.255197489
## [141,]  -0.906524427
## [142,]   1.998633422
## [143,]   2.410360842
## [144,]  -0.585586594
## [145,]   3.171585401
## [146,]  -2.931439564
## [147,]   3.703596103
## [148,]  -2.119172618
## [149,]  -1.459178140
## [150,]   3.324876763
## [151,]  -0.829419444
## [152,]  -1.437140670
## [153,]  -9.406810690
## [154,]  -8.114529173
## [155,]   7.361395281
## [156,]   2.376410799
## [157,]   0.991595085
## [158,]   3.283042751
## [159,] -10.994433155
## [160,]  -5.707010346
## [161,]  -9.289332187
## [162,]  -4.581865317
## [163,]  -0.407629784
## [164,]   3.651182422
## [165,]  -2.459846539
## [166,]   1.671892149
## [167,]  -3.380753743
## [168,]  -2.580991317
## [169,]  -5.883397249
## [170,]   2.778864693
## [171,]  -2.585241246
## [172,]  -5.795651024
## [173,]  -2.510725255
## [174,]   4.786862316
## [175,]  -2.160444202
## [176,]  -1.680464944
## [177,]  -2.789804167
## [178,]  -0.412235282
## [179,]   0.210419117
## [180,]   3.614304178
## [181,]   2.064499167
## [182,]  -4.192993732
## [183,]   0.982980946
## [184,]  -4.799676112
## [185,]   0.125997462
## [186,]   4.908579307
## [187,]   2.072468583
## [188,]   2.650141828
## [189,]   1.804400437
## [190,]   2.390640336
## [191,]   4.899745175
## [192,]   0.856631667
## [193,]  -0.642290022
## [194,]  -8.303924359
## [195,] -12.668517651
## [196,]   4.387398125
## [197,]   3.814036274
## [198,]  -1.604122808
## [199,]  -6.466565426
## [200,]  -0.965657065
## [201,]  -5.374814890
## [202,]   8.238787033
## [203,]   1.974257553
## [204,]  -0.226522360
## [205,]   8.178286702
## [206,]  -3.892475729
## [207,]  -2.881466334
## [208,] -14.630479499
## [209,]  -6.022653492
## [210,]  -6.910455812
## [211,]   1.799198675
## [212,]   1.977167792
## [213,]  -4.250189174
## [214,]  -5.629631436
## [215,]  -0.475015755
## [216,]   4.604143880
## [217,]   3.457753933
## [218,]  -2.575821403
## [219,]   3.593917013
## [220,]   3.038670588
## [221,]  -5.177354952
## [222,]   0.649028193
## [223,]   0.592143156
## [224,]  -3.142356619
## [225,]  -1.516858374
## [226,]  -2.734698349
## [227,]  -5.418913472
## [228,]   3.803555275
## [229,]   2.431635367
## [230,]   0.188223082
## [231,]   1.126447816
## [232,]   2.760195577
## [233,]   0.289412282
## [234,]  -5.137732960
## [235,]   1.294611712
## [236,]   2.343851400
## [237,]   4.547819946
## [238,]   1.973363024
## [239,]   6.613594685
## [240,]  -0.312495921
## [241,]  -1.547347477
## [242,]  -3.430881662
## [243,]   7.661958594
## [244,]  11.104850220
## [245,]   0.567414603
## [246,]   6.675974657
## [247,]   7.842636851
## [248,]   0.391839773
## [249,]   3.815024518
## [250,]   4.064510894
## [251,]  -6.726729452
## [252,]  -5.152223237
## [253,]  -0.268893849
## [254,]   1.633029877
## [255,]  -1.479614452
## [256,]  -2.166126276
## [257,]   0.500389946
## [258,]  -0.813605003
## [259,]  -6.765354980
## [260,]  -1.960659838
## [261,]   0.511409197
## [262,]  -6.628818456
## [263,]   2.890697163
## [264,]   3.061533642
## [265,]   2.703593276
## [266,]   4.154916337
## [267,]   0.474805862
## [268,]   7.175032179
## [269,]  -3.723280967
## [270,]  -4.898399922
## [271,]  -2.419897659
## [272,]  -2.571780625
## [273,]  -6.108740196
## [274,]  -5.382689340
## [275,]  -6.812669505
## [276,] -10.154892535
## [277,]   5.631667627
## [278,]   1.879503522
## [279,]  -0.211252711
## [280,]  -5.307017069
## [281,]  -3.904839845
## [282,]  -3.021463672
## [283,]  -2.187630529
## [284,]  -1.145540228
## [285,]   2.801047019
## [286,]   3.588442086
## [287,]  -0.003340094
## [288,]   0.410234460
## [289,]   0.378496704
## [290,]   0.036322444
## [291,]   4.279591261
## [292,]   7.425094098
## [293,]   8.391783391
## [294,]   9.220287168
## [295,]   5.107268766
## [296,]   0.602010413
## [297,]   3.370176477
## [298,]  -9.418377560
## [299,]  -5.505913374
## [300,]  13.670591357
## [301,]  10.264251588
## [302,]  -1.647733638
## [303,]  13.075942671
## [304,]   3.439929109
## [305,]   9.493938904
## [306,]   4.951103080
## [307,]  13.423589133
## [308,]  16.473448806
## [309,]   6.033595009
## [310,]   6.269056960
## [311,]   9.368585092
## [312,]   2.091011695
## [313,]  -1.527054267
## [314,]  -6.778531801
## [315,]  -4.609115300
## [316,]   7.005713295
## [317,]   5.582877837
## [318,]   3.973617851
## [319,]  11.294384308
## [320,]   7.384647226
## [321,]  16.338431003
## [322,]   4.737421835
## [323,]   7.911875823
## [324,]   9.084282693
## [325,]   5.744111509
## [326,]   8.103907200
## [327,]   0.525310710
## [328,]  20.746755687
## [329,]   3.572510760
## [330,]   3.645187386
## [331,]  12.331683592
## [332,]   0.171851887
## [333,]  10.223609037
## [334,]   4.586225695
## [335,]   2.181993570
## [336,]   1.566068933
## [337,]   3.560706814
## [338,]   3.191324863
## [339,]   9.921614541
## [340,]   9.517947753
## [341,]  10.430325639
## [342,]   6.892738439
## [343,]   1.974544319
## [344,]   2.872997440
## [345,]   8.825211471
## [346,]   6.534948092
## [347,]  11.968223770
## [348,]   7.490169497
## [349,]  -3.914784029
## [350,]   9.782628834
## [351,]  10.110482492
## [352,]   4.917453744
## [353,]   7.462359827
## [354,]   3.411139882
## [355,]  -0.895491203
## [356,]   3.615728250
## [357,]   5.086570572
## [358,]   0.387024534
## [359,]   1.732437424
## [360,]   5.828647134
## [361,]  -2.746059269
## [362,]  -0.606860706
## [363,]  -5.411836939
## [364,]  -4.196890082
## [365,]   2.478121180
## [366,]   4.663864877
## [367,]   3.600423185
## [368,]  -1.550550413
## [369,]  -1.191160796
## [370,]   9.279820373
## [371,]   5.289557307
## [372,]   0.145966320
## [373,]  15.108078499
## [374,]   6.604968655
## [375,]  10.247845547
## [376,]  12.179660982
## [377,]   6.450422367
## [378,]  12.465027075
## [379,]   5.066902406
## [380,]  10.254778415
## [381,]  -1.290257023
## [382,]  17.307567317
## [383,]   3.343348817
## [384,]  -0.361075041
## [385,]  10.050254025
## [386,]  15.628614201
## [387,]   1.184419080
## [388,]   3.883048859
## [389,]   2.853659482
## [390,]  13.958522321
## [391,]  -1.635813657
## [392,]  -0.144356538