In this problem set we’ll work out some properties of the least squares solution that we reviewed in the weekly readings. Consider the unsolvable system \(\mathbf{Ax} = \mathbf{b}\) as given below:
\[\begin{bmatrix} 1 & 0 \\ 1 & 1 \\ 1 & 3 \\ 1 & 4 \end{bmatrix}\begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 0 \\ 8 \\ 8 \\ 20 \end{bmatrix}\]
leastsquares <- function(A, b) {
r <- dim(A)[1]
c_A <- dim(A)[2]
c_b <- dim(b)[2]
# Create transpose of A
At <- matrix(A, nrow=c_A, ncol=r, byrow = T)
# Create AtA = t(A) %*% A
AtA <- matrix(NA, nrow=c_A, ncol=c_A)
for (i in 1:c_A) {
for (j in 1:c_A) {
AtA[i,j] <- sum(At[i,]*A[,j])
}
}
# Create Atb = t(A) %*% b
Atb <- matrix(NA, nrow=c_A, ncol=c_b)
for (i in 1:c_A) {
for (j in 1:c_b) {
Atb[i,j] <- sum(At[i,]*b[,j])
}
}
# Convert to list for output
leastsquares <- list("At" = At, "AtA" = AtA, "Atb" = Atb)
return(leastsquares)
}
a_i <- c(1,1,1,1,0,1,3,4)
b_i <- c(0,8,8,20)
A <- matrix(a_i, nrow=4); A
## [,1] [,2]
## [1,] 1 0
## [2,] 1 1
## [3,] 1 3
## [4,] 1 4
b <- matrix(b_i, nrow=4); b
## [,1]
## [1,] 0
## [2,] 8
## [3,] 8
## [4,] 20
matrices <- leastsquares(A,b)
\[\mathbf{A^T} = \begin{bmatrix} 1 & 1 & 1 & 1 \\ 0 & 1 & 3 & 4 \end{bmatrix}, \quad \mathbf{A^TA} = \begin{bmatrix} 4 & 8 \\ 8 & 26 \end{bmatrix}, \quad \mathbf{A^Tb} = \begin{bmatrix} 36 \\ 112 \end{bmatrix}\]
matrices$At
## [,1] [,2] [,3] [,4]
## [1,] 1 1 1 1
## [2,] 0 1 3 4
matrices$AtA
## [,1] [,2]
## [1,] 4 8
## [2,] 8 26
matrices$Atb
## [,1]
## [1,] 36
## [2,] 112
x_hat <- solve(matrices$AtA) %*% matrices$Atb; x_hat
## [,1]
## [1,] 1
## [2,] 4
\begin{center}\textcolor{red}{ORDER CHANGED TO FACILITATE CODING}\end{center}
p <- A %*% x_hat; p
## [,1]
## [1,] 1
## [2,] 5
## [3,] 13
## [4,] 17
e <- b - p; e
## [,1]
## [1,] -1
## [2,] 3
## [3,] -5
## [4,] 3
e.sq <- e^2; e.sq
## [,1]
## [1,] 1
## [2,] 9
## [3,] 25
## [4,] 9
E <- sum(e.sq); E
## [1] 44
r <- dim(A)[1]
dot <- numeric(r + 1)
for (i in 1:(r + 1)) {
if (i==1) {
dot[i] <- sum(e %*% p[ ,1])
} else {
dot[i] <- sum(e %*% A[(i - 1), 1])
}
}
dot <- round(dot, 10); dot
## [1] 0 0 0 0 0
all.equal(dot, rep(0, (r + 1)))
## [1] TRUE
Verifying Conformity with Projection Matrix Properties \[\mathbf{P} = \mathbf{P^2} = \mathbf{P^T}, \quad \mathbf{p} = \mathbf{Pb}, \quad \mathbf{PA} = \mathbf{IA}\]
P = A %*% solve(matrices$AtA) %*% matrices$At; P
## [,1] [,2] [,3] [,4]
## [1,] 0.65 0.45 0.05 -0.15
## [2,] 0.45 0.35 0.15 0.05
## [3,] 0.05 0.15 0.35 0.45
## [4,] -0.15 0.05 0.45 0.65
all.equal(P, P %*% P, t(P))
## [1] TRUE
all.equal(p, P %*% b)
## [1] TRUE
all.equal(P%*%A, diag(nrow = dim(A)[1]) %*% A)
## [1] TRUE
p <- c(1,5,13,17)
b <- matrix(p, nrow=4)
matrices <- leastsquares(A,b)
x_hat <- solve(matrices$AtA) %*% matrices$Atb; x_hat
## [,1]
## [1,] 1
## [2,] 4
p <- A %*% x_hat
e <- round(b - p, 10); e
## [,1]
## [1,] 0
## [2,] 0
## [3,] 0
## [4,] 0
Consider the modified auto-mpg data (obtained from the UC Irvine Machine Learning dataset). This dataset contains 5 columns: displacement, horsepower, weight, acceleration, mpg. We are going to model mpg as a function of the other four variables.
Write an R markdown script that takes in the auto-mpg data, extracts an \(\mathbf{A}\) matrix from the first 4 columns and \(\mathbf{b}\) vector from the fifth (mpg) column. Using the least squares approach, your code should compute the best fitting solution. That is, find the best fitting equation that expresses mpg in terms of the other 4 variables. Finally, calculate the fitting error between the predicted mpg of your model and the actual mpg. Your script should be able to load in the 5 column data set, extract \(\mathbf{A}\) and \(\mathbf{b}\), and perform the rest of the calculations.
Adding test cases to demonstrate that your code is working will be very helpful.
url <- "https://raw.githubusercontent.com/jzuniga123/SPS/master/DATA%20605/auto-mpg.data"
auto.mpg <- as.matrix(read.table(url))
r <- nrow(auto.mpg)
c <- ncol(auto.mpg)
A <- auto.mpg[ ,1:(c-1)]; head(A)
## V1 V2 V3 V4
## [1,] 307 130 3504 12.0
## [2,] 350 165 3693 11.5
## [3,] 318 150 3436 11.0
## [4,] 304 150 3433 12.0
## [5,] 302 140 3449 10.5
## [6,] 429 198 4341 10.0
b <- as.matrix(auto.mpg[ ,c]); head(b)
## [,1]
## [1,] 18
## [2,] 15
## [3,] 18
## [4,] 16
## [5,] 17
## [6,] 15
# Using the function created in Problem set 1.
matrices <- leastsquares(A,b)
x_hat <- solve(matrices$AtA) %*% matrices$Atb; round(x_hat, 10)
## [,1]
## [1,] -0.030037938
## [2,] 0.157115685
## [3,] -0.006217883
## [4,] 1.997320955
\[mpg_i = -0.03\left(displacement_i\right) + 0.157\left(horsepower_i\right) + -0.006\left(weight_i\right) + 1.997\left(acceleration_i\right) + \varepsilon_i\]
e <- round(b - A %*% x_hat, 10); e
## [,1]
## [1,] 4.616218240
## [2,] -0.417359037
## [3,] 3.378826784
## [4,] -1.057678955
## [5,] 4.548869576
## [6,] 3.795990039
## [7,] 2.168546863
## [8,] 3.271103548
## [9,] -0.142844892
## [10,] 3.824436855
## [11,] 1.975971155
## [12,] 5.536161185
## [13,] 7.858731092
## [14,] -8.468590101
## [15,] -2.742699116
## [16,] -0.321690829
## [17,] -4.972739352
## [18,] -2.218717778
## [19,] 0.370436481
## [20,] -7.848905950
## [21,] -3.703824969
## [22,] -0.778050635
## [23,] -6.987689299
## [24,] 0.804756613
## [25,] -0.657722327
## [26,] -12.233178241
## [27,] -14.951848663
## [28,] -12.159299460
## [29,] -19.719209715
## [30,] 0.370436481
## [31,] 1.183711807
## [32,] -0.640753298
## [33,] 0.669964336
## [34,] -3.313786305
## [35,] -1.461226547
## [36,] 2.256278832
## [37,] -1.256842633
## [38,] 0.792408065
## [39,] 3.307368745
## [40,] -0.630130773
## [41,] -0.512012410
## [42,] 3.064125878
## [43,] 3.847729670
## [44,] 5.511997111
## [45,] -0.079401003
## [46,] -8.083454073
## [47,] 1.245193435
## [48,] 2.240084874
## [49,] -1.006113756
## [50,] 2.582061235
## [51,] -4.676970301
## [52,] 4.581320866
## [53,] -4.004617659
## [54,] 0.399417231
## [55,] -6.058761986
## [56,] -11.053764063
## [57,] -4.325840586
## [58,] -5.375792992
## [59,] -15.492501357
## [60,] -13.910196876
## [61,] -7.962108847
## [62,] 0.196570453
## [63,] 1.817495518
## [64,] -0.268175454
## [65,] 0.213082632
## [66,] 2.427055537
## [67,] -1.956865955
## [68,] 0.189423170
## [69,] -1.882177867
## [70,] -2.307838563
## [71,] -6.613731476
## [72,] -0.202331179
## [73,] -0.685001225
## [74,] -5.182284625
## [75,] -2.627473140
## [76,] -6.686469463
## [77,] -6.644874743
## [78,] -9.489197447
## [79,] -4.298171700
## [80,] -4.912626158
## [81,] -3.268903147
## [82,] -2.014808131
## [83,] 1.870147438
## [84,] -3.810741917
## [85,] -4.453818975
## [86,] -0.572944463
## [87,] -0.436751318
## [88,] -2.281862629
## [89,] -0.496856579
## [90,] 1.599135061
## [91,] 5.236600385
## [92,] -0.117511192
## [93,] -2.631272353
## [94,] -0.092034372
## [95,] -0.869559439
## [96,] -1.893587012
## [97,] -5.288394024
## [98,] -7.771640963
## [99,] -4.388236947
## [100,] -2.488267220
## [101,] 0.121118380
## [102,] -8.132509893
## [103,] 2.556090063
## [104,] 3.315277511
## [105,] 0.076845960
## [106,] -2.759801642
## [107,] -3.360905725
## [108,] -14.691043265
## [109,] -10.125639731
## [110,] -7.688229374
## [111,] -7.794805561
## [112,] -13.277332909
## [113,] -3.747384683
## [114,] -2.071663713
## [115,] 2.147729675
## [116,] -0.495878224
## [117,] -3.995060003
## [118,] -1.839559026
## [119,] -2.781122073
## [120,] -8.087952807
## [121,] 0.148765116
## [122,] -1.071059649
## [123,] -3.992430765
## [124,] -5.955752434
## [125,] -2.646401283
## [126,] -3.661823795
## [127,] -6.413682799
## [128,] -2.977980249
## [129,] -0.620391085
## [130,] -6.607532947
## [131,] -0.726962907
## [132,] -2.646724907
## [133,] -6.901363757
## [134,] -2.229195635
## [135,] 0.861021202
## [136,] 0.202603570
## [137,] 0.733982841
## [138,] -2.043332905
## [139,] -4.924766785
## [140,] -0.255197489
## [141,] -0.906524427
## [142,] 1.998633422
## [143,] 2.410360842
## [144,] -0.585586594
## [145,] 3.171585401
## [146,] -2.931439564
## [147,] 3.703596103
## [148,] -2.119172618
## [149,] -1.459178140
## [150,] 3.324876763
## [151,] -0.829419444
## [152,] -1.437140670
## [153,] -9.406810690
## [154,] -8.114529173
## [155,] 7.361395281
## [156,] 2.376410799
## [157,] 0.991595085
## [158,] 3.283042751
## [159,] -10.994433155
## [160,] -5.707010346
## [161,] -9.289332187
## [162,] -4.581865317
## [163,] -0.407629784
## [164,] 3.651182422
## [165,] -2.459846539
## [166,] 1.671892149
## [167,] -3.380753743
## [168,] -2.580991317
## [169,] -5.883397249
## [170,] 2.778864693
## [171,] -2.585241246
## [172,] -5.795651024
## [173,] -2.510725255
## [174,] 4.786862316
## [175,] -2.160444202
## [176,] -1.680464944
## [177,] -2.789804167
## [178,] -0.412235282
## [179,] 0.210419117
## [180,] 3.614304178
## [181,] 2.064499167
## [182,] -4.192993732
## [183,] 0.982980946
## [184,] -4.799676112
## [185,] 0.125997462
## [186,] 4.908579307
## [187,] 2.072468583
## [188,] 2.650141828
## [189,] 1.804400437
## [190,] 2.390640336
## [191,] 4.899745175
## [192,] 0.856631667
## [193,] -0.642290022
## [194,] -8.303924359
## [195,] -12.668517651
## [196,] 4.387398125
## [197,] 3.814036274
## [198,] -1.604122808
## [199,] -6.466565426
## [200,] -0.965657065
## [201,] -5.374814890
## [202,] 8.238787033
## [203,] 1.974257553
## [204,] -0.226522360
## [205,] 8.178286702
## [206,] -3.892475729
## [207,] -2.881466334
## [208,] -14.630479499
## [209,] -6.022653492
## [210,] -6.910455812
## [211,] 1.799198675
## [212,] 1.977167792
## [213,] -4.250189174
## [214,] -5.629631436
## [215,] -0.475015755
## [216,] 4.604143880
## [217,] 3.457753933
## [218,] -2.575821403
## [219,] 3.593917013
## [220,] 3.038670588
## [221,] -5.177354952
## [222,] 0.649028193
## [223,] 0.592143156
## [224,] -3.142356619
## [225,] -1.516858374
## [226,] -2.734698349
## [227,] -5.418913472
## [228,] 3.803555275
## [229,] 2.431635367
## [230,] 0.188223082
## [231,] 1.126447816
## [232,] 2.760195577
## [233,] 0.289412282
## [234,] -5.137732960
## [235,] 1.294611712
## [236,] 2.343851400
## [237,] 4.547819946
## [238,] 1.973363024
## [239,] 6.613594685
## [240,] -0.312495921
## [241,] -1.547347477
## [242,] -3.430881662
## [243,] 7.661958594
## [244,] 11.104850220
## [245,] 0.567414603
## [246,] 6.675974657
## [247,] 7.842636851
## [248,] 0.391839773
## [249,] 3.815024518
## [250,] 4.064510894
## [251,] -6.726729452
## [252,] -5.152223237
## [253,] -0.268893849
## [254,] 1.633029877
## [255,] -1.479614452
## [256,] -2.166126276
## [257,] 0.500389946
## [258,] -0.813605003
## [259,] -6.765354980
## [260,] -1.960659838
## [261,] 0.511409197
## [262,] -6.628818456
## [263,] 2.890697163
## [264,] 3.061533642
## [265,] 2.703593276
## [266,] 4.154916337
## [267,] 0.474805862
## [268,] 7.175032179
## [269,] -3.723280967
## [270,] -4.898399922
## [271,] -2.419897659
## [272,] -2.571780625
## [273,] -6.108740196
## [274,] -5.382689340
## [275,] -6.812669505
## [276,] -10.154892535
## [277,] 5.631667627
## [278,] 1.879503522
## [279,] -0.211252711
## [280,] -5.307017069
## [281,] -3.904839845
## [282,] -3.021463672
## [283,] -2.187630529
## [284,] -1.145540228
## [285,] 2.801047019
## [286,] 3.588442086
## [287,] -0.003340094
## [288,] 0.410234460
## [289,] 0.378496704
## [290,] 0.036322444
## [291,] 4.279591261
## [292,] 7.425094098
## [293,] 8.391783391
## [294,] 9.220287168
## [295,] 5.107268766
## [296,] 0.602010413
## [297,] 3.370176477
## [298,] -9.418377560
## [299,] -5.505913374
## [300,] 13.670591357
## [301,] 10.264251588
## [302,] -1.647733638
## [303,] 13.075942671
## [304,] 3.439929109
## [305,] 9.493938904
## [306,] 4.951103080
## [307,] 13.423589133
## [308,] 16.473448806
## [309,] 6.033595009
## [310,] 6.269056960
## [311,] 9.368585092
## [312,] 2.091011695
## [313,] -1.527054267
## [314,] -6.778531801
## [315,] -4.609115300
## [316,] 7.005713295
## [317,] 5.582877837
## [318,] 3.973617851
## [319,] 11.294384308
## [320,] 7.384647226
## [321,] 16.338431003
## [322,] 4.737421835
## [323,] 7.911875823
## [324,] 9.084282693
## [325,] 5.744111509
## [326,] 8.103907200
## [327,] 0.525310710
## [328,] 20.746755687
## [329,] 3.572510760
## [330,] 3.645187386
## [331,] 12.331683592
## [332,] 0.171851887
## [333,] 10.223609037
## [334,] 4.586225695
## [335,] 2.181993570
## [336,] 1.566068933
## [337,] 3.560706814
## [338,] 3.191324863
## [339,] 9.921614541
## [340,] 9.517947753
## [341,] 10.430325639
## [342,] 6.892738439
## [343,] 1.974544319
## [344,] 2.872997440
## [345,] 8.825211471
## [346,] 6.534948092
## [347,] 11.968223770
## [348,] 7.490169497
## [349,] -3.914784029
## [350,] 9.782628834
## [351,] 10.110482492
## [352,] 4.917453744
## [353,] 7.462359827
## [354,] 3.411139882
## [355,] -0.895491203
## [356,] 3.615728250
## [357,] 5.086570572
## [358,] 0.387024534
## [359,] 1.732437424
## [360,] 5.828647134
## [361,] -2.746059269
## [362,] -0.606860706
## [363,] -5.411836939
## [364,] -4.196890082
## [365,] 2.478121180
## [366,] 4.663864877
## [367,] 3.600423185
## [368,] -1.550550413
## [369,] -1.191160796
## [370,] 9.279820373
## [371,] 5.289557307
## [372,] 0.145966320
## [373,] 15.108078499
## [374,] 6.604968655
## [375,] 10.247845547
## [376,] 12.179660982
## [377,] 6.450422367
## [378,] 12.465027075
## [379,] 5.066902406
## [380,] 10.254778415
## [381,] -1.290257023
## [382,] 17.307567317
## [383,] 3.343348817
## [384,] -0.361075041
## [385,] 10.050254025
## [386,] 15.628614201
## [387,] 1.184419080
## [388,] 3.883048859
## [389,] 2.853659482
## [390,] 13.958522321
## [391,] -1.635813657
## [392,] -0.144356538