1. REVIEW OF ESSENTIAL CONCEPTS - 15 POINTS
(1) What is the rank of the following matrix?
\[\left[\begin{array}
{rrr}
1 & -1 & 3 & -5\\
2 & 1 & 5 & -9 \\
6 & -1 & -2 & 4
\end{array}\right]
\]
library(Matrix)
A1 <- matrix(c(1,-1, 3, -5, 2, 1, 5, -9, 6, -1, -2, 4), nrow = 3, byrow = T)
print(A1)
## [,1] [,2] [,3] [,4]
## [1,] 1 -1 3 -5
## [2,] 2 1 5 -9
## [3,] 6 -1 -2 4
cat("The rank of the matrix is", qr(A1)$rank)
## The rank of the matrix is 3
(2) What is the transpose of the above matrix?
cat("The transpose of the matrix is:")
## The transpose of the matrix is:
print(t(A1))
## [,1] [,2] [,3]
## [1,] 1 2 6
## [2,] -1 1 -1
## [3,] 3 5 -2
## [4,] -5 -9 4
(3) Define orthonormal basis vectors Please write down at least one orthonormal basis for the 3-dimensional vector space \(R^3\).
Orthonormal basis vectors: 1) all have length = 1 (they have all been normalized, or turned into unit vectors); 2) are all orthogonal to each other (their dot product = 0); and 3) are linearly independent.
An example of a set of orthonormal basis vectors for \(R^3\): \[{(1,0,0), (0,0.7071068,0.7071068), (0,0.7071068,-0.7071068)}\]
library(far)
#Define a matrix containing 3 vectors
A3 <- matrix(c(2,1,1,0,1,2,0,1,0), nrow = 3, byrow = T)
#Normalize the matrix
A3o <- orthonormalization(A3, basis = T, norm = T)
print(A3o)
## [,1] [,2] [,3]
## [1,] 1 0.0000000 0.0000000
## [2,] 0 0.7071068 0.7071068
## [3,] 0 0.7071068 -0.7071068
#Check for linear independence (determinant is non-zero)
det(A3o)
## [1] -1
#Separate into 3 vectors
A3o1 <- A3o[1,]
A3o2 <- A3o[2,]
A3o3 <- A3o[3,]
#Check that they are orthogonal
0 == A3o1 * A3o2 * A3o3
## [1] TRUE TRUE TRUE
(4) Given the following matrix, what is its characteristic polynomial?
\[\mathbf{A} = \left[\begin{array}
{rrr}
5 & 0 & 3\\
0 & 1 & -2 \\
1 & 2 & 0
\end{array}\right]
\]
Solution: \[\lambda^3 - 6\lambda^2 + 6\lambda - 17\]
library(pracma)
##
## Attaching package: 'pracma'
##
## The following objects are masked from 'package:Matrix':
##
## expm, lu, tril, triu
A4 <- matrix(c(5, 0, 3, 0, 1, -2, 1, 2, 0), nrow = 3, byrow = T)
print(A4)
## [,1] [,2] [,3]
## [1,] 5 0 3
## [2,] 0 1 -2
## [3,] 1 2 0
cp <- charpoly(A4, info = T)
## Error term: 0
cp$cp
## [1] 1 -6 6 -17
(5) What are its eigenvalues and eigenvectors?
eigen(A4)$values
## [1] 5.471264+0.000000i 0.264368+1.742772i 0.264368-1.742772i
eigen(A4)$vectors
## [,1] [,2] [,3]
## [1,] 0.98551404+0i 0.2583505-0.2761849i 0.2583505+0.2761849i
## [2,] -0.06924766+0i -0.6725516+0.0000000i -0.6725516+0.0000000i
## [3,] 0.15481227+0i -0.2473752+0.5860519i -0.2473752-0.5860519i
(6) Given a column stochastic matrix of links between URLs, what can you say about the PageRank of this set of URLs?
A column stochastic matrix of links between URLs can be used to give a unique ranking, or the (Google) PageRank, of those web pages. This can be done by applying decay to the matrix and iterating until convergence. Decay (alpha) is key because it simulates the randomness of web browsing in the actual population.
(7) Assuming that we are repeatedly sampling sets of numbers (each set is of size n) from an unknown probability density function. What can we say about the average value of each set?
In accordance with the Central Limit Theorem, we can say that the average values of sets of samples of independently distributed random variables will follow the normal distribution. So, as \(n\) grows large, the mean value of the set of samples will be normally distributed around the mean of the original distribution.
(8) What is the derivative of \(e{^x}\cos{^2}(x)\)?
Solution: \[e^xcos(x)(cos(x)-2sin(x))\]
library(Deriv)
f <- function(x) (exp(x)*cos(x)^2)
g <- function(x) {}
body(g) <- Deriv(body(f), 'x')
print(body(g))
## {
## .e1 <- cos(x)
## .e1 * (.e1 - 2 * sin(x)) * exp(x)
## }
(9) What is the derivative of \(e{^x}{^3}\)?
Solution: \[3e^{x^3}x^2\]
f <- function(x) (exp(x^3))
g <- function(x) {}
body(g) <- Deriv(body(f), 'x')
print(body(g))
## 3 * (x^2 * exp(x^3))
(10) What is \(\int e{^x}\cos(x) + \sin(x)~dx\)?
\[ \int e{^x}\cos(x) + \int sin(x)\]
\[= \int e{^x}\cos(x) + (-cos(x))\]
\[= \frac{1}{2}e{^x}(sin(x)+cos(x)) - cos(x) + C\]
2. MINI-CODING ASSIGNMENTS - 15 POINTS
2.1. Sampling from function. Assume that you have a function that generates integers between 0 and 20 with the following probability distribution: P(x == k) = (20 k)pkq{20-k} where p = 0.25 and q = 1 - p = 0.75 and x (set) 2 [0,20]. This is also known as a Binomial Distribution. Write a function to sample from this distribution. After that, generate 1000 samples from this distribution and plot the histogram of the sample.
# p = 0.25, q = 0.75
# define function
binomiald <- function(x, size, p) {
dbinom(x, size, p)
}
# define sample
x <- seq(0, 20 , by = 1)
# sample
mysample <- sample(x, 1000, replace = T, prob = binomiald(x, 20, .25))
#plot
hist(mysample)

2.2 Principal Components Analysis. For the auto data set attached with the final exam, please perform a Principal Components Analysis by performing an SVD on the 4 independent variables (with mpg as the dependent variable) and select the top 2 directions. Please scatter plot the data set after it has been projected to these two dimensions. Your code should print out the two orthogonal vectors and also perform the scatter plot of the data after it has been projected to these two dimensions.
auto <- read.csv("auto-mpg.data", header = F, sep = "")
names(auto) <- c("displacement", "horsepower", "weight", "acceleration", "mpg")
head(auto)
## displacement horsepower weight acceleration mpg
## 1 307 130 3504 12.0 18
## 2 350 165 3693 11.5 15
## 3 318 150 3436 11.0 18
## 4 304 150 3433 12.0 16
## 5 302 140 3449 10.5 17
## 6 429 198 4341 10.0 15
A <- as.matrix(auto[,1:4])
head(A)
## displacement horsepower weight acceleration
## [1,] 307 130 3504 12.0
## [2,] 350 165 3693 11.5
## [3,] 318 150 3436 11.0
## [4,] 304 150 3433 12.0
## [5,] 302 140 3449 10.5
## [6,] 429 198 4341 10.0
mpg <- as.matrix(auto[,5])
head(mpg)
## [,1]
## [1,] 18
## [2,] 15
## [3,] 18
## [4,] 16
## [5,] 17
## [6,] 15
# Use sweep to subract column means
cx <- sweep(A, 2, colMeans(A), "-")
s <- svd(cx)
head(s$u) #principal components in the PCA
## [,1] [,2] [,3] [,4]
## [1,] -0.03170503 -0.06593017 -0.034068834 0.0562736981
## [2,] -0.04316499 -0.10275329 0.027934606 -0.0115557207
## [3,] -0.02783591 -0.09793801 0.015798400 0.0265439894
## [4,] -0.02756521 -0.08114003 0.028847024 0.0007645433
## [5,] -0.02846751 -0.07235841 0.001223887 0.0721217532
## [6,] -0.08179348 -0.11107760 0.046454703 -0.0134520011
print(s$d) #dimensions
## [1] 16919.50904 769.10176 319.28976 33.69084
plot(s$d,type='b',pch=10,xlab='Singular value',ylab='magnitude')

head(s$v) #vectors
## [,1] [,2] [,3] [,4]
## [1,] -0.114341470 -0.94619620 -0.302560557 -0.009791875
## [2,] -0.038967092 -0.29819647 0.949995562 -0.084076546
## [3,] -0.992676062 0.12074073 -0.002546427 0.003070351
## [4,] 0.001352834 0.03483225 -0.077194932 -0.996406457
# check using PCA function
pca <- prcomp(cx, center=F, scale.F=F)
print(pca)
## Standard deviations:
## [1] 855.656351 38.895148 16.147177 1.703819
##
## Rotation:
## PC1 PC2 PC3 PC4
## displacement -0.114341470 -0.94619620 -0.302560557 -0.009791875
## horsepower -0.038967092 -0.29819647 0.949995562 -0.084076546
## weight -0.992676062 0.12074073 -0.002546427 0.003070351
## acceleration 0.001352834 0.03483225 -0.077194932 -0.996406457
# top 2 dimensions
u2dim <- (s$u[, 1:2])
v2dim <- (s$v[, 1:2])
d2dim <- (diag(s$d)[1:2, 1:2])
autoidim2 <- u2dim %*% d2dim %*% t(v2dim)
newauto <- (autoidim2)
colnames(newauto) <- c("displacement", "horsepower", "weight", "acceleration")
head(newauto)
## displacement horsepower weight acceleration
## [1,] 109.3154 36.02390 526.3823 -2.491945
## [2,] 158.2828 52.02465 715.4397 -3.740730
## [3,] 125.1230 40.81377 458.4259 -3.260859
## [4,] 112.3750 36.78279 455.4392 -2.804652
## [5,] 107.7300 35.36367 471.4094 -2.590050
## [6,] 239.0713 79.40169 1363.4550 -4.847912
pairs(~., data=A, main = "Auto Data Pre-PCA")

pairs(~., data=newauto, main = "Auto Data Post-PCA")

plot(A)

plot(newauto)

orthov1 <- s$v[,1]
orthov2 <- s$v[,2]
cat("orthogonal vector 1:", orthov1)
## orthogonal vector 1: -0.1143415 -0.03896709 -0.9926761 0.001352834
cat("orthogonal vector 2:", orthov2)
## orthogonal vector 2: -0.9461962 -0.2981965 0.1207407 0.03483225
cat("check:", round((orthov1 %*% orthov2), 3) == 0)
## check: TRUE
2.3. Sampling in Bootstrapping. As we discussed in class, in bootstrapping we start with n data points and repeatedly sample many times with replacement. Each time, we generate a candidate data set of size n from the original data set. All parameter estimations are performed on these candidate data sets. It can be easily shown that any particular data set generated by sampling n points from an original set of size n covers roughly 63.2% of the original data set. Using probability theory and limits, please prove that this is true. After that, write a program to perform this sampling and show that the empirical observation also agrees this.
When sampling with replacement, the probability of each data point being picked as test data is: \(P = (1-\frac{1}{n}){^n}\)
Therefore, training data is \(P = 1-(1-\frac{1}{n}){^n}\) of the original data.
So, the probability a particular training data point will not be picked is: \(1-\frac{1}{n}\)
For any value of \(n\), \(1-\frac{1}{n} \approx 0.368 \approx \exp^{-1}\)
Which means that the training data will contain \(\approx 63.2\%\) of the test data instances.
bootstrapme <- function(n) {
data <- (1:n)
mysample <- replicate(n, {sample(data, 1, replace = T)})
return(length(unique(mysample))/n)
}
n = 100000
bootstrapme(n)
## [1] 0.63254
3. Mini-project - 20 points
# read in data
rawx <- read.csv("ex3x.dat", header = F, sep = "")
colnames(rawx) <- c("sqft", "bdrm")
head(rawx)
## sqft bdrm
## 1 2104 3
## 2 1600 3
## 3 2400 3
## 4 1416 2
## 5 3000 4
## 6 1985 4
rawy <- read.csv("ex3y.dat", header = F, sep = "")
colnames(rawy) <- c("price")
head(rawy)
## price
## 1 399900
## 2 329900
## 3 369000
## 4 232000
## 5 539900
## 6 299900
# X <- rawx
y <- as.matrix(rawy)
# standardize data
x <- scale(rawx, center = T, scale = T)
head(x)
## sqft bdrm
## [1,] 0.13000987 -0.2236752
## [2,] -0.50418984 -0.2236752
## [3,] 0.50247636 -0.2236752
## [4,] -0.73572306 -1.5377669
## [5,] 1.25747602 1.0904165
## [6,] -0.01973173 1.0904165
# # define variables
# sqft <- as.matrix(x[,1])
# bdrm <- as.matrix(x[,2])
# price <- as.matrix(y)
# combine datasets
data <- cbind(x, y)
head(data)
## sqft bdrm price
## [1,] 0.13000987 -0.2236752 399900
## [2,] -0.50418984 -0.2236752 329900
## [3,] 0.50247636 -0.2236752 369000
## [4,] -0.73572306 -1.5377669 232000
## [5,] 1.25747602 1.0904165 539900
## [6,] -0.01973173 1.0904165 299900
# ** GRADIENT DESCENT **
# number of observations
m <- nrow(x)
print(m)
## [1] 47
# add dummy variable to x data
x0 <- rep(1, m)
x <- as.matrix(cbind(x0, x))
# check variables
head(x)
## x0 sqft bdrm
## [1,] 1 0.13000987 -0.2236752
## [2,] 1 -0.50418984 -0.2236752
## [3,] 1 0.50247636 -0.2236752
## [4,] 1 -0.73572306 -1.5377669
## [5,] 1 1.25747602 1.0904165
## [6,] 1 -0.01973173 1.0904165
head(y)
## price
## [1,] 399900
## [2,] 329900
## [3,] 369000
## [4,] 232000
## [5,] 539900
## [6,] 299900
# define the gradient function dJ/dtheata: 1/m * (h(x)-y))*x where h(x) = x*theta; in matrix form this is as follows:
grad <- function(x,y,theta){
gradient <- (1/m)* (t(x) %*% ((x%*%t(theta)) - y))
return(t(gradient))
}
# define gradient descent update algorithm
grad.descent <- function(x, maxit, alpha){
theta <- matrix(c(0, 0,0), nrow=1) # Initialize the parameters
for (i in 1:maxit) {
theta <- theta - alpha * grad(x, y, theta)
theta_ret <- rbind(theta_ret,theta)
}
return(theta_ret)
}
alphaval <- c(.001, .01, .1, 1.)
par(mfrow = c(1, 1))
for(i in 1:length(alphaval)) {
theta_ret <- c()
tab <- grad.descent(x,100, alphaval[i])
plot(tab[,1],type="b",ylim=c(min(tab),max(tab)),col="red",lty=1,ylab="Value",lwd=1.5, main = paste("Alpha =", alphaval[i]))
lines(tab[,2],type="b",col="black",lty=1,lwd=1.5)
lines(tab[,3],type="b",col="blue",lty=1,lwd=1.5)
legend("topleft", c("Price", "Square Feet", "# Bedrooms"), lty=c(1,1), lwd=c(2.0, 2.0), col=c("red", "black", "blue"))
print(tab[100,])
}

## x0 sqft bdrm
## 32409.958 9839.144 4894.396

## x0 sqft bdrm
## 215810.62 61384.03 20273.55

## x0 sqft bdrm
## 340403.618 109912.678 -5931.109

## x0 sqft bdrm
## 340412.660 110631.050 -6649.474
** LM FUNCTION **
# read in data
rawx <- read.csv("ex3x.dat", header = F, sep = "")
colnames(rawx) <- c("sqft", "bdrm")
head(rawx)
## sqft bdrm
## 1 2104 3
## 2 1600 3
## 3 2400 3
## 4 1416 2
## 5 3000 4
## 6 1985 4
rawy <- read.csv("ex3y.dat", header = F, sep = "")
colnames(rawy) <- c("price")
head(rawy)
## price
## 1 399900
## 2 329900
## 3 369000
## 4 232000
## 5 539900
## 6 299900
# X <- rawx
y <- rawy
# standardize data
x <- scale(rawx, center = T, scale = T)
head(x)
## sqft bdrm
## [1,] 0.13000987 -0.2236752
## [2,] -0.50418984 -0.2236752
## [3,] 0.50247636 -0.2236752
## [4,] -0.73572306 -1.5377669
## [5,] 1.25747602 1.0904165
## [6,] -0.01973173 1.0904165
# define variables
sqft <- as.matrix(x[,1])
bdrm <- as.matrix(x[,2])
price <- as.matrix(y)
# fit linear regression model
fit <- lm(price ~ sqft + bdrm)
summary_fit <- summary(fit)
print(summary_fit)
##
## Call:
## lm(formula = price ~ sqft + bdrm)
##
## Residuals:
## Min 1Q Median 3Q Max
## -130582 -43636 -10829 43698 198147
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 340413 9637 35.323 < 2e-16 ***
## sqft 110631 11758 9.409 4.22e-12 ***
## bdrm -6650 11758 -0.566 0.575
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 66070 on 44 degrees of freedom
## Multiple R-squared: 0.7329, Adjusted R-squared: 0.7208
## F-statistic: 60.38 on 2 and 44 DF, p-value: 2.428e-13
library(scatterplot3d)
s3d <-scatterplot3d(sqft, bdrm, price, pch=16, highlight.3d=TRUE, type="h", main="3D Scatterplot")
s3d$plane3d(fit)

** OLS FUNCTION **
# read in data
rawx <- read.csv("ex3x.dat", header = F, sep = "")
colnames(rawx) <- c("sqft", "bdrm")
head(rawx)
## sqft bdrm
## 1 2104 3
## 2 1600 3
## 3 2400 3
## 4 1416 2
## 5 3000 4
## 6 1985 4
rawy <- read.csv("ex3y.dat", header = F, sep = "")
colnames(rawy) <- c("price")
head(rawy)
## price
## 1 399900
## 2 329900
## 3 369000
## 4 232000
## 5 539900
## 6 299900
y <- rawy
# standardize data
x <- scale(rawx, center = T, scale = T)
A <- as.matrix(cbind(x, 1))
colnames(A) <- c("sqft", "bdrm", "intercept")
head(A)
## sqft bdrm intercept
## [1,] 0.13000987 -0.2236752 1
## [2,] -0.50418984 -0.2236752 1
## [3,] 0.50247636 -0.2236752 1
## [4,] -0.73572306 -1.5377669 1
## [5,] 1.25747602 1.0904165 1
## [6,] -0.01973173 1.0904165 1
b <- as.matrix(y)
colnames(b) <- c("price")
head(b)
## price
## [1,] 399900
## [2,] 329900
## [3,] 369000
## [4,] 232000
## [5,] 539900
## [6,] 299900
# calculate t(A)*A
ATA <- t(A) %*% A
print(ATA)
## sqft bdrm intercept
## sqft 4.600000e+01 2.575849e+01 8.881784e-16
## bdrm 2.575849e+01 4.600000e+01 1.021405e-14
## intercept 8.881784e-16 1.021405e-14 4.700000e+01
# calculate t(A)*b
ATb <- t(A) %*% b
print(ATb)
## price
## sqft 4917748
## bdrm 2543813
## intercept 15999395
# solve for x-hat using the two matrices
ATAInv <- solve(ATA)
xhat <- (ATAInv %*% ATb)
print(xhat)
## price
## sqft 110631.050
## bdrm -6649.474
## intercept 340412.660
# check using lsfit function
lsr <- lsfit(A, b, intercept = F)
coeffs <- lsr$coefficients
print(coeffs)
## sqft bdrm intercept
## 110631.050 -6649.474 340412.660
** Cross-validation **
library(stats)
library(boot)
# combine datasets
data <- cbind(x, y)
head(data)
## sqft bdrm price
## 1 0.13000987 -0.2236752 399900
## 2 -0.50418984 -0.2236752 329900
## 3 0.50247636 -0.2236752 369000
## 4 -0.73572306 -1.5377669 232000
## 5 1.25747602 1.0904165 539900
## 6 -0.01973173 1.0904165 299900
# run glm for 8 degrees
set.seed(1)
cv.err <- c()
for(i in 1:8) {
glm.fit = glm(price ~ poly(sqft+bdrm), data=data)
cv.err[i] = cv.glm(data, glm.fit, K=5)$delta[1]
}
# summary statistics
summary(glm.fit)
##
## Call:
## glm(formula = price ~ poly(sqft + bdrm), data = data)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -153135 -65604 1909 63727 176776
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 340413 12515 27.200 < 2e-16 ***
## poly(sqft + bdrm) 622842 85800 7.259 4.21e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for gaussian family taken to be 7361699576)
##
## Null deviance: 7.1921e+11 on 46 degrees of freedom
## Residual deviance: 3.3128e+11 on 45 degrees of freedom
## AIC: 1205.2
##
## Number of Fisher Scoring iterations: 2
# plot cross-validation
degree <- 1:8
plot(degree, cv.err, type = "b")

par(mfrow = c(2, 2))
plot(glm.fit)

** SGD Attempt **
library(sgd)
rawx <- read.csv("ex3x.dat", header = F, sep = "")
colnames(rawx) <- c("sqft", "bdrm")
head(rawx)
## sqft bdrm
## 1 2104 3
## 2 1600 3
## 3 2400 3
## 4 1416 2
## 5 3000 4
## 6 1985 4
rawy <- read.csv("ex3y.dat", header = F, sep = "")
colnames(rawy) <- c("price")
head(rawy)
## price
## 1 399900
## 2 329900
## 3 369000
## 4 232000
## 5 539900
## 6 299900
# X <- rawx
y <- rawy
# standardize data
x <- scale(rawx, center = T, scale = T)
head(x)
## sqft bdrm
## [1,] 0.13000987 -0.2236752
## [2,] -0.50418984 -0.2236752
## [3,] 0.50247636 -0.2236752
## [4,] -0.73572306 -1.5377669
## [5,] 1.25747602 1.0904165
## [6,] -0.01973173 1.0904165
# combine datasets
data <- cbind(x, y)
head(data)
## sqft bdrm price
## 1 0.13000987 -0.2236752 399900
## 2 -0.50418984 -0.2236752 329900
## 3 0.50247636 -0.2236752 369000
## 4 -0.73572306 -1.5377669 232000
## 5 1.25747602 1.0904165 539900
## 6 -0.01973173 1.0904165 299900
sgd.data <- sgd(x, y, model="lm")
sgd.data
## $coefficients
## sqft bdrm
## 11926.8714 -599.0125
##
## $residuals
## price <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 398215.4 335779.4 362873.0 239853.7 525555.4 300788.5 321769.9 207474.8
## <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 221181.2 249970.3 241562.9 346876.2 331526.1 664160.8 270762.1 446061.0
## <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 309194.5 211242.3 491521.5 584190.2 256273.1 255770.0 248719.4 261133.7
## <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 545411.2 263283.4 472510.6 460982.0 471874.6 290216.2 351405.4 183209.9
## <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 314963.1 562712.2 289399.9 258225.7 241197.3 343682.3 516420.7 285232.1
## <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 372631.8 326204.3 306153.9 310882.6 196218.2 302784.6 251337.6
##
## $fitted.values
## price <NA> <NA> <NA> <NA>
## 1684.59523 -5879.42311 6126.95521 -7853.73270 14344.58152
## <NA> <NA> <NA> <NA> <NA>
## -888.51097 -6869.94932 -8475.80242 -9181.17715 -7470.26824
## <NA> <NA> <NA> <NA> <NA>
## -1563.86975 123.76605 -1527.11097 35739.20803 -10862.07011
## <NA> <NA> <NA> <NA> <NA>
## 3839.00049 -9294.49809 -11342.32524 8476.46411 14809.82868
## <NA> <NA> <NA> <NA> <NA>
## -3373.09164 -769.96949 -5819.39122 -1233.69435 28488.83482
## <NA> <NA> <NA> <NA> <NA>
## -13383.40956 -8010.55526 8017.95979 3125.36063 9683.84478
## <NA> <NA> <NA> <NA> <NA>
## -1505.36016 -13309.89199 -63.07246 17187.83123 -3499.89825
## <NA> <NA> <NA> <NA> <NA>
## -8325.72269 -11297.30132 1317.66104 32579.26858 1767.90023
## <NA> <NA> <NA> <NA> <NA>
## -4131.75542 3695.66360 7846.12925 -11882.61227 -16318.22941
## <NA> <NA>
## -2884.57137 -11837.58835
##
## $rank
## [1] 2
##
## $family
##
## Family: gaussian
## Link function: identity
##
##
## $linear.predictors
## price <NA> <NA> <NA> <NA>
## 1684.59523 -5879.42311 6126.95521 -7853.73270 14344.58152
## <NA> <NA> <NA> <NA> <NA>
## -888.51097 -6869.94932 -8475.80242 -9181.17715 -7470.26824
## <NA> <NA> <NA> <NA> <NA>
## -1563.86975 123.76605 -1527.11097 35739.20803 -10862.07011
## <NA> <NA> <NA> <NA> <NA>
## 3839.00049 -9294.49809 -11342.32524 8476.46411 14809.82868
## <NA> <NA> <NA> <NA> <NA>
## -3373.09164 -769.96949 -5819.39122 -1233.69435 28488.83482
## <NA> <NA> <NA> <NA> <NA>
## -13383.40956 -8010.55526 8017.95979 3125.36063 9683.84478
## <NA> <NA> <NA> <NA> <NA>
## -1505.36016 -13309.89199 -63.07246 17187.83123 -3499.89825
## <NA> <NA> <NA> <NA> <NA>
## -8325.72269 -11297.30132 1317.66104 32579.26858 1767.90023
## <NA> <NA> <NA> <NA> <NA>
## -4131.75542 3695.66360 7846.12925 -11882.61227 -16318.22941
## <NA> <NA>
## -2884.57137 -11837.58835
##
## $deviance
## [1] 6.057538e+12
##
## $null.deviance
## [1] 719208918475
##
## $iter
## [1] 2
##
## $weights
## price <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 1 1 1 1 1 1 1 1 1 1 1 1
## <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 1 1 1 1 1 1 1 1 1 1 1 1
## <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 1 1 1 1 1 1 1 1 1 1 1 1
## <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
## 1 1 1 1 1 1 1 1 1 1 1
##
## $df.residual
## [1] 45
##
## $df.null
## [1] 46
##
## $converged
## NULL
##
## attr(,"class")
## [1] "sgd"
sgd.data$coefficients
## sqft bdrm
## 11926.8714 -599.0125
print(x)
## sqft bdrm
## [1,] 0.1300098691 -0.2236752
## [2,] -0.5041898382 -0.2236752
## [3,] 0.5024763638 -0.2236752
## [4,] -0.7357230647 -1.5377669
## [5,] 1.2574760154 1.0904165
## [6,] -0.0197317285 1.0904165
## [7,] -0.5872397999 -0.2236752
## [8,] -0.7218814044 -0.2236752
## [9,] -0.7810230438 -0.2236752
## [10,] -0.6375731100 -0.2236752
## [11,] -0.0763567023 1.0904165
## [12,] -0.0008567372 -0.2236752
## [13,] -0.1392733400 -0.2236752
## [14,] 3.1172918237 2.4045083
## [15,] -0.9219563121 -0.2236752
## [16,] 0.3766430886 1.0904165
## [17,] -0.8565230089 -1.5377669
## [18,] -0.9622229602 -0.2236752
## [19,] 0.7654679091 1.0904165
## [20,] 1.2964843307 1.0904165
## [21,] -0.2940482685 -0.2236752
## [22,] -0.1417900055 -1.5377669
## [23,] -0.4991565072 -0.2236752
## [24,] -0.0486733818 1.0904165
## [25,] 2.3773921652 -0.2236752
## [26,] -1.1333562145 -0.2236752
## [27,] -0.6828730891 -0.2236752
## [28,] 0.6610262907 -0.2236752
## [29,] 0.2508098133 -0.2236752
## [30,] 0.8007012262 -0.2236752
## [31,] -0.2034483104 -1.5377669
## [32,] -1.2591894898 -2.8518586
## [33,] 0.0494765729 1.0904165
## [34,] 1.4298676025 -0.2236752
## [35,] -0.2386816274 1.0904165
## [36,] -0.7092980769 -0.2236752
## [37,] -0.9584479619 -0.2236752
## [38,] 0.1652431861 1.0904165
## [39,] 2.7863503098 1.0904165
## [40,] 0.2029931687 1.0904165
## [41,] -0.4236565421 -1.5377669
## [42,] 0.2986264579 -0.2236752
## [43,] 0.7126179335 1.0904165
## [44,] -1.0075229393 -0.2236752
## [45,] -1.4454227371 -1.5377669
## [46,] -0.1870899846 1.0904165
## [47,] -1.0037479410 -0.2236752
## attr(,"scaled:center")
## sqft bdrm
## 2000.680851 3.170213
## attr(,"scaled:scale")
## sqft bdrm
## 794.7023535 0.7609819