title: ‘Homework #1’ author: “Dakota McKenzie” date: “Friday, April 03, 2015” output: html_document
Question #1 a)
i)Does the size of the house increase as the city is beverly hills. ii) Are there more bathrooms if the type of house is sfh or condo. iii) Using the amount of bedrooms, predict the amount of bathrooms in a given house. iv) By using the amount of bedrooms, predict the amount of price a house costs. b) I will answer question iv. We can assume that as the amount of bedrooms increase the price of the house will also increase. Thus, the the more rooms will predict a higher price for a house, and vice versa.
hw1=read.csv("C:/Users/Dakota McKenzie/Downloads/hw1.csv")
attach(hw1)
x=hw1$x
model1=lm(y~x)
anova(model1)
## Analysis of Variance Table
##
## Response: y
## Df Sum Sq Mean Sq F value Pr(>F)
## x 1 479453 479453 38.488 0.0004436 ***
## Residuals 7 87201 12457
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
model2=lm(y~x+I(x^2))
anova(model2)
## Analysis of Variance Table
##
## Response: y
## Df Sum Sq Mean Sq F value Pr(>F)
## x 1 479453 479453 42.0736 0.0006383 ***
## I(x^2) 1 18827 18827 1.6521 0.2460502
## Residuals 6 68374 11396
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
model3=lm(y~x+I(x^2)+I(x^3))
anova(model3)
## Analysis of Variance Table
##
## Response: y
## Df Sum Sq Mean Sq F value Pr(>F)
## x 1 479453 479453 39.0022 0.001542 **
## I(x^2) 1 18827 18827 1.5315 0.270827
## I(x^3) 1 6909 6909 0.5620 0.487209
## Residuals 5 61465 12293
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
model4=lm(y~x+I(x^2)+I(x^3)+I(x^4))
anova(model4)
## Analysis of Variance Table
##
## Response: y
## Df Sum Sq Mean Sq F value Pr(>F)
## x 1 479453 479453 104.7432 0.0005137 ***
## I(x^2) 1 18827 18827 4.1130 0.1124611
## I(x^3) 1 6909 6909 1.5093 0.2865864
## I(x^4) 1 43155 43155 9.4278 0.0372756 *
## Residuals 4 18310 4577
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
model5=lm(y~x+I(x^2)+I(x^3)+I(x^4)+I(x^5))
anova(model5)
## Analysis of Variance Table
##
## Response: y
## Df Sum Sq Mean Sq F value Pr(>F)
## x 1 479453 479453 78.6485 0.003023 **
## I(x^2) 1 18827 18827 3.0883 0.177105
## I(x^3) 1 6909 6909 1.1333 0.365161
## I(x^4) 1 43155 43155 7.0791 0.076296 .
## I(x^5) 1 21 21 0.0035 0.956670
## Residuals 3 18288 6096
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
set.seed(456)
x=seq(0,4,by=.5)
y=500+200*x + rnorm(length(x),0,100)
x=predict(model1)
mse_x=sum((x-mean(x))^2)/length(x)
x2=predict(model2)
mse_x2=sum((x2-mean(x2))^2)/length(x2)
x3=predict(model3)
mse_x3=sum((x3-mean(x3))^2)/length(x3)
x4=predict(model4)
mse_x4=sum((x4-mean(x4))^2)/length(x4)
x5=predict(model5)
mse_x5=sum((x5-mean(x5))^2)/length(x5)
Question #3 (2.4.2) a) This is a regression example and thus an inference. N=500 firms in the US P=profit, number of employees, industry b) Classification and thus a prediction. N=20 similar products previously launched P=price charged, marketing budget, comp.price, and ten other variables c) Regression and thus a prediction. N=52 weeks of 2012 weekly data P=% change in US market, % change in British market, % change in German market.