Assignment
Should we add a dummy to know if the movie turned a profit or not?
Distribution of Budgets
| Min. | 1st Qu. | Median | Mean | 3rd Qu. | Max. |
|---|---|---|---|---|---|
| 8.00 | 32.00 | 50.00 | 65.96 | 80.00 | 300.00 |
For some help with poly in R I used this link
For help on the cubic splines I used: link
plot(budget,profit,col="grey",xlab="Budget",ylab="Profit", main = "We want to predict profit by only using budget")+
abline(h=0, col = "green")
## integer(0)
b.3<-lm(profit ~ bs(budget,knots = 2),data = train)
b.5 <- lm(profit ~ bs(budget,knots = 5),data = train)
b.8 <- lm(profit ~ bs(budget,knots = 8),data = train)
b.donaldcuts <- lm(profit ~ bs(budget,knots = c(32,50,65.96,80)),data = train)
AIC(poly.3,b.5,b.8,b.donaldcuts)
All these models have the same AIC except for my model where I used the distribution of budget to define where to cut. The value is still quite close. Looks strange.
Let’s look at some residuals