Exercise #1
2.)
a.) This scenario should be considered a regression due to the CEO’s
salary being a continuous variable. The goal of this study is inference
since we want to determine what affects the continuous variable. (n=Top
500 Firms) (p=# of factors on CEO’s salary).
b.) This scenario is a classification since success/failure is a
categorical variable. This is prediction since we want to use the
factors to determine whether a product will be successful. (n=20)(p=# of
predictors)
c.) This is a regression since the percentage of change is
considered a continuous variable. This is predictive since we are
interested in exchange rates of the future. (n = # of weeks in a year)
(p=percentage of change)
3.)
a.) 
b.)The bias curve appears to decrease as flexibility increases. Bias
occurs when a model is to simple to explain the complexity of the issue,
thus to lower bias we must make a more complex and flexible model. The
variance curve increases with model flexibility. Variance measures how
much a prediction fro a model can change with different training sets,
more flexible models will be more sensitive to change causing them to
have more variance. The training error decreases as flexibility
increases. Training error reflects how well a model has been fitted to
the data. More flexibility in the model reduces this error. Test Error
Curve creates a U-shape. Initially increasing flexibility allows the
model to capture needed patterns, reducing error,as flexibility
increases further the returns are diminished and then model starts
overfitting. Bayes Error is shown as a line. This is due to this error
representing randomness, thus this line will remain constant regardless
of flexibility.
5.) More flexible models do a better job a capturing complex,
non-linear relationships, These models produce results that have lower
bias and error. However, with this comes higher variance and more work
to be put into preparing the data. Less flexible models do the opposite.
These models require less data and training and have lower variance but
cannot accurately explain complex, nonlinear relationships.
7.)
a.)
1: 3
2: 2
3: sqrt(10)
4:sqrt(5)
5: sqrt(2)
6: sqrt(3)
b.) Green, because the closest neighbor is observation 5 and 5 is
green
c.) Red, k=3 has three neighbors but two of three are red.
d.) Smaller k values allow the model to be more adaptable to local
patterns whereas larger k values smooth out the decision boundary, which
works well for linear problems but not for complex, nonlinear ones.
Thus, smaller k-values are better for interpreting complex
relationships.