Exercise #1

2.)

a.) This scenario should be considered a regression due to the CEO’s salary being a continuous variable. The goal of this study is inference since we want to determine what affects the continuous variable. (n=Top 500 Firms) (p=# of factors on CEO’s salary).

b.) This scenario is a classification since success/failure is a categorical variable. This is prediction since we want to use the factors to determine whether a product will be successful. (n=20)(p=# of predictors)

c.) This is a regression since the percentage of change is considered a continuous variable. This is predictive since we are interested in exchange rates of the future. (n = # of weeks in a year) (p=percentage of change)

3.)

a.)

b.)The bias curve appears to decrease as flexibility increases. Bias occurs when a model is to simple to explain the complexity of the issue, thus to lower bias we must make a more complex and flexible model. The variance curve increases with model flexibility. Variance measures how much a prediction fro a model can change with different training sets, more flexible models will be more sensitive to change causing them to have more variance. The training error decreases as flexibility increases. Training error reflects how well a model has been fitted to the data. More flexibility in the model reduces this error. Test Error Curve creates a U-shape. Initially increasing flexibility allows the model to capture needed patterns, reducing error,as flexibility increases further the returns are diminished and then model starts overfitting. Bayes Error is shown as a line. This is due to this error representing randomness, thus this line will remain constant regardless of flexibility.

5.) More flexible models do a better job a capturing complex, non-linear relationships, These models produce results that have lower bias and error. However, with this comes higher variance and more work to be put into preparing the data. Less flexible models do the opposite. These models require less data and training and have lower variance but cannot accurately explain complex, nonlinear relationships.

7.)

a.)

1: 3

2: 2

3: sqrt(10)

4:sqrt(5)

5: sqrt(2)

6: sqrt(3)

b.) Green, because the closest neighbor is observation 5 and 5 is green

c.) Red, k=3 has three neighbors but two of three are red.

d.) Smaller k values allow the model to be more adaptable to local patterns whereas larger k values smooth out the decision boundary, which works well for linear problems but not for complex, nonlinear ones. Thus, smaller k-values are better for interpreting complex relationships.