Machine Learning I
Assignment 3 – Evaluation Metrics
Directions: Complete the following exercises.
1. What are the standard measures we use to summarize how good a set of predictions is?
• classification accuracy, confusion matrix, mean absolute error, root mean squared error
2. Fill-in-the-blank: Knowing how good a set of predictions is allows one to ________ ________ about the skill of a given machine learning model of your problem.
• Clear objective
3. What performance metrics give you an objective idea of how good a set of predictions is?
• classification accuracy, and
• root mean squared error
4. Fill-in-the-blank: ________ ________ is a ratio of the number of correct predictions out of all predictions that were made.
• Classification accuracy
5. Classification accuracy is found in which percentage range?
• 0% to 100%
6. The symbol, “==” is used to compare which data types?
• It compare the equality of actual to predicted values.
7. What does a confusion matrix do?
• It provides a summary of all of the predictions made compared to the expected actual values
8. In the confusion matrix, where do you find the perfect set of predictions?
• On a diagonal line from the top left to the bottom right of the matrix.
9. What is the function, confusion_matrix( ), used for?
• It makes a list of all of the unique class values and assigns each class value a unique integer or index into the confusion matrix.
10. What must the confusion_matrix ( ) be given before it produces a result?
It needs to provide two arguments: the first is the actual labels or classes of the data points in your test dataset. Second is the labels or classes that your classification model has predicted for the data points in the test dataset.
11. Fill-in-the-blank: The _______ ________ into the confusion matrix is the row for actual values.
• First index
12. What two objects does the confusion_matrix ( ) function return?
• The first is the set of unique class values, so that they can be displayed when the confusion matrix is drawn.
• The second is the confusion matrix itself with the counts in each cell.
13. What does the function, print_confusion_matrix( ), do?
• Displays the results for a confusion matrix
14. Fill-in-the-blank: A confusion matrix is always a good idea to use in addition to help interpret the prediction.
• classification accuracy
15. TRUE or FALSE: An easy metric to consider is the error in the predicted values as compared to the actual values.
• False
16. What is the Mean Absolute Error (MAE) used for?
• To calculate the error in a set of regression predictions
17. TRUE or FALSE: The MAE is calculated as the average of the absolute error values, where absolute means are made positive so that they can be added together. It expects a list of actual outcome values and a list of predictions.
• True
18. Fill-in-the-blank: The ________ ________ is also the average positive error.
• mean absolute error
19. What does the Root Mean Squared Error (RMSE) do?
• It calculates the error in a set of regression predictions
20. How is the RMSE calculated?
• the square root of the mean of the squared differences between actual outcomes and predictions
21. TRUE or FALSE: RMSE values are always lower than MSE values.
• False
22. List three functions in this section and explain what they do. [Note: You cannot use the confusion_matrix ( ), nor can you use the print_confusion_matrix( ) ]
• abs( ) Python function -- calculate the absolute error values that are summed together
• accuracy_metric( ) function - returns classification accuracy as a percentage
• mae metric ( ) function – calculates the mean absolute error