This is Quiz 1 from the Machine Learning course within the Data Science Specialization. This publication is intended as a learning resource, all answers are documented and explained.
1. Which of the following are components in building a machine learning algorithm?
Self-Explanatory
2. Suppose we build a prediction algorithm on a data set and it is 100% accurate on that data set. Why might the algorithm not work well if we collect a new data set?
If your algorithm is too specific to the peculiarities of an individual set, it may lead to false classification and bad predictions.
3. What are typical sizes for the training and test sets?
Self-explanatory
4. What are some common error rates for predicting binary variables (i.e. variables with two possible values like yes/no, disease/normal, clicked/didn’t click)? Check the correct answer(s).
Self-explanatory
5. Suppose that we have created a machine learning algorithm that predicts whether a link will be clicked with 99% sensitivity and 99% specificity. The rate the link is clicked is 1/1000 of visits to a website. If we predict the link will be clicked on a specific visit, what is the probability it will actually be clicked?
n <- 1000000
prevalence = 1/1000
sens <- .99
spec <- .99
clicked <- prevalence * n
TP <- sens * clicked
Notclicked <- n - clicked
TN <- spec * Notclicked
FP <- Notclicked * sens
FP<- Notclicked-TN
PPV <- TP/ (TP+FP)
PPV
## [1] 0.09016393
Check out my website at: http://www.ryantillis.com/