## Assignment 7 – Random Forest Model – Part 1
## 1. What kind of algorithms are random forests?
## • They are ensemble algorithms.
## 2. Random forests are robust to noise. TRUE or FALSE
## • True
## 3. By pruning trees in the random forest model, the model reaches its maximal depth. TRUE or FALSE
## • True
## 4. Preprocessing and variable selection are very important stages for the random forest model. TRUE or FALSE
## • False
# 5. Random tree models overfit training data. TRUE or FALSE
## • True
## 6. What does the argument, method = ”anova”, mean?
## • Analysis of Variance is a test that allows a comparison of more than two groups at the same time to determine whether a relationship exists between them
## 7. Explain the ensemble approach used by the random forest model.
## • a series of decision trees that act as “weak” classifiers that as individuals are poor predictors but in aggregate form a robust prediction.
## 8. Explain bagging.
## • Bagging is the idea of collecting a random sample of observations into a bag (though the term itself is an abbreviation of bootstrap aggregation). Multiple bags are made up of randomly selected observations obtained from the original observation from the training dataset.
## 9. The selection in bagging Is made without replacement. TRUE or FALSE
## • False
## 10. Explain the phrase “the choice of variables for partitioning the dataset.”
## • It refers to an element of randomness. A small set of variables is chosen at steps in building decision nodes. When choosing a split point in a decision tree, a different random set of variables is considered.
## 11. Why does the random forest model deliver a better answer?
## • it relies on collecting various decision trees to arrive at any solution.
## 12. Fill-in-the-blank: ___________ ______________ tend not to perform well on new data.
## • Decision trees
## 13. When comparing a single decision tree with a random forest, what is one of the problems with the random forest?
## • One of the problems with a random forest,
compared with a single decision tree, is that it becomes quite a bit more difficult to readily understand the discovered knowledge
## 14. Explain the random forest model.
## • Random forest model is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time
## 15. How does tuning parameters help the random forest algorithms?
## • The tuning parameters control the complexity of the random forest algorithm and thus also affect any variance-base trade-off that can be made.
## 16. What is a tuning parameter for the random forest model?
## • Tuning parameter is the process of adjusting the model options to identify the best fit. This allows us to increase the model’s accuracy.
## 17. What is the modelLookup ( ) function used for?
## • This function is used to find tunning parameters for a particular model,
## 18. What is parameter tuning?
## • Parameters which define the model architecture are referred to as hyperparameters and thus this process of searching for the ideal model architecture is referred to as hyperparameter tuning.
## 19. Fill-in-the-blank: We _________ ____________ to increase the random forest model’s accuracy.
## • Tun the parameters
## 20. Fill-in-the-blank: The “ntry= “ tuning parameter will be _____________ _____________ from all of those available each time we look to partition a dataset in the process of building the decision tree.
## • Randomly selected
## 21. Explain the argument, “ntree=”
## • specifies how many trees are to be built to populate the random forest.
## 22. Explain the argument, “importance = ”
## • The importance argument allows us to review the importance of each variable in determining the outcome. Two important measures are calculated in addition to importance of the variable in relation to each outcome in a classification task
## 23. Explain the evalq ( ) function.
## • evaluates an R expression (the quoted form of its first argument) in a specified environment.
## 24. Fill-in-the-blank: The random forest ( ) function is contained in the ____________ ____________ for accurate tuning.
## • Modeling method
## 25. Explain the “mtry = “
## • It is an optional integer specifying the number of features to randomly select at each split