``` Lecture 13– Assignment 13: Random Forest – Chapter 17 Part 1
- What can we do to reduce high variance in decision trees?
• by creating multiple trees with different samples of the training dataset and combining their predictions.
- What kind of problems is the random forest algorithm applied to?
• Classification and Regression
- Explain the greedy selection of the best split point.
• the greedy selection of the best split point refers to the process of identifying the optimal attribute and threshold value to divide the dataset into two subsets during the tree construction.
- Fill-in-the-blank: The model selects the best split with __________ ___________ until you get homogeneous nodes.
• Lowest costs
- When working with decision trees what can we do to make sure that trees will be uncorrelated?
• Use bootstrap sampling to randomly sample the training data with replacement to create different subsets of data for each tree.
- Fill-in-the-blank: For classification problems, the number of attributes to be considered for the split is limited to the ______________ .
• square root of the number of input features (columns).
- When working with the Random Forest algorithm, what happens to predictions when trees are made more uncorrelated?
• results in predictions that are more diverse and a combined prediction that often has better performance
- What does a Gini Index of 0 indicate?
• class values are perfectly separated into two groups
- For classification problems, what cost function do you use to calculate the purity of the group of data created by the split point?
• Use the Gini impurit
- Fill-in-the-blank: Bagging uses sampling __________________ replacement.
• With
- Fill-in-the-blank: Random Forest uses sampling _______________ replacement.
• without
- True of False: When working with the Random Forest algorithm, we are working with rows.
• False
- True or False: When working with bagging, we are usually working with columns.
• False
- How is the number of features considered at each split point?
• the number of features considered at each split point is typically set as the square root of the total number of features