```
Lecture 9 – Assignment 9 – CART Part 1 1. Which two parameters do we use when deciding when to stop growing a tree?
• Maximum tree depth, • Minimum node records the training dataset.
• Deeper trees are more complex and are more likely to overfit the training data.
• overfit
• When all rows belong to one group when splitting
• It is the when you stop growing the tree at a given point,
• By taking the group of rows assigned to that node and selecting the most common class value in the group
• to_terminal( )
• It extracts the last element from each row in the group and stores them in the list called “outcomes”.
• Occurs when a new node is added to an existing node
• Create the root node, • call the split( ) function, that then • calls itself recursively to build out the whole tree
• Recursive Binary Splitting algorithm
• Child node
• Use the isinstance( ) function
• Checks whether it is a subclass or instance of the specified class or not.
• needs to be pass in the isinstance( ) function is the class, type, or tuple of a class or type
• it checks whether the object or variable is an instance of the specified class type or data type.
Create a question that interprets the following line of code: “if isinstance(node[ ‘right’], dict):”
What kind of listing do we have in the following line of code? “stump = { ‘index’: 0, ‘right’ : 1, ‘value’ : 6.642287351, ‘left’: 0}”
Fill-in-the-blank: A __________ is a classification algorithm. It is a binary classification algorithm. It classifies two categories, such as 0 and 1.
• Decision stump
• str_column_to_float( )
• K fold cross validation and Bootstrap Sampling
• accuracy_metric( )
If a dataset has 134, 622 observations or records and you are using a 3-fold cross-validation resampling method to evaluate the performance of a model or algorithm, how many records are contained in each fold?
134,622 = 44,874 3
Which Python function was developed to manage the application of the CART algorithm?
• Decision_tree ( )