```

Lecture 9 – Assignment 9 – CART Part 1 1. Which two parameters do we use when deciding when to stop growing a tree?

• Maximum tree depth, • Minimum node records the training dataset.

  1. Why do we tend to restrict the number of nodes so that deeper trees are not built?

• Deeper trees are more complex and are more likely to overfit the training data.

  1. Fill-in-the-blank: Nodes that account for too few training patterns (rows, etc.) are expected to be too specific and are likely to ____________ the training data.

• overfit

  1. When building a tree, when will we be unable to continue splitting and adding child nodes?

• When all rows belong to one group when splitting

  1. What is the terminal node?

• It is the when you stop growing the tree at a given point,

  1. When building a tree, how do we make a final prediction?

• By taking the group of rows assigned to that node and selecting the most common class value in the group

  1. Which Python function selects a class value and returns the most common output value in a list of rows?

• to_terminal( )

  1. Explain the statement “outcomes = [row[ -1] for row in group ]” ?

• It extracts the last element from each row in the group and stores them in the list called “outcomes”.

  1. What are child nodes?

• Occurs when a new node is added to an existing node

  1. When building a tree in Python, which algorithm is used to implement the following procedure:

• Create the root node, • call the split( ) function, that then • calls itself recursively to build out the whole tree

• Recursive Binary Splitting algorithm

  1. Fill-in-the-blank: We must check if a _____________ is either a terminal value to be returned as the prediction, or if it is a dictionary node containing another level of the tree to be considered.

• Child node

  1. How do you check if an instance exists in Python?

• Use the isinstance( ) function

  1. What are the arguments for the isinstance( ) function in Python?

• Checks whether it is a subclass or instance of the specified class or not.

• needs to be pass in the isinstance( ) function is the class, type, or tuple of a class or type

  1. Explain the isinstance( ) function.

• it checks whether the object or variable is an instance of the specified class type or data type.

  1. Create a question that interprets the following line of code: “if isinstance(node[ ‘right’], dict):”

  2. What kind of listing do we have in the following line of code? “stump = { ‘index’: 0, ‘right’ : 1, ‘value’ : 6.642287351, ‘left’: 0}”

  3. Fill-in-the-blank: A __________ is a classification algorithm. It is a binary classification algorithm. It classifies two categories, such as 0 and 1.

• Decision stump

  1. Which function do we use to convert string numbers to float?

• str_column_to_float( )

  1. Which two resampling methods can we use to evaluate the performance of an algorithm?

• K fold cross validation and Bootstrap Sampling

  1. Which metric do we use to calculate the accuracy of predictions in a classification, predictive problem?

• accuracy_metric( )

  1. If a dataset has 134, 622 observations or records and you are using a 3-fold cross-validation resampling method to evaluate the performance of a model or algorithm, how many records are contained in each fold?

    134,622 = 44,874 3

  2. Which Python function was developed to manage the application of the CART algorithm?

• Decision_tree ( )