#Carseats {ISLR}
Description A simulated data set containing sales of child car seats at 400 different stores. - Usage: Carseats - Format: A data frame with 400 observations on the following 11 variables. - Sales: Unit sales (in thousands) at each location - CompPrice: Price charged by competitor at each location - Income: Community income level (in thousands of dollars) - Advertising: Local advertising budget for company at each location (in thousands of dollars) - Population: Population size in region (in thousands) - Price: Price company charges for car seats at each site - ShelveLoc: A factor with levels Bad, Good and Medium indicating the quality of the shelving location for the car seats at each site - Age: Average age of the local population - Education: Education level at each location - Urban: A factor with levels No and Yes to indicate whether the store is in an urban or rural location - US: A factor with levels No and Yes to indicate whether the store is in the US or not - Source: Simulated data
References James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013) An Introduction to Statistical Learning with applications in R, www.StatLearning.com, Springer-Verlag, New York
library("tree")
library("ISLR")
# Load dataseat called Carseats
attach(Carseats)
# Put Carseats data into new dataframe df
df <- Carseats
View(df)
df
## Sales CompPrice Income Advertising Population Price ShelveLoc Age Education
## 1 9.50 138 73 11 276 120 Bad 42 17
## 2 11.22 111 48 16 260 83 Good 65 10
## 3 10.06 113 35 10 269 80 Medium 59 12
## 4 7.40 117 100 4 466 97 Medium 55 14
## 5 4.15 141 64 3 340 128 Bad 38 13
## 6 10.81 124 113 13 501 72 Bad 78 16
## 7 6.63 115 105 0 45 108 Medium 71 15
## 8 11.85 136 81 15 425 120 Good 67 10
## 9 6.54 132 110 0 108 124 Medium 76 10
## 10 4.69 132 113 0 131 124 Medium 76 17
## 11 9.01 121 78 9 150 100 Bad 26 10
## 12 11.96 117 94 4 503 94 Good 50 13
## 13 3.98 122 35 2 393 136 Medium 62 18
## 14 10.96 115 28 11 29 86 Good 53 18
## 15 11.17 107 117 11 148 118 Good 52 18
## 16 8.71 149 95 5 400 144 Medium 76 18
## 17 7.58 118 32 0 284 110 Good 63 13
## 18 12.29 147 74 13 251 131 Good 52 10
## 19 13.91 110 110 0 408 68 Good 46 17
## 20 8.73 129 76 16 58 121 Medium 69 12
## 21 6.41 125 90 2 367 131 Medium 35 18
## 22 12.13 134 29 12 239 109 Good 62 18
## 23 5.08 128 46 6 497 138 Medium 42 13
## 24 5.87 121 31 0 292 109 Medium 79 10
## 25 10.14 145 119 16 294 113 Bad 42 12
## 26 14.90 139 32 0 176 82 Good 54 11
## 27 8.33 107 115 11 496 131 Good 50 11
## 28 5.27 98 118 0 19 107 Medium 64 17
## 29 2.99 103 74 0 359 97 Bad 55 11
## 30 7.81 104 99 15 226 102 Bad 58 17
## 31 13.55 125 94 0 447 89 Good 30 12
## 32 8.25 136 58 16 241 131 Medium 44 18
## 33 6.20 107 32 12 236 137 Good 64 10
## 34 8.77 114 38 13 317 128 Good 50 16
## 35 2.67 115 54 0 406 128 Medium 42 17
## 36 11.07 131 84 11 29 96 Medium 44 17
## 37 8.89 122 76 0 270 100 Good 60 18
## 38 4.95 121 41 5 412 110 Medium 54 10
## 39 6.59 109 73 0 454 102 Medium 65 15
## 40 3.24 130 60 0 144 138 Bad 38 10
## 41 2.07 119 98 0 18 126 Bad 73 17
## 42 7.96 157 53 0 403 124 Bad 58 16
## 43 10.43 77 69 0 25 24 Medium 50 18
## 44 4.12 123 42 11 16 134 Medium 59 13
## 45 4.16 85 79 6 325 95 Medium 69 13
## 46 4.56 141 63 0 168 135 Bad 44 12
## 47 12.44 127 90 14 16 70 Medium 48 15
## 48 4.38 126 98 0 173 108 Bad 55 16
## 49 3.91 116 52 0 349 98 Bad 69 18
## 50 10.61 157 93 0 51 149 Good 32 17
## 51 1.42 99 32 18 341 108 Bad 80 16
## 52 4.42 121 90 0 150 108 Bad 75 16
## 53 7.91 153 40 3 112 129 Bad 39 18
## 54 6.92 109 64 13 39 119 Medium 61 17
## 55 4.90 134 103 13 25 144 Medium 76 17
## 56 6.85 143 81 5 60 154 Medium 61 18
## 57 11.91 133 82 0 54 84 Medium 50 17
## 58 0.91 93 91 0 22 117 Bad 75 11
## 59 5.42 103 93 15 188 103 Bad 74 16
## 60 5.21 118 71 4 148 114 Medium 80 13
## 61 8.32 122 102 19 469 123 Bad 29 13
## 62 7.32 105 32 0 358 107 Medium 26 13
## 63 1.82 139 45 0 146 133 Bad 77 17
## 64 8.47 119 88 10 170 101 Medium 61 13
## 65 7.80 100 67 12 184 104 Medium 32 16
## 66 4.90 122 26 0 197 128 Medium 55 13
## 67 8.85 127 92 0 508 91 Medium 56 18
## 68 9.01 126 61 14 152 115 Medium 47 16
## 69 13.39 149 69 20 366 134 Good 60 13
## 70 7.99 127 59 0 339 99 Medium 65 12
## 71 9.46 89 81 15 237 99 Good 74 12
## 72 6.50 148 51 16 148 150 Medium 58 17
## 73 5.52 115 45 0 432 116 Medium 25 15
## 74 12.61 118 90 10 54 104 Good 31 11
## 75 6.20 150 68 5 125 136 Medium 64 13
## 76 8.55 88 111 23 480 92 Bad 36 16
## 77 10.64 102 87 10 346 70 Medium 64 15
## 78 7.70 118 71 12 44 89 Medium 67 18
## 79 4.43 134 48 1 139 145 Medium 65 12
## 80 9.14 134 67 0 286 90 Bad 41 13
## 81 8.01 113 100 16 353 79 Bad 68 11
## 82 7.52 116 72 0 237 128 Good 70 13
## 83 11.62 151 83 4 325 139 Good 28 17
## 84 4.42 109 36 7 468 94 Bad 56 11
## 85 2.23 111 25 0 52 121 Bad 43 18
## 86 8.47 125 103 0 304 112 Medium 49 13
## 87 8.70 150 84 9 432 134 Medium 64 15
## 88 11.70 131 67 7 272 126 Good 54 16
## 89 6.56 117 42 7 144 111 Medium 62 10
## 90 7.95 128 66 3 493 119 Medium 45 16
## 91 5.33 115 22 0 491 103 Medium 64 11
## 92 4.81 97 46 11 267 107 Medium 80 15
## 93 4.53 114 113 0 97 125 Medium 29 12
## 94 8.86 145 30 0 67 104 Medium 55 17
## 95 8.39 115 97 5 134 84 Bad 55 11
## 96 5.58 134 25 10 237 148 Medium 59 13
## 97 9.48 147 42 10 407 132 Good 73 16
## 98 7.45 161 82 5 287 129 Bad 33 16
## 99 12.49 122 77 24 382 127 Good 36 16
## 100 4.88 121 47 3 220 107 Bad 56 16
## 101 4.11 113 69 11 94 106 Medium 76 12
## 102 6.20 128 93 0 89 118 Medium 34 18
## 103 5.30 113 22 0 57 97 Medium 65 16
## 104 5.07 123 91 0 334 96 Bad 78 17
## 105 4.62 121 96 0 472 138 Medium 51 12
## 106 5.55 104 100 8 398 97 Medium 61 11
## 107 0.16 102 33 0 217 139 Medium 70 18
## 108 8.55 134 107 0 104 108 Medium 60 12
## 109 3.47 107 79 2 488 103 Bad 65 16
## 110 8.98 115 65 0 217 90 Medium 60 17
## 111 9.00 128 62 7 125 116 Medium 43 14
## 112 6.62 132 118 12 272 151 Medium 43 14
## 113 6.67 116 99 5 298 125 Good 62 12
## 114 6.01 131 29 11 335 127 Bad 33 12
## 115 9.31 122 87 9 17 106 Medium 65 13
## 116 8.54 139 35 0 95 129 Medium 42 13
## 117 5.08 135 75 0 202 128 Medium 80 10
## 118 8.80 145 53 0 507 119 Medium 41 12
## 119 7.57 112 88 2 243 99 Medium 62 11
## 120 7.37 130 94 8 137 128 Medium 64 12
## 121 6.87 128 105 11 249 131 Medium 63 13
## 122 11.67 125 89 10 380 87 Bad 28 10
## 123 6.88 119 100 5 45 108 Medium 75 10
## 124 8.19 127 103 0 125 155 Good 29 15
## 125 8.87 131 113 0 181 120 Good 63 14
## 126 9.34 89 78 0 181 49 Medium 43 15
## 127 11.27 153 68 2 60 133 Good 59 16
## 128 6.52 125 48 3 192 116 Medium 51 14
## 129 4.96 133 100 3 350 126 Bad 55 13
## 130 4.47 143 120 7 279 147 Bad 40 10
## 131 8.41 94 84 13 497 77 Medium 51 12
## 132 6.50 108 69 3 208 94 Medium 77 16
## 133 9.54 125 87 9 232 136 Good 72 10
## 134 7.62 132 98 2 265 97 Bad 62 12
## 135 3.67 132 31 0 327 131 Medium 76 16
## 136 6.44 96 94 14 384 120 Medium 36 18
## 137 5.17 131 75 0 10 120 Bad 31 18
## 138 6.52 128 42 0 436 118 Medium 80 11
## 139 10.27 125 103 12 371 109 Medium 44 10
## 140 12.30 146 62 10 310 94 Medium 30 13
## 141 6.03 133 60 10 277 129 Medium 45 18
## 142 6.53 140 42 0 331 131 Bad 28 15
## 143 7.44 124 84 0 300 104 Medium 77 15
## 144 0.53 122 88 7 36 159 Bad 28 17
## 145 9.09 132 68 0 264 123 Good 34 11
## 146 8.77 144 63 11 27 117 Medium 47 17
## 147 3.90 114 83 0 412 131 Bad 39 14
## 148 10.51 140 54 9 402 119 Good 41 16
## 149 7.56 110 119 0 384 97 Medium 72 14
## 150 11.48 121 120 13 140 87 Medium 56 11
## 151 10.49 122 84 8 176 114 Good 57 10
## 152 10.77 111 58 17 407 103 Good 75 17
## 153 7.64 128 78 0 341 128 Good 45 13
## 154 5.93 150 36 7 488 150 Medium 25 17
## 155 6.89 129 69 10 289 110 Medium 50 16
## 156 7.71 98 72 0 59 69 Medium 65 16
## 157 7.49 146 34 0 220 157 Good 51 16
## 158 10.21 121 58 8 249 90 Medium 48 13
## 159 12.53 142 90 1 189 112 Good 39 10
## 160 9.32 119 60 0 372 70 Bad 30 18
## 161 4.67 111 28 0 486 111 Medium 29 12
## 162 2.93 143 21 5 81 160 Medium 67 12
## 163 3.63 122 74 0 424 149 Medium 51 13
## 164 5.68 130 64 0 40 106 Bad 39 17
## 165 8.22 148 64 0 58 141 Medium 27 13
## 166 0.37 147 58 7 100 191 Bad 27 15
## 167 6.71 119 67 17 151 137 Medium 55 11
## 168 6.71 106 73 0 216 93 Medium 60 13
## 169 7.30 129 89 0 425 117 Medium 45 10
## 170 11.48 104 41 15 492 77 Good 73 18
## 171 8.01 128 39 12 356 118 Medium 71 10
## 172 12.49 93 106 12 416 55 Medium 75 15
## 173 9.03 104 102 13 123 110 Good 35 16
## 174 6.38 135 91 5 207 128 Medium 66 18
## 175 0.00 139 24 0 358 185 Medium 79 15
## 176 7.54 115 89 0 38 122 Medium 25 12
## 177 5.61 138 107 9 480 154 Medium 47 11
## 178 10.48 138 72 0 148 94 Medium 27 17
## 179 10.66 104 71 14 89 81 Medium 25 14
## 180 7.78 144 25 3 70 116 Medium 77 18
## 181 4.94 137 112 15 434 149 Bad 66 13
## 182 7.43 121 83 0 79 91 Medium 68 11
## 183 4.74 137 60 4 230 140 Bad 25 13
## 184 5.32 118 74 6 426 102 Medium 80 18
## 185 9.95 132 33 7 35 97 Medium 60 11
## 186 10.07 130 100 11 449 107 Medium 64 10
## 187 8.68 120 51 0 93 86 Medium 46 17
## 188 6.03 117 32 0 142 96 Bad 62 17
## 189 8.07 116 37 0 426 90 Medium 76 15
## 190 12.11 118 117 18 509 104 Medium 26 15
## 191 8.79 130 37 13 297 101 Medium 37 13
## 192 6.67 156 42 13 170 173 Good 74 14
## 193 7.56 108 26 0 408 93 Medium 56 14
## 194 13.28 139 70 7 71 96 Good 61 10
## 195 7.23 112 98 18 481 128 Medium 45 11
## 196 4.19 117 93 4 420 112 Bad 66 11
## 197 4.10 130 28 6 410 133 Bad 72 16
## 198 2.52 124 61 0 333 138 Medium 76 16
## 199 3.62 112 80 5 500 128 Medium 69 10
## 200 6.42 122 88 5 335 126 Medium 64 14
## 201 5.56 144 92 0 349 146 Medium 62 12
## 202 5.94 138 83 0 139 134 Medium 54 18
## 203 4.10 121 78 4 413 130 Bad 46 10
## 204 2.05 131 82 0 132 157 Bad 25 14
## 205 8.74 155 80 0 237 124 Medium 37 14
## 206 5.68 113 22 1 317 132 Medium 28 12
## 207 4.97 162 67 0 27 160 Medium 77 17
## 208 8.19 111 105 0 466 97 Bad 61 10
## 209 7.78 86 54 0 497 64 Bad 33 12
## 210 3.02 98 21 11 326 90 Bad 76 11
## 211 4.36 125 41 2 357 123 Bad 47 14
## 212 9.39 117 118 14 445 120 Medium 32 15
## 213 12.04 145 69 19 501 105 Medium 45 11
## 214 8.23 149 84 5 220 139 Medium 33 10
## 215 4.83 115 115 3 48 107 Medium 73 18
## 216 2.34 116 83 15 170 144 Bad 71 11
## 217 5.73 141 33 0 243 144 Medium 34 17
## 218 4.34 106 44 0 481 111 Medium 70 14
## 219 9.70 138 61 12 156 120 Medium 25 14
## 220 10.62 116 79 19 359 116 Good 58 17
## 221 10.59 131 120 15 262 124 Medium 30 10
## 222 6.43 124 44 0 125 107 Medium 80 11
## 223 7.49 136 119 6 178 145 Medium 35 13
## 224 3.45 110 45 9 276 125 Medium 62 14
## 225 4.10 134 82 0 464 141 Medium 48 13
## 226 6.68 107 25 0 412 82 Bad 36 14
## 227 7.80 119 33 0 245 122 Good 56 14
## 228 8.69 113 64 10 68 101 Medium 57 16
## 229 5.40 149 73 13 381 163 Bad 26 11
## 230 11.19 98 104 0 404 72 Medium 27 18
## 231 5.16 115 60 0 119 114 Bad 38 14
## 232 8.09 132 69 0 123 122 Medium 27 11
## 233 13.14 137 80 10 24 105 Good 61 15
## 234 8.65 123 76 18 218 120 Medium 29 14
## 235 9.43 115 62 11 289 129 Good 56 16
## 236 5.53 126 32 8 95 132 Medium 50 17
## 237 9.32 141 34 16 361 108 Medium 69 10
## 238 9.62 151 28 8 499 135 Medium 48 10
## 239 7.36 121 24 0 200 133 Good 73 13
## 240 3.89 123 105 0 149 118 Bad 62 16
## 241 10.31 159 80 0 362 121 Medium 26 18
## 242 12.01 136 63 0 160 94 Medium 38 12
## 243 4.68 124 46 0 199 135 Medium 52 14
## 244 7.82 124 25 13 87 110 Medium 57 10
## 245 8.78 130 30 0 391 100 Medium 26 18
## 246 10.00 114 43 0 199 88 Good 57 10
## 247 6.90 120 56 20 266 90 Bad 78 18
## 248 5.04 123 114 0 298 151 Bad 34 16
## 249 5.36 111 52 0 12 101 Medium 61 11
## 250 5.05 125 67 0 86 117 Bad 65 11
## 251 9.16 137 105 10 435 156 Good 72 14
## 252 3.72 139 111 5 310 132 Bad 62 13
## 253 8.31 133 97 0 70 117 Medium 32 16
## 254 5.64 124 24 5 288 122 Medium 57 12
## 255 9.58 108 104 23 353 129 Good 37 17
## 256 7.71 123 81 8 198 81 Bad 80 15
## 257 4.20 147 40 0 277 144 Medium 73 10
## 258 8.67 125 62 14 477 112 Medium 80 13
## 259 3.47 108 38 0 251 81 Bad 72 14
## 260 5.12 123 36 10 467 100 Bad 74 11
## 261 7.67 129 117 8 400 101 Bad 36 10
## 262 5.71 121 42 4 188 118 Medium 54 15
## 263 6.37 120 77 15 86 132 Medium 48 18
## 264 7.77 116 26 6 434 115 Medium 25 17
## 265 6.95 128 29 5 324 159 Good 31 15
## 266 5.31 130 35 10 402 129 Bad 39 17
## 267 9.10 128 93 12 343 112 Good 73 17
## 268 5.83 134 82 7 473 112 Bad 51 12
## 269 6.53 123 57 0 66 105 Medium 39 11
## 270 5.01 159 69 0 438 166 Medium 46 17
## 271 11.99 119 26 0 284 89 Good 26 10
## 272 4.55 111 56 0 504 110 Medium 62 16
## 273 12.98 113 33 0 14 63 Good 38 12
## 274 10.04 116 106 8 244 86 Medium 58 12
## 275 7.22 135 93 2 67 119 Medium 34 11
## 276 6.67 107 119 11 210 132 Medium 53 11
## 277 6.93 135 69 14 296 130 Medium 73 15
## 278 7.80 136 48 12 326 125 Medium 36 16
## 279 7.22 114 113 2 129 151 Good 40 15
## 280 3.42 141 57 13 376 158 Medium 64 18
## 281 2.86 121 86 10 496 145 Bad 51 10
## 282 11.19 122 69 7 303 105 Good 45 16
## 283 7.74 150 96 0 80 154 Good 61 11
## 284 5.36 135 110 0 112 117 Medium 80 16
## 285 6.97 106 46 11 414 96 Bad 79 17
## 286 7.60 146 26 11 261 131 Medium 39 10
## 287 7.53 117 118 11 429 113 Medium 67 18
## 288 6.88 95 44 4 208 72 Bad 44 17
## 289 6.98 116 40 0 74 97 Medium 76 15
## 290 8.75 143 77 25 448 156 Medium 43 17
## 291 9.49 107 111 14 400 103 Medium 41 11
## 292 6.64 118 70 0 106 89 Bad 39 17
## 293 11.82 113 66 16 322 74 Good 76 15
## 294 11.28 123 84 0 74 89 Good 59 10
## 295 12.66 148 76 3 126 99 Good 60 11
## 296 4.21 118 35 14 502 137 Medium 79 10
## 297 8.21 127 44 13 160 123 Good 63 18
## 298 3.07 118 83 13 276 104 Bad 75 10
## 299 10.98 148 63 0 312 130 Good 63 15
## 300 9.40 135 40 17 497 96 Medium 54 17
## 301 8.57 116 78 1 158 99 Medium 45 11
## 302 7.41 99 93 0 198 87 Medium 57 16
## 303 5.28 108 77 13 388 110 Bad 74 14
## 304 10.01 133 52 16 290 99 Medium 43 11
## 305 11.93 123 98 12 408 134 Good 29 10
## 306 8.03 115 29 26 394 132 Medium 33 13
## 307 4.78 131 32 1 85 133 Medium 48 12
## 308 5.90 138 92 0 13 120 Bad 61 12
## 309 9.24 126 80 19 436 126 Medium 52 10
## 310 11.18 131 111 13 33 80 Bad 68 18
## 311 9.53 175 65 29 419 166 Medium 53 12
## 312 6.15 146 68 12 328 132 Bad 51 14
## 313 6.80 137 117 5 337 135 Bad 38 10
## 314 9.33 103 81 3 491 54 Medium 66 13
## 315 7.72 133 33 10 333 129 Good 71 14
## 316 6.39 131 21 8 220 171 Good 29 14
## 317 15.63 122 36 5 369 72 Good 35 10
## 318 6.41 142 30 0 472 136 Good 80 15
## 319 10.08 116 72 10 456 130 Good 41 14
## 320 6.97 127 45 19 459 129 Medium 57 11
## 321 5.86 136 70 12 171 152 Medium 44 18
## 322 7.52 123 39 5 499 98 Medium 34 15
## 323 9.16 140 50 10 300 139 Good 60 15
## 324 10.36 107 105 18 428 103 Medium 34 12
## 325 2.66 136 65 4 133 150 Bad 53 13
## 326 11.70 144 69 11 131 104 Medium 47 11
## 327 4.69 133 30 0 152 122 Medium 53 17
## 328 6.23 112 38 17 316 104 Medium 80 16
## 329 3.15 117 66 1 65 111 Bad 55 11
## 330 11.27 100 54 9 433 89 Good 45 12
## 331 4.99 122 59 0 501 112 Bad 32 14
## 332 10.10 135 63 15 213 134 Medium 32 10
## 333 5.74 106 33 20 354 104 Medium 61 12
## 334 5.87 136 60 7 303 147 Medium 41 10
## 335 7.63 93 117 9 489 83 Bad 42 13
## 336 6.18 120 70 15 464 110 Medium 72 15
## 337 5.17 138 35 6 60 143 Bad 28 18
## 338 8.61 130 38 0 283 102 Medium 80 15
## 339 5.97 112 24 0 164 101 Medium 45 11
## 340 11.54 134 44 4 219 126 Good 44 15
## 341 7.50 140 29 0 105 91 Bad 43 16
## 342 7.38 98 120 0 268 93 Medium 72 10
## 343 7.81 137 102 13 422 118 Medium 71 10
## 344 5.99 117 42 10 371 121 Bad 26 14
## 345 8.43 138 80 0 108 126 Good 70 13
## 346 4.81 121 68 0 279 149 Good 79 12
## 347 8.97 132 107 0 144 125 Medium 33 13
## 348 6.88 96 39 0 161 112 Good 27 14
## 349 12.57 132 102 20 459 107 Good 49 11
## 350 9.32 134 27 18 467 96 Medium 49 14
## 351 8.64 111 101 17 266 91 Medium 63 17
## 352 10.44 124 115 16 458 105 Medium 62 16
## 353 13.44 133 103 14 288 122 Good 61 17
## 354 9.45 107 67 12 430 92 Medium 35 12
## 355 5.30 133 31 1 80 145 Medium 42 18
## 356 7.02 130 100 0 306 146 Good 42 11
## 357 3.58 142 109 0 111 164 Good 72 12
## 358 13.36 103 73 3 276 72 Medium 34 15
## 359 4.17 123 96 10 71 118 Bad 69 11
## 360 3.13 130 62 11 396 130 Bad 66 14
## 361 8.77 118 86 7 265 114 Good 52 15
## 362 8.68 131 25 10 183 104 Medium 56 15
## 363 5.25 131 55 0 26 110 Bad 79 12
## 364 10.26 111 75 1 377 108 Good 25 12
## 365 10.50 122 21 16 488 131 Good 30 14
## 366 6.53 154 30 0 122 162 Medium 57 17
## 367 5.98 124 56 11 447 134 Medium 53 12
## 368 14.37 95 106 0 256 53 Good 52 17
## 369 10.71 109 22 10 348 79 Good 74 14
## 370 10.26 135 100 22 463 122 Medium 36 14
## 371 7.68 126 41 22 403 119 Bad 42 12
## 372 9.08 152 81 0 191 126 Medium 54 16
## 373 7.80 121 50 0 508 98 Medium 65 11
## 374 5.58 137 71 0 402 116 Medium 78 17
## 375 9.44 131 47 7 90 118 Medium 47 12
## 376 7.90 132 46 4 206 124 Medium 73 11
## 377 16.27 141 60 19 319 92 Good 44 11
## 378 6.81 132 61 0 263 125 Medium 41 12
## 379 6.11 133 88 3 105 119 Medium 79 12
## 380 5.81 125 111 0 404 107 Bad 54 15
## 381 9.64 106 64 10 17 89 Medium 68 17
## 382 3.90 124 65 21 496 151 Bad 77 13
## 383 4.95 121 28 19 315 121 Medium 66 14
## 384 9.35 98 117 0 76 68 Medium 63 10
## 385 12.85 123 37 15 348 112 Good 28 12
## 386 5.87 131 73 13 455 132 Medium 62 17
## 387 5.32 152 116 0 170 160 Medium 39 16
## 388 8.67 142 73 14 238 115 Medium 73 14
## 389 8.14 135 89 11 245 78 Bad 79 16
## 390 8.44 128 42 8 328 107 Medium 35 12
## 391 5.47 108 75 9 61 111 Medium 67 12
## 392 6.10 153 63 0 49 124 Bad 56 16
## 393 4.53 129 42 13 315 130 Bad 34 13
## 394 5.57 109 51 10 26 120 Medium 30 17
## 395 5.35 130 58 19 366 139 Bad 33 16
## 396 12.57 138 108 17 203 128 Good 33 14
## 397 6.14 139 23 3 37 120 Medium 55 11
## 398 7.41 162 26 12 368 159 Medium 40 18
## 399 5.94 100 79 7 284 95 Bad 50 12
## 400 9.71 134 37 0 27 120 Good 49 16
## Urban US
## 1 Yes Yes
## 2 Yes Yes
## 3 Yes Yes
## 4 Yes Yes
## 5 Yes No
## 6 No Yes
## 7 Yes No
## 8 Yes Yes
## 9 No No
## 10 No Yes
## 11 No Yes
## 12 Yes Yes
## 13 Yes No
## 14 Yes Yes
## 15 Yes Yes
## 16 No No
## 17 Yes No
## 18 Yes Yes
## 19 No Yes
## 20 Yes Yes
## 21 Yes Yes
## 22 No Yes
## 23 Yes No
## 24 Yes No
## 25 Yes Yes
## 26 No No
## 27 No Yes
## 28 Yes No
## 29 Yes Yes
## 30 Yes Yes
## 31 Yes No
## 32 Yes Yes
## 33 No Yes
## 34 Yes Yes
## 35 Yes Yes
## 36 No Yes
## 37 No No
## 38 Yes Yes
## 39 Yes No
## 40 No No
## 41 No No
## 42 Yes No
## 43 Yes No
## 44 Yes Yes
## 45 Yes Yes
## 46 Yes Yes
## 47 No Yes
## 48 Yes No
## 49 Yes No
## 50 Yes No
## 51 Yes Yes
## 52 Yes No
## 53 Yes Yes
## 54 Yes Yes
## 55 No Yes
## 56 Yes Yes
## 57 Yes No
## 58 Yes No
## 59 Yes Yes
## 60 Yes No
## 61 Yes Yes
## 62 No No
## 63 Yes Yes
## 64 Yes Yes
## 65 No Yes
## 66 No No
## 67 Yes No
## 68 Yes Yes
## 69 Yes Yes
## 70 Yes No
## 71 Yes Yes
## 72 No Yes
## 73 Yes No
## 74 No Yes
## 75 No Yes
## 76 No Yes
## 77 Yes Yes
## 78 No Yes
## 79 Yes Yes
## 80 Yes No
## 81 Yes Yes
## 82 Yes No
## 83 Yes Yes
## 84 Yes Yes
## 85 No No
## 86 No No
## 87 Yes No
## 88 No Yes
## 89 Yes Yes
## 90 No No
## 91 No No
## 92 Yes Yes
## 93 Yes No
## 94 Yes No
## 95 Yes Yes
## 96 Yes Yes
## 97 No Yes
## 98 Yes Yes
## 99 No Yes
## 100 No Yes
## 101 No Yes
## 102 Yes No
## 103 No No
## 104 Yes Yes
## 105 Yes No
## 106 Yes Yes
## 107 No No
## 108 Yes No
## 109 Yes No
## 110 No No
## 111 Yes Yes
## 112 Yes Yes
## 113 Yes Yes
## 114 Yes Yes
## 115 Yes Yes
## 116 Yes No
## 117 No No
## 118 Yes No
## 119 Yes Yes
## 120 Yes Yes
## 121 Yes Yes
## 122 Yes Yes
## 123 Yes Yes
## 124 No Yes
## 125 Yes No
## 126 No No
## 127 Yes Yes
## 128 Yes Yes
## 129 Yes Yes
## 130 No Yes
## 131 Yes Yes
## 132 Yes No
## 133 Yes Yes
## 134 Yes Yes
## 135 Yes No
## 136 No Yes
## 137 No No
## 138 Yes No
## 139 Yes Yes
## 140 No Yes
## 141 Yes Yes
## 142 Yes No
## 143 Yes No
## 144 Yes Yes
## 145 No No
## 146 Yes Yes
## 147 Yes No
## 148 No Yes
## 149 No Yes
## 150 Yes Yes
## 151 No Yes
## 152 No Yes
## 153 No No
## 154 No Yes
## 155 No Yes
## 156 Yes No
## 157 Yes No
## 158 No Yes
## 159 No Yes
## 160 No No
## 161 No No
## 162 No Yes
## 163 Yes No
## 164 No No
## 165 No Yes
## 166 Yes Yes
## 167 Yes Yes
## 168 Yes No
## 169 Yes No
## 170 Yes Yes
## 171 Yes Yes
## 172 Yes Yes
## 173 Yes Yes
## 174 Yes Yes
## 175 No No
## 176 Yes No
## 177 No Yes
## 178 Yes Yes
## 179 No Yes
## 180 Yes Yes
## 181 Yes Yes
## 182 Yes No
## 183 Yes No
## 184 Yes Yes
## 185 No Yes
## 186 Yes Yes
## 187 No No
## 188 Yes No
## 189 Yes No
## 190 No Yes
## 191 No Yes
## 192 Yes Yes
## 193 No No
## 194 Yes Yes
## 195 Yes Yes
## 196 Yes Yes
## 197 Yes Yes
## 198 Yes No
## 199 Yes Yes
## 200 Yes Yes
## 201 No No
## 202 Yes No
## 203 No Yes
## 204 Yes No
## 205 Yes No
## 206 Yes No
## 207 Yes Yes
## 208 No No
## 209 Yes No
## 210 No Yes
## 211 No Yes
## 212 Yes Yes
## 213 Yes Yes
## 214 Yes Yes
## 215 Yes Yes
## 216 Yes Yes
## 217 Yes No
## 218 No No
## 219 Yes Yes
## 220 Yes Yes
## 221 Yes Yes
## 222 Yes No
## 223 Yes Yes
## 224 Yes Yes
## 225 No No
## 226 Yes No
## 227 Yes No
## 228 Yes Yes
## 229 No Yes
## 230 No No
## 231 No No
## 232 No No
## 233 Yes Yes
## 234 No Yes
## 235 No Yes
## 236 Yes Yes
## 237 Yes Yes
## 238 Yes Yes
## 239 Yes No
## 240 Yes Yes
## 241 Yes No
## 242 Yes No
## 243 No No
## 244 Yes Yes
## 245 Yes No
## 246 No Yes
## 247 Yes Yes
## 248 Yes No
## 249 Yes Yes
## 250 Yes No
## 251 Yes Yes
## 252 Yes Yes
## 253 Yes No
## 254 No Yes
## 255 Yes Yes
## 256 Yes Yes
## 257 Yes No
## 258 Yes Yes
## 259 No No
## 260 No Yes
## 261 Yes Yes
## 262 Yes Yes
## 263 Yes Yes
## 264 Yes Yes
## 265 Yes Yes
## 266 Yes Yes
## 267 No Yes
## 268 No Yes
## 269 Yes No
## 270 Yes No
## 271 Yes No
## 272 Yes No
## 273 Yes No
## 274 Yes Yes
## 275 Yes Yes
## 276 Yes Yes
## 277 Yes Yes
## 278 Yes Yes
## 279 No Yes
## 280 Yes Yes
## 281 Yes Yes
## 282 No Yes
## 283 Yes No
## 284 No No
## 285 No No
## 286 Yes Yes
## 287 No Yes
## 288 Yes Yes
## 289 No No
## 290 Yes Yes
## 291 No Yes
## 292 Yes No
## 293 Yes Yes
## 294 Yes No
## 295 Yes Yes
## 296 No Yes
## 297 Yes Yes
## 298 Yes Yes
## 299 Yes No
## 300 No Yes
## 301 Yes Yes
## 302 Yes Yes
## 303 Yes Yes
## 304 Yes Yes
## 305 Yes Yes
## 306 Yes Yes
## 307 Yes Yes
## 308 Yes No
## 309 Yes Yes
## 310 Yes Yes
## 311 Yes Yes
## 312 Yes Yes
## 313 Yes Yes
## 314 Yes No
## 315 Yes Yes
## 316 Yes Yes
## 317 Yes Yes
## 318 No No
## 319 No Yes
## 320 No Yes
## 321 Yes Yes
## 322 Yes No
## 323 Yes Yes
## 324 Yes Yes
## 325 Yes Yes
## 326 Yes Yes
## 327 Yes No
## 328 Yes Yes
## 329 Yes Yes
## 330 Yes Yes
## 331 No No
## 332 Yes Yes
## 333 Yes Yes
## 334 Yes Yes
## 335 Yes Yes
## 336 Yes Yes
## 337 Yes No
## 338 Yes No
## 339 Yes No
## 340 Yes Yes
## 341 Yes No
## 342 No No
## 343 No Yes
## 344 Yes Yes
## 345 No Yes
## 346 Yes No
## 347 No No
## 348 No No
## 349 Yes Yes
## 350 No Yes
## 351 No Yes
## 352 No Yes
## 353 Yes Yes
## 354 No Yes
## 355 Yes Yes
## 356 Yes No
## 357 Yes No
## 358 Yes Yes
## 359 Yes Yes
## 360 Yes Yes
## 361 No Yes
## 362 No Yes
## 363 Yes Yes
## 364 Yes No
## 365 Yes Yes
## 366 No No
## 367 No Yes
## 368 Yes No
## 369 No Yes
## 370 Yes Yes
## 371 Yes Yes
## 372 Yes No
## 373 No No
## 374 Yes No
## 375 Yes Yes
## 376 Yes No
## 377 Yes Yes
## 378 No No
## 379 Yes Yes
## 380 Yes No
## 381 Yes Yes
## 382 Yes Yes
## 383 Yes Yes
## 384 Yes No
## 385 Yes Yes
## 386 Yes Yes
## 387 Yes No
## 388 No Yes
## 389 Yes Yes
## 390 Yes Yes
## 391 Yes Yes
## 392 Yes No
## 393 Yes Yes
## 394 No Yes
## 395 Yes Yes
## 396 Yes Yes
## 397 No Yes
## 398 Yes Yes
## 399 Yes Yes
## 400 Yes Yes
#Discretisize Sales from Low, Medium, High and add as new variable
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
#Low, Medium, High
#df = df %>%
#mutate(Sales_cat <- as.factor(ifelse(Sales < 6, "Low",
# (ifelse(Sales < 8, " Medium", "High")))))
#Low, High
df = df %>%
mutate(Sales_cat <- as.factor(ifelse(Sales < 8, "Low", "High")))
colnames(df)[12] <- c('Sales_cat')
df
## Sales CompPrice Income Advertising Population Price ShelveLoc Age Education
## 1 9.50 138 73 11 276 120 Bad 42 17
## 2 11.22 111 48 16 260 83 Good 65 10
## 3 10.06 113 35 10 269 80 Medium 59 12
## 4 7.40 117 100 4 466 97 Medium 55 14
## 5 4.15 141 64 3 340 128 Bad 38 13
## 6 10.81 124 113 13 501 72 Bad 78 16
## 7 6.63 115 105 0 45 108 Medium 71 15
## 8 11.85 136 81 15 425 120 Good 67 10
## 9 6.54 132 110 0 108 124 Medium 76 10
## 10 4.69 132 113 0 131 124 Medium 76 17
## 11 9.01 121 78 9 150 100 Bad 26 10
## 12 11.96 117 94 4 503 94 Good 50 13
## 13 3.98 122 35 2 393 136 Medium 62 18
## 14 10.96 115 28 11 29 86 Good 53 18
## 15 11.17 107 117 11 148 118 Good 52 18
## 16 8.71 149 95 5 400 144 Medium 76 18
## 17 7.58 118 32 0 284 110 Good 63 13
## 18 12.29 147 74 13 251 131 Good 52 10
## 19 13.91 110 110 0 408 68 Good 46 17
## 20 8.73 129 76 16 58 121 Medium 69 12
## 21 6.41 125 90 2 367 131 Medium 35 18
## 22 12.13 134 29 12 239 109 Good 62 18
## 23 5.08 128 46 6 497 138 Medium 42 13
## 24 5.87 121 31 0 292 109 Medium 79 10
## 25 10.14 145 119 16 294 113 Bad 42 12
## 26 14.90 139 32 0 176 82 Good 54 11
## 27 8.33 107 115 11 496 131 Good 50 11
## 28 5.27 98 118 0 19 107 Medium 64 17
## 29 2.99 103 74 0 359 97 Bad 55 11
## 30 7.81 104 99 15 226 102 Bad 58 17
## 31 13.55 125 94 0 447 89 Good 30 12
## 32 8.25 136 58 16 241 131 Medium 44 18
## 33 6.20 107 32 12 236 137 Good 64 10
## 34 8.77 114 38 13 317 128 Good 50 16
## 35 2.67 115 54 0 406 128 Medium 42 17
## 36 11.07 131 84 11 29 96 Medium 44 17
## 37 8.89 122 76 0 270 100 Good 60 18
## 38 4.95 121 41 5 412 110 Medium 54 10
## 39 6.59 109 73 0 454 102 Medium 65 15
## 40 3.24 130 60 0 144 138 Bad 38 10
## 41 2.07 119 98 0 18 126 Bad 73 17
## 42 7.96 157 53 0 403 124 Bad 58 16
## 43 10.43 77 69 0 25 24 Medium 50 18
## 44 4.12 123 42 11 16 134 Medium 59 13
## 45 4.16 85 79 6 325 95 Medium 69 13
## 46 4.56 141 63 0 168 135 Bad 44 12
## 47 12.44 127 90 14 16 70 Medium 48 15
## 48 4.38 126 98 0 173 108 Bad 55 16
## 49 3.91 116 52 0 349 98 Bad 69 18
## 50 10.61 157 93 0 51 149 Good 32 17
## 51 1.42 99 32 18 341 108 Bad 80 16
## 52 4.42 121 90 0 150 108 Bad 75 16
## 53 7.91 153 40 3 112 129 Bad 39 18
## 54 6.92 109 64 13 39 119 Medium 61 17
## 55 4.90 134 103 13 25 144 Medium 76 17
## 56 6.85 143 81 5 60 154 Medium 61 18
## 57 11.91 133 82 0 54 84 Medium 50 17
## 58 0.91 93 91 0 22 117 Bad 75 11
## 59 5.42 103 93 15 188 103 Bad 74 16
## 60 5.21 118 71 4 148 114 Medium 80 13
## 61 8.32 122 102 19 469 123 Bad 29 13
## 62 7.32 105 32 0 358 107 Medium 26 13
## 63 1.82 139 45 0 146 133 Bad 77 17
## 64 8.47 119 88 10 170 101 Medium 61 13
## 65 7.80 100 67 12 184 104 Medium 32 16
## 66 4.90 122 26 0 197 128 Medium 55 13
## 67 8.85 127 92 0 508 91 Medium 56 18
## 68 9.01 126 61 14 152 115 Medium 47 16
## 69 13.39 149 69 20 366 134 Good 60 13
## 70 7.99 127 59 0 339 99 Medium 65 12
## 71 9.46 89 81 15 237 99 Good 74 12
## 72 6.50 148 51 16 148 150 Medium 58 17
## 73 5.52 115 45 0 432 116 Medium 25 15
## 74 12.61 118 90 10 54 104 Good 31 11
## 75 6.20 150 68 5 125 136 Medium 64 13
## 76 8.55 88 111 23 480 92 Bad 36 16
## 77 10.64 102 87 10 346 70 Medium 64 15
## 78 7.70 118 71 12 44 89 Medium 67 18
## 79 4.43 134 48 1 139 145 Medium 65 12
## 80 9.14 134 67 0 286 90 Bad 41 13
## 81 8.01 113 100 16 353 79 Bad 68 11
## 82 7.52 116 72 0 237 128 Good 70 13
## 83 11.62 151 83 4 325 139 Good 28 17
## 84 4.42 109 36 7 468 94 Bad 56 11
## 85 2.23 111 25 0 52 121 Bad 43 18
## 86 8.47 125 103 0 304 112 Medium 49 13
## 87 8.70 150 84 9 432 134 Medium 64 15
## 88 11.70 131 67 7 272 126 Good 54 16
## 89 6.56 117 42 7 144 111 Medium 62 10
## 90 7.95 128 66 3 493 119 Medium 45 16
## 91 5.33 115 22 0 491 103 Medium 64 11
## 92 4.81 97 46 11 267 107 Medium 80 15
## 93 4.53 114 113 0 97 125 Medium 29 12
## 94 8.86 145 30 0 67 104 Medium 55 17
## 95 8.39 115 97 5 134 84 Bad 55 11
## 96 5.58 134 25 10 237 148 Medium 59 13
## 97 9.48 147 42 10 407 132 Good 73 16
## 98 7.45 161 82 5 287 129 Bad 33 16
## 99 12.49 122 77 24 382 127 Good 36 16
## 100 4.88 121 47 3 220 107 Bad 56 16
## 101 4.11 113 69 11 94 106 Medium 76 12
## 102 6.20 128 93 0 89 118 Medium 34 18
## 103 5.30 113 22 0 57 97 Medium 65 16
## 104 5.07 123 91 0 334 96 Bad 78 17
## 105 4.62 121 96 0 472 138 Medium 51 12
## 106 5.55 104 100 8 398 97 Medium 61 11
## 107 0.16 102 33 0 217 139 Medium 70 18
## 108 8.55 134 107 0 104 108 Medium 60 12
## 109 3.47 107 79 2 488 103 Bad 65 16
## 110 8.98 115 65 0 217 90 Medium 60 17
## 111 9.00 128 62 7 125 116 Medium 43 14
## 112 6.62 132 118 12 272 151 Medium 43 14
## 113 6.67 116 99 5 298 125 Good 62 12
## 114 6.01 131 29 11 335 127 Bad 33 12
## 115 9.31 122 87 9 17 106 Medium 65 13
## 116 8.54 139 35 0 95 129 Medium 42 13
## 117 5.08 135 75 0 202 128 Medium 80 10
## 118 8.80 145 53 0 507 119 Medium 41 12
## 119 7.57 112 88 2 243 99 Medium 62 11
## 120 7.37 130 94 8 137 128 Medium 64 12
## 121 6.87 128 105 11 249 131 Medium 63 13
## 122 11.67 125 89 10 380 87 Bad 28 10
## 123 6.88 119 100 5 45 108 Medium 75 10
## 124 8.19 127 103 0 125 155 Good 29 15
## 125 8.87 131 113 0 181 120 Good 63 14
## 126 9.34 89 78 0 181 49 Medium 43 15
## 127 11.27 153 68 2 60 133 Good 59 16
## 128 6.52 125 48 3 192 116 Medium 51 14
## 129 4.96 133 100 3 350 126 Bad 55 13
## 130 4.47 143 120 7 279 147 Bad 40 10
## 131 8.41 94 84 13 497 77 Medium 51 12
## 132 6.50 108 69 3 208 94 Medium 77 16
## 133 9.54 125 87 9 232 136 Good 72 10
## 134 7.62 132 98 2 265 97 Bad 62 12
## 135 3.67 132 31 0 327 131 Medium 76 16
## 136 6.44 96 94 14 384 120 Medium 36 18
## 137 5.17 131 75 0 10 120 Bad 31 18
## 138 6.52 128 42 0 436 118 Medium 80 11
## 139 10.27 125 103 12 371 109 Medium 44 10
## 140 12.30 146 62 10 310 94 Medium 30 13
## 141 6.03 133 60 10 277 129 Medium 45 18
## 142 6.53 140 42 0 331 131 Bad 28 15
## 143 7.44 124 84 0 300 104 Medium 77 15
## 144 0.53 122 88 7 36 159 Bad 28 17
## 145 9.09 132 68 0 264 123 Good 34 11
## 146 8.77 144 63 11 27 117 Medium 47 17
## 147 3.90 114 83 0 412 131 Bad 39 14
## 148 10.51 140 54 9 402 119 Good 41 16
## 149 7.56 110 119 0 384 97 Medium 72 14
## 150 11.48 121 120 13 140 87 Medium 56 11
## 151 10.49 122 84 8 176 114 Good 57 10
## 152 10.77 111 58 17 407 103 Good 75 17
## 153 7.64 128 78 0 341 128 Good 45 13
## 154 5.93 150 36 7 488 150 Medium 25 17
## 155 6.89 129 69 10 289 110 Medium 50 16
## 156 7.71 98 72 0 59 69 Medium 65 16
## 157 7.49 146 34 0 220 157 Good 51 16
## 158 10.21 121 58 8 249 90 Medium 48 13
## 159 12.53 142 90 1 189 112 Good 39 10
## 160 9.32 119 60 0 372 70 Bad 30 18
## 161 4.67 111 28 0 486 111 Medium 29 12
## 162 2.93 143 21 5 81 160 Medium 67 12
## 163 3.63 122 74 0 424 149 Medium 51 13
## 164 5.68 130 64 0 40 106 Bad 39 17
## 165 8.22 148 64 0 58 141 Medium 27 13
## 166 0.37 147 58 7 100 191 Bad 27 15
## 167 6.71 119 67 17 151 137 Medium 55 11
## 168 6.71 106 73 0 216 93 Medium 60 13
## 169 7.30 129 89 0 425 117 Medium 45 10
## 170 11.48 104 41 15 492 77 Good 73 18
## 171 8.01 128 39 12 356 118 Medium 71 10
## 172 12.49 93 106 12 416 55 Medium 75 15
## 173 9.03 104 102 13 123 110 Good 35 16
## 174 6.38 135 91 5 207 128 Medium 66 18
## 175 0.00 139 24 0 358 185 Medium 79 15
## 176 7.54 115 89 0 38 122 Medium 25 12
## 177 5.61 138 107 9 480 154 Medium 47 11
## 178 10.48 138 72 0 148 94 Medium 27 17
## 179 10.66 104 71 14 89 81 Medium 25 14
## 180 7.78 144 25 3 70 116 Medium 77 18
## 181 4.94 137 112 15 434 149 Bad 66 13
## 182 7.43 121 83 0 79 91 Medium 68 11
## 183 4.74 137 60 4 230 140 Bad 25 13
## 184 5.32 118 74 6 426 102 Medium 80 18
## 185 9.95 132 33 7 35 97 Medium 60 11
## 186 10.07 130 100 11 449 107 Medium 64 10
## 187 8.68 120 51 0 93 86 Medium 46 17
## 188 6.03 117 32 0 142 96 Bad 62 17
## 189 8.07 116 37 0 426 90 Medium 76 15
## 190 12.11 118 117 18 509 104 Medium 26 15
## 191 8.79 130 37 13 297 101 Medium 37 13
## 192 6.67 156 42 13 170 173 Good 74 14
## 193 7.56 108 26 0 408 93 Medium 56 14
## 194 13.28 139 70 7 71 96 Good 61 10
## 195 7.23 112 98 18 481 128 Medium 45 11
## 196 4.19 117 93 4 420 112 Bad 66 11
## 197 4.10 130 28 6 410 133 Bad 72 16
## 198 2.52 124 61 0 333 138 Medium 76 16
## 199 3.62 112 80 5 500 128 Medium 69 10
## 200 6.42 122 88 5 335 126 Medium 64 14
## 201 5.56 144 92 0 349 146 Medium 62 12
## 202 5.94 138 83 0 139 134 Medium 54 18
## 203 4.10 121 78 4 413 130 Bad 46 10
## 204 2.05 131 82 0 132 157 Bad 25 14
## 205 8.74 155 80 0 237 124 Medium 37 14
## 206 5.68 113 22 1 317 132 Medium 28 12
## 207 4.97 162 67 0 27 160 Medium 77 17
## 208 8.19 111 105 0 466 97 Bad 61 10
## 209 7.78 86 54 0 497 64 Bad 33 12
## 210 3.02 98 21 11 326 90 Bad 76 11
## 211 4.36 125 41 2 357 123 Bad 47 14
## 212 9.39 117 118 14 445 120 Medium 32 15
## 213 12.04 145 69 19 501 105 Medium 45 11
## 214 8.23 149 84 5 220 139 Medium 33 10
## 215 4.83 115 115 3 48 107 Medium 73 18
## 216 2.34 116 83 15 170 144 Bad 71 11
## 217 5.73 141 33 0 243 144 Medium 34 17
## 218 4.34 106 44 0 481 111 Medium 70 14
## 219 9.70 138 61 12 156 120 Medium 25 14
## 220 10.62 116 79 19 359 116 Good 58 17
## 221 10.59 131 120 15 262 124 Medium 30 10
## 222 6.43 124 44 0 125 107 Medium 80 11
## 223 7.49 136 119 6 178 145 Medium 35 13
## 224 3.45 110 45 9 276 125 Medium 62 14
## 225 4.10 134 82 0 464 141 Medium 48 13
## 226 6.68 107 25 0 412 82 Bad 36 14
## 227 7.80 119 33 0 245 122 Good 56 14
## 228 8.69 113 64 10 68 101 Medium 57 16
## 229 5.40 149 73 13 381 163 Bad 26 11
## 230 11.19 98 104 0 404 72 Medium 27 18
## 231 5.16 115 60 0 119 114 Bad 38 14
## 232 8.09 132 69 0 123 122 Medium 27 11
## 233 13.14 137 80 10 24 105 Good 61 15
## 234 8.65 123 76 18 218 120 Medium 29 14
## 235 9.43 115 62 11 289 129 Good 56 16
## 236 5.53 126 32 8 95 132 Medium 50 17
## 237 9.32 141 34 16 361 108 Medium 69 10
## 238 9.62 151 28 8 499 135 Medium 48 10
## 239 7.36 121 24 0 200 133 Good 73 13
## 240 3.89 123 105 0 149 118 Bad 62 16
## 241 10.31 159 80 0 362 121 Medium 26 18
## 242 12.01 136 63 0 160 94 Medium 38 12
## 243 4.68 124 46 0 199 135 Medium 52 14
## 244 7.82 124 25 13 87 110 Medium 57 10
## 245 8.78 130 30 0 391 100 Medium 26 18
## 246 10.00 114 43 0 199 88 Good 57 10
## 247 6.90 120 56 20 266 90 Bad 78 18
## 248 5.04 123 114 0 298 151 Bad 34 16
## 249 5.36 111 52 0 12 101 Medium 61 11
## 250 5.05 125 67 0 86 117 Bad 65 11
## 251 9.16 137 105 10 435 156 Good 72 14
## 252 3.72 139 111 5 310 132 Bad 62 13
## 253 8.31 133 97 0 70 117 Medium 32 16
## 254 5.64 124 24 5 288 122 Medium 57 12
## 255 9.58 108 104 23 353 129 Good 37 17
## 256 7.71 123 81 8 198 81 Bad 80 15
## 257 4.20 147 40 0 277 144 Medium 73 10
## 258 8.67 125 62 14 477 112 Medium 80 13
## 259 3.47 108 38 0 251 81 Bad 72 14
## 260 5.12 123 36 10 467 100 Bad 74 11
## 261 7.67 129 117 8 400 101 Bad 36 10
## 262 5.71 121 42 4 188 118 Medium 54 15
## 263 6.37 120 77 15 86 132 Medium 48 18
## 264 7.77 116 26 6 434 115 Medium 25 17
## 265 6.95 128 29 5 324 159 Good 31 15
## 266 5.31 130 35 10 402 129 Bad 39 17
## 267 9.10 128 93 12 343 112 Good 73 17
## 268 5.83 134 82 7 473 112 Bad 51 12
## 269 6.53 123 57 0 66 105 Medium 39 11
## 270 5.01 159 69 0 438 166 Medium 46 17
## 271 11.99 119 26 0 284 89 Good 26 10
## 272 4.55 111 56 0 504 110 Medium 62 16
## 273 12.98 113 33 0 14 63 Good 38 12
## 274 10.04 116 106 8 244 86 Medium 58 12
## 275 7.22 135 93 2 67 119 Medium 34 11
## 276 6.67 107 119 11 210 132 Medium 53 11
## 277 6.93 135 69 14 296 130 Medium 73 15
## 278 7.80 136 48 12 326 125 Medium 36 16
## 279 7.22 114 113 2 129 151 Good 40 15
## 280 3.42 141 57 13 376 158 Medium 64 18
## 281 2.86 121 86 10 496 145 Bad 51 10
## 282 11.19 122 69 7 303 105 Good 45 16
## 283 7.74 150 96 0 80 154 Good 61 11
## 284 5.36 135 110 0 112 117 Medium 80 16
## 285 6.97 106 46 11 414 96 Bad 79 17
## 286 7.60 146 26 11 261 131 Medium 39 10
## 287 7.53 117 118 11 429 113 Medium 67 18
## 288 6.88 95 44 4 208 72 Bad 44 17
## 289 6.98 116 40 0 74 97 Medium 76 15
## 290 8.75 143 77 25 448 156 Medium 43 17
## 291 9.49 107 111 14 400 103 Medium 41 11
## 292 6.64 118 70 0 106 89 Bad 39 17
## 293 11.82 113 66 16 322 74 Good 76 15
## 294 11.28 123 84 0 74 89 Good 59 10
## 295 12.66 148 76 3 126 99 Good 60 11
## 296 4.21 118 35 14 502 137 Medium 79 10
## 297 8.21 127 44 13 160 123 Good 63 18
## 298 3.07 118 83 13 276 104 Bad 75 10
## 299 10.98 148 63 0 312 130 Good 63 15
## 300 9.40 135 40 17 497 96 Medium 54 17
## 301 8.57 116 78 1 158 99 Medium 45 11
## 302 7.41 99 93 0 198 87 Medium 57 16
## 303 5.28 108 77 13 388 110 Bad 74 14
## 304 10.01 133 52 16 290 99 Medium 43 11
## 305 11.93 123 98 12 408 134 Good 29 10
## 306 8.03 115 29 26 394 132 Medium 33 13
## 307 4.78 131 32 1 85 133 Medium 48 12
## 308 5.90 138 92 0 13 120 Bad 61 12
## 309 9.24 126 80 19 436 126 Medium 52 10
## 310 11.18 131 111 13 33 80 Bad 68 18
## 311 9.53 175 65 29 419 166 Medium 53 12
## 312 6.15 146 68 12 328 132 Bad 51 14
## 313 6.80 137 117 5 337 135 Bad 38 10
## 314 9.33 103 81 3 491 54 Medium 66 13
## 315 7.72 133 33 10 333 129 Good 71 14
## 316 6.39 131 21 8 220 171 Good 29 14
## 317 15.63 122 36 5 369 72 Good 35 10
## 318 6.41 142 30 0 472 136 Good 80 15
## 319 10.08 116 72 10 456 130 Good 41 14
## 320 6.97 127 45 19 459 129 Medium 57 11
## 321 5.86 136 70 12 171 152 Medium 44 18
## 322 7.52 123 39 5 499 98 Medium 34 15
## 323 9.16 140 50 10 300 139 Good 60 15
## 324 10.36 107 105 18 428 103 Medium 34 12
## 325 2.66 136 65 4 133 150 Bad 53 13
## 326 11.70 144 69 11 131 104 Medium 47 11
## 327 4.69 133 30 0 152 122 Medium 53 17
## 328 6.23 112 38 17 316 104 Medium 80 16
## 329 3.15 117 66 1 65 111 Bad 55 11
## 330 11.27 100 54 9 433 89 Good 45 12
## 331 4.99 122 59 0 501 112 Bad 32 14
## 332 10.10 135 63 15 213 134 Medium 32 10
## 333 5.74 106 33 20 354 104 Medium 61 12
## 334 5.87 136 60 7 303 147 Medium 41 10
## 335 7.63 93 117 9 489 83 Bad 42 13
## 336 6.18 120 70 15 464 110 Medium 72 15
## 337 5.17 138 35 6 60 143 Bad 28 18
## 338 8.61 130 38 0 283 102 Medium 80 15
## 339 5.97 112 24 0 164 101 Medium 45 11
## 340 11.54 134 44 4 219 126 Good 44 15
## 341 7.50 140 29 0 105 91 Bad 43 16
## 342 7.38 98 120 0 268 93 Medium 72 10
## 343 7.81 137 102 13 422 118 Medium 71 10
## 344 5.99 117 42 10 371 121 Bad 26 14
## 345 8.43 138 80 0 108 126 Good 70 13
## 346 4.81 121 68 0 279 149 Good 79 12
## 347 8.97 132 107 0 144 125 Medium 33 13
## 348 6.88 96 39 0 161 112 Good 27 14
## 349 12.57 132 102 20 459 107 Good 49 11
## 350 9.32 134 27 18 467 96 Medium 49 14
## 351 8.64 111 101 17 266 91 Medium 63 17
## 352 10.44 124 115 16 458 105 Medium 62 16
## 353 13.44 133 103 14 288 122 Good 61 17
## 354 9.45 107 67 12 430 92 Medium 35 12
## 355 5.30 133 31 1 80 145 Medium 42 18
## 356 7.02 130 100 0 306 146 Good 42 11
## 357 3.58 142 109 0 111 164 Good 72 12
## 358 13.36 103 73 3 276 72 Medium 34 15
## 359 4.17 123 96 10 71 118 Bad 69 11
## 360 3.13 130 62 11 396 130 Bad 66 14
## 361 8.77 118 86 7 265 114 Good 52 15
## 362 8.68 131 25 10 183 104 Medium 56 15
## 363 5.25 131 55 0 26 110 Bad 79 12
## 364 10.26 111 75 1 377 108 Good 25 12
## 365 10.50 122 21 16 488 131 Good 30 14
## 366 6.53 154 30 0 122 162 Medium 57 17
## 367 5.98 124 56 11 447 134 Medium 53 12
## 368 14.37 95 106 0 256 53 Good 52 17
## 369 10.71 109 22 10 348 79 Good 74 14
## 370 10.26 135 100 22 463 122 Medium 36 14
## 371 7.68 126 41 22 403 119 Bad 42 12
## 372 9.08 152 81 0 191 126 Medium 54 16
## 373 7.80 121 50 0 508 98 Medium 65 11
## 374 5.58 137 71 0 402 116 Medium 78 17
## 375 9.44 131 47 7 90 118 Medium 47 12
## 376 7.90 132 46 4 206 124 Medium 73 11
## 377 16.27 141 60 19 319 92 Good 44 11
## 378 6.81 132 61 0 263 125 Medium 41 12
## 379 6.11 133 88 3 105 119 Medium 79 12
## 380 5.81 125 111 0 404 107 Bad 54 15
## 381 9.64 106 64 10 17 89 Medium 68 17
## 382 3.90 124 65 21 496 151 Bad 77 13
## 383 4.95 121 28 19 315 121 Medium 66 14
## 384 9.35 98 117 0 76 68 Medium 63 10
## 385 12.85 123 37 15 348 112 Good 28 12
## 386 5.87 131 73 13 455 132 Medium 62 17
## 387 5.32 152 116 0 170 160 Medium 39 16
## 388 8.67 142 73 14 238 115 Medium 73 14
## 389 8.14 135 89 11 245 78 Bad 79 16
## 390 8.44 128 42 8 328 107 Medium 35 12
## 391 5.47 108 75 9 61 111 Medium 67 12
## 392 6.10 153 63 0 49 124 Bad 56 16
## 393 4.53 129 42 13 315 130 Bad 34 13
## 394 5.57 109 51 10 26 120 Medium 30 17
## 395 5.35 130 58 19 366 139 Bad 33 16
## 396 12.57 138 108 17 203 128 Good 33 14
## 397 6.14 139 23 3 37 120 Medium 55 11
## 398 7.41 162 26 12 368 159 Medium 40 18
## 399 5.94 100 79 7 284 95 Bad 50 12
## 400 9.71 134 37 0 27 120 Good 49 16
## Urban US Sales_cat
## 1 Yes Yes High
## 2 Yes Yes High
## 3 Yes Yes High
## 4 Yes Yes Low
## 5 Yes No Low
## 6 No Yes High
## 7 Yes No Low
## 8 Yes Yes High
## 9 No No Low
## 10 No Yes Low
## 11 No Yes High
## 12 Yes Yes High
## 13 Yes No Low
## 14 Yes Yes High
## 15 Yes Yes High
## 16 No No High
## 17 Yes No Low
## 18 Yes Yes High
## 19 No Yes High
## 20 Yes Yes High
## 21 Yes Yes Low
## 22 No Yes High
## 23 Yes No Low
## 24 Yes No Low
## 25 Yes Yes High
## 26 No No High
## 27 No Yes High
## 28 Yes No Low
## 29 Yes Yes Low
## 30 Yes Yes Low
## 31 Yes No High
## 32 Yes Yes High
## 33 No Yes Low
## 34 Yes Yes High
## 35 Yes Yes Low
## 36 No Yes High
## 37 No No High
## 38 Yes Yes Low
## 39 Yes No Low
## 40 No No Low
## 41 No No Low
## 42 Yes No Low
## 43 Yes No High
## 44 Yes Yes Low
## 45 Yes Yes Low
## 46 Yes Yes Low
## 47 No Yes High
## 48 Yes No Low
## 49 Yes No Low
## 50 Yes No High
## 51 Yes Yes Low
## 52 Yes No Low
## 53 Yes Yes Low
## 54 Yes Yes Low
## 55 No Yes Low
## 56 Yes Yes Low
## 57 Yes No High
## 58 Yes No Low
## 59 Yes Yes Low
## 60 Yes No Low
## 61 Yes Yes High
## 62 No No Low
## 63 Yes Yes Low
## 64 Yes Yes High
## 65 No Yes Low
## 66 No No Low
## 67 Yes No High
## 68 Yes Yes High
## 69 Yes Yes High
## 70 Yes No Low
## 71 Yes Yes High
## 72 No Yes Low
## 73 Yes No Low
## 74 No Yes High
## 75 No Yes Low
## 76 No Yes High
## 77 Yes Yes High
## 78 No Yes Low
## 79 Yes Yes Low
## 80 Yes No High
## 81 Yes Yes High
## 82 Yes No Low
## 83 Yes Yes High
## 84 Yes Yes Low
## 85 No No Low
## 86 No No High
## 87 Yes No High
## 88 No Yes High
## 89 Yes Yes Low
## 90 No No Low
## 91 No No Low
## 92 Yes Yes Low
## 93 Yes No Low
## 94 Yes No High
## 95 Yes Yes High
## 96 Yes Yes Low
## 97 No Yes High
## 98 Yes Yes Low
## 99 No Yes High
## 100 No Yes Low
## 101 No Yes Low
## 102 Yes No Low
## 103 No No Low
## 104 Yes Yes Low
## 105 Yes No Low
## 106 Yes Yes Low
## 107 No No Low
## 108 Yes No High
## 109 Yes No Low
## 110 No No High
## 111 Yes Yes High
## 112 Yes Yes Low
## 113 Yes Yes Low
## 114 Yes Yes Low
## 115 Yes Yes High
## 116 Yes No High
## 117 No No Low
## 118 Yes No High
## 119 Yes Yes Low
## 120 Yes Yes Low
## 121 Yes Yes Low
## 122 Yes Yes High
## 123 Yes Yes Low
## 124 No Yes High
## 125 Yes No High
## 126 No No High
## 127 Yes Yes High
## 128 Yes Yes Low
## 129 Yes Yes Low
## 130 No Yes Low
## 131 Yes Yes High
## 132 Yes No Low
## 133 Yes Yes High
## 134 Yes Yes Low
## 135 Yes No Low
## 136 No Yes Low
## 137 No No Low
## 138 Yes No Low
## 139 Yes Yes High
## 140 No Yes High
## 141 Yes Yes Low
## 142 Yes No Low
## 143 Yes No Low
## 144 Yes Yes Low
## 145 No No High
## 146 Yes Yes High
## 147 Yes No Low
## 148 No Yes High
## 149 No Yes Low
## 150 Yes Yes High
## 151 No Yes High
## 152 No Yes High
## 153 No No Low
## 154 No Yes Low
## 155 No Yes Low
## 156 Yes No Low
## 157 Yes No Low
## 158 No Yes High
## 159 No Yes High
## 160 No No High
## 161 No No Low
## 162 No Yes Low
## 163 Yes No Low
## 164 No No Low
## 165 No Yes High
## 166 Yes Yes Low
## 167 Yes Yes Low
## 168 Yes No Low
## 169 Yes No Low
## 170 Yes Yes High
## 171 Yes Yes High
## 172 Yes Yes High
## 173 Yes Yes High
## 174 Yes Yes Low
## 175 No No Low
## 176 Yes No Low
## 177 No Yes Low
## 178 Yes Yes High
## 179 No Yes High
## 180 Yes Yes Low
## 181 Yes Yes Low
## 182 Yes No Low
## 183 Yes No Low
## 184 Yes Yes Low
## 185 No Yes High
## 186 Yes Yes High
## 187 No No High
## 188 Yes No Low
## 189 Yes No High
## 190 No Yes High
## 191 No Yes High
## 192 Yes Yes Low
## 193 No No Low
## 194 Yes Yes High
## 195 Yes Yes Low
## 196 Yes Yes Low
## 197 Yes Yes Low
## 198 Yes No Low
## 199 Yes Yes Low
## 200 Yes Yes Low
## 201 No No Low
## 202 Yes No Low
## 203 No Yes Low
## 204 Yes No Low
## 205 Yes No High
## 206 Yes No Low
## 207 Yes Yes Low
## 208 No No High
## 209 Yes No Low
## 210 No Yes Low
## 211 No Yes Low
## 212 Yes Yes High
## 213 Yes Yes High
## 214 Yes Yes High
## 215 Yes Yes Low
## 216 Yes Yes Low
## 217 Yes No Low
## 218 No No Low
## 219 Yes Yes High
## 220 Yes Yes High
## 221 Yes Yes High
## 222 Yes No Low
## 223 Yes Yes Low
## 224 Yes Yes Low
## 225 No No Low
## 226 Yes No Low
## 227 Yes No Low
## 228 Yes Yes High
## 229 No Yes Low
## 230 No No High
## 231 No No Low
## 232 No No High
## 233 Yes Yes High
## 234 No Yes High
## 235 No Yes High
## 236 Yes Yes Low
## 237 Yes Yes High
## 238 Yes Yes High
## 239 Yes No Low
## 240 Yes Yes Low
## 241 Yes No High
## 242 Yes No High
## 243 No No Low
## 244 Yes Yes Low
## 245 Yes No High
## 246 No Yes High
## 247 Yes Yes Low
## 248 Yes No Low
## 249 Yes Yes Low
## 250 Yes No Low
## 251 Yes Yes High
## 252 Yes Yes Low
## 253 Yes No High
## 254 No Yes Low
## 255 Yes Yes High
## 256 Yes Yes Low
## 257 Yes No Low
## 258 Yes Yes High
## 259 No No Low
## 260 No Yes Low
## 261 Yes Yes Low
## 262 Yes Yes Low
## 263 Yes Yes Low
## 264 Yes Yes Low
## 265 Yes Yes Low
## 266 Yes Yes Low
## 267 No Yes High
## 268 No Yes Low
## 269 Yes No Low
## 270 Yes No Low
## 271 Yes No High
## 272 Yes No Low
## 273 Yes No High
## 274 Yes Yes High
## 275 Yes Yes Low
## 276 Yes Yes Low
## 277 Yes Yes Low
## 278 Yes Yes Low
## 279 No Yes Low
## 280 Yes Yes Low
## 281 Yes Yes Low
## 282 No Yes High
## 283 Yes No Low
## 284 No No Low
## 285 No No Low
## 286 Yes Yes Low
## 287 No Yes Low
## 288 Yes Yes Low
## 289 No No Low
## 290 Yes Yes High
## 291 No Yes High
## 292 Yes No Low
## 293 Yes Yes High
## 294 Yes No High
## 295 Yes Yes High
## 296 No Yes Low
## 297 Yes Yes High
## 298 Yes Yes Low
## 299 Yes No High
## 300 No Yes High
## 301 Yes Yes High
## 302 Yes Yes Low
## 303 Yes Yes Low
## 304 Yes Yes High
## 305 Yes Yes High
## 306 Yes Yes High
## 307 Yes Yes Low
## 308 Yes No Low
## 309 Yes Yes High
## 310 Yes Yes High
## 311 Yes Yes High
## 312 Yes Yes Low
## 313 Yes Yes Low
## 314 Yes No High
## 315 Yes Yes Low
## 316 Yes Yes Low
## 317 Yes Yes High
## 318 No No Low
## 319 No Yes High
## 320 No Yes Low
## 321 Yes Yes Low
## 322 Yes No Low
## 323 Yes Yes High
## 324 Yes Yes High
## 325 Yes Yes Low
## 326 Yes Yes High
## 327 Yes No Low
## 328 Yes Yes Low
## 329 Yes Yes Low
## 330 Yes Yes High
## 331 No No Low
## 332 Yes Yes High
## 333 Yes Yes Low
## 334 Yes Yes Low
## 335 Yes Yes Low
## 336 Yes Yes Low
## 337 Yes No Low
## 338 Yes No High
## 339 Yes No Low
## 340 Yes Yes High
## 341 Yes No Low
## 342 No No Low
## 343 No Yes Low
## 344 Yes Yes Low
## 345 No Yes High
## 346 Yes No Low
## 347 No No High
## 348 No No Low
## 349 Yes Yes High
## 350 No Yes High
## 351 No Yes High
## 352 No Yes High
## 353 Yes Yes High
## 354 No Yes High
## 355 Yes Yes Low
## 356 Yes No Low
## 357 Yes No Low
## 358 Yes Yes High
## 359 Yes Yes Low
## 360 Yes Yes Low
## 361 No Yes High
## 362 No Yes High
## 363 Yes Yes Low
## 364 Yes No High
## 365 Yes Yes High
## 366 No No Low
## 367 No Yes Low
## 368 Yes No High
## 369 No Yes High
## 370 Yes Yes High
## 371 Yes Yes Low
## 372 Yes No High
## 373 No No Low
## 374 Yes No Low
## 375 Yes Yes High
## 376 Yes No Low
## 377 Yes Yes High
## 378 No No Low
## 379 Yes Yes Low
## 380 Yes No Low
## 381 Yes Yes High
## 382 Yes Yes Low
## 383 Yes Yes Low
## 384 Yes No High
## 385 Yes Yes High
## 386 Yes Yes Low
## 387 Yes No Low
## 388 No Yes High
## 389 Yes Yes High
## 390 Yes Yes High
## 391 Yes Yes Low
## 392 Yes No Low
## 393 Yes Yes Low
## 394 No Yes Low
## 395 Yes Yes Low
## 396 Yes Yes High
## 397 No Yes Low
## 398 Yes Yes Low
## 399 Yes Yes Low
## 400 Yes Yes High
tree.carseats <- tree(Sales_cat ~ .-Sales,
data = df,
split = "deviance",
model = TRUE)
If we just type the name of the tree object, R prints output corresponding to each branch of the tree. R displays the split criterion (e.g. Price<92.5), the number of observations in that branch, the deviance, the overall prediction for the branch (Yes or No), and the fraction of observations in that branch that take on values of Yes and No. Branches that lead to terminal nodes are indicated using asterisks.
tree:::print.tree(tree.carseats)
## node), split, n, deviance, yval, (yprob)
## * denotes terminal node
##
## 1) root 400 541.500 Low ( 0.41000 0.59000 )
## 2) ShelveLoc: Good 85 90.330 High ( 0.77647 0.22353 )
## 4) Price < 135 68 49.260 High ( 0.88235 0.11765 )
## 8) US: No 17 22.070 High ( 0.64706 0.35294 )
## 16) Price < 109 8 0.000 High ( 1.00000 0.00000 ) *
## 17) Price > 109 9 11.460 Low ( 0.33333 0.66667 ) *
## 9) US: Yes 51 16.880 High ( 0.96078 0.03922 ) *
## 5) Price > 135 17 22.070 Low ( 0.35294 0.64706 )
## 10) Income < 46 6 0.000 Low ( 0.00000 1.00000 ) *
## 11) Income > 46 11 15.160 High ( 0.54545 0.45455 ) *
## 3) ShelveLoc: Bad,Medium 315 390.600 Low ( 0.31111 0.68889 )
## 6) Price < 92.5 46 56.530 High ( 0.69565 0.30435 )
## 12) Income < 57 10 12.220 Low ( 0.30000 0.70000 )
## 24) CompPrice < 110.5 5 0.000 Low ( 0.00000 1.00000 ) *
## 25) CompPrice > 110.5 5 6.730 High ( 0.60000 0.40000 ) *
## 13) Income > 57 36 35.470 High ( 0.80556 0.19444 )
## 26) Population < 207.5 16 21.170 High ( 0.62500 0.37500 ) *
## 27) Population > 207.5 20 7.941 High ( 0.95000 0.05000 ) *
## 7) Price > 92.5 269 299.800 Low ( 0.24535 0.75465 )
## 14) Advertising < 13.5 224 213.200 Low ( 0.18304 0.81696 )
## 28) CompPrice < 124.5 96 44.890 Low ( 0.06250 0.93750 )
## 56) Price < 106.5 38 33.150 Low ( 0.15789 0.84211 )
## 112) Population < 177 12 16.300 Low ( 0.41667 0.58333 )
## 224) Income < 60.5 6 0.000 Low ( 0.00000 1.00000 ) *
## 225) Income > 60.5 6 5.407 High ( 0.83333 0.16667 ) *
## 113) Population > 177 26 8.477 Low ( 0.03846 0.96154 ) *
## 57) Price > 106.5 58 0.000 Low ( 0.00000 1.00000 ) *
## 29) CompPrice > 124.5 128 150.200 Low ( 0.27344 0.72656 )
## 58) Price < 122.5 51 70.680 High ( 0.50980 0.49020 )
## 116) ShelveLoc: Bad 11 6.702 Low ( 0.09091 0.90909 ) *
## 117) ShelveLoc: Medium 40 52.930 High ( 0.62500 0.37500 )
## 234) Price < 109.5 16 7.481 High ( 0.93750 0.06250 ) *
## 235) Price > 109.5 24 32.600 Low ( 0.41667 0.58333 )
## 470) Age < 49.5 13 16.050 High ( 0.69231 0.30769 ) *
## 471) Age > 49.5 11 6.702 Low ( 0.09091 0.90909 ) *
## 59) Price > 122.5 77 55.540 Low ( 0.11688 0.88312 )
## 118) CompPrice < 147.5 58 17.400 Low ( 0.03448 0.96552 ) *
## 119) CompPrice > 147.5 19 25.010 Low ( 0.36842 0.63158 )
## 238) Price < 147 12 16.300 High ( 0.58333 0.41667 )
## 476) CompPrice < 152.5 7 5.742 High ( 0.85714 0.14286 ) *
## 477) CompPrice > 152.5 5 5.004 Low ( 0.20000 0.80000 ) *
## 239) Price > 147 7 0.000 Low ( 0.00000 1.00000 ) *
## 15) Advertising > 13.5 45 61.830 High ( 0.55556 0.44444 )
## 30) Age < 54.5 25 25.020 High ( 0.80000 0.20000 )
## 60) CompPrice < 130.5 14 18.250 High ( 0.64286 0.35714 )
## 120) Income < 100 9 12.370 Low ( 0.44444 0.55556 ) *
## 121) Income > 100 5 0.000 High ( 1.00000 0.00000 ) *
## 61) CompPrice > 130.5 11 0.000 High ( 1.00000 0.00000 ) *
## 31) Age > 54.5 20 22.490 Low ( 0.25000 0.75000 )
## 62) CompPrice < 122.5 10 0.000 Low ( 0.00000 1.00000 ) *
## 63) CompPrice > 122.5 10 13.860 Low ( 0.50000 0.50000 )
## 126) Price < 125 5 0.000 High ( 1.00000 0.00000 ) *
## 127) Price > 125 5 0.000 Low ( 0.00000 1.00000 ) *
The summary() function lists the variables that are used as internal nodes in the tree, the number of terminal nodes, and the model error rate.
summary(tree.carseats )
##
## Classification tree:
## tree(formula = Sales_cat ~ . - Sales, data = df, split = "deviance",
## model = TRUE)
## Variables actually used in tree construction:
## [1] "ShelveLoc" "Price" "US" "Income" "CompPrice"
## [6] "Population" "Advertising" "Age"
## Number of terminal nodes: 27
## Residual mean deviance: 0.4575 = 170.7 / 373
## Misclassification error rate: 0.09 = 36 / 400
We see that the model error rate is 23.5 %. For classification trees,
the deviance reported in the output of summary() is given by
A caption
plot(tree.carseats, type = "proportional" )
text(tree.carseats,
col=1,
cex=1/2)
plot(tree.carseats, type = "proportional")
text(tree.carseats,
label = "yval",
pretty=1)
The most important indicator of Sales appears to be shelving location, since the first branch differentiates Good locations from Bad and Medium locations.
Predict
set.seed(123)
tree.pred=predict(tree.carseats,
df,
type ="class")
data.frame(tree.pred,df$Sales_cat)
## tree.pred df.Sales_cat
## 1 Low High
## 2 High High
## 3 High High
## 4 Low Low
## 5 Low Low
## 6 High High
## 7 Low Low
## 8 High High
## 9 Low Low
## 10 Low Low
## 11 High High
## 12 High High
## 13 Low Low
## 14 High High
## 15 High High
## 16 High High
## 17 Low Low
## 18 High High
## 19 High High
## 20 High High
## 21 Low Low
## 22 High High
## 23 Low Low
## 24 Low Low
## 25 High High
## 26 High High
## 27 High High
## 28 Low Low
## 29 Low Low
## 30 Low Low
## 31 High High
## 32 High High
## 33 Low Low
## 34 High High
## 35 Low Low
## 36 High High
## 37 High High
## 38 Low Low
## 39 Low Low
## 40 Low Low
## 41 Low Low
## 42 Low Low
## 43 High High
## 44 Low Low
## 45 Low Low
## 46 Low Low
## 47 High High
## 48 Low Low
## 49 Low Low
## 50 High High
## 51 Low Low
## 52 Low Low
## 53 Low Low
## 54 Low Low
## 55 Low Low
## 56 Low Low
## 57 High High
## 58 Low Low
## 59 Low Low
## 60 Low Low
## 61 High High
## 62 Low Low
## 63 Low Low
## 64 High High
## 65 Low Low
## 66 Low Low
## 67 High High
## 68 Low High
## 69 High High
## 70 High Low
## 71 High High
## 72 Low Low
## 73 Low Low
## 74 High High
## 75 High Low
## 76 High High
## 77 High High
## 78 High Low
## 79 Low Low
## 80 High High
## 81 High High
## 82 Low Low
## 83 High High
## 84 Low Low
## 85 Low Low
## 86 High High
## 87 High High
## 88 High High
## 89 Low Low
## 90 High Low
## 91 Low Low
## 92 Low Low
## 93 Low Low
## 94 High High
## 95 High High
## 96 Low Low
## 97 High High
## 98 Low Low
## 99 High High
## 100 Low Low
## 101 High Low
## 102 High Low
## 103 Low Low
## 104 Low Low
## 105 Low Low
## 106 Low Low
## 107 Low Low
## 108 High High
## 109 Low Low
## 110 High High
## 111 High High
## 112 Low Low
## 113 High Low
## 114 Low Low
## 115 High High
## 116 Low High
## 117 Low Low
## 118 High High
## 119 Low Low
## 120 Low Low
## 121 Low Low
## 122 High High
## 123 Low Low
## 124 High High
## 125 Low High
## 126 High High
## 127 High High
## 128 Low Low
## 129 Low Low
## 130 Low Low
## 131 High High
## 132 Low Low
## 133 High High
## 134 Low Low
## 135 Low Low
## 136 Low Low
## 137 Low Low
## 138 Low Low
## 139 High High
## 140 High High
## 141 Low Low
## 142 Low Low
## 143 Low Low
## 144 Low Low
## 145 Low High
## 146 High High
## 147 Low Low
## 148 High High
## 149 Low Low
## 150 High High
## 151 High High
## 152 High High
## 153 Low Low
## 154 Low Low
## 155 Low Low
## 156 High Low
## 157 Low Low
## 158 High High
## 159 High High
## 160 High High
## 161 Low Low
## 162 Low Low
## 163 Low Low
## 164 Low Low
## 165 High High
## 166 Low Low
## 167 Low Low
## 168 Low Low
## 169 High Low
## 170 High High
## 171 Low High
## 172 High High
## 173 High High
## 174 Low Low
## 175 Low Low
## 176 Low Low
## 177 Low Low
## 178 High High
## 179 High High
## 180 Low Low
## 181 Low Low
## 182 High Low
## 183 Low Low
## 184 Low Low
## 185 High High
## 186 High High
## 187 High High
## 188 Low Low
## 189 High High
## 190 High High
## 191 High High
## 192 Low Low
## 193 Low Low
## 194 High High
## 195 Low Low
## 196 Low Low
## 197 Low Low
## 198 Low Low
## 199 Low Low
## 200 Low Low
## 201 Low Low
## 202 Low Low
## 203 Low Low
## 204 Low Low
## 205 Low High
## 206 Low Low
## 207 Low Low
## 208 Low High
## 209 Low Low
## 210 Low Low
## 211 Low Low
## 212 High High
## 213 High High
## 214 High High
## 215 Low Low
## 216 Low Low
## 217 Low Low
## 218 Low Low
## 219 High High
## 220 High High
## 221 High High
## 222 Low Low
## 223 Low Low
## 224 Low Low
## 225 Low Low
## 226 Low Low
## 227 Low Low
## 228 High High
## 229 Low Low
## 230 High High
## 231 Low Low
## 232 High High
## 233 High High
## 234 Low High
## 235 High High
## 236 Low Low
## 237 High High
## 238 High High
## 239 Low Low
## 240 Low Low
## 241 High High
## 242 High High
## 243 Low Low
## 244 Low Low
## 245 High High
## 246 High High
## 247 High Low
## 248 Low Low
## 249 Low Low
## 250 Low Low
## 251 High High
## 252 Low Low
## 253 High High
## 254 Low Low
## 255 High High
## 256 High Low
## 257 Low Low
## 258 High High
## 259 Low Low
## 260 Low Low
## 261 Low Low
## 262 Low Low
## 263 Low Low
## 264 Low Low
## 265 Low Low
## 266 Low Low
## 267 High High
## 268 Low Low
## 269 Low Low
## 270 Low Low
## 271 High High
## 272 Low Low
## 273 High High
## 274 High High
## 275 High Low
## 276 Low Low
## 277 Low Low
## 278 Low Low
## 279 High Low
## 280 Low Low
## 281 Low Low
## 282 High High
## 283 High Low
## 284 Low Low
## 285 Low Low
## 286 Low Low
## 287 Low Low
## 288 Low Low
## 289 Low Low
## 290 High High
## 291 High High
## 292 High Low
## 293 High High
## 294 High High
## 295 High High
## 296 Low Low
## 297 High High
## 298 Low Low
## 299 Low High
## 300 High High
## 301 High High
## 302 High Low
## 303 Low Low
## 304 High High
## 305 High High
## 306 Low High
## 307 Low Low
## 308 Low Low
## 309 Low High
## 310 High High
## 311 High High
## 312 Low Low
## 313 Low Low
## 314 High High
## 315 High Low
## 316 Low Low
## 317 High High
## 318 Low Low
## 319 High High
## 320 Low Low
## 321 Low Low
## 322 Low Low
## 323 High High
## 324 High High
## 325 Low Low
## 326 High High
## 327 Low Low
## 328 Low Low
## 329 Low Low
## 330 High High
## 331 Low Low
## 332 High High
## 333 Low Low
## 334 Low Low
## 335 High Low
## 336 Low Low
## 337 Low Low
## 338 High High
## 339 Low Low
## 340 High High
## 341 High Low
## 342 Low Low
## 343 Low Low
## 344 Low Low
## 345 High High
## 346 High Low
## 347 Low High
## 348 Low Low
## 349 High High
## 350 High High
## 351 High High
## 352 High High
## 353 High High
## 354 High High
## 355 Low Low
## 356 High Low
## 357 High Low
## 358 High High
## 359 Low Low
## 360 Low Low
## 361 High High
## 362 High High
## 363 Low Low
## 364 High High
## 365 High High
## 366 Low Low
## 367 Low Low
## 368 High High
## 369 High High
## 370 High High
## 371 Low Low
## 372 High High
## 373 Low Low
## 374 Low Low
## 375 High High
## 376 Low Low
## 377 High High
## 378 Low Low
## 379 Low Low
## 380 Low Low
## 381 High High
## 382 Low Low
## 383 Low Low
## 384 High High
## 385 High High
## 386 Low Low
## 387 Low Low
## 388 High High
## 389 High High
## 390 High High
## 391 Low Low
## 392 Low Low
## 393 Low Low
## 394 Low Low
## 395 Low Low
## 396 High High
## 397 Low Low
## 398 Low Low
## 399 Low Low
## 400 High High
Generate the confusion matrix showing counts.
print("confusion matrix showing train counts")
## [1] "confusion matrix showing train counts"
per <- rattle::errorMatrix(as.numeric(df$Sales_cat), as.numeric(tree.pred), count=TRUE)
per
## Predicted
## Actual 1 2 Error
## 1 151 13 7.9
## 2 23 213 9.7
Plot map
table(tree.pred,
df$Sales_cat)
##
## tree.pred High Low
## High 151 23
## Low 13 213
plot(table(tree.pred,
df$Sales_cat),
main = 'Map plot of actual vs prediction',
ylab = 'Actual',
xlab = 'Prediction')
In order to properly evaluate the performance of a classification tree on these data, we must estimate the test error rather than simply computing the training error. We split the observations into a training set and a test set, build the tree using the training set, and evaluate its performance on the test data.
#Random generator
set.seed (2)
train=sample (1: nrow(df), 250) #Row sample for training data
df[train,] #Training data row from Carseats
## Sales CompPrice Income Advertising Population Price ShelveLoc Age Education
## 341 7.50 140 29 0 105 91 Bad 43 16
## 198 2.52 124 61 0 333 138 Medium 76 16
## 262 5.71 121 42 4 188 118 Medium 54 15
## 392 6.10 153 63 0 49 124 Bad 56 16
## 273 12.98 113 33 0 14 63 Good 38 12
## 349 12.57 132 102 20 459 107 Good 49 11
## 204 2.05 131 82 0 132 157 Bad 25 14
## 381 9.64 106 64 10 17 89 Medium 68 17
## 297 8.21 127 44 13 160 123 Good 63 18
## 178 10.48 138 72 0 148 94 Medium 27 17
## 75 6.20 150 68 5 125 136 Medium 64 13
## 131 8.41 94 84 13 497 77 Medium 51 12
## 306 8.03 115 29 26 394 132 Medium 33 13
## 371 7.68 126 41 22 403 119 Bad 42 12
## 311 9.53 175 65 29 419 166 Medium 53 12
## 63 1.82 139 45 0 146 133 Bad 77 17
## 136 6.44 96 94 14 384 120 Medium 36 18
## 231 5.16 115 60 0 119 114 Bad 38 14
## 289 6.98 116 40 0 74 97 Medium 76 15
## 54 6.92 109 64 13 39 119 Medium 61 17
## 361 8.77 118 86 7 265 114 Good 52 15
## 112 6.62 132 118 12 272 151 Medium 43 14
## 171 8.01 128 39 12 356 118 Medium 71 10
## 38 4.95 121 41 5 412 110 Medium 54 10
## 380 5.81 125 111 0 404 107 Bad 54 15
## 110 8.98 115 65 0 217 90 Medium 60 17
## 144 0.53 122 88 7 36 159 Bad 28 17
## 45 4.16 85 79 6 325 95 Medium 69 13
## 238 9.62 151 28 8 499 135 Medium 48 10
## 208 8.19 111 105 0 466 97 Bad 61 10
## 134 7.62 132 98 2 265 97 Bad 62 12
## 339 5.97 112 24 0 164 101 Medium 45 11
## 9 6.54 132 110 0 108 124 Medium 76 10
## 350 9.32 134 27 18 467 96 Medium 49 14
## 130 4.47 143 120 7 279 147 Bad 40 10
## 244 7.82 124 25 13 87 110 Medium 57 10
## 3 10.06 113 35 10 269 80 Medium 59 12
## 129 4.96 133 100 3 350 126 Bad 55 13
## 304 10.01 133 52 16 290 99 Medium 43 11
## 397 6.14 139 23 3 37 120 Medium 55 11
## 301 8.57 116 78 1 158 99 Medium 45 11
## 382 3.90 124 65 21 496 151 Bad 77 13
## 274 10.04 116 106 8 244 86 Medium 58 12
## 8 11.85 136 81 15 425 120 Good 67 10
## 164 5.68 130 64 0 40 106 Bad 39 17
## 367 5.98 124 56 11 447 134 Medium 53 12
## 37 8.89 122 76 0 270 100 Good 60 18
## 226 6.68 107 25 0 412 82 Bad 36 14
## 149 7.56 110 119 0 384 97 Medium 72 14
## 205 8.74 155 80 0 237 124 Medium 37 14
## 327 4.69 133 30 0 152 122 Medium 53 17
## 242 12.01 136 63 0 160 94 Medium 38 12
## 44 4.12 123 42 11 16 134 Medium 59 13
## 276 6.67 107 119 11 210 132 Medium 53 11
## 156 7.71 98 72 0 59 69 Medium 65 16
## 368 14.37 95 106 0 256 53 Good 52 17
## 106 5.55 104 100 8 398 97 Medium 61 11
## 175 0.00 139 24 0 358 185 Medium 79 15
## 388 8.67 142 73 14 238 115 Medium 73 14
## 326 11.70 144 69 11 131 104 Medium 47 11
## 182 7.43 121 83 0 79 91 Medium 68 11
## 224 3.45 110 45 9 276 125 Medium 62 14
## 271 11.99 119 26 0 284 89 Good 26 10
## 13 3.98 122 35 2 393 136 Medium 62 18
## 328 6.23 112 38 17 316 104 Medium 80 16
## 189 8.07 116 37 0 426 90 Medium 76 15
## 96 5.58 134 25 10 237 148 Medium 59 13
## 333 5.74 106 33 20 354 104 Medium 61 12
## 166 0.37 147 58 7 100 191 Bad 27 15
## 265 6.95 128 29 5 324 159 Good 31 15
## 53 7.91 153 40 3 112 129 Bad 39 18
## 143 7.44 124 84 0 300 104 Medium 77 15
## 36 11.07 131 84 11 29 96 Medium 44 17
## 17 7.58 118 32 0 284 110 Good 63 13
## 241 10.31 159 80 0 362 121 Medium 26 18
## 80 9.14 134 67 0 286 90 Bad 41 13
## 127 11.27 153 68 2 60 133 Good 59 16
## 267 9.10 128 93 12 343 112 Good 73 17
## 79 4.43 134 48 1 139 145 Medium 65 12
## 78 7.70 118 71 12 44 89 Medium 67 18
## 394 5.57 109 51 10 26 120 Medium 30 17
## 153 7.64 128 78 0 341 128 Good 45 13
## 192 6.67 156 42 13 170 173 Good 74 14
## 23 5.08 128 46 6 497 138 Medium 42 13
## 28 5.27 98 118 0 19 107 Medium 64 17
## 1 9.50 138 73 11 276 120 Bad 42 17
## 196 4.19 117 93 4 420 112 Bad 66 11
## 185 9.95 132 33 7 35 97 Medium 60 11
## 221 10.59 131 120 15 262 124 Medium 30 10
## 102 6.20 128 93 0 89 118 Medium 34 18
## 151 10.49 122 84 8 176 114 Good 57 10
## 71 9.46 89 81 15 237 99 Good 74 12
## 275 7.22 135 93 2 67 119 Medium 34 11
## 293 11.82 113 66 16 322 74 Good 76 15
## 212 9.39 117 118 14 445 120 Medium 32 15
## 239 7.36 121 24 0 200 133 Good 73 13
## 269 6.53 123 57 0 66 105 Medium 39 11
## 133 9.54 125 87 9 232 136 Good 72 10
## 154 5.93 150 36 7 488 150 Medium 25 17
## 163 3.63 122 74 0 424 149 Medium 51 13
## 91 5.33 115 22 0 491 103 Medium 64 11
## 225 4.10 134 82 0 464 141 Medium 48 13
## 191 8.79 130 37 13 297 101 Medium 37 13
## 142 6.53 140 42 0 331 131 Bad 28 15
## 25 10.14 145 119 16 294 113 Bad 42 12
## 390 8.44 128 42 8 328 107 Medium 35 12
## 197 4.10 130 28 6 410 133 Bad 72 16
## 291 9.49 107 111 14 400 103 Medium 41 11
## 377 16.27 141 60 19 319 92 Good 44 11
## 315 7.72 133 33 10 333 129 Good 71 14
## 172 12.49 93 106 12 416 55 Medium 75 15
## 384 9.35 98 117 0 76 68 Medium 63 10
## 94 8.86 145 30 0 67 104 Medium 55 17
## 34 8.77 114 38 13 317 128 Good 50 16
## 72 6.50 148 51 16 148 150 Medium 58 17
## 77 10.64 102 87 10 346 70 Medium 64 15
## 81 8.01 113 100 16 353 79 Bad 68 11
## 176 7.54 115 89 0 38 122 Medium 25 12
## 383 4.95 121 28 19 315 121 Medium 66 14
## 29 2.99 103 74 0 359 97 Bad 55 11
## 51 1.42 99 32 18 341 108 Bad 80 16
## 200 6.42 122 88 5 335 126 Medium 64 14
## 281 2.86 121 86 10 496 145 Bad 51 10
## 82 7.52 116 72 0 237 128 Good 70 13
## 342 7.38 98 120 0 268 93 Medium 72 10
## 109 3.47 107 79 2 488 103 Bad 65 16
## 12 11.96 117 94 4 503 94 Good 50 13
## 222 6.43 124 44 0 125 107 Medium 80 11
## 387 5.32 152 116 0 170 160 Medium 39 16
## 393 4.53 129 42 13 315 130 Bad 34 13
## 76 8.55 88 111 23 480 92 Bad 36 16
## 195 7.23 112 98 18 481 128 Medium 45 11
## 330 11.27 100 54 9 433 89 Good 45 12
## 229 5.40 149 73 13 381 163 Bad 26 11
## 395 5.35 130 58 19 366 139 Bad 33 16
## 343 7.81 137 102 13 422 118 Medium 71 10
## 334 5.87 136 60 7 303 147 Medium 41 10
## 118 8.80 145 53 0 507 119 Medium 41 12
## 360 3.13 130 62 11 396 130 Bad 66 14
## 180 7.78 144 25 3 70 116 Medium 77 18
## 107 0.16 102 33 0 217 139 Medium 70 18
## 258 8.67 125 62 14 477 112 Medium 80 13
## 247 6.90 120 56 20 266 90 Bad 78 18
## 14 10.96 115 28 11 29 86 Good 53 18
## 98 7.45 161 82 5 287 129 Bad 33 16
## 213 12.04 145 69 19 501 105 Medium 45 11
## 56 6.85 143 81 5 60 154 Medium 61 18
## 396 12.57 138 108 17 203 128 Good 33 14
## 270 5.01 159 69 0 438 166 Medium 46 17
## 124 8.19 127 103 0 125 155 Good 29 15
## 300 9.40 135 40 17 497 96 Medium 54 17
## 139 10.27 125 103 12 371 109 Medium 44 10
## 62 7.32 105 32 0 358 107 Medium 26 13
## 35 2.67 115 54 0 406 128 Medium 42 17
## 86 8.47 125 103 0 304 112 Medium 49 13
## 217 5.73 141 33 0 243 144 Medium 34 17
## 321 5.86 136 70 12 171 152 Medium 44 18
## 88 11.70 131 67 7 272 126 Good 54 16
## 70 7.99 127 59 0 339 99 Medium 65 12
## 240 3.89 123 105 0 149 118 Bad 62 16
## 89 6.56 117 42 7 144 111 Medium 62 10
## 95 8.39 115 97 5 134 84 Bad 55 11
## 150 11.48 121 120 13 140 87 Medium 56 11
## 103 5.30 113 22 0 57 97 Medium 65 16
## 43 10.43 77 69 0 25 24 Medium 50 18
## 16 8.71 149 95 5 400 144 Medium 76 18
## 370 10.26 135 100 22 463 122 Medium 36 14
## 216 2.34 116 83 15 170 144 Bad 71 11
## 292 6.64 118 70 0 106 89 Bad 39 17
## 308 5.90 138 92 0 13 120 Bad 61 12
## 253 8.31 133 97 0 70 117 Medium 32 16
## 248 5.04 123 114 0 298 151 Bad 34 16
## 11 9.01 121 78 9 150 100 Bad 26 10
## 50 10.61 157 93 0 51 149 Good 32 17
## 108 8.55 134 107 0 104 108 Medium 60 12
## 155 6.89 129 69 10 289 110 Medium 50 16
## 234 8.65 123 76 18 218 120 Medium 29 14
## 42 7.96 157 53 0 403 124 Bad 58 16
## 135 3.67 132 31 0 327 131 Medium 76 16
## 146 8.77 144 63 11 27 117 Medium 47 17
## 357 3.58 142 109 0 111 164 Good 72 12
## 87 8.70 150 84 9 432 134 Medium 64 15
## 336 6.18 120 70 15 464 110 Medium 72 15
## 193 7.56 108 26 0 408 93 Medium 56 14
## 100 4.88 121 47 3 220 107 Bad 56 16
## 138 6.52 128 42 0 436 118 Medium 80 11
## 67 8.85 127 92 0 508 91 Medium 56 18
## 177 5.61 138 107 9 480 154 Medium 47 11
## 365 10.50 122 21 16 488 131 Good 30 14
## 169 7.30 129 89 0 425 117 Medium 45 10
## 145 9.09 132 68 0 264 123 Good 34 11
## 261 7.67 129 117 8 400 101 Bad 36 10
## 141 6.03 133 60 10 277 129 Medium 45 18
## 184 5.32 118 74 6 426 102 Medium 80 18
## 302 7.41 99 93 0 198 87 Medium 57 16
## 186 10.07 130 100 11 449 107 Medium 64 10
## 228 8.69 113 64 10 68 101 Medium 57 16
## 391 5.47 108 75 9 61 111 Medium 67 12
## 260 5.12 123 36 10 467 100 Bad 74 11
## 181 4.94 137 112 15 434 149 Bad 66 13
## 364 10.26 111 75 1 377 108 Good 25 12
## 148 10.51 140 54 9 402 119 Good 41 16
## 117 5.08 135 75 0 202 128 Medium 80 10
## 55 4.90 134 103 13 25 144 Medium 76 17
## 335 7.63 93 117 9 489 83 Bad 42 13
## 209 7.78 86 54 0 497 64 Bad 33 12
## 337 5.17 138 35 6 60 143 Bad 28 18
## 263 6.37 120 77 15 86 132 Medium 48 18
## 358 13.36 103 73 3 276 72 Medium 34 15
## 338 8.61 130 38 0 283 102 Medium 80 15
## 340 11.54 134 44 4 219 126 Good 44 15
## 41 2.07 119 98 0 18 126 Bad 73 17
## 48 4.38 126 98 0 173 108 Bad 55 16
## 318 6.41 142 30 0 472 136 Good 80 15
## 61 8.32 122 102 19 469 123 Bad 29 13
## 111 9.00 128 62 7 125 116 Medium 43 14
## 90 7.95 128 66 3 493 119 Medium 45 16
## 165 8.22 148 64 0 58 141 Medium 27 13
## 4 7.40 117 100 4 466 97 Medium 55 14
## 85 2.23 111 25 0 52 121 Bad 43 18
## 372 9.08 152 81 0 191 126 Medium 54 16
## 57 11.91 133 82 0 54 84 Medium 50 17
## 6 10.81 124 113 13 501 72 Bad 78 16
## 179 10.66 104 71 14 89 81 Medium 25 14
## 230 11.19 98 104 0 404 72 Medium 27 18
## 223 7.49 136 119 6 178 145 Medium 35 13
## 20 8.73 129 76 16 58 121 Medium 69 12
## 352 10.44 124 115 16 458 105 Medium 62 16
## 104 5.07 123 91 0 334 96 Bad 78 17
## 310 11.18 131 111 13 33 80 Bad 68 18
## 332 10.10 135 63 15 213 134 Medium 32 10
## 398 7.41 162 26 12 368 159 Medium 40 18
## 344 5.99 117 42 10 371 121 Bad 26 14
## 257 4.20 147 40 0 277 144 Medium 73 10
## 235 9.43 115 62 11 289 129 Good 56 16
## 389 8.14 135 89 11 245 78 Bad 79 16
## 122 11.67 125 89 10 380 87 Bad 28 10
## 296 4.21 118 35 14 502 137 Medium 79 10
## 363 5.25 131 55 0 26 110 Bad 79 12
## 351 8.64 111 101 17 266 91 Medium 63 17
## 83 11.62 151 83 4 325 139 Good 28 17
## 250 5.05 125 67 0 86 117 Bad 65 11
## 400 9.71 134 37 0 27 120 Good 49 16
## 324 10.36 107 105 18 428 103 Medium 34 12
## 323 9.16 140 50 10 300 139 Good 60 15
## 194 13.28 139 70 7 71 96 Good 61 10
## 2 11.22 111 48 16 260 83 Good 65 10
## 93 4.53 114 113 0 97 125 Medium 29 12
## 373 7.80 121 50 0 508 98 Medium 65 11
## 199 3.62 112 80 5 500 128 Medium 69 10
## Urban US Sales_cat
## 341 Yes No Low
## 198 Yes No Low
## 262 Yes Yes Low
## 392 Yes No Low
## 273 Yes No High
## 349 Yes Yes High
## 204 Yes No Low
## 381 Yes Yes High
## 297 Yes Yes High
## 178 Yes Yes High
## 75 No Yes Low
## 131 Yes Yes High
## 306 Yes Yes High
## 371 Yes Yes Low
## 311 Yes Yes High
## 63 Yes Yes Low
## 136 No Yes Low
## 231 No No Low
## 289 No No Low
## 54 Yes Yes Low
## 361 No Yes High
## 112 Yes Yes Low
## 171 Yes Yes High
## 38 Yes Yes Low
## 380 Yes No Low
## 110 No No High
## 144 Yes Yes Low
## 45 Yes Yes Low
## 238 Yes Yes High
## 208 No No High
## 134 Yes Yes Low
## 339 Yes No Low
## 9 No No Low
## 350 No Yes High
## 130 No Yes Low
## 244 Yes Yes Low
## 3 Yes Yes High
## 129 Yes Yes Low
## 304 Yes Yes High
## 397 No Yes Low
## 301 Yes Yes High
## 382 Yes Yes Low
## 274 Yes Yes High
## 8 Yes Yes High
## 164 No No Low
## 367 No Yes Low
## 37 No No High
## 226 Yes No Low
## 149 No Yes Low
## 205 Yes No High
## 327 Yes No Low
## 242 Yes No High
## 44 Yes Yes Low
## 276 Yes Yes Low
## 156 Yes No Low
## 368 Yes No High
## 106 Yes Yes Low
## 175 No No Low
## 388 No Yes High
## 326 Yes Yes High
## 182 Yes No Low
## 224 Yes Yes Low
## 271 Yes No High
## 13 Yes No Low
## 328 Yes Yes Low
## 189 Yes No High
## 96 Yes Yes Low
## 333 Yes Yes Low
## 166 Yes Yes Low
## 265 Yes Yes Low
## 53 Yes Yes Low
## 143 Yes No Low
## 36 No Yes High
## 17 Yes No Low
## 241 Yes No High
## 80 Yes No High
## 127 Yes Yes High
## 267 No Yes High
## 79 Yes Yes Low
## 78 No Yes Low
## 394 No Yes Low
## 153 No No Low
## 192 Yes Yes Low
## 23 Yes No Low
## 28 Yes No Low
## 1 Yes Yes High
## 196 Yes Yes Low
## 185 No Yes High
## 221 Yes Yes High
## 102 Yes No Low
## 151 No Yes High
## 71 Yes Yes High
## 275 Yes Yes Low
## 293 Yes Yes High
## 212 Yes Yes High
## 239 Yes No Low
## 269 Yes No Low
## 133 Yes Yes High
## 154 No Yes Low
## 163 Yes No Low
## 91 No No Low
## 225 No No Low
## 191 No Yes High
## 142 Yes No Low
## 25 Yes Yes High
## 390 Yes Yes High
## 197 Yes Yes Low
## 291 No Yes High
## 377 Yes Yes High
## 315 Yes Yes Low
## 172 Yes Yes High
## 384 Yes No High
## 94 Yes No High
## 34 Yes Yes High
## 72 No Yes Low
## 77 Yes Yes High
## 81 Yes Yes High
## 176 Yes No Low
## 383 Yes Yes Low
## 29 Yes Yes Low
## 51 Yes Yes Low
## 200 Yes Yes Low
## 281 Yes Yes Low
## 82 Yes No Low
## 342 No No Low
## 109 Yes No Low
## 12 Yes Yes High
## 222 Yes No Low
## 387 Yes No Low
## 393 Yes Yes Low
## 76 No Yes High
## 195 Yes Yes Low
## 330 Yes Yes High
## 229 No Yes Low
## 395 Yes Yes Low
## 343 No Yes Low
## 334 Yes Yes Low
## 118 Yes No High
## 360 Yes Yes Low
## 180 Yes Yes Low
## 107 No No Low
## 258 Yes Yes High
## 247 Yes Yes Low
## 14 Yes Yes High
## 98 Yes Yes Low
## 213 Yes Yes High
## 56 Yes Yes Low
## 396 Yes Yes High
## 270 Yes No Low
## 124 No Yes High
## 300 No Yes High
## 139 Yes Yes High
## 62 No No Low
## 35 Yes Yes Low
## 86 No No High
## 217 Yes No Low
## 321 Yes Yes Low
## 88 No Yes High
## 70 Yes No Low
## 240 Yes Yes Low
## 89 Yes Yes Low
## 95 Yes Yes High
## 150 Yes Yes High
## 103 No No Low
## 43 Yes No High
## 16 No No High
## 370 Yes Yes High
## 216 Yes Yes Low
## 292 Yes No Low
## 308 Yes No Low
## 253 Yes No High
## 248 Yes No Low
## 11 No Yes High
## 50 Yes No High
## 108 Yes No High
## 155 No Yes Low
## 234 No Yes High
## 42 Yes No Low
## 135 Yes No Low
## 146 Yes Yes High
## 357 Yes No Low
## 87 Yes No High
## 336 Yes Yes Low
## 193 No No Low
## 100 No Yes Low
## 138 Yes No Low
## 67 Yes No High
## 177 No Yes Low
## 365 Yes Yes High
## 169 Yes No Low
## 145 No No High
## 261 Yes Yes Low
## 141 Yes Yes Low
## 184 Yes Yes Low
## 302 Yes Yes Low
## 186 Yes Yes High
## 228 Yes Yes High
## 391 Yes Yes Low
## 260 No Yes Low
## 181 Yes Yes Low
## 364 Yes No High
## 148 No Yes High
## 117 No No Low
## 55 No Yes Low
## 335 Yes Yes Low
## 209 Yes No Low
## 337 Yes No Low
## 263 Yes Yes Low
## 358 Yes Yes High
## 338 Yes No High
## 340 Yes Yes High
## 41 No No Low
## 48 Yes No Low
## 318 No No Low
## 61 Yes Yes High
## 111 Yes Yes High
## 90 No No Low
## 165 No Yes High
## 4 Yes Yes Low
## 85 No No Low
## 372 Yes No High
## 57 Yes No High
## 6 No Yes High
## 179 No Yes High
## 230 No No High
## 223 Yes Yes Low
## 20 Yes Yes High
## 352 No Yes High
## 104 Yes Yes Low
## 310 Yes Yes High
## 332 Yes Yes High
## 398 Yes Yes Low
## 344 Yes Yes Low
## 257 Yes No Low
## 235 No Yes High
## 389 Yes Yes High
## 122 Yes Yes High
## 296 No Yes Low
## 363 Yes Yes Low
## 351 No Yes High
## 83 Yes Yes High
## 250 Yes No Low
## 400 Yes Yes High
## 324 Yes Yes High
## 323 Yes Yes High
## 194 Yes Yes High
## 2 Yes Yes High
## 93 Yes No Low
## 373 No No Low
## 199 Yes Yes Low
Dataset for test.
Carseats.test = df[-train ,]
Carseats.test
## Sales CompPrice Income Advertising Population Price ShelveLoc Age Education
## 5 4.15 141 64 3 340 128 Bad 38 13
## 7 6.63 115 105 0 45 108 Medium 71 15
## 10 4.69 132 113 0 131 124 Medium 76 17
## 15 11.17 107 117 11 148 118 Good 52 18
## 18 12.29 147 74 13 251 131 Good 52 10
## 19 13.91 110 110 0 408 68 Good 46 17
## 21 6.41 125 90 2 367 131 Medium 35 18
## 22 12.13 134 29 12 239 109 Good 62 18
## 24 5.87 121 31 0 292 109 Medium 79 10
## 26 14.90 139 32 0 176 82 Good 54 11
## 27 8.33 107 115 11 496 131 Good 50 11
## 30 7.81 104 99 15 226 102 Bad 58 17
## 31 13.55 125 94 0 447 89 Good 30 12
## 32 8.25 136 58 16 241 131 Medium 44 18
## 33 6.20 107 32 12 236 137 Good 64 10
## 39 6.59 109 73 0 454 102 Medium 65 15
## 40 3.24 130 60 0 144 138 Bad 38 10
## 46 4.56 141 63 0 168 135 Bad 44 12
## 47 12.44 127 90 14 16 70 Medium 48 15
## 49 3.91 116 52 0 349 98 Bad 69 18
## 52 4.42 121 90 0 150 108 Bad 75 16
## 58 0.91 93 91 0 22 117 Bad 75 11
## 59 5.42 103 93 15 188 103 Bad 74 16
## 60 5.21 118 71 4 148 114 Medium 80 13
## 64 8.47 119 88 10 170 101 Medium 61 13
## 65 7.80 100 67 12 184 104 Medium 32 16
## 66 4.90 122 26 0 197 128 Medium 55 13
## 68 9.01 126 61 14 152 115 Medium 47 16
## 69 13.39 149 69 20 366 134 Good 60 13
## 73 5.52 115 45 0 432 116 Medium 25 15
## 74 12.61 118 90 10 54 104 Good 31 11
## 84 4.42 109 36 7 468 94 Bad 56 11
## 92 4.81 97 46 11 267 107 Medium 80 15
## 97 9.48 147 42 10 407 132 Good 73 16
## 99 12.49 122 77 24 382 127 Good 36 16
## 101 4.11 113 69 11 94 106 Medium 76 12
## 105 4.62 121 96 0 472 138 Medium 51 12
## 113 6.67 116 99 5 298 125 Good 62 12
## 114 6.01 131 29 11 335 127 Bad 33 12
## 115 9.31 122 87 9 17 106 Medium 65 13
## 116 8.54 139 35 0 95 129 Medium 42 13
## 119 7.57 112 88 2 243 99 Medium 62 11
## 120 7.37 130 94 8 137 128 Medium 64 12
## 121 6.87 128 105 11 249 131 Medium 63 13
## 123 6.88 119 100 5 45 108 Medium 75 10
## 125 8.87 131 113 0 181 120 Good 63 14
## 126 9.34 89 78 0 181 49 Medium 43 15
## 128 6.52 125 48 3 192 116 Medium 51 14
## 132 6.50 108 69 3 208 94 Medium 77 16
## 137 5.17 131 75 0 10 120 Bad 31 18
## 140 12.30 146 62 10 310 94 Medium 30 13
## 147 3.90 114 83 0 412 131 Bad 39 14
## 152 10.77 111 58 17 407 103 Good 75 17
## 157 7.49 146 34 0 220 157 Good 51 16
## 158 10.21 121 58 8 249 90 Medium 48 13
## 159 12.53 142 90 1 189 112 Good 39 10
## 160 9.32 119 60 0 372 70 Bad 30 18
## 161 4.67 111 28 0 486 111 Medium 29 12
## 162 2.93 143 21 5 81 160 Medium 67 12
## 167 6.71 119 67 17 151 137 Medium 55 11
## 168 6.71 106 73 0 216 93 Medium 60 13
## 170 11.48 104 41 15 492 77 Good 73 18
## 173 9.03 104 102 13 123 110 Good 35 16
## 174 6.38 135 91 5 207 128 Medium 66 18
## 183 4.74 137 60 4 230 140 Bad 25 13
## 187 8.68 120 51 0 93 86 Medium 46 17
## 188 6.03 117 32 0 142 96 Bad 62 17
## 190 12.11 118 117 18 509 104 Medium 26 15
## 201 5.56 144 92 0 349 146 Medium 62 12
## 202 5.94 138 83 0 139 134 Medium 54 18
## 203 4.10 121 78 4 413 130 Bad 46 10
## 206 5.68 113 22 1 317 132 Medium 28 12
## 207 4.97 162 67 0 27 160 Medium 77 17
## 210 3.02 98 21 11 326 90 Bad 76 11
## 211 4.36 125 41 2 357 123 Bad 47 14
## 214 8.23 149 84 5 220 139 Medium 33 10
## 215 4.83 115 115 3 48 107 Medium 73 18
## 218 4.34 106 44 0 481 111 Medium 70 14
## 219 9.70 138 61 12 156 120 Medium 25 14
## 220 10.62 116 79 19 359 116 Good 58 17
## 227 7.80 119 33 0 245 122 Good 56 14
## 232 8.09 132 69 0 123 122 Medium 27 11
## 233 13.14 137 80 10 24 105 Good 61 15
## 236 5.53 126 32 8 95 132 Medium 50 17
## 237 9.32 141 34 16 361 108 Medium 69 10
## 243 4.68 124 46 0 199 135 Medium 52 14
## 245 8.78 130 30 0 391 100 Medium 26 18
## 246 10.00 114 43 0 199 88 Good 57 10
## 249 5.36 111 52 0 12 101 Medium 61 11
## 251 9.16 137 105 10 435 156 Good 72 14
## 252 3.72 139 111 5 310 132 Bad 62 13
## 254 5.64 124 24 5 288 122 Medium 57 12
## 255 9.58 108 104 23 353 129 Good 37 17
## 256 7.71 123 81 8 198 81 Bad 80 15
## 259 3.47 108 38 0 251 81 Bad 72 14
## 264 7.77 116 26 6 434 115 Medium 25 17
## 266 5.31 130 35 10 402 129 Bad 39 17
## 268 5.83 134 82 7 473 112 Bad 51 12
## 272 4.55 111 56 0 504 110 Medium 62 16
## 277 6.93 135 69 14 296 130 Medium 73 15
## 278 7.80 136 48 12 326 125 Medium 36 16
## 279 7.22 114 113 2 129 151 Good 40 15
## 280 3.42 141 57 13 376 158 Medium 64 18
## 282 11.19 122 69 7 303 105 Good 45 16
## 283 7.74 150 96 0 80 154 Good 61 11
## 284 5.36 135 110 0 112 117 Medium 80 16
## 285 6.97 106 46 11 414 96 Bad 79 17
## 286 7.60 146 26 11 261 131 Medium 39 10
## 287 7.53 117 118 11 429 113 Medium 67 18
## 288 6.88 95 44 4 208 72 Bad 44 17
## 290 8.75 143 77 25 448 156 Medium 43 17
## 294 11.28 123 84 0 74 89 Good 59 10
## 295 12.66 148 76 3 126 99 Good 60 11
## 298 3.07 118 83 13 276 104 Bad 75 10
## 299 10.98 148 63 0 312 130 Good 63 15
## 303 5.28 108 77 13 388 110 Bad 74 14
## 305 11.93 123 98 12 408 134 Good 29 10
## 307 4.78 131 32 1 85 133 Medium 48 12
## 309 9.24 126 80 19 436 126 Medium 52 10
## 312 6.15 146 68 12 328 132 Bad 51 14
## 313 6.80 137 117 5 337 135 Bad 38 10
## 314 9.33 103 81 3 491 54 Medium 66 13
## 316 6.39 131 21 8 220 171 Good 29 14
## 317 15.63 122 36 5 369 72 Good 35 10
## 319 10.08 116 72 10 456 130 Good 41 14
## 320 6.97 127 45 19 459 129 Medium 57 11
## 322 7.52 123 39 5 499 98 Medium 34 15
## 325 2.66 136 65 4 133 150 Bad 53 13
## 329 3.15 117 66 1 65 111 Bad 55 11
## 331 4.99 122 59 0 501 112 Bad 32 14
## 345 8.43 138 80 0 108 126 Good 70 13
## 346 4.81 121 68 0 279 149 Good 79 12
## 347 8.97 132 107 0 144 125 Medium 33 13
## 348 6.88 96 39 0 161 112 Good 27 14
## 353 13.44 133 103 14 288 122 Good 61 17
## 354 9.45 107 67 12 430 92 Medium 35 12
## 355 5.30 133 31 1 80 145 Medium 42 18
## 356 7.02 130 100 0 306 146 Good 42 11
## 359 4.17 123 96 10 71 118 Bad 69 11
## 362 8.68 131 25 10 183 104 Medium 56 15
## 366 6.53 154 30 0 122 162 Medium 57 17
## 369 10.71 109 22 10 348 79 Good 74 14
## 374 5.58 137 71 0 402 116 Medium 78 17
## 375 9.44 131 47 7 90 118 Medium 47 12
## 376 7.90 132 46 4 206 124 Medium 73 11
## 378 6.81 132 61 0 263 125 Medium 41 12
## 379 6.11 133 88 3 105 119 Medium 79 12
## 385 12.85 123 37 15 348 112 Good 28 12
## 386 5.87 131 73 13 455 132 Medium 62 17
## 399 5.94 100 79 7 284 95 Bad 50 12
## Urban US Sales_cat
## 5 Yes No Low
## 7 Yes No Low
## 10 No Yes Low
## 15 Yes Yes High
## 18 Yes Yes High
## 19 No Yes High
## 21 Yes Yes Low
## 22 No Yes High
## 24 Yes No Low
## 26 No No High
## 27 No Yes High
## 30 Yes Yes Low
## 31 Yes No High
## 32 Yes Yes High
## 33 No Yes Low
## 39 Yes No Low
## 40 No No Low
## 46 Yes Yes Low
## 47 No Yes High
## 49 Yes No Low
## 52 Yes No Low
## 58 Yes No Low
## 59 Yes Yes Low
## 60 Yes No Low
## 64 Yes Yes High
## 65 No Yes Low
## 66 No No Low
## 68 Yes Yes High
## 69 Yes Yes High
## 73 Yes No Low
## 74 No Yes High
## 84 Yes Yes Low
## 92 Yes Yes Low
## 97 No Yes High
## 99 No Yes High
## 101 No Yes Low
## 105 Yes No Low
## 113 Yes Yes Low
## 114 Yes Yes Low
## 115 Yes Yes High
## 116 Yes No High
## 119 Yes Yes Low
## 120 Yes Yes Low
## 121 Yes Yes Low
## 123 Yes Yes Low
## 125 Yes No High
## 126 No No High
## 128 Yes Yes Low
## 132 Yes No Low
## 137 No No Low
## 140 No Yes High
## 147 Yes No Low
## 152 No Yes High
## 157 Yes No Low
## 158 No Yes High
## 159 No Yes High
## 160 No No High
## 161 No No Low
## 162 No Yes Low
## 167 Yes Yes Low
## 168 Yes No Low
## 170 Yes Yes High
## 173 Yes Yes High
## 174 Yes Yes Low
## 183 Yes No Low
## 187 No No High
## 188 Yes No Low
## 190 No Yes High
## 201 No No Low
## 202 Yes No Low
## 203 No Yes Low
## 206 Yes No Low
## 207 Yes Yes Low
## 210 No Yes Low
## 211 No Yes Low
## 214 Yes Yes High
## 215 Yes Yes Low
## 218 No No Low
## 219 Yes Yes High
## 220 Yes Yes High
## 227 Yes No Low
## 232 No No High
## 233 Yes Yes High
## 236 Yes Yes Low
## 237 Yes Yes High
## 243 No No Low
## 245 Yes No High
## 246 No Yes High
## 249 Yes Yes Low
## 251 Yes Yes High
## 252 Yes Yes Low
## 254 No Yes Low
## 255 Yes Yes High
## 256 Yes Yes Low
## 259 No No Low
## 264 Yes Yes Low
## 266 Yes Yes Low
## 268 No Yes Low
## 272 Yes No Low
## 277 Yes Yes Low
## 278 Yes Yes Low
## 279 No Yes Low
## 280 Yes Yes Low
## 282 No Yes High
## 283 Yes No Low
## 284 No No Low
## 285 No No Low
## 286 Yes Yes Low
## 287 No Yes Low
## 288 Yes Yes Low
## 290 Yes Yes High
## 294 Yes No High
## 295 Yes Yes High
## 298 Yes Yes Low
## 299 Yes No High
## 303 Yes Yes Low
## 305 Yes Yes High
## 307 Yes Yes Low
## 309 Yes Yes High
## 312 Yes Yes Low
## 313 Yes Yes Low
## 314 Yes No High
## 316 Yes Yes Low
## 317 Yes Yes High
## 319 No Yes High
## 320 No Yes Low
## 322 Yes No Low
## 325 Yes Yes Low
## 329 Yes Yes Low
## 331 No No Low
## 345 No Yes High
## 346 Yes No Low
## 347 No No High
## 348 No No Low
## 353 Yes Yes High
## 354 No Yes High
## 355 Yes Yes Low
## 356 Yes No Low
## 359 Yes Yes Low
## 362 No Yes High
## 366 No No Low
## 369 No Yes High
## 374 Yes No Low
## 375 Yes Yes High
## 376 Yes No Low
## 378 No No Low
## 379 Yes Yes Low
## 385 Yes Yes High
## 386 Yes Yes Low
## 399 Yes Yes Low
df$Sales_cat[train]
## [1] Low Low Low Low High High Low High High High Low High High Low High
## [16] Low Low Low Low Low High Low High Low Low High Low Low High High
## [31] Low Low Low High Low Low High Low High Low High Low High High Low
## [46] Low High Low Low High Low High Low Low Low High Low Low High High
## [61] Low Low High Low Low High Low Low Low Low Low Low High Low High
## [76] High High High Low Low Low Low Low Low Low High Low High High Low
## [91] High High Low High High Low Low High Low Low Low Low High Low High
## [106] High Low High High Low High High High High Low High High Low Low Low
## [121] Low Low Low Low Low Low High Low Low Low High Low High Low Low
## [136] Low Low High Low Low Low High Low High Low High Low High Low High
## [151] High High Low Low High Low Low High Low Low Low High High Low High
## [166] High High Low Low Low High Low High High High Low High Low Low High
## [181] Low High Low Low Low Low High Low High Low High Low Low Low Low
## [196] High High Low Low Low High High Low Low Low Low Low Low High High
## [211] High Low Low Low High High Low High Low Low High High High High High
## [226] Low High High Low High High Low Low Low High High High Low Low High
## [241] High Low High High High High High Low Low Low
## Levels: High Low
Do the regression tree for Carseats using training data.
#Training
set.seed(123)
tree.carseats = tree(Sales_cat ~ .-Sales,
df,
subset = train )
tree:::print.tree(tree.carseats)
## node), split, n, deviance, yval, (yprob)
## * denotes terminal node
##
## 1) root 250 341.900 Low ( 0.43200 0.56800 )
## 2) ShelveLoc: Good 43 44.120 High ( 0.79070 0.20930 )
## 4) Price < 127 25 8.397 High ( 0.96000 0.04000 ) *
## 5) Price > 127 18 24.730 High ( 0.55556 0.44444 )
## 10) Age < 65 11 10.430 High ( 0.81818 0.18182 )
## 20) Population < 308.5 6 0.000 High ( 1.00000 0.00000 ) *
## 21) Population > 308.5 5 6.730 High ( 0.60000 0.40000 ) *
## 11) Age > 65 7 5.742 Low ( 0.14286 0.85714 ) *
## 3) ShelveLoc: Bad,Medium 207 269.900 Low ( 0.35749 0.64251 )
## 6) Price < 124.5 140 193.600 Low ( 0.47143 0.52857 )
## 12) Advertising < 6.5 75 90.770 Low ( 0.29333 0.70667 )
## 24) Price < 94.5 21 28.680 High ( 0.57143 0.42857 )
## 48) Education < 16.5 14 19.120 Low ( 0.42857 0.57143 ) *
## 49) Education > 16.5 7 5.742 High ( 0.85714 0.14286 ) *
## 25) Price > 94.5 54 51.750 Low ( 0.18519 0.81481 )
## 50) CompPrice < 144.5 48 36.170 Low ( 0.12500 0.87500 )
## 100) Income < 95 36 15.450 Low ( 0.05556 0.94444 )
## 200) Price < 102.5 11 10.430 Low ( 0.18182 0.81818 )
## 400) Population < 304 5 6.730 Low ( 0.40000 0.60000 ) *
## 401) Population > 304 6 0.000 Low ( 0.00000 1.00000 ) *
## 201) Price > 102.5 25 0.000 Low ( 0.00000 1.00000 ) *
## 101) Income > 95 12 15.280 Low ( 0.33333 0.66667 )
## 202) Age < 61.5 7 9.561 High ( 0.57143 0.42857 ) *
## 203) Age > 61.5 5 0.000 Low ( 0.00000 1.00000 ) *
## 51) CompPrice > 144.5 6 7.638 High ( 0.66667 0.33333 ) *
## 13) Advertising > 6.5 65 81.790 High ( 0.67692 0.32308 )
## 26) CompPrice < 129.5 47 64.110 High ( 0.57447 0.42553 )
## 52) Income < 75.5 24 30.550 Low ( 0.33333 0.66667 )
## 104) Advertising < 14.5 17 23.510 Low ( 0.47059 0.52941 ) *
## 105) Advertising > 14.5 7 0.000 Low ( 0.00000 1.00000 ) *
## 53) Income > 75.5 23 21.250 High ( 0.82609 0.17391 )
## 106) Advertising < 9.5 5 6.730 Low ( 0.40000 0.60000 ) *
## 107) Advertising > 9.5 18 7.724 High ( 0.94444 0.05556 ) *
## 27) CompPrice > 129.5 18 7.724 High ( 0.94444 0.05556 ) *
## 7) Price > 124.5 67 49.010 Low ( 0.11940 0.88060 )
## 14) CompPrice < 147.5 52 16.950 Low ( 0.03846 0.96154 )
## 28) Advertising < 14.5 44 0.000 Low ( 0.00000 1.00000 ) *
## 29) Advertising > 14.5 8 8.997 Low ( 0.25000 0.75000 ) *
## 15) CompPrice > 147.5 15 20.190 Low ( 0.40000 0.60000 )
## 30) Age < 47 8 6.028 Low ( 0.12500 0.87500 ) *
## 31) Age > 47 7 8.376 High ( 0.71429 0.28571 ) *
The summary() function lists the variables that are used as internal nodes in the tree, the number of terminal nodes, and the (training) error rate.
summary(tree.carseats )
##
## Classification tree:
## tree(formula = Sales_cat ~ . - Sales, data = df, subset = train)
## Variables actually used in tree construction:
## [1] "ShelveLoc" "Price" "Age" "Population" "Advertising"
## [6] "Education" "CompPrice" "Income"
## Number of terminal nodes: 21
## Residual mean deviance: 0.6059 = 138.7 / 229
## Misclassification error rate: 0.14 = 35 / 250
We see that the training error rate is 24 %.
plot(tree.carseats )
text(tree.carseats ,
cex=1/2, pretty = 1)
. The predict() function can be used for this purpose. In the case of a classification tree, the argument type=“class” instructs R to return the actual class prediction. [function (object, newdata = list(), type = c(“vector”, “tree”, “class”, “where”), split = FALSE, nwts, eps = 0.001, …)]
set.seed(123)
tree.pred=predict(tree.carseats,
Carseats.test,
type ="class")
data.frame(tree.pred, Carseats.test$Sales_cat)
## tree.pred Carseats.test.Sales_cat
## 1 Low Low
## 2 Low Low
## 3 Low Low
## 4 High High
## 5 High High
## 6 High High
## 7 Low Low
## 8 High High
## 9 Low Low
## 10 High High
## 11 High High
## 12 High Low
## 13 High High
## 14 Low High
## 15 High Low
## 16 Low Low
## 17 Low Low
## 18 Low Low
## 19 High High
## 20 Low Low
## 21 Low Low
## 22 Low Low
## 23 High Low
## 24 Low Low
## 25 High High
## 26 Low Low
## 27 Low Low
## 28 Low High
## 29 High High
## 30 Low Low
## 31 High High
## 32 Low Low
## 33 Low Low
## 34 Low High
## 35 High High
## 36 Low Low
## 37 Low Low
## 38 High Low
## 39 Low Low
## 40 Low High
## 41 Low High
## 42 Low Low
## 43 Low Low
## 44 Low Low
## 45 Low Low
## 46 High High
## 47 Low High
## 48 Low Low
## 49 Low Low
## 50 Low Low
## 51 High High
## 52 Low Low
## 53 High High
## 54 High Low
## 55 Low High
## 56 High High
## 57 High High
## 58 Low Low
## 59 Low Low
## 60 Low Low
## 61 Low Low
## 62 High High
## 63 High High
## 64 Low Low
## 65 Low Low
## 66 High High
## 67 Low Low
## 68 High High
## 69 Low Low
## 70 Low Low
## 71 Low Low
## 72 Low Low
## 73 High Low
## 74 Low Low
## 75 Low Low
## 76 Low High
## 77 Low Low
## 78 Low Low
## 79 High High
## 80 High High
## 81 High Low
## 82 Low High
## 83 High High
## 84 Low Low
## 85 High High
## 86 Low Low
## 87 Low High
## 88 High High
## 89 Low Low
## 90 Low High
## 91 Low Low
## 92 Low Low
## 93 High High
## 94 Low Low
## 95 Low Low
## 96 Low Low
## 97 Low Low
## 98 High Low
## 99 Low Low
## 100 Low Low
## 101 Low Low
## 102 High Low
## 103 Low Low
## 104 High High
## 105 High Low
## 106 Low Low
## 107 Low Low
## 108 Low Low
## 109 High Low
## 110 High Low
## 111 Low High
## 112 High High
## 113 High High
## 114 High Low
## 115 High High
## 116 High Low
## 117 High High
## 118 Low Low
## 119 Low High
## 120 Low Low
## 121 Low Low
## 122 Low High
## 123 High Low
## 124 High High
## 125 High High
## 126 Low Low
## 127 Low Low
## 128 Low Low
## 129 Low Low
## 130 Low Low
## 131 High High
## 132 Low Low
## 133 Low High
## 134 High Low
## 135 High High
## 136 Low High
## 137 Low Low
## 138 High Low
## 139 High Low
## 140 High High
## 141 High Low
## 142 High High
## 143 Low Low
## 144 High High
## 145 Low Low
## 146 Low Low
## 147 Low Low
## 148 High High
## 149 Low Low
## 150 Low Low
plot(tree.pred )
summary(tree.pred)
## High Low
## 59 91
table(Carseats.test$Sales_cat)
##
## High Low
## 56 94
table(tree.pred,
Carseats.test$Sales_cat)
##
## tree.pred High Low
## High 40 19
## Low 16 75
accuracy <- (16+43+33)/150
accuracy
## [1] 0.6133333
This approach leads to correct predictions for around 61.3% of the locations in the test data set.
plot(table(tree.pred,
Carseats.test$Sales_cat),
main = 'Map plot of test actual vs prediction',
ylab = 'Test Actual',
xlab = 'Prediction')
Generate the confusion matrix showing counts test data.
print("confusion matrix showing test counts")
## [1] "confusion matrix showing test counts"
rattle::errorMatrix(as.numeric(Carseats.test$Sales_cat), as.numeric(tree.pred), count=TRUE)
## Predicted
## Actual 1 2 Error
## 1 40 16 28.6
## 2 19 75 20.2
Generate the confusion matrix showing proportions.
print("confusion matrix showing test proportion")
## [1] "confusion matrix showing test proportion"
per <- rattle::errorMatrix(as.numeric(Carseats.test$Sales_cat),
as.numeric(tree.pred),
count = FALSE)
per
## Predicted
## Actual 1 2 Error
## 1 26.7 10.7 28.6
## 2 12.7 50.0 20.2
Calculate the test overall error percentage.
paste0("overall error percentage ", collapse = ", ")
## [1] "overall error percentage "
cat(100-sum(diag(per), na.rm=TRUE))
## 23.3
Calculate the averaged class error percentage.
print("averaged class error percentage")
## [1] "averaged class error percentage"
cat(mean(per[,"Error"], na.rm=TRUE))
## 24.4
Evaluate model performance on the testing dataset. ROC Curve: requires the ROCR package.
library(ROCR)
# ROC Curve: requires the ggplot2 package.
library(ggplot2, quietly=TRUE)
# Generate an ROC Curve for the rpart model on car_seats.csv [test].
tree.pred = predict(tree.carseats, Carseats.test, type ="vector")[,2]
# Remove observations with missing target.
no.miss <- na.omit(Carseats.test$Sales_cat)
miss.list <- attr(no.miss, "na.action")
attributes(no.miss) <- NULL
if (length(miss.list))
{
pred <- prediction(tree.pred[-miss.list], no.miss)
} else
{
pred <- prediction(tree.pred, no.miss)
}
pe <- performance(pred, "tpr", "fpr")
au <- performance(pred, "auc")@y.values[[1]]
pd <- data.frame(fpr=unlist(pe@x.values), tpr=unlist(pe@y.values))
p <- ggplot(pd, aes(x=fpr, y=tpr))
p <- p + geom_line(colour="red")
p <- p + xlab("False Positive Rate") + ylab("True Positive Rate")
p <- p + ggtitle("ROC Curve Decision Tree car_seats.csv [test] High")
p <- p + theme(plot.title=element_text(size=10))
p <- p + geom_line(data=data.frame(), aes(x=c(0,1), y=c(0,1)), colour="grey")
p <- p + annotate("text", x=0.50, y=0.00, hjust=0, vjust=0, size=5,
label=paste("AUC =", round(au, 2)))
print(p)
performance(pred, "auc")
## A performance instance
## 'Area under the ROC curve'
Evaluate model performance on the testing dataset. Precision/Recall Plot: requires the ROCR package
ROCR::plot(performance(pred, "prec", "rec"), col="#CC0000FF", lty=1, add=FALSE)
# Add decorations to the plot.
title(main="Precision/Recall Plot car_seats.csv test",
sub=paste("rocr", format(Sys.time(), "%Y-%b-%d %H:%M:%S"), Sys.info()["user"]))
grid()
Evaluate model performance on the testing dataset. Tibble
library(ROCR)
pred<-ROCR::prediction(as.numeric(tree.pred),
as.numeric(Carseats.test$Sales_cat))
library(tibble)
glimpse(pred)
## Formal class 'prediction' [package "ROCR"] with 11 slots
## ..@ predictions:List of 1
## .. ..$ : num [1:150] 1 1 1 0.04 0 0.04 1 0.04 1 0.04 ...
## ..@ labels :List of 1
## .. ..$ : Ord.factor w/ 2 levels "1"<"2": 2 2 2 1 1 1 2 1 2 1 ...
## ..@ cutoffs :List of 1
## .. ..$ : num [1:14] Inf 1 0.875 0.857 0.75 ...
## ..@ fp :List of 1
## .. ..$ : num [1:14] 0 4 5 7 10 11 13 16 23 23 ...
## ..@ tp :List of 1
## .. ..$ : num [1:14] 0 58 58 59 61 66 69 75 75 77 ...
## ..@ tn :List of 1
## .. ..$ : num [1:14] 56 52 51 49 46 45 43 40 33 33 ...
## ..@ fn :List of 1
## .. ..$ : num [1:14] 94 36 36 35 33 28 25 19 19 17 ...
## ..@ n.pos :List of 1
## .. ..$ : int 94
## ..@ n.neg :List of 1
## .. ..$ : int 56
## ..@ n.pos.pred :List of 1
## .. ..$ : num [1:14] 0 62 63 66 71 77 82 91 98 100 ...
## ..@ n.neg.pred :List of 1
## .. ..$ : num [1:14] 150 88 87 84 79 73 68 59 52 50 ...
#Pruning the tree 10. Next, we consider whether pruning the tree might lead to improved results. The function cv.tree() performs cross-validation in order to determine the optimal level of tree complexity; cost complexity pruning is used in order to select a sequence of trees for consideration. We use the argument FUN=prune.misclass in order to indicate that we want the classification error rate to guide the cross-validation and pruning process, rather than the default for the cv.tree() function, which is deviance. The cv.tree() function reports the number of terminal nodes of each tree considered (size) as well as the corresponding error rate and the value of the cost-complexity parameter used (k).
set.seed (123)
cv.carseats = tree::cv.tree(tree(Sales_cat ~ . -Sales,
data = df[train,],
split = "deviance",
model = TRUE),
FUN=prune.misclass,
K = 10)
cv.carseats
## $size
## [1] 21 16 14 13 11 9 7 6 4 2 1
##
## $dev
## [1] 79 80 86 85 85 87 90 92 91 99 110
##
## $k
## [1] -Inf 0.0 0.5 1.0 1.5 2.0 2.5 3.0 4.0 11.5 25.0
##
## $method
## [1] "misclass"
##
## attr(,"class")
## [1] "prune" "tree.sequence"
names(cv.carseats )
## [1] "size" "dev" "k" "method"
Note that, despite the name, dev corresponds to the cross-validation error rate in this instance. The tree with 8 terminal nodes results in the lowest cross-validation error rate, with 81 cross-validation errors.
summary(cv.carseats)
## Length Class Mode
## size 11 -none- numeric
## dev 11 -none- numeric
## k 11 -none- numeric
## method 1 -none- character
We plot the error rate as a function of both size and k.
par(mfrow =c(1,2))
plot(cv.carseats$size ,
cv.carseats$dev ,
type="b")
plot(cv.carseats$k ,
cv.carseats$dev ,
type="b")
set.seed(123)
prune.carseats = prune.misclass(tree.carseats ,
best = 7)
plot(prune.carseats )
text(prune.carseats ,
col=1,
cex=1/2, pretty = 1)
tree:::print.tree(prune.carseats)
## node), split, n, deviance, yval, (yprob)
## * denotes terminal node
##
## 1) root 250 341.900 Low ( 0.43200 0.56800 )
## 2) ShelveLoc: Good 43 44.120 High ( 0.79070 0.20930 ) *
## 3) ShelveLoc: Bad,Medium 207 269.900 Low ( 0.35749 0.64251 )
## 6) Price < 124.5 140 193.600 Low ( 0.47143 0.52857 )
## 12) Advertising < 6.5 75 90.770 Low ( 0.29333 0.70667 )
## 24) Price < 94.5 21 28.680 High ( 0.57143 0.42857 ) *
## 25) Price > 94.5 54 51.750 Low ( 0.18519 0.81481 ) *
## 13) Advertising > 6.5 65 81.790 High ( 0.67692 0.32308 )
## 26) CompPrice < 129.5 47 64.110 High ( 0.57447 0.42553 )
## 52) Income < 75.5 24 30.550 Low ( 0.33333 0.66667 ) *
## 53) Income > 75.5 23 21.250 High ( 0.82609 0.17391 ) *
## 27) CompPrice > 129.5 18 7.724 High ( 0.94444 0.05556 ) *
## 7) Price > 124.5 67 49.010 Low ( 0.11940 0.88060 ) *
summary(prune.carseats)
##
## Classification tree:
## snip.tree(tree = tree.carseats, nodes = c(52L, 53L, 7L, 24L,
## 25L, 2L))
## Variables actually used in tree construction:
## [1] "ShelveLoc" "Price" "Advertising" "CompPrice" "Income"
## Number of terminal nodes: 7
## Residual mean deviance: 0.9592 = 233.1 / 243
## Misclassification error rate: 0.196 = 49 / 250
Misclassification training error after pruning = 31.2 which is significantly worst compares to before pruning = 24.
set.seed(123)
tree.pred = predict(prune.carseats,
Carseats.test,
type="class")
tree.pred
## [1] Low Low Low High High High Low High Low High High High High Low High
## [16] Low Low Low High Low Low Low High Low High Low Low Low High Low
## [31] High Low Low High High Low Low High Low High Low Low Low Low Low
## [46] High High Low High Low High Low High High Low High High Low Low Low
## [61] High High High Low Low High Low High Low Low Low Low Low Low Low
## [76] Low Low Low High High High Low High Low High Low Low High Low High
## [91] Low Low High High High Low Low High Low Low Low High Low High High
## [106] Low Low Low High High Low High High High High High High Low Low Low
## [121] Low High High High High Low Low Low Low Low High High Low High High
## [136] Low Low High High High Low High Low High Low Low Low High Low High
## Levels: High Low
table(tree.pred,
Carseats.test$Sales_cat)
##
## tree.pred High Low
## High 45 23
## Low 11 71
accuracy <- (6+43+39)/150
accuracy
## [1] 0.5866667
Now 58.7% of the test observations are correctly classified, so the pruning process produced a more interpretable tree, however not necessarily improved the classification accuracy, from 61.4 down to 58.7.
Generate the confusion matrix showing counts for pruned tree.
print("confusion matrix showing test counts")
## [1] "confusion matrix showing test counts"
per <- rattle::errorMatrix(as.numeric(Carseats.test$Sales_cat),
as.numeric(tree.pred),
count=FALSE)
per
## Predicted
## Actual 1 2 Error
## 1 30.0 7.3 19.6
## 2 15.3 47.3 24.5
Calculate the overall error percentage for pruned tree.
print("overall error percentage")
## [1] "overall error percentage"
cat(100-sum(diag(per), na.rm=TRUE))
## 22.7
set.seed(123)
prune.carseats <- prune.misclass(tree.carseats, best = 15)
plot(prune.carseats )
text(prune.carseats,
cex =1/2,
col=1)
set.seed(123)
tree.pred = predict(prune.carseats,
Carseats.test,
type="class")
table(tree.pred,
Carseats.test$Sales_cat)
##
## tree.pred High Low
## High 40 19
## Low 16 75
print("accuracy=")
## [1] "accuracy="
(11+40+33)/150
## [1] 0.56
The accuracy is not necessarily improved
The rpart programs build classification or regression models of a very general structure using a two stage procedure; the resulting models can be represented as binary trees. rpart routines implements many of the ideas found in the CART (Classification and Regression Trees) book and programs of Breiman, Friedman, Olshen and Stone. rpart has now become more common than the original and more descriptive cart, a testament to the influence of freely available software.
library(rpart)
tr<-rpart(Sales_cat ~ .-Sales ,
df,
subset = train)
library(rpart.plot)
rpart.plot(tr)
tr
## n= 250
##
## node), split, n, loss, yval, (yprob)
## * denotes terminal node
##
## 1) root 250 108 Low (0.43200000 0.56800000)
## 2) ShelveLoc=Good 43 9 High (0.79069767 0.20930233)
## 4) Age< 68.5 33 3 High (0.90909091 0.09090909) *
## 5) Age>=68.5 10 4 Low (0.40000000 0.60000000) *
## 3) ShelveLoc=Bad,Medium 207 74 Low (0.35748792 0.64251208)
## 6) Price< 96.5 44 14 High (0.68181818 0.31818182)
## 12) Advertising>=9.5 19 2 High (0.89473684 0.10526316) *
## 13) Advertising< 9.5 25 12 High (0.52000000 0.48000000)
## 26) ShelveLoc=Medium 17 6 High (0.64705882 0.35294118) *
## 27) ShelveLoc=Bad 8 2 Low (0.25000000 0.75000000) *
## 7) Price>=96.5 163 44 Low (0.26993865 0.73006135)
## 14) Advertising>=10.5 52 24 Low (0.46153846 0.53846154)
## 28) Price< 126 31 10 High (0.67741935 0.32258065)
## 56) CompPrice>=121.5 21 3 High (0.85714286 0.14285714) *
## 57) CompPrice< 121.5 10 3 Low (0.30000000 0.70000000) *
## 29) Price>=126 21 3 Low (0.14285714 0.85714286) *
## 15) Advertising< 10.5 111 20 Low (0.18018018 0.81981982) *
summary(tr)
## Call:
## rpart(formula = Sales_cat ~ . - Sales, data = df, subset = train)
## n= 250
##
## CP nsplit rel error xerror xstd
## 1 0.23148148 0 1.0000000 1.0000000 0.07252075
## 2 0.14814815 1 0.7685185 0.8981481 0.07134077
## 3 0.05092593 2 0.6203704 0.6944444 0.06708971
## 4 0.03703704 4 0.5185185 0.7500000 0.06851602
## 5 0.01851852 5 0.4814815 0.7407407 0.06829292
## 6 0.01000000 8 0.4259259 0.7962963 0.06954684
##
## Variable importance
## Price ShelveLoc Advertising CompPrice Age Education
## 31 24 14 12 7 3
## US Population Income Urban
## 3 3 2 1
##
## Node number 1: 250 observations, complexity param=0.2314815
## predicted class=Low expected loss=0.432 P(node) =1
## class counts: 108 142
## probabilities: 0.432 0.568
## left son=2 (43 obs) right son=3 (207 obs)
## Primary splits:
## ShelveLoc splits as RLR, improve=13.363650, (0 missing)
## Price < 96.5 to the left, improve=13.130930, (0 missing)
## Advertising < 6.5 to the right, improve=10.067330, (0 missing)
## Income < 61.5 to the right, improve= 7.083292, (0 missing)
## Age < 60.5 to the left, improve= 5.773908, (0 missing)
##
## Node number 2: 43 observations, complexity param=0.01851852
## predicted class=High expected loss=0.2093023 P(node) =0.172
## class counts: 34 9
## probabilities: 0.791 0.209
## left son=4 (33 obs) right son=5 (10 obs)
## Primary splits:
## Age < 68.5 to the left, improve=3.978013, (0 missing)
## Price < 127 to the left, improve=3.423669, (0 missing)
## Income < 43 to the right, improve=2.813203, (0 missing)
## US splits as RL, improve=2.371020, (0 missing)
## Advertising < 0.5 to the right, improve=1.996105, (0 missing)
## Surrogate splits:
## Price < 161.5 to the left, agree=0.814, adj=0.2, (0 split)
##
## Node number 3: 207 observations, complexity param=0.1481481
## predicted class=Low expected loss=0.3574879 P(node) =0.828
## class counts: 74 133
## probabilities: 0.357 0.643
## left son=6 (44 obs) right son=7 (163 obs)
## Primary splits:
## Price < 96.5 to the left, improve=11.755480, (0 missing)
## Advertising < 7.5 to the right, improve= 8.642215, (0 missing)
## Income < 61.5 to the right, improve= 7.567312, (0 missing)
## ShelveLoc splits as R-L, improve= 3.867538, (0 missing)
## Age < 64.5 to the left, improve= 3.104385, (0 missing)
## Surrogate splits:
## CompPrice < 98.5 to the left, agree=0.831, adj=0.205, (0 split)
##
## Node number 4: 33 observations
## predicted class=High expected loss=0.09090909 P(node) =0.132
## class counts: 30 3
## probabilities: 0.909 0.091
##
## Node number 5: 10 observations
## predicted class=Low expected loss=0.4 P(node) =0.04
## class counts: 4 6
## probabilities: 0.400 0.600
##
## Node number 6: 44 observations, complexity param=0.01851852
## predicted class=High expected loss=0.3181818 P(node) =0.176
## class counts: 30 14
## probabilities: 0.682 0.318
## left son=12 (19 obs) right son=13 (25 obs)
## Primary splits:
## Advertising < 9.5 to the right, improve=3.031962, (0 missing)
## CompPrice < 123.5 to the right, improve=2.147981, (0 missing)
## Price < 81.5 to the left, improve=1.555277, (0 missing)
## Age < 64.5 to the left, improve=1.357576, (0 missing)
## Income < 83.5 to the right, improve=1.310371, (0 missing)
## Surrogate splits:
## US splits as RL, agree=0.818, adj=0.579, (0 split)
## Urban splits as LR, agree=0.659, adj=0.211, (0 split)
## Income < 83.5 to the right, agree=0.636, adj=0.158, (0 split)
## Population < 49 to the left, agree=0.636, adj=0.158, (0 split)
## Price < 81.5 to the left, agree=0.636, adj=0.158, (0 split)
##
## Node number 7: 163 observations, complexity param=0.05092593
## predicted class=Low expected loss=0.2699387 P(node) =0.652
## class counts: 44 119
## probabilities: 0.270 0.730
## left son=14 (52 obs) right son=15 (111 obs)
## Primary splits:
## Advertising < 10.5 to the right, improve=5.606452, (0 missing)
## Price < 124.5 to the left, improve=5.155847, (0 missing)
## CompPrice < 143.5 to the right, improve=3.578732, (0 missing)
## ShelveLoc splits as R-L, improve=3.533366, (0 missing)
## Income < 61.5 to the right, improve=3.523417, (0 missing)
## Surrogate splits:
## CompPrice < 161.5 to the right, agree=0.693, adj=0.038, (0 split)
## Income < 117.5 to the right, agree=0.687, adj=0.019, (0 split)
##
## Node number 12: 19 observations
## predicted class=High expected loss=0.1052632 P(node) =0.076
## class counts: 17 2
## probabilities: 0.895 0.105
##
## Node number 13: 25 observations, complexity param=0.01851852
## predicted class=High expected loss=0.48 P(node) =0.1
## class counts: 13 12
## probabilities: 0.520 0.480
## left son=26 (17 obs) right son=27 (8 obs)
## Primary splits:
## ShelveLoc splits as R-L, improve=1.715294, (0 missing)
## CompPrice < 111.5 to the right, improve=1.608205, (0 missing)
## Education < 16.5 to the right, improve=1.244706, (0 missing)
## Age < 61.5 to the left, improve=1.067302, (0 missing)
## Population < 305.5 to the left, improve=0.980000, (0 missing)
## Surrogate splits:
## Age < 46.5 to the right, agree=0.76, adj=0.250, (0 split)
## Income < 33 to the right, agree=0.72, adj=0.125, (0 split)
## Population < 410 to the left, agree=0.72, adj=0.125, (0 split)
##
## Node number 14: 52 observations, complexity param=0.05092593
## predicted class=Low expected loss=0.4615385 P(node) =0.208
## class counts: 24 28
## probabilities: 0.462 0.538
## left son=28 (31 obs) right son=29 (21 obs)
## Primary splits:
## Price < 126 to the left, improve=7.154910, (0 missing)
## Age < 47.5 to the left, improve=3.846154, (0 missing)
## Income < 60 to the right, improve=2.585650, (0 missing)
## CompPrice < 121.5 to the right, improve=2.068376, (0 missing)
## Education < 15.5 to the left, improve=1.846154, (0 missing)
## Surrogate splits:
## CompPrice < 146.5 to the left, agree=0.673, adj=0.190, (0 split)
## Population < 215.5 to the right, agree=0.654, adj=0.143, (0 split)
## Advertising < 24 to the left, agree=0.635, adj=0.095, (0 split)
## ShelveLoc splits as R-L, agree=0.635, adj=0.095, (0 split)
## Education < 17.5 to the left, agree=0.635, adj=0.095, (0 split)
##
## Node number 15: 111 observations
## predicted class=Low expected loss=0.1801802 P(node) =0.444
## class counts: 20 91
## probabilities: 0.180 0.820
##
## Node number 26: 17 observations
## predicted class=High expected loss=0.3529412 P(node) =0.068
## class counts: 11 6
## probabilities: 0.647 0.353
##
## Node number 27: 8 observations
## predicted class=Low expected loss=0.25 P(node) =0.032
## class counts: 2 6
## probabilities: 0.250 0.750
##
## Node number 28: 31 observations, complexity param=0.03703704
## predicted class=High expected loss=0.3225806 P(node) =0.124
## class counts: 21 10
## probabilities: 0.677 0.323
## left son=56 (21 obs) right son=57 (10 obs)
## Primary splits:
## CompPrice < 121.5 to the right, improve=4.205530, (0 missing)
## Income < 46.5 to the right, improve=3.939691, (0 missing)
## Age < 52 to the left, improve=3.161832, (0 missing)
## Education < 14.5 to the left, improve=1.376670, (0 missing)
## Population < 425 to the right, improve=1.134246, (0 missing)
## Surrogate splits:
## Education < 14.5 to the left, agree=0.774, adj=0.3, (0 split)
## Income < 35 to the right, agree=0.742, adj=0.2, (0 split)
## Price < 104.5 to the right, agree=0.710, adj=0.1, (0 split)
## Age < 71.5 to the left, agree=0.710, adj=0.1, (0 split)
##
## Node number 29: 21 observations
## predicted class=Low expected loss=0.1428571 P(node) =0.084
## class counts: 3 18
## probabilities: 0.143 0.857
##
## Node number 56: 21 observations
## predicted class=High expected loss=0.1428571 P(node) =0.084
## class counts: 18 3
## probabilities: 0.857 0.143
##
## Node number 57: 10 observations
## predicted class=Low expected loss=0.3 P(node) =0.04
## class counts: 3 7
## probabilities: 0.300 0.700
tr$splits
## count ncat improve index adj
## ShelveLoc 250 3 13.3636544 1.0 0.00000000
## Price 250 -1 13.1309327 96.5 0.00000000
## Advertising 250 1 10.0673324 6.5 0.00000000
## Income 250 1 7.0832922 61.5 0.00000000
## Age 250 -1 5.7739083 60.5 0.00000000
## Age 43 -1 3.9780127 68.5 0.00000000
## Price 43 -1 3.4236693 127.0 0.00000000
## Income 43 1 2.8132033 43.0 0.00000000
## US 43 2 2.3710197 2.0 0.00000000
## Advertising 43 1 1.9961049 0.5 0.00000000
## Price 0 -1 0.8139535 161.5 0.20000000
## Price 207 -1 11.7554796 96.5 0.00000000
## Advertising 207 1 8.6422148 7.5 0.00000000
## Income 207 1 7.5673119 61.5 0.00000000
## ShelveLoc 207 3 3.8675383 3.0 0.00000000
## Age 207 -1 3.1043848 64.5 0.00000000
## CompPrice 0 -1 0.8309179 98.5 0.20454545
## Advertising 44 1 3.0319617 9.5 0.00000000
## CompPrice 44 1 2.1479811 123.5 0.00000000
## Price 44 -1 1.5552769 81.5 0.00000000
## Age 44 -1 1.3575758 64.5 0.00000000
## Income 44 1 1.3103708 83.5 0.00000000
## US 0 2 0.8181818 4.0 0.57894737
## Urban 0 2 0.6590909 5.0 0.21052632
## Income 0 1 0.6363636 83.5 0.15789474
## Population 0 -1 0.6363636 49.0 0.15789474
## Price 0 -1 0.6363636 81.5 0.15789474
## ShelveLoc 25 3 1.7152941 6.0 0.00000000
## CompPrice 25 1 1.6082051 111.5 0.00000000
## Education 25 1 1.2447059 16.5 0.00000000
## Age 25 -1 1.0673016 61.5 0.00000000
## Population 25 -1 0.9800000 305.5 0.00000000
## Age 0 1 0.7600000 46.5 0.25000000
## Income 0 1 0.7200000 33.0 0.12500000
## Population 0 -1 0.7200000 410.0 0.12500000
## Advertising 163 1 5.6064521 10.5 0.00000000
## Price 163 -1 5.1558465 124.5 0.00000000
## CompPrice 163 1 3.5787321 143.5 0.00000000
## ShelveLoc 163 3 3.5333665 7.0 0.00000000
## Income 163 1 3.5234173 61.5 0.00000000
## CompPrice 0 1 0.6932515 161.5 0.03846154
## Income 0 1 0.6871166 117.5 0.01923077
## Price 52 -1 7.1549096 126.0 0.00000000
## Age 52 -1 3.8461538 47.5 0.00000000
## Income 52 1 2.5856496 60.0 0.00000000
## CompPrice 52 1 2.0683761 121.5 0.00000000
## Education 52 -1 1.8461538 15.5 0.00000000
## CompPrice 0 -1 0.6730769 146.5 0.19047619
## Population 0 1 0.6538462 215.5 0.14285714
## Advertising 0 -1 0.6346154 24.0 0.09523810
## ShelveLoc 0 3 0.6346154 8.0 0.09523810
## Education 0 -1 0.6346154 17.5 0.09523810
## CompPrice 31 1 4.2055300 121.5 0.00000000
## Income 31 1 3.9396914 46.5 0.00000000
## Age 31 -1 3.1618325 52.0 0.00000000
## Education 31 -1 1.3766699 14.5 0.00000000
## Population 31 1 1.1342457 425.0 0.00000000
## Education 0 -1 0.7741935 14.5 0.30000000
## Income 0 1 0.7419355 35.0 0.20000000
## Price 0 1 0.7096774 104.5 0.10000000
## Age 0 -1 0.7096774 71.5 0.10000000
print("confusion matrix showing test counts")
## [1] "confusion matrix showing test counts"
rpart.pred <- predict(tr, Carseats.test, type="class")
perPart <- rattle::errorMatrix(as.numeric(Carseats.test$Sales_cat),
as.numeric(rpart.pred),
count=TRUE)
perPart
## Predicted
## Actual 1 2 Error
## 1 36 20 35.7
## 2 14 80 14.9
print("confusion matrix showing test proportion")
## [1] "confusion matrix showing test proportion"
rpart.pred <- predict(tr, Carseats.test, type="class")
perPart <- rattle::errorMatrix(as.numeric(Carseats.test$Sales_cat),
as.numeric(rpart.pred),
count=FALSE)
perPart
## Predicted
## Actual 1 2 Error
## 1 24.0 13.3 35.7
## 2 9.3 53.3 14.9
#Kesimpulan #Berdasarkan deskripsi, terdapat kumpulan data simulasi yang berisi tentang penjualan car seats anak di 400 toko berbeda dengan pengamatan pada 11 variabel, yaitu Sales, CompPrice, Income, Advertising, Population, Price, SelveLoc, Age, Education, Urban, dan US untuk mengetahui kombinasi faktor manakah yang akan memprediksi tingginya penjualan. Dengan mengesampingkan variabel Sales, didapatkan hasil bahwa dari 10 variabel yang digunakan, hanya terdapat sebanyak 8 variabel yang berpengaruh terhadap penjualan car seats anak. Delapan variabel tersebut antara lain adalah ShelveLoc, Price, US, Income, CompPrice, Population, Advertising, dan Age dengan tingkat error sebesar 9%. Pada visualisasi decision tree, ShelveLoc berada pada bagan paling atas yang mengartikan bahwa variabel ShelveLoc merupakan variabel yang paling mempengaruhi. Hasil prediksi dari decision tree kemudian dibandingkan dengan variabel Sales dalam training set yang menghasilkan hasil bahwa dari 400 data, 151 dan 213 data termasuk ke dalam penjualan high serta 13 dan 23 data termasuk ke dalam penjualan low dengan tingkat error sebesar 7,9% dan 9,7%. Selain training error, test error juga perlu dilakukan untuk mengevaluasi lebih lanjut terkait kinerja classification tree. 150 data digunakan untuk test error dan menampilkan hasil 56 data high dan 94 data low.