Regression in Decision Tree — A Step by Step CART (Classification And Regression Tree)

1. Introduction

  • Least squares. This method is similar to minimizing least squares in a linear model. Splits are chosen to minimize the residual sum of squares between the observation and the mean in each node.
  • Least absolute deviations. This method minimizes the mean absolute deviation from the median within a node. The advantage of this over least squares is that it is not as sensitive to outliers and provides a more robust model. The disadvantage is in insensitivity when dealing with data sets containing a large proportion of zeros [1].

2. How Does CART Work in Regression with one predictor?

In order to find out the “best” split, we must minimize the RSS

2.1 Intuition

2.2 How does CART process the splitting of the dataset (predictor =1)

Start within index 1

Start within index 2

  1. Introduction to Statistical Learning
  2. Ecological Informatics — Classification and Regression Trees
  3. Adapted from YouTube Channel of “StatQuest with Josh Stamer




