Regression in Decision Tree — A Step by Step CART (Classification And Regression Tree)

1. Introduction

  • Least squares. This method is similar to minimizing least squares in a linear model. Splits are chosen to minimize the residual sum of squares between the observation and the mean in each node.
  • Least absolute deviations. This method minimizes the mean absolute deviation from the median within a node. The advantage of this over least squares is that it is not as sensitive to outliers and provides a more robust model. The disadvantage is in insensitivity when dealing with data sets containing a large proportion of zeros [1].

2. How Does CART Work in Regression with one predictor?

In order to find out the “best” split, we must minimize the RSS

2.1 Intuition

2.2 How does CART process the splitting of the dataset (predictor =1)

Start within index 1

Start within index 2

About Me

Reference

  1. Introduction to Statistical Learning
  2. Ecological Informatics — Classification and Regression Trees
  3. Adapted from YouTube Channel of “StatQuest with Josh Stamer

--

--

--

Data Scientist and Artificial Intelligence Enthusiast

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Developments in the Research related to Word Embedding part2(Natural Language Processing)

The Next-Gen Business Intelligence

The Data Science Team by Dr. Alvin Ang

The Data Lifecycle Lowdown: Data Processing & Data Processing Engines

Anomaly Detection for Predictive Maintenance — Exploratory Data Analysis

10 Areas of Expertise in Data Science

Data Science Techniques: Generalization and Chi-Squared

5 lesser-known Python libraries to improve your Data Science workflow

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Arif R

Arif R

Data Scientist and Artificial Intelligence Enthusiast

More from Medium

Deciding Variables in Multiple Linear Regression

K-NN Algorithm From Scratch

Machine Learning Introduction

K-Nearest Neighbors Classification