Decision Tree in Machine Learning
In this post we will see the approaches of decision tree used in the machine learning models. The decision tree is used both regression and classification algorithms. Decision Trees are a type of Supervised Machine Learning where the data are continually split according to a certain parameter. The tree can be explained by two entities, namely decision nodes and leaves.
There are two main types of Decision Trees:
1. Classification trees (Yes/No types)
2. Regression trees (Continuous data types)
Steps to create Machine Learning models
- Import all the packages
- Import data set
- Data preprocessing
- Train the model
- Measure Accuracy
Decision Tree Classifier
The decision tree classifier algorithm is used for the classification data sets. The iris data set is used for this machine learning models. That iris data set has four features and one target variable.
Import packages and data set
The pandas and the SK learn package is imported. the data set read by the pandas package.
Data Preprocessing
The data set is split into the dependent and independent variable.
Train the Model
The object is created for the decision tree classifier and fit the data into the decision tree classifier algorithm. The machine learning model is trained.
Measure Accuracy
The accuracy of the machine learning model is check by the confusion matrix.
Confusion Matrix
A confusion matrix is a matrix (table) that can be used to measure the performance of an machine learning algorithm, usually a supervised learning one. Each row of the confusion matrix represents the instances of an actual class and each column represents the instances of a predicted class.
Decision Tree Regression
Decision tree builds regression models in the form of a tree structure. It breaks down a data set into smaller and smaller subsets while at the same time an associated decision tree is incrementally developed. The final result is a tree with decision nodes and leaf nodes.
The position and salary data set is used for this machine learning model
Import packages and data set
The pandas, Matplotlib and the SK learn package is imported. the data set read by the pandas package.
Data Preprocessing
The data set is split into the dependent and independent variable.
Train the Model
The object is created for the decision tree regression and fit the data into the decision tree classifier algorithm. The machine learning model is trained.
Measure Accuracy
The accuracy of the machine learning model is check by the score() function.
If you want the full code,visit my Github page.
Conclusion
Decision Trees are a non-parametric supervised learning method used for both classification and regression tasks. The goal is to create a model that predicts the value of a target variable by learning simple decision rules inferred from the data features.