Ensemble Learning Explained! Part 1
Ensemble Modelling is a great way to improve the performance of a model. It uses multiple models for the prediction. This is a two-part post, Part 2 will be uploaded soon.
What is Ensemble Learning?
Basically, an ensemble is a supervised learning technique for combining multiple weak learners or models to produce a strong learner.
Why use Ensemble Learning?
Using ensemble learning, we get the power of three models, hence better prediction.
Basic Ensemble Techniques
Basic Ensemble Techniques predicts the output of our models based on three criteria:
- Max Voting
- Averaging
- Weighted Averaging
Max Voting
In Max Voting, multiple models are used to make a prediction for each of the input. The prediction of each of the model is considered as a vote. The prediction with majority of the votes, will be the final prediction.
Averaging
In this method, we take the average of all the predictions and use it to make the final prediction.
Weighted Averaging
This is an extension to the Averaging method. Here each model is assigned weights based, and the final prediction is based on the weighted average. The weights are assigned as per our business case.
Advanced Ensemble Learning
Under advance ensemble learning, we have the following algorithms
Stacking
Stacking uses prediction from multiple models to build a new model.
Steps:
- Training dataset is split to 10 parts.
- The Base Model(e.g. DT)is fitted on 9 parts; prediction is made on the 10th part.
- Base Model prediction is made on the whole training dataset.
- Predictions are made on the test dataset using the same model (Base Model).
- Step 2 to 4 are repeated for a different prediction algorithm(e.g. KNN).
- The prediction from train set is used as features to build a new model.
- This model is used to make the final prediction.
Blending
Blending follows a similar approach as stacking but uses Validation(holdout set only) set from train set to make prediction.
Steps:
- Train set is split to training and validation set.
- Models are fitted on training set.
- Predictions are made on the Validation set
- Validation set and its predictions are used as features to build a new model.
- The final model is used to make predictions on the test and meta-features.
Bagging
Bagging is combining the results of multiple models (usually Decision Tree) to give a generalized model.
Bootstrapping or Bagging Aggregation uses subsets(bags) to get a fair idea of distribution of the data.
Why Bootstrapping? Because if we use the same kind of data with different models in ensemble learning, our results are going to be same for all the models since the input are the same.
Steps:
- Multiple subsets are created using the original dataset.
- A base model is created on each of these subsets.
- The model runs in parallel and is independent of each other.
- The final prediction is determined by a combination of predictions of all the models.
Boosting
Boosting is a sequential method, where the subsequent model attempts to correct the error of the previous model.
Steps:
- Subset is created from the original dataset.
- All data points are given equal weights at first.
- A base model is created on this subset.
- This model is used to make prediction on the whole dataset.
- Errors are found, by using actual values and predicted values.
- Data points that are incorrectly predicted are given higher weights.
- Another model is created and predictions are made on the dataset. This model tries to correct the error of the previous model.
- Similar models are subsequently created, each trying to correct the error of the previous one.
- The final model i.e. the strong learner, will be the weighted mean of all the weak learners.
The topic for Part 2 of Ensemble Learning will be on detailed ensemble algorithms like AdaBoost, XGBoost, Random Forest etc.
Found this article useful? Follow me Vignesh for more.