# Ensemble Learning Explained! Part 1

Ensemble Modelling is a great way to improve the performance of a model. It uses multiple models for the prediction. This is a two-part post, Part 2 will be uploaded soon.

# What is Ensemble Learning?

Basically, an

ensembleis a supervised learning technique for combining multiple weaklearnersormodelsto produce astrong learner.

# Why use Ensemble Learning?

Using ensemble learning, we get the power of three models, hence

better prediction.

**Basic Ensemble Techniques**

Basic Ensemble Techniques predicts the output of our models based on three criteria:

- Max Voting
- Averaging
- Weighted Averaging

# Max Voting

In Max Voting, multiple models are used to make a prediction for each of the input. The prediction of each of the model is considered as a

vote.The prediction with majority of the votes, will be the final prediction.

# Averaging

In this method, we take the average of all the predictions and use it to make the final prediction.

# Weighted Averaging

This is an extension to the Averaging method. Here each model is assigned weights based, and the final prediction is based on the weighted average. The weights are assigned as per our business case.

# Advanced **Ensemble Learning**

Under advance ensemble learning, we have the following algorithms

# Stacking

Stacking uses prediction from multiple models to build a new model.

*Steps:*

- Training dataset is split to 10 parts.
- The
*Base Model(e.g. DT)*is fitted on**9 parts**; prediction is made on the**10th part.** *Base Model*prediction is made on the**whole training dataset.**- Predictions are made on the t
**est dataset**using the same model (*Base Model*). - Step 2 to 4 are repeated for a different prediction algorithm
*(e.g. KNN).* - The prediction from train set is used as features to
**build a new model.** - This model is used to make the final prediction.

# Blending

Blending follows a similar approach as stacking but uses **Validation**(holdout set only)** **set from train set to make prediction.

Steps:

**Train set**is split to**training**and**validation**set.- Models are fitted on
**training**set. - Predictions are made on the
**Validation**set - Validation set and its predictions are used as features to build a new model.
- The final model is used to make predictions on the test and meta-features.

# Bagging

Bagging is combining the results of multiple models (usually Decision Tree) to give a generalized model.

Bootstrapping

orBagging Aggregationuses subsets(bags) to get a fair idea of distribution of the data.Why Bootstrapping?

Because if we use the same kind of data with different models in ensemble learning, our results are going to be same for all the models since the input are the same.

**Steps**:

- Multiple subsets are created using the original dataset.
- A base model is created on each of these subsets.
- The model runs in parallel and is independent of each other.
- The final prediction is determined by a combination of predictions of all the models.

**Boosting**

Boosting is a sequential method, where the subsequent model attempts to correct the error of the previous model.

**Steps**:

- Subset is created from the original dataset.
- All data points are given equal weights at first.
- A
*base model*is created on this subset. - This model is used to make prediction on the whole dataset.
- Errors are found, by using actual values and predicted values.
- Data points that are incorrectly predicted are given higher weights.
- Another model is created and predictions are made on the dataset. This model tries to correct the error of the previous model.
- Similar models are subsequently created, each trying to correct the error of the previous one.
- The final model i.e. the strong learner, will be the weighted mean of all the weak learners.

The topic for Part 2 of Ensemble Learning will be on detailed ensemble algorithms like AdaBoost, XGBoost, Random Forest etc.

Found this article useful? Follow me Vignesh for more.