Machine learning study notes, contains Math behind all the mainstream tree-based machine learning models, covering basic decision tree models (ID3, C4.5, CART), boosted models (GBM, AdaBoost, Xgboost, LightGBM), bagging models (Bagging Tree, Random Forest, ExtraTrees).
Bagging Tree
One Sentence Summary:
Train multiple strong base learners on different subsets of dataset parallelly and take the average or majority vote as the final predictions.
-
a. Difference between Bagging and Boosting
The logic behind the boosting method is adding weak base learners step by step to form a strong learner and correct previous mistakes. But the core idea behind bagging is training and aggregating multiple strong base learners at the same time to prevent overfitting.Aspects Boosting Bagging Ensemble Category Sequential Ensembling: weak base learners are generated sequentially Parallel Ensembling: Strong base learners are generated parallelly Overall Target Reduce Bias Reduce Variance Target of individual base learners Reduce previous weak learners' error Reduce the overfitting of each strong base learners Parallel Computing Parallel computing within a single tree (XGboost) Parallel computing within a single tree & across running different trees -
b. The Bagging Tree Classification Algorithm
Model Input:Model Output: Final classifier: G(x)
Steps:
-
c. The Bagging Tree Regression Algorithm
Model Input:Model Output: Final regressor: G(x)
Steps:
Reference
- Breiman, Leo. "Bagging predictors." Machine learning 24.2 (1996): 123-140.
- Zhihua Zhou. Machine Learning[M]. Tsinghua University Press, 2018. [Chinese]
- https://towardsdatascience.com/decision-tree-ensembles-bagging-and-boosting-266a8ba60fd9
- https://machinelearningmastery.com/bagging-and-random-forest-ensemble-algorithms-for-machine-learning/
- https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingClassifier.html