The Pipeline uses different libraries and techniques manually implemented in python to handle multi-class classification problems which promises higher accuracy than the native TF-IDF feature extraction pipeline. The pipeline can be used to get baseline scores for text categorisation.
Zipf Law and Chi-square filters
TF-IDF, Bag Of Words, GLOVE vectors , Novel document vector aggregation using GloVe vectors
Light GBM, XGboost, Random Forest
Logistic, SGD classifier
The pipeline Outputs a classification score report of all individual classes with corresponding weighted Precision and Recall.