13 Must-Know Machine Learning Algorithms for Data Scientists. The idea comes from https://medium.com/@johnvastola/10-must-know-machine-learning-algorithms-for-data-scientists-adbf3272398a
There are 3 types of ML Algorithms which are
- supervised (Regression, Random Forest, Decision Tree, Naive Bayes, and Linear/kernel SVM) (Dataset source: https://archive.ics.uci.edu/ml/datasets/Adult)
- unsupervised (KNN, Clustering, DBSCAN, SVD, and Latent Dirichlet Analysis) (Dataset source: https://www.kaggle.com/datasets/rohan0301/unsupervised-learning-on-country-data)
- and reinforcement learning (Monte Carlo, Markow Decision Processes, and Q-learning).
In this repository, 13 algorithms are mentioned will be introduced.
- Superivised ML steps:
- Step 1: Load data
- Step 2: View data infor to check any missing, impossible or human errors cells
- Step 3: Cleaning dataset based on finding from step 2
- Step 4: Visualize each column if needed
- Step 5: Create dummies data if there are some columns that are not integers
- Step 6: Split the data into training and testing sets
- Step 7: Import necessary libraries from sk_learn
- Step 8: Fit the training set and create prediction variable.
- Step 9: Visualize results (confusion matrix)