You will compare between the performance of the following optimization algorithms:
- Stochastic gradient descent
- Gradient descent with momentum
- Gradient descent with adaptive learning rate. You need to implement a linear search method to tune the learning rate in each iteration.
- Adagrad
- Adam
- Build each optimizer to by easily to integrate with tensorflow models that build on top of
tf.keras.models
and utilize the built-in loss functions that was inherited fromtf.keras.losses
base classes - Build
SGD trainer
that could utilize any of the implementedOptimizers
- all implemented Optimizer are inheriting from the base class
Optimizer
that exists on/optimization/optimizers.py
- Build Unit tests (using Pytest) for each part of the code to ensure the correctness and to be easily used independently without any issues