The running project code is in the file operationalizing_ml.ipynb which is based on the starter code file starter_files/aml-pipelines-with-automated-machine-learning-step.ipynb
This project comprises of operationalizing Machine Learning in Microsoft Azure. A classification model is trained and deployed. The deployed model endpoint URI can later be used to make predictions A pipeline is also created, published and consumed. The published pipeline endpoint URI can later be used to initiate a new run.
- Consider implementing one or more stand out suggestions made in the project rubric
- Research and use balanced data to improve the outcome of the predictions
- Version the pipelines. This would allow development of new models while customers continue to use the existing version
This step allows a model to be trained across different algorithms/parameters and highlights the best model
- An Azure ML Dataset is created and registered if it does not already exist. The data is retrieved from the URL provided.
- A Compute cluster is created if it does not already exist
- An AtoML configuration is specified with key information such as the best metric to use, the column label, etc.
- A run is then submitted to train the model. Once the experiment/run is completed, a number of models including the best model may be inspected.
- The best model is associated with the run and it is the VotingEnsemble for this run
In this step, we use the Azure ML Studio user interface to deploy the best model. We use ACI (Azure Container Instance) with authentication enabled.
Logging (Application Insights) helps troubleshoot and understand the workflow. It also helps with quantifying performance at various stages of the execution. The WebServcie is used to enable/disable Application Insights.
Display output from running logs.py. logs.py retrieves the logs from the WebService.
Once a model is published, Azure ML exposes a swagger.json file. This can be consumed by Swagger and helps with the documentation of the methods that are exposed together with the JSON payloads for input/output. This makes it much easier to start consuming the endpoint.
The model endpoint is consumed by making a REST API call (endpoint.py). If authentication is enabled, the key must also be provide. The output of such a model invocation is displayed below.
- A pipeline is created when it is "run" in the context of an experiment. Pipelines are visualized in the Pipelines section of Azure ML Studio
- Once a pipeline is published, a pipeline endpoint is generated and may be reviewed from Azure ML Studio under the 'Pipeline Endpoint' tab
- Review of the pipeline from Azure ML studio showing the bank marketing dataset and the AutoML module
- Once published, the Pipeline Details tab will show the published pipeline status (Active in this case) and also the pipeline REST endpoint that can be called to "run" the pipeline.
- The RunDetails widget asynchronously displays the run details in the Notebook as the pipeline run progresses.
- A pipeline run status (Scheduled, Completed, etc.) can be reviewed in Azure ML Studio in the Pipelines section