The aim of this solution is to provide the core for any streaming video platform that wants to improve their QoS system.
- QoS Solution
The Architecture includes standard AWS components for the video streaming side of an OTT platform and Databricks as a Unified Data Analytics Platform for both the real time insights and the advanced analytics (machine learning) capabilities.
The provided Notebooks are showcasing an end-to-end project using Delta and a Delta Architecture pattern :
- the data ingestion including a
make your data available to everyone pipeline
with real-time data enrichment and anonymisation - real-time notifications based on a complex rules engine or machine learning based scoring
- real-time aggregations to update the web application
- quick shareable Dashboard built directly on top of the datasets stored in your Delta Lake (for e.g. the Network Operations Center Dashboard)
For an easy import in Databricks an archive (.dbc) with all the Notebooks is provided in the Notebooks folder.
As a minimum level of requirements in order to deploy the platform you must have access to an AWS account with a Databricks workspace and Docker installed on your local environment to build the code.
Deployment:
-
Clone the project and configure the Makefile and the CloudFormation template
- set
bucket
variable to reflect the S3 bucket name prefix which will be created within a deployment region - set
regions
variable to reflect one or more AWS regions you want the code artifacts to be copied for CloudFormation deployment. - set
stack_name
for the Stack Name to use in the deployment. - set
profile
to the AWS CLI profile which has necessary permissions to deploy and create all the resources required.
- set
-
Build and upload the code in the source bucket in S3:
make all
. Once the build is completed,you can use the URL for your CloudFormation script in the next step. -
Deploy the CloudFormation script either using the
make deploy
command or the CloudFormation UI. At the end of deployment you can find all the resources created during the deployment in the Resource tab.
-
As best practice, you should launch Databricks clusters with instance profiles that allow you to access your data from Databricks clusters without having to embed your AWS keys in notebooks.
- Update the Databricks cross-account role with the new created IAM role that you can found the resource list mentioned above (the IAM role name includes Databricks)
- Add the instance profile to the Databricks workspace
- Start the cluster using the instance profile attached
-
Import the Databricks archive - QoS Notebooks - in your environment and update the config notebook with resources created by the CloudFormation deployment.
-
You are ready to go! Enjoy the QoS Solution!
This sample code is made available under the MIT-0 license. See the LICENSE file.