-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mlops-multi-account-cdk template: batch transform option #61
base: main
Are you sure you want to change the base?
mlops-multi-account-cdk template: batch transform option #61
Conversation
id=f"{prj_name}-source_scripts", | ||
destination_bucket=i_bucket, | ||
destination_key_prefix=f"{self.pipeline_name}/{self.timestamp}/source-scripts", | ||
sources=[s3_deployment.Source.asset(path=f"{BASE_DIR}/source_scripts")], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per our discussion, the source scripts seem to be missing in this deploy repository. I'll test as soon as you add them :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Source scripts added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of issues in the inference pipeline:
get_approved_package_desc
function is not implemented inget_approved_package.py
, thus the import inbatch_inference_pipeline.py
fails- data quality step is added to the inference pipeline but the baseline calculation is missing
|
||
''' | ||
from deploy_endpoint.get_approved_package import get_approved_package_desc | ||
spec = get_approved_package_desc() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_approved_package_desc()
function is not implemented in the get_approved_package.py file, so this import fails
''' | ||
|
||
step_process = self.get_process_step() | ||
step_data_quality = self.get_data_quality_step(step_process) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The data quality baseline calculation step is not in the training pipeline, so the data quality check in the inference pipeline fails. Please add the data quality baseline calculation to the repo
logger = logging.getLogger(__file__.split('/')[-1]) | ||
logger.setLevel(getenv("LOGLEVEL", "INFO")) | ||
|
||
def upload_assets_to_s3(account_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Maria, do we actually need that part?
Can we just keep the pipeline definition and code in the dev account and just point to that file in "deploy_endpoint_stack"?
The fact of copying the pipeline definitions and scripts could be avoided as long as the Role for preprod and prod has access to the s3 bucket in the dev account
Issue #, if available:
Description of changes:
This PR adds capability to create Batch Transform SageMaker Pipeline as an alternative to deployment with SageMaker real-time Endpoint.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.