Conversation
35c2c4a to
c87b8ef
Compare
3d49bd0 to
5a6bedc
Compare
2a9b3c2 to
4729bdc
Compare
3a365b2 to
e26288b
Compare
370b602 to
baff263
Compare
* Reformatted advice section to remove brackets * Removed content improvement and design best practice advice - can be reenabled by using system_prompt_extended.txt * Switched to using British English spellings
There was a problem hiding this comment.
Pull request overview
This PR updates the NotifAI document-classification pipeline to improve performance and accuracy by switching to a multimodal Bedrock flow (sending documents directly to the LLM), updating prompts/models, refreshing the evaluation/alerting pipeline, and adding local Docker tooling.
Changes:
- Replace text extraction with multimodal document input (frontend + Bedrock prompt service).
- Rework automated evaluation runner to run direct
conversecalls and adjust alert scoring to match the new output format. - Add/adjust local Docker + dependency hardening (nginx health endpoint, npm overrides, etc.), and update infra to Nova-Lite + scheduling changes.
Reviewed changes
Copilot reviewed 41 out of 45 changed files in this pull request and generated 17 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/prompt-input-generator/prompt-input-generator.py | Generates JSONL prompts from S3 documents for the new multimodal evaluation format. |
| tools/prompt-input-generator/env.example | Updates generator env vars to S3-based inputs and Nova-Lite defaults. |
| src/frontend/notifai-uploader/src/components/FileUpload.js | Switches upload flow to DataURL (base64) for multimodal input instead of /convert. |
| src/frontend/notifai-uploader/src/components/AIfeedback.js | Tweaks markdown/text cleanup for LLM feedback rendering. |
| src/frontend/notifai-uploader/src/api/BackendAPIClient.js | Adjusts baseURL construction for backend API config. |
| src/frontend/notifai-uploader/src/Pages/FileUploadPage.js | Removes page-count handling and wires new upload payload through to the LLM call. |
| src/frontend/notifai-uploader/package.json | Adds dependency overrides for security/vulnerability mitigation. |
| src/frontend/notifai-uploader/package-lock.json | Locks updated dependency graph consistent with overrides. |
| src/frontend/notifai-uploader/nginx.conf | Adds a /health endpoint for container health checks. |
| src/frontend/notifai-uploader/Dockerfile | Updates nginx base and config placement; adds apk upgrade. |
| src/frontend/notifai-uploader/.dockerignore | Introduces docker ignore rules for frontend build context. |
| src/backend/bedrock_evaluations_runner/tests/test_bedrock_evaluation_service.py | Updates evaluation runner tests for the new programmatic evaluation path. |
| src/backend/bedrock_evaluations_runner/tests/evaluation_example.json | Adds a sample JSONL evaluation record using multimodal document input. |
| src/backend/bedrock_evaluations_runner/constants.py | Centralizes tool schema constants and common error messages. |
| src/backend/bedrock_evaluations_runner/bedrock_evaluation_service.py | Replaces Bedrock “evaluation job” API usage with direct converse execution over JSONL prompts. |
| src/backend/bedrock_evaluations_runner/bedrock_evaluation_lambda.py | Updates Lambda handler to use new evaluation runner signature and flow. |
| src/backend/bedrock_evaluations_runner/.env-example | Updates default model identifiers and region example values. |
| src/backend/bedrock_alerts/tests/test_evaluations_alert_service.py | Updates alert scoring tests to match new result record format. |
| src/backend/bedrock_alerts/evaluations_alert_service.py | Changes scoring logic from Bedrock eval-job metric output to rating vs actualClass. |
| src/backend/bedrock_alerts/evaluations_alert_lambda.py | Adds logging and adjusts control flow for alert triggering. |
| src/backend/bedrock-prompt-messager/tests/services/test_bedrock_service.py | Extends tests to cover multimodal document inputs and mime handling. |
| src/backend/bedrock-prompt-messager/system_prompt.txt | Updates the system prompt for improved NHS-specific classification and effectiveness advice. |
| src/backend/bedrock-prompt-messager/services/bedrock_service.py | Implements Data URL parsing and sends document bytes via Bedrock multimodal document content. |
| src/backend/bedrock-prompt-messager/api_wrapper.py | Adds a Flask wrapper for local HTTP testing of the Lambda handler. |
| src/backend/bedrock-prompt-messager/Dockerfile | Adds a containerized local runner exposing the Flask wrapper. |
| src/backend/app/tests/services/test_convert_service.py | Removes tests for the deprecated /convert text-extraction pipeline. |
| src/backend/app/tests/routers/test_convert_router.py | Removes /convert router tests. |
| src/backend/app/services/convert_service.py | Removes the convert service (pandoc/pdf extraction) implementation. |
| src/backend/app/routers/convert.py | Removes the /convert endpoint router. |
| src/backend/app/requirements.txt | Drops pandoc/pypdf and updates boto3 dependency flavor. |
| src/backend/app/main.py | Expands allowed CORS origins for local dev and adds /health. |
| src/backend/app/core/auth.py | Exempts /health from authentication middleware. |
| src/backend/app/Dockerfile | Updates FastAPI CLI invocation for container startup. |
| scripts/config/sonar-scanner.properties | Adjusts sonar exclusions/coverage exclusions for new local wrapper artifacts. |
| infrastructure/terraform/etc/group_nhs-notify-poc001.tfvars | Updates log retention setting. |
| infrastructure/terraform/etc/env_eu-west-2_test.tfvars | Switches prompt/eval models to Nova-Lite. |
| infrastructure/terraform/etc/env_eu-west-2_dev2.tfvars | Switches prompt/eval models to Nova-Lite. |
| infrastructure/terraform/etc/env_eu-west-2_dev1.tfvars | Switches prompt/eval models to Nova-Lite. |
| infrastructure/terraform/components/notifyai/resources/prompt-data/letter-responses.json | Removes legacy ground-truth response mapping used by the old pipeline. |
| infrastructure/terraform/components/notifyai/lambda.tf | Updates Bedrock model permissions toward Nova-Lite. |
| infrastructure/terraform/components/notifyai/eventbridge.tf | Alters scheduled evaluation behavior via an explicit scheduler end date. |
| infrastructure/terraform/components/notifyai/bedrock.tf | Updates Bedrock access policy resources to Nova-Lite inference profile. |
| docker-compose.yaml | Adds a local dev stack for frontend, backend, and bedrock-prompt-messager. |
| .gitignore | Ignores additional build artifacts/caches. |
Files not reviewed (1)
- src/frontend/notifai-uploader/package-lock.json: Language not supported
Comments suppressed due to low confidence (1)
src/backend/bedrock_evaluations_runner/tests/test_bedrock_evaluation_service.py:78
run_evaluation_jobno longer accepts anevaluator_modelargument, but this test still passesevaluator_model="eval-model", which will raiseTypeErrorand won’t exercise the intended failure path. Update the test to call the new signature and to trigger failure via the new code path (e.g., mockget_object/converse/put_objectto raise).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const formData = new FormData(); | ||
| formData.append("file", file); | ||
|
|
There was a problem hiding this comment.
formData is created and file is appended, but the data is never sent anywhere now that the /convert request was removed. This is dead code and (together with the now-unused API client) can trigger lint failures; please remove it or reintroduce the upload call.
| def __init__(self, region: str, resource_prefix: str): | ||
| if not all([region]): | ||
| raise ValueError("Region, Role ARN and resource prefix must be provided.") | ||
|
|
There was a problem hiding this comment.
The constructor now only validates region, but the error message still mentions “Role ARN and resource prefix” and resource_prefix can be None without failing fast. Please update the validation to check resource_prefix too and correct the error message to match the required args.
| output = io.BytesIO() | ||
| jsonlines.Writer(output).write_all(responses) | ||
| s3_out_uri = up.urlparse(output_s3_uri) | ||
|
|
||
| return self.s3_client.put_object( | ||
| Bucket=s3_out_uri.hostname, | ||
| Key=s3_out_uri.path[1:], | ||
| Body=output.getvalue(), | ||
| ) |
There was a problem hiding this comment.
put_object uses Key=s3_out_uri.path[1:], so if output_s3_uri is a prefix like s3://bucket/results/ (as in the env example), this will write to the key results/ (no filename/extension) and will overwrite the same object every run. It also won’t match the alerts job which searches for keys ending with _output.jsonl. Consider appending a generated filename (e.g., ${job_name}_output.jsonl) under the provided prefix.
| output = io.BytesIO() | ||
| jsonlines.Writer(output).write_all(responses) | ||
| s3_out_uri = up.urlparse(output_s3_uri) |
There was a problem hiding this comment.
jsonlines.Writer(output) is given an io.BytesIO(). jsonlines.Writer typically expects a text stream (e.g., io.StringIO() or a TextIOWrapper), which can raise TypeError or write incorrect data. Consider using io.StringIO() (then .getvalue().encode('utf-8')) or wrapping BytesIO with io.TextIOWrapper.
| with patch.object(bedrock_service, "log_prompt_details_to_s3") as mock_log: | ||
| result = bedrock_service.call_admail_bedrock_prompt( | ||
| f"data:fake/type;base64,{base64.b64encode(b"test letter")}", "test_file.json" | ||
| ) | ||
| assert result["statusCode"] == 400 |
There was a problem hiding this comment.
Same issue here: the f-string has nested double-quotes and will not parse, and the base64 payload should be decoded to a text string when constructing a data URL. Please fix this test input construction (e.g., use single quotes and .decode('utf-8')).
| @@ -12,4 +10,4 @@ COPY . . | |||
|
|
|||
| EXPOSE 8080 | |||
There was a problem hiding this comment.
The container runs fastapi run /app/main.py, but the code imports using from app... (expecting app to be a package on PYTHONPATH, as configured in pytest.ini). In this image, /app is the working directory, so app typically won’t be importable unless PYTHONPATH includes the parent directory. Consider switching to module syntax (fastapi run app.main:app) and/or setting ENV PYTHONPATH=/ (or restructuring the container paths) so imports resolve consistently.
| EXPOSE 8080 | |
| EXPOSE 8080 | |
| ENV PYTHONPATH=/ |
| @@ -67,7 +67,7 @@ data "aws_iam_policy_document" "bedrock_access" { | |||
| "${aws_s3_bucket.lambda_prompt_logging_s3_bucket.arn}/${local.s3_lambda_logging_key}*", | |||
| "arn:aws:bedrock:${var.region}:${var.aws_account_id}:guardrail/*", | |||
| "arn:aws:bedrock:${var.region}:${var.aws_account_id}:inference-profile/eu.amazon.nova-pro-v1:*", | |||
There was a problem hiding this comment.
IAM policy still grants access to the Nova Pro inference profile (...inference-profile/eu.amazon.nova-pro-v1:*) while the PR switches the deployed model to Nova Lite. If the runtime uses inference profiles, this mismatch can cause AccessDenied when invoking Nova Lite. Consider updating/removing the Nova Pro inference-profile ARN and adding the corresponding Nova Lite inference-profile ARN.
| "arn:aws:bedrock:${var.region}:${var.aws_account_id}:inference-profile/eu.amazon.nova-pro-v1:*", | |
| "arn:aws:bedrock:${var.region}:${var.aws_account_id}:inference-profile/eu.amazon.nova-lite-v1:*", |
| role_arn = aws_iam_role.eventbridge_scheduler_role.arn | ||
|
|
||
| end_date = "2025-12-16T00:00:00Z" | ||
| } |
There was a problem hiding this comment.
Hard-coding end_date = "2025-12-16T00:00:00Z" permanently stops the scheduled evaluation after that date (and it’s already in the past as of now), regardless of evaluation-schedule-days. If the intent is to disable scheduling, it’s clearer/safer to disable the schedule explicitly (e.g., via a variable/flag) or remove the schedule block, rather than relying on a fixed timestamp.
| documents = [ | ||
| obj['Key'] for obj in S3.list_objects_v2(Bucket=INPUT_BUCKET)['Contents'] | ||
| if obj['Key'].endswith(("docx","csv","html","txt","pdf","md"," oc","xlsx","xls")) |
There was a problem hiding this comment.
list_objects_v2(...)["Contents"] will raise KeyError when the bucket/prefix is empty (AWS omits Contents). Consider using .get("Contents", []) (and optionally handling pagination via IsTruncated/ContinuationToken) so the script behaves predictably for empty/large buckets.
| entrypoint: | ||
| [ | ||
| "fastapi", | ||
| "dev", | ||
| "/app/main.py", | ||
| "--port", | ||
| "8080", | ||
| "--host", | ||
| "0.0.0.0", | ||
| "--reload", | ||
| ] |
There was a problem hiding this comment.
The backend is started with fastapi dev /app/main.py, but main.py imports modules as from app.... In this container layout, that typically requires PYTHONPATH to include the parent directory of the app package (mirroring pytest.ini: pythonpath = src/backend). Without setting PYTHONPATH (or switching to module-style startup like fastapi dev app.main:app), the backend may fail to start with ModuleNotFoundError: No module named 'app'.
Description
Improve performance of the model pipeline to more accurately classify documents as admail/business.
Context
The current model and prompt are generating false-negatives for letters that should be classified as admail. Misclassifying letters that are appropriate for admail as business incurs unnecessary postage costs.
Type of changes
Checklist
Sensitive Information Declaration
To ensure the utmost confidentiality and protect your and others privacy, we kindly ask you to NOT including PII (Personal Identifiable Information) / PID (Personal Identifiable Data) or any other sensitive data in this PR (Pull Request) and the codebase changes. We will remove any PR that do contain any sensitive information. We really appreciate your cooperation in this matter.