Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion docs/pattern-2.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ Each step includes comprehensive retry logic for handling transient errors:
```

#### Extraction Function
- **Purpose**: Extracts fields using Claude via Bedrock
- **Purpose**: Extracts fields using Claude via Bedrock.
- **Key Features**:
- Document class-specific attribute extraction
- Configurable extraction attributes
Expand Down Expand Up @@ -225,6 +225,9 @@ The pattern exports these outputs to the parent stack:

**Stack Deployment Parameters:**
- `ClassificationMethod`: Classification methodology to use (options: 'multimodalPageLevelClassification' or 'textbasedHolisticClassification')
- **OCR**: Control the ability to tune ocr via configuration file `ocr.tuning` property.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid I am confused now by the scope of this PR, including the introduction of this 'tuning' concept. I had thought this would be a simple PR, a quick win, to introduce only an 'enable' property for classification and extraction, with the simple logic in the associated lambdas to skip the associated processing if enabled is False. (Like we have already for Summarization).
And so I am confused by why this would require introduction of new stack deployment parameters and these .tuning properties.

- **Classification**: Control the ability to tune classification via configuration file `classification.tuning` property.
- **Extraction**: Control the ability to tune extraction via configuration file `extraction.tuning` property.
- **Summarization**: Control summarization via configuration file `summarization.enabled` property (replaces `IsSummarizationEnabled` parameter)
- `ConfigurationDefaultS3Uri`: Optional S3 URI to custom configuration (uses default configuration if not specified)
- `MaxConcurrentWorkflows`: Workflow concurrency limit
Expand Down
5 changes: 4 additions & 1 deletion docs/pattern-3.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ Each step includes comprehensive retry logic for handling transient errors:
```

#### Classification Function
- **Purpose**: Classifies pages using UDOP model on SageMaker, and segments into sections using class boundaries
- **Purpose**: Classifies pages using UDOP model on SageMaker, and segments into sections using class boundaries.
- **Input**: Output from OCR function plus output bucket
- **Output**:
```json
Expand Down Expand Up @@ -209,6 +209,9 @@ The pattern exports these outputs to the parent stack:

**Stack Deployment Parameters:**
- `UDOPModelArtifactPath`: S3 path to UDOP model artifacts (see [Fine tuning a UDOP model](#fine-tuning-a-udop-model-for-classification))
- **OCR**: Control the ability to tune ocr via configuration file `ocr.tuning` property.
- **Classification**: Control the ability to tune classification via configuration file `classification.tuning` property.
- **Extraction**: Control the ability to tune extraction via configuration file `extraction.tuning` property.
- **Summarization**: Control summarization via configuration file `summarization.enabled` property (replaces `IsSummarizationEnabled` parameter)
- `ConfigurationDefaultS3Uri`: Optional S3 URI to custom configuration (uses default configuration if not specified)
- `MaxConcurrentWorkflows`: Workflow concurrency limit
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2018,4 +2018,4 @@ def _convert_error_list_to_string(self, errors) -> str:
return f"{first_errors} and {len(errors) - 1} more errors"
else:
# Few errors - show all
return "; ".join(errors)
return "; ".join(errors)
Loading