{Cognitive Services} az cognitiveservices agent: ABAC-enabled ACR registry support#32863
{Cognitive Services} az cognitiveservices agent: ABAC-enabled ACR registry support#32863eamonoreilly wants to merge 1 commit intoAzure:devfrom
Conversation
Fix two bugs when using ABAC-enabled (repository-scoped permissions) ACR registries with hosted agent commands. Bug 1: ACR access check ignores ABAC conditions _check_project_acr_access() treated conditioned role assignments the same as unconditional ones, causing false negatives when the managed identity had a valid ABAC-scoped role. Now detects ABAC mode on the registry: - If a role assignment has a condition and ABAC is enabled, log a warning (cannot evaluate condition strings client-side) and allow the operation to proceed (warn-but-don't-block). - Return abac_enabled as a 4th tuple element so the caller can provide ABAC-specific remediation guidance without a redundant registry fetch. - agent_create() error messages now include the correct repo-scoped az role assignment create command with --condition when ABAC is active. Bug 2: Remote build fails on ABAC-enabled registries _build_image_remotely() called schedule_run with no credentials, which causes ACR to reject push on ABAC registries with "at least one credential is required". For ABAC-enabled registries, one-off build runs require SourceRegistryCredentials(identity='[caller]') so the task authenticates as the signed-in CLI user, matching az acr build --source-acr-auth-id [caller]. Non-ABAC registries continue to pass credentials=None. Additional improvements - Extract _ACR_PULL_ROLES as a module-level set constant (previously rebuilt as a local list on every call). - Remove unused image_repo parameter from _check_project_acr_access.
️✔️AzureCLI-FullTest
|
️✔️AzureCLI-BreakingChangeTest
|
|
Thank you for your contribution! We will review the pull request and get back to you soon. |
|
The git hooks are available for azure-cli and azure-cli-extensions repos. They could help you run required checks before creating the PR. Please sync the latest code with latest dev branch (for azure-cli) or main branch (for azure-cli-extensions). pip install azdev --upgrade
azdev setup -c <your azure-cli repo path> -r <your azure-cli-extensions repo path>
|
There was a problem hiding this comment.
Pull request overview
This PR fixes two bugs related to ABAC-enabled (Attribute-Based Access Control with repository-scoped permissions) Azure Container Registry support in the cognitive services agent commands. The first bug incorrectly blocked users when role assignments had ABAC conditions, treating them as invalid. The fix adopts a warn-but-don't-block approach since ABAC condition strings cannot be reliably evaluated client-side. The second bug caused remote image builds to fail on ABAC registries due to missing credentials; the fix passes SourceRegistryCredentials(identity='[caller]') to authenticate as the CLI user, matching az acr build --source-acr-auth-id [caller] behavior.
Changes:
- Extract
_ACR_PULL_ROLESas a module-level constant for reusability - Add ABAC detection and credential passing in
_build_image_remotely()for remote builds - Update
_check_project_acr_access()to detect ABAC mode and warn (not block) on conditioned assignments - Add
_extract_repository_name_for_acr()helper to generate ABAC-specific guidance - Enhance error messages in
agent_create()with ABAC-specific remediation commands
Comments suppressed due to low confidence (1)
src/azure-cli/azure/cli/command_modules/cognitiveservices/custom.py:886
- The class
PlatformPropertiesis imported twice from the same module: once at line 861 (inside the has_dockerfile block) and once at line 885 (inside the else block). Consider moving this import to the top-level function imports at line 797-801 to avoid duplication, or accept it as a minor trade-off for keeping the conditional imports isolated.
from azure.mgmt.containerregistry.models import (
DockerBuildRequest, PlatformProperties
)
docker_build_request = DockerBuildRequest(
image_names=[image_without_registry],
is_push_enabled=True,
source_location=source_location,
platform=PlatformProperties(os='Linux', architecture='amd64'),
docker_file_path=dockerfile_name,
timeout=3600,
credentials=build_credentials,
)
queued = client_registries.schedule_run(
resource_group_name=resource_group_name,
registry_name=registry_name,
run_request=docker_build_request)
else:
# Buildpacks for auto-detection
logger.warning("No Dockerfile - using Cloud Native Buildpacks (auto-detect)")
logger.warning("Buildpacks will detect Python, Node.js, .NET, etc.")
logger.warning("Queueing build task...")
from azure.mgmt.containerregistry.models import (
EncodedTaskRunRequest, PlatformProperties
)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def _extract_repository_name_for_acr(image, source, agent_name, registry): | ||
| """Extract ACR repository path (without tag/digest) for ABAC guidance.""" | ||
| if source: | ||
| return agent_name | ||
|
|
||
| if not image: | ||
| return None | ||
|
|
||
| if '.azurecr.io/' in image: | ||
| repository = image.split('.azurecr.io/', 1)[1] | ||
| elif registry: | ||
| repository = image | ||
| else: | ||
| return None | ||
|
|
||
| repository = repository.split('@', 1)[0] | ||
| repository = repository.split(':', 1)[0] | ||
| return repository or None | ||
|
|
There was a problem hiding this comment.
The new helper function _extract_repository_name_for_acr lacks unit tests. The test file shows a pattern of unit testing for similar helper functions (e.g., test_has_dockerfile_*, test_is_fully_qualified_image). Consider adding unit tests covering edge cases: source with agent_name, image with .azurecr.io/, image with registry param, digest format (@sha256:...), tag format (:v1), and combinations thereof.
| @@ -1796,45 +1863,45 @@ def _check_project_acr_access(cmd, client, account_name, project_name, registry_ | |||
| # However, this is significantly more complex and slower. The current approach | |||
| # follows the pattern used by AKS (see acs/_roleassignments.py) and covers | |||
| # the most common scenarios. Users with custom roles can use --skip-acr-check. | |||
| # | |||
| # Acceptable roles include: | |||
| # Standard ACR roles: | |||
| # - AcrPull: Can pull images | |||
| # - AcrPush: Can pull and push images | |||
| # Repository-scoped roles: | |||
| # - Container Registry Repository Reader: Read access (includes pull) | |||
| # - Container Registry Repository Writer: Read/write access (includes pull) | |||
| # - Container Registry Repository Contributor: Full repository access (includes pull) | |||
| # General Azure roles: | |||
| # - Reader: Can view resources (includes pull) | |||
| # - Contributor, Owner: Full access | |||
| acceptable_roles = [ | |||
| 'AcrPull', | |||
| 'AcrPush', | |||
| 'Container Registry Repository Reader', | |||
| 'Container Registry Repository Writer', | |||
| 'Container Registry Repository Contributor', | |||
| 'Reader', | |||
| 'Contributor', | |||
| 'Owner' | |||
| ] | |||
|
|
|||
| # Get role assignments for the principal on the ACR | |||
| assignments = list_role_assignments(cmd, assignee=principal_id, scope=acr_resource_id) | |||
|
|
|||
| # Check if any assignment has acceptable role | |||
| # Check if any assignment has an acceptable role (see _ACR_PULL_ROLES), | |||
| # accounting for ABAC conditions | |||
| for assignment in assignments: | |||
| role_name = assignment.get('roleDefinitionName', '') | |||
| if role_name in acceptable_roles: | |||
| if role_name in _ACR_PULL_ROLES: | |||
| condition = assignment.get('condition', None) | |||
| if condition and abac_enabled: | |||
| # ABAC is enabled and this assignment has a condition. | |||
| # We cannot reliably evaluate ABAC condition strings client-side, | |||
| # so we treat the assignment as valid and warn the user. | |||
| logger.warning( | |||
| "Found '%s' role on ACR '%s' with an ABAC condition. " | |||
| "Cannot verify whether the condition grants access to the " | |||
| "target repository. If the deployment fails with a permission " | |||
| "error, verify the ABAC condition covers the required repository.", | |||
| role_name, registry_name) | |||
| return (True, principal_id, None, abac_enabled) | |||
| # No condition = full scope access (or ABAC not enabled) | |||
| logger.info( | |||
| "Found %s role for project identity on ACR %s", | |||
| role_name, registry_name) | |||
| return (True, principal_id, None) | |||
| return (True, principal_id, None, abac_enabled) | |||
|
|
|||
| # No suitable role found | |||
| if abac_enabled: | |||
| return ( | |||
| False, principal_id, | |||
| f"Project managed identity does not have any recognized pull role on " | |||
| f"ABAC-enabled registry '{registry_name}'", | |||
| abac_enabled | |||
| ) | |||
| return ( | |||
| False, principal_id, | |||
| f"Project managed identity does not have AcrPull access to '{registry_name}'" | |||
| f"Project managed identity does not have AcrPull access to '{registry_name}'", | |||
| abac_enabled | |||
| ) | |||
There was a problem hiding this comment.
The ABAC handling logic in _check_project_acr_access lacks test coverage. Given the complexity of the warn-but-don't-block pattern for conditioned assignments (lines 1876-1886) and the ABAC-specific error messages (lines 1894-1900), consider adding unit tests or integration tests to verify: (1) correct handling when ABAC is enabled with conditioned assignments, (2) correct handling when ABAC is disabled but conditions exist, and (3) correct error messages for ABAC vs non-ABAC registries. Mock the ACR registry fetch and role assignments to test different scenarios.
| condition = assignment.get('condition', None) | ||
| if condition and abac_enabled: | ||
| # ABAC is enabled and this assignment has a condition. | ||
| # We cannot reliably evaluate ABAC condition strings client-side, | ||
| # so we treat the assignment as valid and warn the user. | ||
| logger.warning( | ||
| "Found '%s' role on ACR '%s' with an ABAC condition. " | ||
| "Cannot verify whether the condition grants access to the " | ||
| "target repository. If the deployment fails with a permission " | ||
| "error, verify the ABAC condition covers the required repository.", | ||
| role_name, registry_name) | ||
| return (True, principal_id, None, abac_enabled) | ||
| # No condition = full scope access (or ABAC not enabled) | ||
| logger.info( | ||
| "Found %s role for project identity on ACR %s", | ||
| role_name, registry_name) | ||
| return (True, principal_id, None) | ||
| return (True, principal_id, None, abac_enabled) |
There was a problem hiding this comment.
The logic for handling conditioned role assignments needs refinement. If a role assignment has a condition but abac_enabled is False (either because ABAC detection failed or the registry doesn't have ABAC enabled), the code falls through to line 1888-1891 and returns True, treating the conditioned assignment as fully valid. This could incorrectly accept conditioned assignments that may not grant the required access. Consider adding an else-if branch: if condition exists but abac_enabled is False, either log a warning or treat it as a failure, since conditioned assignments typically indicate repository-scoped permissions that should be validated.
| if acr_abac_enabled and image_repo: | ||
| error_message = ( | ||
| f"{error_msg}\n\n" | ||
| f"This registry has ABAC (repository-level permissions) enabled.\n" | ||
| f"Grant repository-scoped access to the project's managed identity:\n\n" | ||
| f" az role assignment create --assignee {principal_id} " | ||
| f"--role \"Container Registry Repository Reader\" " | ||
| f"--scope {acr_scope} " | ||
| f"--condition \"@Resource[Microsoft.ContainerRegistry/registries/" | ||
| f"repositories] StringEquals '{image_repo}'\" " | ||
| f"--condition-version \"2.0\"\n\n" | ||
| f"Or grant broad access (bypasses ABAC scoping):\n\n" | ||
| f" az role assignment create --assignee {principal_id} " | ||
| f"--role AcrPull " | ||
| f"--scope {acr_scope}\n\n" | ||
| f"To skip this check (not recommended), use: --skip-acr-check" | ||
| ) | ||
| else: | ||
| error_message = ( | ||
| f"{error_msg}\n\n" | ||
| f"AI Foundry needs permission to pull the container image from ACR.\n" | ||
| f"Grant AcrPull role to the project's managed identity:\n\n" | ||
| f" az role assignment create --assignee {principal_id} " | ||
| f"--role AcrPull " | ||
| f"--scope {acr_scope}\n\n" | ||
| f"Or use Azure Portal:\n" | ||
| f" 1. Open ACR '{registry_name}' → Access Control (IAM)\n" | ||
| f" 2. Add role assignment → AcrPull\n" | ||
| f" 3. Assign access to: Managed Identity\n" | ||
| f" 4. Select the project's managed identity\n\n" | ||
| f"To skip this check (not recommended), use: --skip-acr-check" | ||
| ) |
There was a problem hiding this comment.
When ABAC is enabled but image_repo is None (line 2070 condition fails), the error message falls through to the else block (line 2087-2101) and provides generic AcrPull guidance that doesn't mention ABAC. This could mislead users with ABAC-enabled registries. Consider adding a third case: if acr_abac_enabled is True but image_repo is None, provide an ABAC-aware message without the repository-specific condition command, or explain why the repository name couldn't be determined.
Fix two bugs when using ABAC-enabled (repository-scoped permissions) ACR registries with hosted agent commands:
_check_project_acr_access()treating conditioned role assignments as invalid; now warns instead of blocking when ABAC conditions cannot be evaluated client-side_build_image_remotely()failing to push images to ABAC-enabled registries due to missing credentials inschedule_run_ACR_PULL_ROLESas a module-level constant and eliminate redundant ACR registry fetchRelated command
az cognitiveservices agent createDescription
Two bugs affect users with ABAC-enabled (repository-scoped permissions) ACR registries:
ACR access validation incorrectly blocks ABAC users: When a project's managed identity has a role assignment with an ABAC condition (e.g. scoped to a specific repository),
_check_project_acr_access()treated the condition as evidence of insufficient access and blocked the operation. Since ABAC condition strings cannot be reliably evaluated client-side, the fix adopts a warn-but-don't-block approach and surfaces ABAC-specific remediation guidance when access is denied.Remote image build fails on ABAC registries:
_build_image_remotely()calledschedule_runwith no credentials, causing ACR to reject push with "at least one credential is required". On ABAC-enabled registries, one-off build runs requireSourceRegistryCredentials(identity='[caller]')so the ACR task authenticates as the signed-in CLI user — matching the behaviour ofaz acr build --source-acr-auth-id [caller].Testing Guide