Skip to content

{Cognitive Services} az cognitiveservices agent: ABAC-enabled ACR registry support#32863

Open
eamonoreilly wants to merge 1 commit intoAzure:devfrom
eamonoreilly:agent-acr-abac-support
Open

{Cognitive Services} az cognitiveservices agent: ABAC-enabled ACR registry support#32863
eamonoreilly wants to merge 1 commit intoAzure:devfrom
eamonoreilly:agent-acr-abac-support

Conversation

@eamonoreilly
Copy link
Contributor

Fix two bugs when using ABAC-enabled (repository-scoped permissions) ACR registries with hosted agent commands:

  • Fix _check_project_acr_access() treating conditioned role assignments as invalid; now warns instead of blocking when ABAC conditions cannot be evaluated client-side
  • Fix _build_image_remotely() failing to push images to ABAC-enabled registries due to missing credentials in schedule_run
  • Extract _ACR_PULL_ROLES as a module-level constant and eliminate redundant ACR registry fetch

Related command
az cognitiveservices agent create

Description
Two bugs affect users with ABAC-enabled (repository-scoped permissions) ACR registries:

  1. ACR access validation incorrectly blocks ABAC users: When a project's managed identity has a role assignment with an ABAC condition (e.g. scoped to a specific repository), _check_project_acr_access() treated the condition as evidence of insufficient access and blocked the operation. Since ABAC condition strings cannot be reliably evaluated client-side, the fix adopts a warn-but-don't-block approach and surfaces ABAC-specific remediation guidance when access is denied.

  2. Remote image build fails on ABAC registries: _build_image_remotely() called schedule_run with no credentials, causing ACR to reject push with "at least one credential is required". On ABAC-enabled registries, one-off build runs require SourceRegistryCredentials(identity='[caller]') so the ACR task authenticates as the signed-in CLI user — matching the behaviour of az acr build --source-acr-auth-id [caller].

Testing Guide

# Create agent from source using an ABAC-enabled ACR registry
az cognitiveservices agent create \
  --account-name myAccount \
  --project-name myProject \
  --name myAgent \
  --source . \
  --registry myAbacRegistry \
  --build-remote

# Create agent with pre-built image on an ABAC-enabled registry (should warn, not fail, if role has condition)
az cognitiveservices agent create \
  --account-name myAccount \
  --project-name myProject \
  --name myAgent \
  --image myAbacRegistry.azurecr.io/myagent:v1

**History Notes**
[Cognitive Services] `az cognitiveservices agent create`: Fix ABAC-enabled ACR registry support for remote build and access validation

This checklist is used to make sure that common guidelines for a pull request are followed.

- [x] The PR title and description has followed the guideline in [Submitting Pull Requests](https://github.com/Azure/azure-cli/tree/dev/doc/authoring_command_modules#submitting-pull-requests).

- [x] I adhere to the [Command Guidelines](https://github.com/Azure/azure-cli/blob/dev/doc/command_guidelines.md).

- [x] I adhere to the [Error Handling Guidelines](https://github.com/Azure/azure-cli/blob/dev/doc/error_handling_guidelines.md).

Fix two bugs when using ABAC-enabled (repository-scoped permissions) ACR
registries with hosted agent commands.

Bug 1: ACR access check ignores ABAC conditions
_check_project_acr_access() treated conditioned role assignments the same
as unconditional ones, causing false negatives when the managed identity
had a valid ABAC-scoped role. Now detects ABAC mode on the registry:
- If a role assignment has a condition and ABAC is enabled, log a warning
  (cannot evaluate condition strings client-side) and allow the operation
  to proceed (warn-but-don't-block).
- Return abac_enabled as a 4th tuple element so the caller can provide
  ABAC-specific remediation guidance without a redundant registry fetch.
- agent_create() error messages now include the correct repo-scoped
  az role assignment create command with --condition when ABAC is active.

Bug 2: Remote build fails on ABAC-enabled registries
_build_image_remotely() called schedule_run with no credentials, which
causes ACR to reject push on ABAC registries with "at least one credential
is required". For ABAC-enabled registries, one-off build runs require
SourceRegistryCredentials(identity='[caller]') so the task authenticates
as the signed-in CLI user, matching az acr build --source-acr-auth-id
[caller]. Non-ABAC registries continue to pass credentials=None.

Additional improvements
- Extract _ACR_PULL_ROLES as a module-level set constant (previously
  rebuilt as a local list on every call).
- Remove unused image_repo parameter from _check_project_acr_access.
Copilot AI review requested due to automatic review settings February 27, 2026 02:00
@azure-client-tools-bot-prd
Copy link

azure-client-tools-bot-prd bot commented Feb 27, 2026

️✔️AzureCLI-FullTest
️✔️acr
️✔️latest
️✔️3.12
️✔️3.13
️✔️acs
️✔️latest
️✔️3.12
️✔️3.13
️✔️advisor
️✔️latest
️✔️3.12
️✔️3.13
️✔️ams
️✔️latest
️✔️3.12
️✔️3.13
️✔️apim
️✔️latest
️✔️3.12
️✔️3.13
️✔️appconfig
️✔️latest
️✔️3.12
️✔️3.13
️✔️appservice
️✔️latest
️✔️3.12
️✔️3.13
️✔️aro
️✔️latest
️✔️3.12
️✔️3.13
️✔️backup
️✔️latest
️✔️3.12
️✔️3.13
️✔️batch
️✔️latest
️✔️3.12
️✔️3.13
️✔️batchai
️✔️latest
️✔️3.12
️✔️3.13
️✔️billing
️✔️latest
️✔️3.12
️✔️3.13
️✔️botservice
️✔️latest
️✔️3.12
️✔️3.13
️✔️cdn
️✔️latest
️✔️3.12
️✔️3.13
️✔️cloud
️✔️latest
️✔️3.12
️✔️3.13
️✔️cognitiveservices
️✔️latest
️✔️3.12
️✔️3.13
️✔️compute_recommender
️✔️latest
️✔️3.12
️✔️3.13
️✔️computefleet
️✔️latest
️✔️3.12
️✔️3.13
️✔️config
️✔️latest
️✔️3.12
️✔️3.13
️✔️configure
️✔️latest
️✔️3.12
️✔️3.13
️✔️consumption
️✔️latest
️✔️3.12
️✔️3.13
️✔️container
️✔️latest
️✔️3.12
️✔️3.13
️✔️containerapp
️✔️latest
️✔️3.12
️✔️3.13
️✔️core
️✔️latest
️✔️3.12
️✔️3.13
️✔️cosmosdb
️✔️latest
️✔️3.12
️✔️3.13
️✔️databoxedge
️✔️latest
️✔️3.12
️✔️3.13
️✔️dls
️✔️latest
️✔️3.12
️✔️3.13
️✔️dms
️✔️latest
️✔️3.12
️✔️3.13
️✔️eventgrid
️✔️latest
️✔️3.12
️✔️3.13
️✔️eventhubs
️✔️latest
️✔️3.12
️✔️3.13
️✔️feedback
️✔️latest
️✔️3.12
️✔️3.13
️✔️find
️✔️latest
️✔️3.12
️✔️3.13
️✔️hdinsight
️✔️latest
️✔️3.12
️✔️3.13
️✔️identity
️✔️latest
️✔️3.12
️✔️3.13
️✔️iot
️✔️latest
️✔️3.12
️✔️3.13
️✔️keyvault
️✔️latest
️✔️3.12
️✔️3.13
️✔️lab
️✔️latest
️✔️3.12
️✔️3.13
️✔️managedservices
️✔️latest
️✔️3.12
️✔️3.13
️✔️maps
️✔️latest
️✔️3.12
️✔️3.13
️✔️marketplaceordering
️✔️latest
️✔️3.12
️✔️3.13
️✔️monitor
️✔️latest
️✔️3.12
️✔️3.13
️✔️mysql
️✔️latest
️✔️3.12
️✔️3.13
️✔️netappfiles
️✔️latest
️✔️3.12
️✔️3.13
️✔️network
️✔️latest
️✔️3.12
️✔️3.13
️✔️policyinsights
️✔️latest
️✔️3.12
️✔️3.13
️✔️postgresql
️✔️latest
️✔️3.12
️✔️3.13
️✔️privatedns
️✔️latest
️✔️3.12
️✔️3.13
️✔️profile
️✔️latest
️✔️3.12
️✔️3.13
️✔️rdbms
️✔️latest
️✔️3.12
️✔️3.13
️✔️redis
️✔️latest
️✔️3.12
️✔️3.13
️✔️relay
️✔️latest
️✔️3.12
️✔️3.13
️✔️resource
️✔️latest
️✔️3.12
️✔️3.13
️✔️role
️✔️latest
️✔️3.12
️✔️3.13
️✔️search
️✔️latest
️✔️3.12
️✔️3.13
️✔️security
️✔️latest
️✔️3.12
️✔️3.13
️✔️servicebus
️✔️latest
️✔️3.12
️✔️3.13
️✔️serviceconnector
️✔️latest
️✔️3.12
️✔️3.13
️✔️servicefabric
️✔️latest
️✔️3.12
️✔️3.13
️✔️signalr
️✔️latest
️✔️3.12
️✔️3.13
️✔️sql
️✔️latest
️✔️3.12
️✔️3.13
️✔️sqlvm
️✔️latest
️✔️3.12
️✔️3.13
️✔️storage
️✔️latest
️✔️3.12
️✔️3.13
️✔️synapse
️✔️latest
️✔️3.12
️✔️3.13
️✔️telemetry
️✔️latest
️✔️3.12
️✔️3.13
️✔️util
️✔️latest
️✔️3.12
️✔️3.13
️✔️vm
️✔️latest
️✔️3.12
️✔️3.13

@azure-client-tools-bot-prd
Copy link

azure-client-tools-bot-prd bot commented Feb 27, 2026

️✔️AzureCLI-BreakingChangeTest
️✔️Non Breaking Changes

@yonzhan
Copy link
Collaborator

yonzhan commented Feb 27, 2026

Thank you for your contribution! We will review the pull request and get back to you soon.

@github-actions
Copy link

The git hooks are available for azure-cli and azure-cli-extensions repos. They could help you run required checks before creating the PR.

Please sync the latest code with latest dev branch (for azure-cli) or main branch (for azure-cli-extensions).
After that please run the following commands to enable git hooks:

pip install azdev --upgrade
azdev setup -c <your azure-cli repo path> -r <your azure-cli-extensions repo path>

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes two bugs related to ABAC-enabled (Attribute-Based Access Control with repository-scoped permissions) Azure Container Registry support in the cognitive services agent commands. The first bug incorrectly blocked users when role assignments had ABAC conditions, treating them as invalid. The fix adopts a warn-but-don't-block approach since ABAC condition strings cannot be reliably evaluated client-side. The second bug caused remote image builds to fail on ABAC registries due to missing credentials; the fix passes SourceRegistryCredentials(identity='[caller]') to authenticate as the CLI user, matching az acr build --source-acr-auth-id [caller] behavior.

Changes:

  • Extract _ACR_PULL_ROLES as a module-level constant for reusability
  • Add ABAC detection and credential passing in _build_image_remotely() for remote builds
  • Update _check_project_acr_access() to detect ABAC mode and warn (not block) on conditioned assignments
  • Add _extract_repository_name_for_acr() helper to generate ABAC-specific guidance
  • Enhance error messages in agent_create() with ABAC-specific remediation commands
Comments suppressed due to low confidence (1)

src/azure-cli/azure/cli/command_modules/cognitiveservices/custom.py:886

  • The class PlatformProperties is imported twice from the same module: once at line 861 (inside the has_dockerfile block) and once at line 885 (inside the else block). Consider moving this import to the top-level function imports at line 797-801 to avoid duplication, or accept it as a minor trade-off for keeping the conditional imports isolated.
            from azure.mgmt.containerregistry.models import (
                DockerBuildRequest, PlatformProperties
            )

            docker_build_request = DockerBuildRequest(
                image_names=[image_without_registry],
                is_push_enabled=True,
                source_location=source_location,
                platform=PlatformProperties(os='Linux', architecture='amd64'),
                docker_file_path=dockerfile_name,
                timeout=3600,
                credentials=build_credentials,
            )

            queued = client_registries.schedule_run(
                resource_group_name=resource_group_name,
                registry_name=registry_name,
                run_request=docker_build_request)
        else:
            # Buildpacks for auto-detection
            logger.warning("No Dockerfile - using Cloud Native Buildpacks (auto-detect)")
            logger.warning("Buildpacks will detect Python, Node.js, .NET, etc.")
            logger.warning("Queueing build task...")

            from azure.mgmt.containerregistry.models import (
                EncodedTaskRunRequest, PlatformProperties
            )

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1958 to +1976
def _extract_repository_name_for_acr(image, source, agent_name, registry):
"""Extract ACR repository path (without tag/digest) for ABAC guidance."""
if source:
return agent_name

if not image:
return None

if '.azurecr.io/' in image:
repository = image.split('.azurecr.io/', 1)[1]
elif registry:
repository = image
else:
return None

repository = repository.split('@', 1)[0]
repository = repository.split(':', 1)[0]
return repository or None

Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new helper function _extract_repository_name_for_acr lacks unit tests. The test file shows a pattern of unit testing for similar helper functions (e.g., test_has_dockerfile_*, test_is_fully_qualified_image). Consider adding unit tests covering edge cases: source with agent_name, image with .azurecr.io/, image with registry param, digest format (@sha256:...), tag format (:v1), and combinations thereof.

Copilot uses AI. Check for mistakes.
Comment on lines 1840 to 1905
@@ -1796,45 +1863,45 @@ def _check_project_acr_access(cmd, client, account_name, project_name, registry_
# However, this is significantly more complex and slower. The current approach
# follows the pattern used by AKS (see acs/_roleassignments.py) and covers
# the most common scenarios. Users with custom roles can use --skip-acr-check.
#
# Acceptable roles include:
# Standard ACR roles:
# - AcrPull: Can pull images
# - AcrPush: Can pull and push images
# Repository-scoped roles:
# - Container Registry Repository Reader: Read access (includes pull)
# - Container Registry Repository Writer: Read/write access (includes pull)
# - Container Registry Repository Contributor: Full repository access (includes pull)
# General Azure roles:
# - Reader: Can view resources (includes pull)
# - Contributor, Owner: Full access
acceptable_roles = [
'AcrPull',
'AcrPush',
'Container Registry Repository Reader',
'Container Registry Repository Writer',
'Container Registry Repository Contributor',
'Reader',
'Contributor',
'Owner'
]

# Get role assignments for the principal on the ACR
assignments = list_role_assignments(cmd, assignee=principal_id, scope=acr_resource_id)

# Check if any assignment has acceptable role
# Check if any assignment has an acceptable role (see _ACR_PULL_ROLES),
# accounting for ABAC conditions
for assignment in assignments:
role_name = assignment.get('roleDefinitionName', '')
if role_name in acceptable_roles:
if role_name in _ACR_PULL_ROLES:
condition = assignment.get('condition', None)
if condition and abac_enabled:
# ABAC is enabled and this assignment has a condition.
# We cannot reliably evaluate ABAC condition strings client-side,
# so we treat the assignment as valid and warn the user.
logger.warning(
"Found '%s' role on ACR '%s' with an ABAC condition. "
"Cannot verify whether the condition grants access to the "
"target repository. If the deployment fails with a permission "
"error, verify the ABAC condition covers the required repository.",
role_name, registry_name)
return (True, principal_id, None, abac_enabled)
# No condition = full scope access (or ABAC not enabled)
logger.info(
"Found %s role for project identity on ACR %s",
role_name, registry_name)
return (True, principal_id, None)
return (True, principal_id, None, abac_enabled)

# No suitable role found
if abac_enabled:
return (
False, principal_id,
f"Project managed identity does not have any recognized pull role on "
f"ABAC-enabled registry '{registry_name}'",
abac_enabled
)
return (
False, principal_id,
f"Project managed identity does not have AcrPull access to '{registry_name}'"
f"Project managed identity does not have AcrPull access to '{registry_name}'",
abac_enabled
)
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ABAC handling logic in _check_project_acr_access lacks test coverage. Given the complexity of the warn-but-don't-block pattern for conditioned assignments (lines 1876-1886) and the ABAC-specific error messages (lines 1894-1900), consider adding unit tests or integration tests to verify: (1) correct handling when ABAC is enabled with conditioned assignments, (2) correct handling when ABAC is disabled but conditions exist, and (3) correct error messages for ABAC vs non-ABAC registries. Mock the ACR registry fetch and role assignments to test different scenarios.

Copilot uses AI. Check for mistakes.
Comment on lines +1875 to +1891
condition = assignment.get('condition', None)
if condition and abac_enabled:
# ABAC is enabled and this assignment has a condition.
# We cannot reliably evaluate ABAC condition strings client-side,
# so we treat the assignment as valid and warn the user.
logger.warning(
"Found '%s' role on ACR '%s' with an ABAC condition. "
"Cannot verify whether the condition grants access to the "
"target repository. If the deployment fails with a permission "
"error, verify the ABAC condition covers the required repository.",
role_name, registry_name)
return (True, principal_id, None, abac_enabled)
# No condition = full scope access (or ABAC not enabled)
logger.info(
"Found %s role for project identity on ACR %s",
role_name, registry_name)
return (True, principal_id, None)
return (True, principal_id, None, abac_enabled)
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for handling conditioned role assignments needs refinement. If a role assignment has a condition but abac_enabled is False (either because ABAC detection failed or the registry doesn't have ABAC enabled), the code falls through to line 1888-1891 and returns True, treating the conditioned assignment as fully valid. This could incorrectly accept conditioned assignments that may not grant the required access. Consider adding an else-if branch: if condition exists but abac_enabled is False, either log a warning or treat it as a failure, since conditioned assignments typically indicate repository-scoped permissions that should be validated.

Copilot uses AI. Check for mistakes.
Comment on lines +2070 to +2101
if acr_abac_enabled and image_repo:
error_message = (
f"{error_msg}\n\n"
f"This registry has ABAC (repository-level permissions) enabled.\n"
f"Grant repository-scoped access to the project's managed identity:\n\n"
f" az role assignment create --assignee {principal_id} "
f"--role \"Container Registry Repository Reader\" "
f"--scope {acr_scope} "
f"--condition \"@Resource[Microsoft.ContainerRegistry/registries/"
f"repositories] StringEquals '{image_repo}'\" "
f"--condition-version \"2.0\"\n\n"
f"Or grant broad access (bypasses ABAC scoping):\n\n"
f" az role assignment create --assignee {principal_id} "
f"--role AcrPull "
f"--scope {acr_scope}\n\n"
f"To skip this check (not recommended), use: --skip-acr-check"
)
else:
error_message = (
f"{error_msg}\n\n"
f"AI Foundry needs permission to pull the container image from ACR.\n"
f"Grant AcrPull role to the project's managed identity:\n\n"
f" az role assignment create --assignee {principal_id} "
f"--role AcrPull "
f"--scope {acr_scope}\n\n"
f"Or use Azure Portal:\n"
f" 1. Open ACR '{registry_name}' → Access Control (IAM)\n"
f" 2. Add role assignment → AcrPull\n"
f" 3. Assign access to: Managed Identity\n"
f" 4. Select the project's managed identity\n\n"
f"To skip this check (not recommended), use: --skip-acr-check"
)
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When ABAC is enabled but image_repo is None (line 2070 condition fails), the error message falls through to the else block (line 2087-2101) and provides generic AcrPull guidance that doesn't mention ABAC. This could mislead users with ABAC-enabled registries. Consider adding a third case: if acr_abac_enabled is True but image_repo is None, provide an ABAC-aware message without the repository-specific condition command, or explain why the repository name couldn't be determined.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Auto-Assign Auto assign by bot Cognitive Services az cognitiveservices

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants