feat(hooks): SHACKLE — pre-execution circuit breaker for tool calls by Fame510 · Pull Request #6298 · crewAIInc/crewAI

Fame510 · 2026-06-23T07:52:13Z

Summary

Adds shackle_guard.py — a lightweight, self-contained pre-execution circuit breaker that integrates with crewAI's existing tool hook system.

One line to activate:

from crewai.hooks.shackle_guard import register_shackle_guard
register_shackle_guard(budget=0.25, max_repeat_calls=3)

What it does

Sits between the agent and tool execution as a before_tool_call hook. Returns False to block execution, None to allow, or triggers request_human_input() for HITL approval.

Five guard layers:

Circuit breaker — once tripped, all calls blocked for the session
Wall-clock timeout — caps total session duration (default 300s)
Budget enforcement — tracks cumulative tool cost, opens circuit on exhaustion
Loop detection — blocks identical tool+params calls after limit reached
HITL approval — uses crewAI's built-in request_human_input() for high-risk tools

Error amplification: When 401/403/500/timeout signals are detected in tool input, the repeat limit tightens automatically — catching the "loop of death" pattern where an agent retries a failing API call indefinitely.

Motivation

Related to #6025 — the community has been discussing the need for a runtime release-control mediation layer. This PR implements the tri-state PROCEED / NEEDS_REVIEW / BLOCK pattern as a before_tool_call hook.

200+ lines of self-contained Python
Zero new dependencies
Uses crewAI's existing hook infrastructure
Tested with the property-based testing framework from SHACKLE SP/1.0

Example

from crewai import Agent, Task, Crew
from crewai.hooks.shackle_guard import register_shackle_guard

# Activate — one line
shackle = register_shackle_guard(
    budget=0.25,
    max_repeat_calls=3,
    hitl_tools=["execute_code", "deploy"],
)

# Your crew runs as normal — SHACKLE guards every tool call
crew = Crew(agents=[...], tasks=[...])
crew.kickoff()

print(f"Budget spent: ${shackle._budget_spent:.4f}")

Summary by CodeRabbit

New Features

Introduced ShackleGuard to improve tool execution safety, adding circuit-breaker protection for timeouts and budget exhaustion, per-tool-type cost tracking, repeat-call detection with optional error amplification, and optional human-in-the-loop approval for configured high-risk tools.

Bug Fixes

Improved fail-safe behavior: if the guard encounters an internal error while evaluating a tool call, it now denies the execution to prevent unsafe runs.

Adds shackle_guard.py — a lightweight, self-contained circuit breaker that integrates with crewAI's existing tool hook system. One-line activation: from crewai.hooks.shackle_guard import register_shackle_guard register_shackle_guard(budget=0.25, max_repeat_calls=3) Features: - Budget enforcement: tracks cumulative tool cost, opens circuit on exhaustion - Loop detection: blocks identical tool+params calls after limit reached - Error amplification: tightens repeat limits when 401/403/500 signals detected - HITL: uses crewAI's built-in request_human_input() for high-risk tool approval - Wall-clock timeout: caps total session duration - Zero dependencies beyond crewAI's existing hook infrastructure Related: crewAIInc#6025 (Runtime release-control mediation)

corridor-security

Summary: This PR adds an optional pre-execution tool-call guard that enforces budget, timeout, repeat-call, and human-approval checks before tool execution; no exploitable security vulnerabilities were identified in the added code.

Risk: Low risk. The change introduces an opt-in local hook rather than a public endpoint or authorization boundary, and it does not add unsafe file, SQL, subprocess, or network handling paths.

coderabbitai · 2026-06-23T07:52:56Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: d09a15f9-ec76-4756-8034-86b7c0a84e39

📥 Commits

Reviewing files that changed from the base of the PR and between 77e8efc and eabb34a.

📒 Files selected for processing (1)

lib/crewai/src/crewai/hooks/shackle_guard.py

🚧 Files skipped from review as they are similar to previous changes (1)

lib/crewai/src/crewai/hooks/shackle_guard.py

📝 Walkthrough

Walkthrough

A new file lib/crewai/src/crewai/hooks/shackle_guard.py is added, implementing a ShackleGuard class and a register_shackle_guard factory function. The guard registers as a before_tool_call hook and enforces budget limits, session timeouts, repeat-call caps, error-amplified thresholds, and optional HITL approval gates. A fail-closed wrapper ensures that exceptions during hook execution trip the circuit and block the call.

Changes

ShackleGuard tool hook

Layer / File(s)	Summary
ShackleGuard class: setup and helper methods `lib/crewai/src/crewai/hooks/shackle_guard.py`	Module documentation describes SHACKLE Guard as a pre-execution tool hook integration. Defines the `ShackleGuard` constructor accepting `budget`, `max_repeat_calls`, `error_amplification`, `timeout_seconds`, and `hitl_tools`, initializing runtime state (`_budget_spent`, `_call_counts`, `_call_hashes`, `_tripped`, `_session_start`, `_error_signals`). Adds internal helpers `_hash_input` for repeat detection, `_detect_error` for error-amplification signal detection, and `_cost_estimate` for per-tool cost lookups.
`__call__` gatekeeping and fail-closed wrapper `lib/crewai/src/crewai/hooks/shackle_guard.py`	Implements five ordered checks in `ShackleGuard.__call__`: immediate block if the circuit is already tripped, timeout-triggered trip, budget-exhaustion trip, repeated-call blocking with optional error-amplified threshold tightening, and per-tool HITL approval gating. Returns `None` to allow or `False` to block; updates budget and call-count tracking on allowed calls. Wraps the hook with a fail-closed exception handler that trips the circuit, records a failure reason, prints a message, and returns `False` if any exception is raised.
`register_shackle_guard` public API `lib/crewai/src/crewai/hooks/shackle_guard.py`	Adds the `register_shackle_guard(...)` function that constructs a `ShackleGuard` instance with the provided configuration parameters, registers it globally via `register_before_tool_call_hook`, and returns the guard instance for optional further reference.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely describes the main change: introducing SHACKLE, a pre-execution circuit breaker for tool calls, which aligns with the primary purpose of the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/crewai/src/crewai/hooks/shackle_guard.py`:
- Around line 70-71: The timeout countdown is currently initialized in the
__init__ method when the guard is registered, but it should only start when the
guard is actually invoked. Remove the self._start_time initialization from
__init__ and instead initialize it lazily in the __call__ method on the first
invocation by checking if self._start_time is None or not set, then setting it
to time.time() at that point. This ensures the timeout period only begins
counting from the first actual tool call, not from registration time.
- Around line 132-145: The budget exhaustion check in the shackle_guard.py file
currently only blocks when remaining budget is less than or equal to zero, which
allows one additional call to execute and cause overspending when 0 < remaining
< cost. Update the condition in the if statement from `remaining <= 0` to
`remaining < cost` so that it properly prevents execution when the remaining
budget cannot cover the estimated cost of the current call being evaluated.
- Around line 173-184: The response variable from request_human_input() may be
None or request_human_input() may raise an exception, causing response.lower()
to fail and allowing tool execution to proceed unblocked. Wrap the
request_human_input call in a try-except block to catch any exceptions, and add
defensive checks: if response is None or if any exception occurs, treat it as a
denial by returning False immediately. This ensures the HITL guard fails closed
on errors and prevents bypassing the security check through exceptions or null
returns.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: adb1af6a-bb45-4c01-b69a-357614943b88

📥 Commits

Reviewing files that changed from the base of the PR and between 2eb4e3a and 77e8efc.

📒 Files selected for processing (1)

lib/crewai/src/crewai/hooks/shackle_guard.py

coderabbitai · 2026-06-23T07:57:44Z

+        self._start_time: float = time.time()
+


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Timeout starts at registration time instead of first tool call.

Line 70 initializes session timing in __init__. If the guard is registered long before tool execution, Line 122 can trip timeout before the first call. Initialize start time lazily on first __call__.

Proposed fix

- self._start_time: float = time.time() + self._start_time: float | None = None ... # Layer 2: Timeout - elapsed = time.time() - self._start_time + if self._start_time is None: + self._start_time = time.time() + elapsed = time.time() - self._start_time if elapsed > self.timeout_seconds:

Also applies to: 121-123

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/src/crewai/hooks/shackle_guard.py` around lines 70 - 71, The timeout countdown is currently initialized in the __init__ method when the guard is registered, but it should only start when the guard is actually invoked. Remove the self._start_time initialization from __init__ and instead initialize it lazily in the __call__ method on the first invocation by checking if self._start_time is None or not set, then setting it to time.time() at that point. This ensures the timeout period only begins counting from the first actual tool call, not from registration time.

coderabbitai · 2026-06-23T07:57:44Z

+        cost = self._cost_estimate(context)
+        remaining = self.budget - self._budget_spent
+        if remaining <= 0:
+            self._circuit_tripped = True
+            self._circuit_reason = (
+                f"Budget exhausted: ${self._budget_spent:.4f} / ${self.budget:.2f}"
+            )
+            print(
+                f"\n💰 SHACKLE BUDGET EXHAUSTED: "
+                f"${self._budget_spent:.4f} / ${self.budget:.2f}\n"
+                f"   Circuit opened. All further calls blocked."
+            )
+            return False
+


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Budget guard currently allows overspending by one call.

At Line 134, blocking only when remaining <= 0 means a call still executes when 0 < remaining < cost, which exceeds the configured pre-execution budget cap.

Proposed fix

# Layer 3: Budget cost = self._cost_estimate(context) remaining = self.budget - self._budget_spent - if remaining <= 0: + if remaining < cost: self._circuit_tripped = True self._circuit_reason = ( - f"Budget exhausted: ${self._budget_spent:.4f} / ${self.budget:.2f}" + f"Budget exhausted: projected ${self._budget_spent + cost:.4f} > ${self.budget:.2f}" ) print( f"\n💰 SHACKLE BUDGET EXHAUSTED: " - f"${self._budget_spent:.4f} / ${self.budget:.2f}\n" + f"projected ${self._budget_spent + cost:.4f} > ${self.budget:.2f}\n" f" Circuit opened. All further calls blocked." ) return False

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/src/crewai/hooks/shackle_guard.py` around lines 132 - 145, The budget exhaustion check in the shackle_guard.py file currently only blocks when remaining budget is less than or equal to zero, which allows one additional call to execute and cause overspending when 0 < remaining < cost. Update the condition in the if statement from `remaining <= 0` to `remaining < cost` so that it properly prevents execution when the remaining budget cannot cover the estimated cost of the current call being evaluated.

LOLA0786 · 2026-06-23T08:44:56Z

Really like this the tri-state proceed/review/block as a before_tool_call
hook is the right shape, and amplifying the repeat limit on 401/403/500 to catch
the retry "loop of death" is a genuinely nice touch.

Three things from working on this exact pattern, in case they're useful:

1.) The fail-closed fix CodeRabbit flagged on the HITL path is the load-bearing
one worth treating as the component's core invariant, not a bug fix. A guard
whose failure mode is "allow" isn't a guard. I'd extend it past that one
try/except and make every layer fail closed: if _cost_estimate, the input
hash, or any context access throws, block. The entire value of the thing is that
an exception can never become an implicit approval.

2. In-process hooks share a trust boundary with what they're guarding.
_budget_spent, _tripped, and _call_hashes live in the same process as the
agent. For runaway cost and loops — the threat model this nails — that's totally
fine. But if the threat model ever includes a prompt-injected agent or a
compromised tool, the guard's own state is reachable by the thing it's policing,
so the enforcement point has to move outside the agent's process. Might be worth
stating in the README which threat model SHACKLE targets, since people will grab
it for both.

3. Loop detection by exact tool+params hash is evadable by decomposition. An
agent can slip max_repeat_calls by perturbing params (whitespace, key
reordering, a no-op arg) or by splitting one blocked action into N benign-looking
calls with the same aggregate effect. If you ever harden it: canonicalize inputs
before hashing, and budget on aggregate effect rather than per-call identity.

None of this detracts from the PR clean, dependency-free, sound layering. I've
spent a lot of time on pre-execution enforcement for agent actions specifically;
happy to compare notes anytime.

@LOLA0786

Per @LOLA0786 review: a guard whose failure mode is 'allow' isn't a guard. Wraps entire __call__ in try/except that trips circuit on any error. HITL path also fails closed — if terminal is disconnected, block execution.

Fame510 · 2026-06-23T09:14:14Z

@LOLA0786 excellent review — you nailed the two critical issues.

On fail-closed as invariant (point 1): you're absolutely right. I just pushed a fix that wraps the entire __call__ method in a fail-closed guard:

def __call__(self, context):
    try:
        # ... existing guard logic
        return None  # allow
    except Exception as e:
        # Fail-closed: any guard error = circuit tripped
        self._circuit_tripped = True
        self._circuit_reason = f"Guard error (fail-closed): {e}"
        print(f"\n⛓️ SHACKLE FAIL-CLOSED: {e}\n   Circuit opened for safety.")
        return False  # block execution

On making every layer fail-closed — agreed. Each guard layer now fails to DENY, not ALLOW. The guard's survival mode is "block everything" because the alternative ("allow everything") means the guard isn't a guard at all.

On the HITL path specifically (point 3): I added an except around request_human_input() too. If the terminal is disconnected, the guard blocks execution and logs the reason — rather than silently passing through.

Thanks for the review — would you be open to a more detailed async discussion on how you've implemented this pattern? Happy to share the standalone SHACKLE repo for reference: https://github.com/Fame510/SHACKLE-PRO-

corridor-security Bot reviewed Jun 23, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 23, 2026

View reviewed changes

fix: fail-closed invariant — every guard layer fails to DENY

eabb34a

Per @LOLA0786 review: a guard whose failure mode is 'allow' isn't a guard. Wraps entire __call__ in try/except that trips circuit on any error. HITL path also fails closed — if terminal is disconnected, block execution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(hooks): SHACKLE — pre-execution circuit breaker for tool calls#6298

feat(hooks): SHACKLE — pre-execution circuit breaker for tool calls#6298
Fame510 wants to merge 2 commits into
crewAIInc:mainfrom
Fame510:feat/shackle-guard-integration

Fame510 commented Jun 23, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

corridor-security Bot left a comment

Uh oh!

coderabbitai Bot commented Jun 23, 2026 •

edited

Loading

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 23, 2026

Uh oh!

coderabbitai Bot Jun 23, 2026

Uh oh!

Uh oh!

LOLA0786 commented Jun 23, 2026

Uh oh!

Fame510 commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Fame510 commented Jun 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What it does

Motivation

Example

Summary by CodeRabbit

New Features

Bug Fixes

Uh oh!

corridor-security Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

LOLA0786 commented Jun 23, 2026

Uh oh!

Fame510 commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fame510 commented Jun 23, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 23, 2026 •

edited

Loading