-
Notifications
You must be signed in to change notification settings - Fork 51
CRE-2025-0169: AutoGPT Agent Runaway Loop Detection #147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
piyzard
wants to merge
1
commit into
prequel-dev:main
Choose a base branch
from
piyzard:feature/autogpt-runaway-detection
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,80 @@ | ||
| rules: | ||
| - metadata: | ||
| kind: prequel | ||
| id: AG3xK8mP2qQzYvWr9aLfT7 | ||
| hash: YpQ9Lm4Zk8TnVb2Ry6HwGs | ||
| cre: | ||
| id: CRE-2025-0169 | ||
| severity: 0 | ||
| title: "AutoGPT Agent Runaway Loop Execution" | ||
| category: "memory-problem" | ||
| author: Prequel Community | ||
| description: | | ||
| Detects AutoGPT agents stuck in runaway execution loops where the agent repeatedly performs the same planning or research tasks without making meaningful progress toward the goal. This critical failure pattern can silently derail agent execution, consume excessive resources, and prevent task completion even when left running for hours. | ||
| cause: | | ||
| - Poor prompt engineering leading to circular reasoning patterns | ||
| - Inadequate memory management causing repetitive action sequences | ||
| - Insufficient world model preventing the agent from tracking completed work | ||
| - Overly complex planning that creates unnecessary recursive sub-tasks | ||
| - Semantic search over previous actions increasing loop likelihood | ||
| - GPT-4's constrained reasoning ability in complex multi-step scenarios | ||
| impact: | | ||
| - Agent never completes assigned tasks despite extended execution | ||
| - Excessive resource consumption (API calls, compute, memory) | ||
| - Silent failure that appears as continued "progress" to users | ||
| - Loss of productivity and wasted computational resources | ||
| - Potential infinite execution without manual intervention | ||
| tags: | ||
| - memory | ||
| - crash | ||
| - configuration | ||
| - performance | ||
| - timeout | ||
| - python | ||
| - deadlock | ||
| - retry | ||
| - api-key | ||
| - threads | ||
| mitigation: | | ||
| IMMEDIATE ACTIONS: | ||
| - Stop the current AutoGPT execution immediately | ||
| - Review the agent's execution log for repetitive patterns | ||
| - Implement loop detection with maximum iteration limits | ||
| CONFIGURATION FIXES: | ||
| - Set MAX_LOOPS environment variable (e.g., MAX_LOOPS=25) | ||
| - Enable execution summary logging to track progress | ||
| - Implement task completion validation checkpoints | ||
| - Add time-based execution limits (e.g., --timeout 3600) | ||
| PROMPT IMPROVEMENTS: | ||
| - Simplify task descriptions to reduce planning complexity | ||
| - Break large tasks into smaller, concrete sub-tasks | ||
| - Add explicit success criteria and completion conditions | ||
| - Use more directive language rather than open-ended requests | ||
| PREVENTION: | ||
| - Monitor execution logs for repetitive action patterns | ||
| - Implement progress tracking metrics | ||
| - Use task-specific timeouts and loop detection | ||
| - Regular checkpoint validation of intermediate results | ||
| references: | ||
| - https://github.com/Significant-Gravitas/AutoGPT/issues/2726 | ||
| - https://github.com/reworkd/AgentGPT/issues/1252 | ||
| - https://www.taivo.ai/__why-autogpt-fails-and-how-to-fix-it/ | ||
| - https://jina.ai/news/auto-gpt-unmasked-hype-hard-truths-production-pitfalls/ | ||
| applications: | ||
| - name: autogpt | ||
| version: ">=0.4.0" | ||
| - name: auto-gpt | ||
| version: ">=0.4.0" | ||
| impactScore: 9 | ||
| mitigationScore: 6 | ||
| reports: 23 | ||
| rule: | ||
| set: | ||
| window: 300s | ||
| event: | ||
| source: cre.log.autogpt | ||
| match: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @piyzard mostly LGTM just wanted to mention that the test logs contain all the 6 issues you mentioned but the regex pattern only covers 2 cases: why is that? |
||
| - regex: 'REASONING.*I need to.*research' | ||
| count: 3 | ||
| - regex: 'REASONING.*I should.*research' | ||
| count: 2 | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,69 @@ | ||
| 2024-12-15 10:23:45.123 [INFO] AutoGPT starting with goal: Find the turning circle of a Volvo V60 | ||
| 2024-12-15 10:23:46.456 [INFO] NEXT ACTION: COMMAND = { | ||
| "name": "google", | ||
| "args": {"input": "Volvo V60 turning circle"} | ||
| } | ||
| REASONING: I need to research the Volvo V60 turning circle specifications to provide accurate information. | ||
|
|
||
| 2024-12-15 10:24:12.789 [INFO] Command google returned with results... | ||
| 2024-12-15 10:24:13.012 [INFO] NEXT ACTION: COMMAND = { | ||
| "name": "browse_website", | ||
| "args": {"url": "https://www.volvo.com/en/models/v60", "question": "what is the turning circle"} | ||
| } | ||
| REASONING: I should research more detailed specifications from the official Volvo website to gather information about the V60 turning circle. | ||
|
|
||
| 2024-12-15 10:24:45.234 [INFO] Website browsing completed... | ||
| 2024-12-15 10:24:46.567 [INFO] NEXT ACTION: COMMAND = { | ||
| "name": "google", | ||
| "args": {"input": "Volvo V60 2023 turning radius specifications"} | ||
| } | ||
| REASONING: I need to research more specific information about the turning circle. Let me analyze the exact specifications for different model years. | ||
|
|
||
| 2024-12-15 10:25:13.890 [INFO] Google search completed... | ||
| 2024-12-15 10:25:14.123 [INFO] NEXT ACTION: COMMAND = { | ||
| "name": "browse_website", | ||
| "args": {"url": "https://www.cars.com/research/volvo-v60", "question": "turning circle radius"} | ||
| } | ||
| REASONING: I should research and gather more comprehensive information about the Volvo V60 turning specifications from automotive databases. | ||
|
|
||
| 2024-12-15 10:25:47.456 [INFO] Attempting to fix JSON by finding outermost brackets... | ||
| 2024-12-15 10:25:48.789 [INFO] NEXT ACTION: COMMAND = { | ||
| "name": "google", | ||
| "args": {"input": "Volvo V60 curb-to-curb turning diameter"} | ||
| } | ||
| REASONING: I need to research the exact turning circle measurements. Let me analyze and gather information about both curb-to-curb and wall-to-wall turning diameters. | ||
|
|
||
| 2024-12-15 10:26:23.012 [INFO] Search results processed... | ||
| 2024-12-15 10:26:24.345 [INFO] NEXT ACTION: COMMAND = { | ||
| "name": "write_to_file", | ||
| "args": {"file": "volvo_research.txt", "text": "Researching Volvo V60 turning specifications"} | ||
| } | ||
| REASONING: I should research and document my findings systematically. Let me analyze what I've learned so far about the turning circle. | ||
|
|
||
| 2024-12-15 10:26:55.678 [INFO] File written successfully... | ||
| 2024-12-15 10:26:56.901 [INFO] NEXT ACTION: COMMAND = { | ||
| "name": "google", | ||
| "args": {"input": "Volvo V60 maneuvering diameter specifications manual"} | ||
| } | ||
| REASONING: I need to research the official technical specifications. Let me analyze the owner's manual for precise turning circle data. | ||
|
|
||
| 2024-12-15 10:27:34.234 [INFO] Search completed... | ||
| 2024-12-15 10:27:35.567 [INFO] NEXT ACTION: COMMAND = { | ||
| "name": "browse_website", | ||
| "args": {"url": "https://www.volvocars.com/intl/support/manuals", "question": "V60 turning circle"} | ||
| } | ||
| REASONING: I should research the official documentation to gather information about the exact turning specifications from Volvo's technical manuals. | ||
|
|
||
| 2024-12-15 10:28:12.890 [WARN] Loop iteration 8 detected... | ||
| 2024-12-15 10:28:13.123 [INFO] NEXT ACTION: COMMAND = { | ||
| "name": "google", | ||
| "args": {"input": "Volvo V60 turning circle feet meters comparison"} | ||
| } | ||
| REASONING: I need to research and analyze the turning circle in both metric and imperial units to provide comprehensive information. | ||
|
|
||
| 2024-12-15 10:28:47.456 [INFO] Search results obtained... | ||
| 2024-12-15 10:28:48.789 [INFO] NEXT ACTION: COMMAND = { | ||
| "name": "analyze_data", | ||
| "args": {"data": "turning circle research results"} | ||
| } | ||
| REASONING: I should research and analyze all the data I've gathered about the Volvo V60 turning specifications to synthesize the information. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hash is not a valid field.