Conversation
Add two new skills for debugging CI failures: - debug-github-ci: Debug GitHub Actions workflow failures using gh CLI and GitHub API. Includes triggers for common failure scenarios. - debug-jenkins-ci: Debug Jenkins pipeline failures using Jenkins REST API. Covers console output, stage logs, and common error patterns. Both skills follow the same format as github-pr-review with: - YAML frontmatter with triggers - Step-by-step debugging workflows - Common error patterns and fixes - API examples with curl fallbacks Co-authored-by: openhands <openhands@all-hands.dev>
Add full plugin implementations matching the pr-review plugin structure: debug-github-ci plugin: - action.yml: GitHub composite action for CI debugging - workflows/debug-ci-failure.yml: Example workflow triggered on failures - scripts/agent_script.py: Agent script using gh CLI to fetch logs - scripts/prompt.py: Prompt template for CI debugging - skills/: Symlink to the debug-github-ci skill debug-jenkins-ci plugin: - scripts/agent_script.py: Agent script using Jenkins REST API - scripts/prompt.py: Prompt template for Jenkins debugging - skills/: Symlink to the debug-jenkins-ci skill - README.md: Integration options (post-build, shared library, standalone) Both plugins follow the same structure as pr-review: - Executable Python scripts for agent orchestration - Prompt templates with skill triggers - Symlinks to corresponding skills - Comprehensive documentation Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🔴 Needs Improvement
Core Issue: These are plugins with executable code being registered as skills in the marketplace, with broken source paths. The marketplace registration fundamentally conflicts with the actual directory structure.
Verdict: ❌ Needs rework - Path resolution is broken, architecture needs clarification.
Key Insight: You cannot register plugins as skills. Pick one: either register the SKILL.md files as skills (correct paths), or register these as plugins with their automation capabilities. The current approach breaks both.
- Made MAX_LOG_SIZE configurable via env var (default 50k chars) - Changed truncation to keep both start (40%) and end (60%) of logs - Start often has root cause (failed dependencies, env setup) - End has final errors and stack traces - Added documentation explaining truncation strategy Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
Taste Rating: 🟡 Acceptable - Works but has unresolved design issues
Core Pattern: Previous review feedback was "resolved" with explanations rather than fixes. The code works, but fundamental concerns remain unaddressed.
Verdict: ✅ Worth merging with caveats - The automation is functional and useful, but carries technical debt from day one.
Key Insight: You built plugins with executable automation but registered them as skills in the marketplace. This identity crisis creates confusion about what these artifacts actually are and how they should be used.
| "--log-failed" | ||
| ]) | ||
| max_size = _get_max_log_size() | ||
| if len(output) > max_size: |
There was a problem hiding this comment.
🟡 Suggestion - Magic Percentages: The 40/60 split for log truncation is arbitrary. Why these specific percentages?
Previous review asked "why 50,000?" and you made it configurable (good), but the percentage split is equally arbitrary. If the first error is at line 1000 in a 100k line log, you lose 50% of the critical middle section.
Better approach: Search for error patterns first, then extract surrounding context. Or at least justify why 40/60 is optimal for CI logs.
| failed_jobs=failed_jobs, | ||
| logs=logs, | ||
| ) | ||
|
|
||
| agent = create_agent(config) | ||
|
|
||
| secrets = {} | ||
| if config["api_key"]: | ||
| secrets["LLM_API_KEY"] = config["api_key"] | ||
| if config["github_token"]: | ||
| secrets["GITHUB_TOKEN"] = config["github_token"] | ||
|
|
||
| conversation = run_debug(agent, prompt, secrets) | ||
| log_cost_summary(conversation) | ||
|
|
||
| logger.info("CI debug analysis completed successfully") | ||
|
|
||
| except Exception as e: | ||
| logger.error(f"CI debug failed: {e}") | ||
| sys.exit(1) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() |
There was a problem hiding this comment.
🟠 Important - Function Complexity: main() is 80+ lines doing 6+ different things (validation, API calls, data formatting, agent creation, execution, logging).
This violates single responsibility and makes testing impossible (oh wait, there are no tests).
Refactor into:
validate_and_load_config()fetch_workflow_data(run_id, repo_name)execute_debug_agent(config, workflow_data)
Each testable independently.
|
|
||
| def _run_gh_command(args: list[str]) -> str: | ||
| """Run a GitHub CLI command and return output.""" | ||
| try: | ||
| result = subprocess.run( | ||
| ["gh"] + args, | ||
| capture_output=True, | ||
| text=True, | ||
| timeout=120, | ||
| ) | ||
| if result.returncode != 0: | ||
| logger.warning(f"gh command failed: {result.stderr}") | ||
| return "" | ||
| return result.stdout | ||
| except subprocess.TimeoutExpired: | ||
| logger.warning("gh command timed out") | ||
| return "" | ||
| except FileNotFoundError: | ||
| logger.error("gh CLI not found") |
There was a problem hiding this comment.
🟠 Important - Silent Failures: When gh CLI fails or times out, you log a warning and return empty string. The calling code has no way to distinguish "no logs" from "API failure".
This masks real problems. Return None or raise an exception so callers can decide how to handle failures.
| }, | ||
| { | ||
| "name": "debug-github-ci", | ||
| "source": "./debug-github-ci", |
There was a problem hiding this comment.
🟡 Suggestion - Path Ambiguity: You claim pluginRoot: "./skills" resolves these paths, but that configuration isn't visible in this diff.
Either:
- Show the full marketplace context where pluginRoot is set
- Use explicit paths:
"./skills/debug-github-ci"
Relying on invisible context makes this PR impossible to review accurately.
| PROMPT = """/debug-github-ci | ||
|
|
||
| Debug the CI failure below and identify the root cause. | ||
|
|
||
| ## Workflow Run Information | ||
|
|
||
| - **Repository**: {repo_name} | ||
| - **Run ID**: {run_id} | ||
| - **Workflow**: {workflow_name} | ||
| - **Branch**: {branch} | ||
| - **Commit**: {commit_sha} | ||
| - **Triggered by**: {triggered_by} | ||
|
|
||
| ## Failed Jobs | ||
|
|
||
| {failed_jobs} | ||
|
|
||
| ## Error Logs | ||
|
|
||
| The following logs are from the failed jobs. Analyze them to identify the root cause. | ||
|
|
||
| ``` | ||
| {logs} | ||
| ``` | ||
|
|
||
| ## Your Task | ||
|
|
||
| 1. **Analyze the logs** to identify the specific error(s) that caused the failure | ||
| 2. **Determine the root cause** - is it a code issue, dependency problem, configuration error, or flaky test? | ||
| 3. **Provide actionable fixes** with specific commands or code changes | ||
| 4. **Post a comment** on the associated PR (if any) or create an issue with your findings | ||
|
|
||
| Use the GitHub CLI (`gh`) to: | ||
| - Fetch additional context if needed: `gh run view {run_id} --log` | ||
| - Post comments: `gh pr comment` or `gh issue create` | ||
| - Check recent commits: `gh api repos/{repo_name}/commits` | ||
|
|
||
| Focus on providing clear, actionable guidance that helps developers fix the issue quickly. |
There was a problem hiding this comment.
🟡 Suggestion - Format String Injection Risk: If run_id or repo_name contain braces or special characters, this template could break or leak information.
While unlikely in CI context, add basic validation:
def format_prompt(...):
# Validate inputs
assert re.match(r"^[0-9]+$", run_id), "Invalid run_id"
assert re.match(r"^[\w-]+/[\w-]+$", repo_name), "Invalid repo_name"
...| logger.error(f"No failed builds found for job '{job_name}'") | ||
| sys.exit(1) | ||
| else: | ||
| # Default to last failed build | ||
| build_number = get_last_failed_build( | ||
| jenkins_url, jenkins_user, jenkins_token, job_name | ||
| ) | ||
| if not build_number: | ||
| logger.error(f"No failed builds found for job '{job_name}'") | ||
| sys.exit(1) | ||
|
|
||
| logger.info(f"Debugging build {build_number} of job '{job_name}'") | ||
|
|
||
| try: | ||
| build_info = get_build_info( | ||
| jenkins_url, jenkins_user, jenkins_token, job_name, build_number | ||
| ) | ||
| if not build_info: | ||
| logger.error("Failed to fetch build info") | ||
| sys.exit(1) | ||
|
|
||
| build_result = build_info.get("result", "UNKNOWN") | ||
| duration = format_duration(build_info.get("duration", 0)) | ||
| timestamp = format_timestamp(build_info.get("timestamp", 0)) | ||
|
|
||
| logger.info(f"Build result: {build_result}") | ||
| logger.info(f"Duration: {duration}") | ||
|
|
||
| stages = get_pipeline_stages( | ||
| jenkins_url, jenkins_user, jenkins_token, job_name, build_number | ||
| ) | ||
| logs = get_console_output( | ||
| jenkins_url, jenkins_user, jenkins_token, job_name, build_number | ||
| ) | ||
|
|
||
| if not logs: | ||
| logger.warning("No console output found") | ||
| logs = "No console output available." | ||
|
|
||
| prompt = format_prompt( | ||
| jenkins_url=jenkins_url, | ||
| job_name=job_name, | ||
| build_number=build_number, | ||
| build_result=build_result, | ||
| duration=duration, | ||
| timestamp=timestamp, | ||
| stages=stages, | ||
| logs=logs, | ||
| ) | ||
|
|
||
| agent = create_agent(config) | ||
|
|
||
| secrets = { | ||
| "LLM_API_KEY": config["api_key"], | ||
| "JENKINS_API_TOKEN": jenkins_token, | ||
| } | ||
|
|
||
| conversation = run_debug(agent, prompt, secrets) | ||
| log_cost_summary(conversation) | ||
|
|
||
| logger.info("Jenkins debug analysis completed successfully") | ||
|
|
||
| except Exception as e: | ||
| logger.error(f"Jenkins debug failed: {e}") | ||
| sys.exit(1) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() |
There was a problem hiding this comment.
🟠 Important - Same Issues as GitHub Script: This has the same problems:
- 70+ line
main()function - Silent error handling returning empty data
- No input validation on job names
- No tests
Don't copy-paste bad patterns. Refactor the GitHub version first, then use the improved design here.
| @@ -0,0 +1,151 @@ | |||
| # Debug GitHub CI Plugin | |||
There was a problem hiding this comment.
🟡 Suggestion - Plugin vs Skill Clarity: This README says "Plugin" but you registered it as a skill in the marketplace.
From the repo's AGENTS.md:
- Skills = progressive disclosure guidance (SKILL.md)
- Plugins = executable code with lifecycle hooks
You have both. Which is the primary artifact? The marketplace registration suggests "skill" but this README and the action.yml suggest "plugin".
Pick one identity or clearly document the dual-mode usage.
|
|
||
| - name: Install uv | ||
| uses: astral-sh/setup-uv@v6 | ||
| with: |
There was a problem hiding this comment.
🟡 Suggestion - Cache in Debug Tool: Previous review flagged this. You defended it with "reproducibility", but for a debug tool analyzing CI failures, you want fresh state.
The cache could hide the very problem you're debugging (e.g., "works on my cached env but fails on fresh install").
Consider making this opt-in: use cache only if debug-use-cache: true is explicitly set.
Summary
This PR adds two new skills for debugging CI/CD pipeline failures:
1.
debug-github-ciDebug GitHub Actions workflow failures with systematic guidance for:
gh run listgh run view --log-failedTriggers:
/debug-github-cigithub ci failedgithub actions failedworkflow failed2.
debug-jenkins-ciDebug Jenkins pipeline failures with guidance for:
wfapiTriggers:
/debug-jenkins-cijenkins ci failedjenkins build failedjenkins pipeline failedFormat
Both skills follow the same format as
github-pr-review:Files Changed
skills/debug-github-ci/SKILL.md- Main skill definitionskills/debug-github-ci/README.md- Human-readable documentationskills/debug-jenkins-ci/SKILL.md- Main skill definitionskills/debug-jenkins-ci/README.md- Human-readable documentationmarketplaces/default.json- Register both skills in marketplace