Add debug-github-ci and debug-jenkins-ci skills by neubig · Pull Request #76 · OpenHands/extensions

neubig · 2026-02-26T20:14:41Z

Summary

This PR adds two new skills for debugging CI/CD pipeline failures:

1. `debug-github-ci`

Debug GitHub Actions workflow failures with systematic guidance for:

Finding failed runs using gh run list
Fetching logs with gh run view --log-failed
Common error patterns and their fixes
Rerunning failed jobs

Triggers:

/debug-github-ci
github ci failed
github actions failed
workflow failed

2. `debug-jenkins-ci`

Debug Jenkins pipeline failures with guidance for:

Fetching build status and console output via Jenkins REST API
Pipeline stage debugging with wfapi
Common Jenkins error patterns
Jenkinsfile validation and debugging

Triggers:

/debug-jenkins-ci
jenkins ci failed
jenkins build failed
jenkins pipeline failed

Format

Both skills follow the same format as github-pr-review:

YAML frontmatter with name, description, and triggers
Step-by-step debugging workflows
Common error pattern tables
API examples with curl fallbacks
Debugging checklists

Files Changed

skills/debug-github-ci/SKILL.md - Main skill definition
skills/debug-github-ci/README.md - Human-readable documentation
skills/debug-jenkins-ci/SKILL.md - Main skill definition
skills/debug-jenkins-ci/README.md - Human-readable documentation
marketplaces/default.json - Register both skills in marketplace

Add two new skills for debugging CI failures: - debug-github-ci: Debug GitHub Actions workflow failures using gh CLI and GitHub API. Includes triggers for common failure scenarios. - debug-jenkins-ci: Debug Jenkins pipeline failures using Jenkins REST API. Covers console output, stage logs, and common error patterns. Both skills follow the same format as github-pr-review with: - YAML frontmatter with triggers - Step-by-step debugging workflows - Common error patterns and fixes - API examples with curl fallbacks Co-authored-by: openhands <openhands@all-hands.dev>

Add full plugin implementations matching the pr-review plugin structure: debug-github-ci plugin: - action.yml: GitHub composite action for CI debugging - workflows/debug-ci-failure.yml: Example workflow triggered on failures - scripts/agent_script.py: Agent script using gh CLI to fetch logs - scripts/prompt.py: Prompt template for CI debugging - skills/: Symlink to the debug-github-ci skill debug-jenkins-ci plugin: - scripts/agent_script.py: Agent script using Jenkins REST API - scripts/prompt.py: Prompt template for Jenkins debugging - skills/: Symlink to the debug-jenkins-ci skill - README.md: Integration options (post-build, shared library, standalone) Both plugins follow the same structure as pr-review: - Executable Python scripts for agent orchestration - Prompt templates with skill triggers - Symlinks to corresponding skills - Comprehensive documentation Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot

Taste Rating: 🔴 Needs Improvement

Core Issue: These are plugins with executable code being registered as skills in the marketplace, with broken source paths. The marketplace registration fundamentally conflicts with the actual directory structure.

Verdict: ❌ Needs rework - Path resolution is broken, architecture needs clarification.

Key Insight: You cannot register plugins as skills. Pick one: either register the SKILL.md files as skills (correct paths), or register these as plugins with their automation capabilities. The current approach breaks both.

marketplaces/default.json

plugins/debug-github-ci/scripts/agent_script.py

plugins/debug-jenkins-ci/scripts/agent_script.py

plugins/debug-github-ci/action.yml

plugins/debug-github-ci/skills/debug-github-ci

skills/debug-github-ci/SKILL.md

plugins/debug-github-ci/scripts/agent_script.py

- Made MAX_LOG_SIZE configurable via env var (default 50k chars) - Changed truncation to keep both start (40%) and end (60%) of logs - Start often has root cause (failed dependencies, env setup) - End has final errors and stack traces - Added documentation explaining truncation strategy Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot

Taste Rating: 🟡 Acceptable - Works but has unresolved design issues

Core Pattern: Previous review feedback was "resolved" with explanations rather than fixes. The code works, but fundamental concerns remain unaddressed.

Verdict: ✅ Worth merging with caveats - The automation is functional and useful, but carries technical debt from day one.

Key Insight: You built plugins with executable automation but registered them as skills in the marketplace. This identity crisis creates confusion about what these artifacts actually are and how they should be used.

all-hands-bot · 2026-03-02T03:43:11Z

plugins/debug-github-ci/scripts/agent_script.py

+        "--log-failed"
+    ])
+    max_size = _get_max_log_size()
+    if len(output) > max_size:


🟡 Suggestion - Magic Percentages: The 40/60 split for log truncation is arbitrary. Why these specific percentages?

Previous review asked "why 50,000?" and you made it configurable (good), but the percentage split is equally arbitrary. If the first error is at line 1000 in a 100k line log, you lose 50% of the critical middle section.

Better approach: Search for error patterns first, then extract surrounding context. Or at least justify why 40/60 is optimal for CI logs.

all-hands-bot · 2026-03-02T03:43:11Z

plugins/debug-github-ci/scripts/agent_script.py

+            failed_jobs=failed_jobs,
+            logs=logs,
+        )
+
+        agent = create_agent(config)
+
+        secrets = {}
+        if config["api_key"]:
+            secrets["LLM_API_KEY"] = config["api_key"]
+        if config["github_token"]:
+            secrets["GITHUB_TOKEN"] = config["github_token"]
+
+        conversation = run_debug(agent, prompt, secrets)
+        log_cost_summary(conversation)
+
+        logger.info("CI debug analysis completed successfully")
+
+    except Exception as e:
+        logger.error(f"CI debug failed: {e}")
+        sys.exit(1)
+
+
+if __name__ == "__main__":
+    main()


🟠 Important - Function Complexity: main() is 80+ lines doing 6+ different things (validation, API calls, data formatting, agent creation, execution, logging).

This violates single responsibility and makes testing impossible (oh wait, there are no tests).

Refactor into:

validate_and_load_config()

fetch_workflow_data(run_id, repo_name)

execute_debug_agent(config, workflow_data)

Each testable independently.

all-hands-bot · 2026-03-02T03:43:11Z

plugins/debug-github-ci/scripts/agent_script.py

+
+def _run_gh_command(args: list[str]) -> str:
+    """Run a GitHub CLI command and return output."""
+    try:
+        result = subprocess.run(
+            ["gh"] + args,
+            capture_output=True,
+            text=True,
+            timeout=120,
+        )
+        if result.returncode != 0:
+            logger.warning(f"gh command failed: {result.stderr}")
+            return ""
+        return result.stdout
+    except subprocess.TimeoutExpired:
+        logger.warning("gh command timed out")
+        return ""
+    except FileNotFoundError:
+        logger.error("gh CLI not found")


🟠 Important - Silent Failures: When gh CLI fails or times out, you log a warning and return empty string. The calling code has no way to distinguish "no logs" from "API failure".

This masks real problems. Return None or raise an exception so callers can decide how to handle failures.

all-hands-bot · 2026-03-02T03:43:11Z

marketplaces/default.json

+        },
+        {
+            "name": "debug-github-ci",
+            "source": "./debug-github-ci",


🟡 Suggestion - Path Ambiguity: You claim pluginRoot: "./skills" resolves these paths, but that configuration isn't visible in this diff.

Either:

Show the full marketplace context where pluginRoot is set

Use explicit paths: "./skills/debug-github-ci"

Relying on invisible context makes this PR impossible to review accurately.

all-hands-bot · 2026-03-02T03:43:11Z

plugins/debug-github-ci/scripts/prompt.py

+PROMPT = """/debug-github-ci
+
+Debug the CI failure below and identify the root cause.
+
+## Workflow Run Information
+
+- **Repository**: {repo_name}
+- **Run ID**: {run_id}
+- **Workflow**: {workflow_name}
+- **Branch**: {branch}
+- **Commit**: {commit_sha}
+- **Triggered by**: {triggered_by}
+
+## Failed Jobs
+
+{failed_jobs}
+
+## Error Logs
+
+The following logs are from the failed jobs. Analyze them to identify the root cause.
+
+```
+{logs}
+```
+
+## Your Task
+
+1. **Analyze the logs** to identify the specific error(s) that caused the failure
+2. **Determine the root cause** - is it a code issue, dependency problem, configuration error, or flaky test?
+3. **Provide actionable fixes** with specific commands or code changes
+4. **Post a comment** on the associated PR (if any) or create an issue with your findings
+
+Use the GitHub CLI (`gh`) to:
+- Fetch additional context if needed: `gh run view {run_id} --log`
+- Post comments: `gh pr comment` or `gh issue create`
+- Check recent commits: `gh api repos/{repo_name}/commits`
+
+Focus on providing clear, actionable guidance that helps developers fix the issue quickly.


🟡 Suggestion - Format String Injection Risk: If run_id or repo_name contain braces or special characters, this template could break or leak information.

While unlikely in CI context, add basic validation:

def format_prompt(...): # Validate inputs assert re.match(r"^[0-9]+$", run_id), "Invalid run_id" assert re.match(r"^[\w-]+/[\w-]+$", repo_name), "Invalid repo_name" ...

all-hands-bot · 2026-03-02T03:43:11Z

plugins/debug-jenkins-ci/scripts/agent_script.py

+            logger.error(f"No failed builds found for job '{job_name}'")
+            sys.exit(1)
+    else:
+        # Default to last failed build
+        build_number = get_last_failed_build(
+            jenkins_url, jenkins_user, jenkins_token, job_name
+        )
+        if not build_number:
+            logger.error(f"No failed builds found for job '{job_name}'")
+            sys.exit(1)
+
+    logger.info(f"Debugging build {build_number} of job '{job_name}'")
+
+    try:
+        build_info = get_build_info(
+            jenkins_url, jenkins_user, jenkins_token, job_name, build_number
+        )
+        if not build_info:
+            logger.error("Failed to fetch build info")
+            sys.exit(1)
+
+        build_result = build_info.get("result", "UNKNOWN")
+        duration = format_duration(build_info.get("duration", 0))
+        timestamp = format_timestamp(build_info.get("timestamp", 0))
+
+        logger.info(f"Build result: {build_result}")
+        logger.info(f"Duration: {duration}")
+
+        stages = get_pipeline_stages(
+            jenkins_url, jenkins_user, jenkins_token, job_name, build_number
+        )
+        logs = get_console_output(
+            jenkins_url, jenkins_user, jenkins_token, job_name, build_number
+        )
+
+        if not logs:
+            logger.warning("No console output found")
+            logs = "No console output available."
+
+        prompt = format_prompt(
+            jenkins_url=jenkins_url,
+            job_name=job_name,
+            build_number=build_number,
+            build_result=build_result,
+            duration=duration,
+            timestamp=timestamp,
+            stages=stages,
+            logs=logs,
+        )
+
+        agent = create_agent(config)
+
+        secrets = {
+            "LLM_API_KEY": config["api_key"],
+            "JENKINS_API_TOKEN": jenkins_token,
+        }
+
+        conversation = run_debug(agent, prompt, secrets)
+        log_cost_summary(conversation)
+
+        logger.info("Jenkins debug analysis completed successfully")
+
+    except Exception as e:
+        logger.error(f"Jenkins debug failed: {e}")
+        sys.exit(1)
+
+
+if __name__ == "__main__":
+    main()


🟠 Important - Same Issues as GitHub Script: This has the same problems:

70+ line main() function

Silent error handling returning empty data

No input validation on job names

No tests

Don't copy-paste bad patterns. Refactor the GitHub version first, then use the improved design here.

all-hands-bot · 2026-03-02T03:43:11Z

plugins/debug-github-ci/README.md

@@ -0,0 +1,151 @@
+# Debug GitHub CI Plugin


🟡 Suggestion - Plugin vs Skill Clarity: This README says "Plugin" but you registered it as a skill in the marketplace.

From the repo's AGENTS.md:

Skills = progressive disclosure guidance (SKILL.md)

Plugins = executable code with lifecycle hooks

You have both. Which is the primary artifact? The marketplace registration suggests "skill" but this README and the action.yml suggest "plugin".

Pick one identity or clearly document the dual-mode usage.

all-hands-bot · 2026-03-02T03:43:11Z

plugins/debug-github-ci/action.yml

+
+        - name: Install uv
+          uses: astral-sh/setup-uv@v6
+          with:


🟡 Suggestion - Cache in Debug Tool: Previous review flagged this. You defended it with "reproducibility", but for a debug tool analyzing CI failures, you want fresh state.

The cache could hide the very problem you're debugging (e.g., "works on my cached env but fails on fresh install").

Consider making this opt-in: use cache only if debug-use-cache: true is explicitly set.

openhands-agent added 2 commits February 26, 2026 20:14

neubig marked this pull request as ready for review March 1, 2026 13:17

all-hands-bot reviewed Mar 1, 2026

View reviewed changes

neubig marked this pull request as draft March 1, 2026 13:28

neubig marked this pull request as ready for review March 2, 2026 03:40

all-hands-bot reviewed Mar 2, 2026

View reviewed changes

Conversation

neubig commented Feb 26, 2026

Summary

1. debug-github-ci

2. debug-jenkins-ci

Format

Files Changed

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Taste Rating: 🔴 Needs Improvement

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Taste Rating: 🟡 Acceptable - Works but has unresolved design issues

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

1. `debug-github-ci`

2. `debug-jenkins-ci`