Skip to content

Conversation

Logicmn
Copy link

@Logicmn Logicmn commented Jul 11, 2025

Description

Hello! The current parser implementation for GitHub code scanning results is baked into the "Github Vulnerability Scan" scan type, which is a parser originally meant to be used for GitHub SCA (Dependabot) vulnerabilities. Since these two scan types are exceptionally different, issues can arise especially around the fields used for deduplication in the hash code. This PR splits out GitHub code scanning into its own GithubSASTParser, with a scan-type string called ""Github SAST Scan." I have included documentation, unit tests, and a new list of fields for hash code deduplication.

I also included several improvements for the original Github Vulnerability Scan parser. These improvements include:

  • Add support for the cvssSeverities which will replace the cvss field in GitHub's graphql response in October, 2025.
  • Add the permalink from the dependabotUpdate field to the finding description
  • Add GitHub's now supported epss percentage and percentile to finding.epss_score and finding.epss_percentile finding fields
  • Set finding.url to GitHub Dependabot alert hyperlink for convenience
  • Improve vulnerability ID handling (now explicitly sets finding.cve and finding.vuln_id_from_tool fields before falling back to unsaved_vulnerability_ids)
  • Fix a bug where finding.component_version was only being set when the vulnerableRequirements str started with =
  • Improve defensive coding where applicable, like using .get() to access fields

Backward compatibility: existing users of the “Github Vulnerability Scan” scan type (driven by GithubVulnerabilityParser) for SCA imports will see no change. If you’d been using it to ingest SAST/code-scanning JSON, you’ll need to switch your import to the new “Github SAST Scan” scan type (driven by GithubSASTParser).

Ref links:

@github-actions github-actions bot added settings_changes Needs changes to settings.py based on changes in settings.dist.py included in this PR docs unittests parser labels Jul 11, 2025
Copy link

dryrunsecurity bot commented Jul 11, 2025

DryRun Security

This pull request contains an open redirect vulnerability in the GithubSASTParser where a maliciously crafted SAST report could generate a file link pointing to an attacker-controlled domain, potentially leading to an open redirect if the DefectDojo UI renders the description as HTML.

Open Redirect in dojo/tools/github_sast/parser.py
Vulnerability Open Redirect
Description The GithubSASTParser constructs a file_link URL using the scheme and network location directly from the html_url field found in the input SAST report. If a malicious SAST report is uploaded with a crafted html_url (e.g., https://attacker.com/path), the generated file_link will point to the attacker's domain. This file_link is then embedded into the description field of the Finding object as a Markdown-formatted link. If the DefectDojo UI renders this description as HTML, it would create a clickable link to the attacker-controlled domain, leading to an open redirect.

import json
from dojo.models import Finding
class GithubSASTParser:
def get_scan_types(self):
return ["Github SAST Scan"]
def get_label_for_scan_types(self, scan_type):
return scan_type
def get_description_for_scan_types(self, scan_type):
return "GitHub SAST report file can be imported in JSON format."
def get_findings(self, filename, test):
data = json.load(filename)
if not isinstance(data, list):
error_msg = "Invalid SAST report format, expected a JSON list of alerts."
raise TypeError(error_msg)
findings = []
for vuln in data:
rule = vuln.get("rule", {})
inst = vuln.get("most_recent_instance", {})
loc = inst.get("location", {})
html_url = vuln.get("html_url")
rule_id = rule.get("id")
title = f"{rule.get('description')} ({rule_id})"
severity = rule.get("security_severity_level", "Info").title()
active = vuln.get("state") == "open"
# Build description with context
desc_lines = []
if html_url:
desc_lines.append(f"GitHub Alert: [{html_url}]({html_url})")
owner = repo = None
commit_sha = inst.get("commit_sha")
if html_url:
from urllib.parse import urlparse
parsed = urlparse(html_url)
parts = parsed.path.strip("/").split("/")
# URL is /<owner>/<repo>/security/... so parts[0]=owner, parts[1]=repo
if len(parts) >= 2:
owner, repo = parts[0], parts[1]
if owner and repo and commit_sha and loc.get("path") and loc.get("start_line"):
file_link = (
f"{parsed.scheme}://{parsed.netloc}/"
f"{owner}/{repo}/blob/{commit_sha}/"
f"{loc['path']}#L{loc['start_line']}"
)
desc_lines.append(f"Location: [{loc['path']}:{loc['start_line']}]({file_link})")
elif loc.get("path") and loc.get("start_line"):
# fallback if something is missing
desc_lines.append(f"Location: {loc['path']}:{loc['start_line']}")
msg = inst.get("message", {}).get("text")
if msg:
desc_lines.append(f"Message: {msg}")
if severity:
desc_lines.append(f"Rule Severity: {severity}")
if rule.get("full_description"):
desc_lines.append(f"Description: {rule.get('full_description')}")
description = "\n".join(desc_lines)
finding = Finding(
title=title,
test=test,
description=description,
severity=severity,
active=active,
static_finding=True,
dynamic_finding=False,
vuln_id_from_tool=rule_id,
)
# File path & line
finding.file_path = loc.get("path")
finding.line = loc.get("start_line")
if html_url:
finding.url = html_url
findings.append(finding)
return findings


All finding details can be found in the DryRun Security Dashboard.

@Logicmn
Copy link
Author

Logicmn commented Jul 14, 2025

@Maffooch All linting errors should be fixed now, thanks for bearing with. :)

@valentijnscholten valentijnscholten added this to the 2.49.0 milestone Jul 15, 2025
Copy link
Member

@valentijnscholten valentijnscholten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment posted above

@valentijnscholten valentijnscholten modified the milestones: 2.49.0, 2.50.0 Aug 4, 2025
Copy link
Contributor

@dogboat dogboat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two nits about import placement, but otherwise looks great; approving because they're not blockers imho.

owner = repo = None
commit_sha = inst.get("commit_sha")
if html_url:
from urllib.parse import urlparse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to have this here rather than at the top?


def test_parse_file_invalid_format_raises(self):
"""Non-list JSON should raise"""
import io
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same nit about imports.

@valentijnscholten valentijnscholten modified the milestones: 2.50.0, 2.51.0 Sep 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs parser settings_changes Needs changes to settings.py based on changes in settings.dist.py included in this PR unittests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants