-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Split Github Vulnerability Scan into separate SCA & SAST parsers #12773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
This pull request contains an open redirect vulnerability in the
Open Redirect in
|
Vulnerability | Open Redirect |
---|---|
Description | The GithubSASTParser constructs a file_link URL using the scheme and network location directly from the html_url field found in the input SAST report. If a malicious SAST report is uploaded with a crafted html_url (e.g., https://attacker.com/path ), the generated file_link will point to the attacker's domain. This file_link is then embedded into the description field of the Finding object as a Markdown-formatted link. If the DefectDojo UI renders this description as HTML, it would create a clickable link to the attacker-controlled domain, leading to an open redirect. |
django-DefectDojo/dojo/tools/github_sast/parser.py
Lines 1 to 85 in 2ffb18d
import json | |
from dojo.models import Finding | |
class GithubSASTParser: | |
def get_scan_types(self): | |
return ["Github SAST Scan"] | |
def get_label_for_scan_types(self, scan_type): | |
return scan_type | |
def get_description_for_scan_types(self, scan_type): | |
return "GitHub SAST report file can be imported in JSON format." | |
def get_findings(self, filename, test): | |
data = json.load(filename) | |
if not isinstance(data, list): | |
error_msg = "Invalid SAST report format, expected a JSON list of alerts." | |
raise TypeError(error_msg) | |
findings = [] | |
for vuln in data: | |
rule = vuln.get("rule", {}) | |
inst = vuln.get("most_recent_instance", {}) | |
loc = inst.get("location", {}) | |
html_url = vuln.get("html_url") | |
rule_id = rule.get("id") | |
title = f"{rule.get('description')} ({rule_id})" | |
severity = rule.get("security_severity_level", "Info").title() | |
active = vuln.get("state") == "open" | |
# Build description with context | |
desc_lines = [] | |
if html_url: | |
desc_lines.append(f"GitHub Alert: [{html_url}]({html_url})") | |
owner = repo = None | |
commit_sha = inst.get("commit_sha") | |
if html_url: | |
from urllib.parse import urlparse | |
parsed = urlparse(html_url) | |
parts = parsed.path.strip("/").split("/") | |
# URL is /<owner>/<repo>/security/... so parts[0]=owner, parts[1]=repo | |
if len(parts) >= 2: | |
owner, repo = parts[0], parts[1] | |
if owner and repo and commit_sha and loc.get("path") and loc.get("start_line"): | |
file_link = ( | |
f"{parsed.scheme}://{parsed.netloc}/" | |
f"{owner}/{repo}/blob/{commit_sha}/" | |
f"{loc['path']}#L{loc['start_line']}" | |
) | |
desc_lines.append(f"Location: [{loc['path']}:{loc['start_line']}]({file_link})") | |
elif loc.get("path") and loc.get("start_line"): | |
# fallback if something is missing | |
desc_lines.append(f"Location: {loc['path']}:{loc['start_line']}") | |
msg = inst.get("message", {}).get("text") | |
if msg: | |
desc_lines.append(f"Message: {msg}") | |
if severity: | |
desc_lines.append(f"Rule Severity: {severity}") | |
if rule.get("full_description"): | |
desc_lines.append(f"Description: {rule.get('full_description')}") | |
description = "\n".join(desc_lines) | |
finding = Finding( | |
title=title, | |
test=test, | |
description=description, | |
severity=severity, | |
active=active, | |
static_finding=True, | |
dynamic_finding=False, | |
vuln_id_from_tool=rule_id, | |
) | |
# File path & line | |
finding.file_path = loc.get("path") | |
finding.line = loc.get("start_line") | |
if html_url: | |
finding.url = html_url | |
findings.append(finding) | |
return findings |
All finding details can be found in the DryRun Security Dashboard.
@Maffooch All linting errors should be fixed now, thanks for bearing with. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment posted above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just two nits about import placement, but otherwise looks great; approving because they're not blockers imho.
owner = repo = None | ||
commit_sha = inst.get("commit_sha") | ||
if html_url: | ||
from urllib.parse import urlparse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to have this here rather than at the top?
|
||
def test_parse_file_invalid_format_raises(self): | ||
"""Non-list JSON should raise""" | ||
import io |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same nit about imports.
Description
Hello! The current parser implementation for GitHub code scanning results is baked into the "Github Vulnerability Scan" scan type, which is a parser originally meant to be used for GitHub SCA (Dependabot) vulnerabilities. Since these two scan types are exceptionally different, issues can arise especially around the fields used for deduplication in the hash code. This PR splits out GitHub code scanning into its own
GithubSASTParser
, with a scan-type string called ""Github SAST Scan." I have included documentation, unit tests, and a new list of fields for hash code deduplication.I also included several improvements for the original Github Vulnerability Scan parser. These improvements include:
cvssSeverities
which will replace thecvss
field in GitHub's graphql response in October, 2025.dependabotUpdate
field to the finding descriptionepss
percentage and percentile tofinding.epss_score
andfinding.epss_percentile
finding fieldsfinding.url
to GitHub Dependabot alert hyperlink for conveniencefinding.cve
andfinding.vuln_id_from_tool
fields before falling back tounsaved_vulnerability_ids
)finding.component_version
was only being set when thevulnerableRequirements
str started with=
.get()
to access fieldsBackward compatibility: existing users of the “Github Vulnerability Scan” scan type (driven by GithubVulnerabilityParser) for SCA imports will see no change. If you’d been using it to ingest SAST/code-scanning JSON, you’ll need to switch your import to the new “Github SAST Scan” scan type (driven by GithubSASTParser).
Ref links: