Skip to content

Add upstream source identification and GENERATED_FROM relationship to SBOM #1031

@mprpic

Description

@mprpic

Summary

Add upstream source package identification to the SBOM generated by Fromager. Each SBOM should contain two package entries (downstream wheel and upstream source) linked by a GENERATED_FROM relationship, allowing consumers to trace a built wheel back to its original source.

Motivation

The downstream wheel purl (e.g. pkg:pypi/numpy@1.26.0?repository_url=https://packages.redhat.com) identifies what was built, but not where the source came from. For standard PyPI packages, the upstream is implicit (same name, no qualifier). For packages sourced from GitHub/GitLab forks (e.g. vllm-bart-plugin), the upstream identity is completely different and must be declared explicitly.

Design

Purl construction with packageurl-python

Use the packageurl-python library instead of manual string formatting. Two functions handle purl construction:

  • _build_downstream_purl — cascades through: per-package purl (full string override) > per-package field overrides (purl_type, purl_namespace, purl_name, purl_version) > global defaults from SbomSettings. Adds repository_url qualifier from per-package or global settings.

  • _build_upstream_purl — same cascade but never adds repository_url. If upstream_purl is set in per-package settings, it is used as-is.

Per-package upstream_purl field

A new upstream_purl field on PackageSettings for packages sourced from GitHub/GitLab:

# overrides/settings/vllm_bart_plugin.yaml
upstream_purl: "pkg:github/vllm-project/bart-plugin@v0.2.0"

When absent, the upstream purl is auto-derived from the downstream purl without the repository_url qualifier.

Examples

Standard PyPI package (numpy) — no per-package settings needed:

{
  "spdxVersion": "SPDX-2.3",
  "dataLicense": "CC0-1.0",
  "SPDXID": "SPDXRef-DOCUMENT",
  "name": "numpy-1.26.0",
  "documentNamespace": "https://www.redhat.com/numpy-1.26.0.spdx.json",
  "creationInfo": {
    "created": "2026-04-06T12:00:00Z",
    "creators": ["Organization: Red Hat", "Tool: fromager-0.40.0"]
  },
  "packages": [
    {
      "SPDXID": "SPDXRef-wheel",
      "name": "numpy",
      "versionInfo": "1.26.0",
      "downloadLocation": "NOASSERTION",
      "supplier": "Organization: Red Hat",
      "externalRefs": [{
        "referenceCategory": "PACKAGE-MANAGER",
        "referenceType": "purl",
        "referenceLocator": "pkg:pypi/numpy@1.26.0?repository_url=https://packages.redhat.com"
      }]
    },
    {
      "SPDXID": "SPDXRef-upstream",
      "name": "numpy",
      "versionInfo": "1.26.0",
      "downloadLocation": "NOASSERTION",
      "supplier": "NOASSERTION",
      "externalRefs": [{
        "referenceCategory": "PACKAGE-MANAGER",
        "referenceType": "purl",
        "referenceLocator": "pkg:pypi/numpy@1.26.0"
      }]
    }
  ],
  "relationships": [
    { "spdxElementId": "SPDXRef-DOCUMENT", "relationshipType": "DESCRIBES", "relatedSpdxElement": "SPDXRef-wheel" },
    { "spdxElementId": "SPDXRef-wheel", "relationshipType": "GENERATED_FROM", "relatedSpdxElement": "SPDXRef-upstream" }
  ]
}

GitHub-sourced package (vllm-bart-plugin)upstream_purl set in per-package settings:

{
  "spdxVersion": "SPDX-2.3",
  "dataLicense": "CC0-1.0",
  "SPDXID": "SPDXRef-DOCUMENT",
  "name": "vllm-bart-plugin-0.2.0",
  "documentNamespace": "https://www.redhat.com/vllm-bart-plugin-0.2.0.spdx.json",
  "creationInfo": {
    "created": "2026-04-06T12:00:00Z",
    "creators": ["Organization: Red Hat", "Tool: fromager-0.40.0"]
  },
  "packages": [
    {
      "SPDXID": "SPDXRef-wheel",
      "name": "vllm-bart-plugin",
      "versionInfo": "0.2.0",
      "downloadLocation": "NOASSERTION",
      "supplier": "Organization: Red Hat",
      "externalRefs": [{
        "referenceCategory": "PACKAGE-MANAGER",
        "referenceType": "purl",
        "referenceLocator": "pkg:pypi/vllm-bart-plugin@0.2.0?repository_url=https://packages.redhat.com"
      }]
    },
    {
      "SPDXID": "SPDXRef-upstream",
      "name": "vllm-bart-plugin",
      "versionInfo": "0.2.0",
      "downloadLocation": "NOASSERTION",
      "supplier": "NOASSERTION",
      "externalRefs": [{
        "referenceCategory": "PACKAGE-MANAGER",
        "referenceType": "purl",
        "referenceLocator": "pkg:github/vllm-project/bart-plugin@v0.2.0"
      }]
    }
  ],
  "relationships": [
    { "spdxElementId": "SPDXRef-DOCUMENT", "relationshipType": "DESCRIBES", "relatedSpdxElement": "SPDXRef-wheel" },
    { "spdxElementId": "SPDXRef-wheel", "relationshipType": "GENERATED_FROM", "relatedSpdxElement": "SPDXRef-upstream" }
  ]
}

SBOM structure

SPDXRef-DOCUMENT ──DESCRIBES──> SPDXRef-wheel (downstream, supplier from settings)
                                    │
                                    └──GENERATED_FROM──> SPDXRef-upstream (supplier: NOASSERTION)

The upstream package entry always has supplier: NOASSERTION and downloadLocation: NOASSERTION regardless of the configured supplier, since the upstream source is not produced by the downstream builder.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions