Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Unifying SBOM Representation and Trust Across Multiple Equivalent Image References #2387

Open
robert-cronin opened this issue Dec 19, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@robert-cronin
Copy link
Collaborator

In container ecosystems, a single image (identified by a digest) can appear in multiple registries under different tags or repositories. For instance, an image at registryA.com/base:1.0 can be mirrored at registryB.com/cache/base:1.0. Although these references differ, the underlying image content is identical. A single, signed SBOM should logically apply to all such references without needing to be regenerated (which would break its original signature and trust).

Currently, GUAC typically associates one SBOM per subject (e.g., one image reference), making it hard to unify multiple references of the same image or multiple SBOM formats (e.g., SPDX and CycloneDX) for the same underlying content. Additionally, SBOMs aren’t first-class entities with their own trust or provenance data, limiting the ability to differentiate a trusted SBOM from a suspicious or forged one.

Use Case Example:

1. Multiple Registry Locations, One Image:

  • Consider an official base image at registryA.com/base:1.0 described by a trusted SBOM.
  • Another registry provides a pull-through cache of this same digest at registryB.com/cache/base:1.0.
  • In GUAC today, there’s no simple way to say: These two references represent the same underlying image, so we should link them to the same trusted SBOM.

2. Different SBOM Formats, Same Content:

  • If the same image from above also has a CycloneDX SBOM and an SPDX SBOM, both describing identical content, GUAC cannot directly represent these two documents as equivalent descriptions.
  • Users can’t query: “Show me all SBOM representations of this image” and see them unified.

3. Trust and Attestation at the SBOM Level:

  • Suppose a malicious actor introduces a forged SBOM claiming to describe the same package.
  • Without modeling SBOMs as distinct nodes, GUAC can’t easily attach attestations, cryptographic proofs, or trust metadata to that SBOM and differentiate it from the known-trusted one.

This issue aims to explore ways to unify image references, preserve trust in original SBOMs, and enhance the expressiveness of SBOM modeling in GUAC.

@robert-cronin robert-cronin added the enhancement New feature or request label Dec 19, 2024
@robert-cronin
Copy link
Collaborator Author

Potential Directions (not prescriptive, just brainstorming):

Enhanced HasSBOM Edges:

  • Ingest the canonical SBOM once for the original image.
  • For each equivalent image reference (e.g. same content but different registry/tag), create only a new HasSBOM edge linking that new image node to the existing SBOM nodes.
  • Possibly introduce a “subject override” mechanism for HasSBOM ingestion so the backend knows not to duplicate nodes/edges like IsDependency/IsOccurrence (e.g. via a --subject flag on collection/ingestion).
  • Minimal or no GraphQL change, depends on idempotency of SBOM ingestion at the backend level.

Attestation Linking Approach:

  • Ingest the canonical SBOM once for the original image.
  • For secondary image references, ingest an attestation stating that they share the same SBOM as the canonical image.
  • This attestation could be a new predicate (like CertifySameSBOM) that GUAC’s backend understands as a linkage.
  • Queries against the secondary image can resolve through this attestation to the already ingested SBOM.

SBOM as First-Class Nodes:

  • Model SBOMs as distinct nodes, allowing equivalence (DocEqual?) across different formats.
  • Attach trust, signatures, and provenance directly to SBOM nodes.
  • Integrate with PkgEqual and other relationships for richer semantic links.

@pxp928
Copy link
Collaborator

pxp928 commented Jan 10, 2025

hasSBOM nodes subject should be an artifact (algorithm and digest), if that is not provided, it can be a package (https://github.com/guacsec/guac/blob/main/pkg/assembler/graphql/schema/hasSBOM.graphql#L23). We made this change based on this issue here: #1736. From there an IsOccurrence node can be used to link the artifact to multiple packages (registry location of the image).

So this should handle the usecases you specified above:

  1. Multiple Registry Locations, One Image

Use the artifact (algorithm and digest) to create the hasSBOM node and IsOccurrence to link to multiple registry locations.

  1. Different SBOM Formats, Same Content:

Use the artifact (algorithm and digest) to create the hasSBOM node for the different SBOM formats. Query the subject (artifact) to find all associated hasSBOM nodes (CDX or SPDX).

Trust and Attestation at the SBOM Level:

Add attestations that are linked to the artifact (algorithm and digest)

Let me know if that answers all your questions or if I missed anything. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants