Skip to content

feat: Add Docker Support and GHCR Publish Workflow#537

Open
FNGarvin wants to merge 4 commits intoBreakthrough:mainfrom
FNGarvin:fng-infra-docker-ci
Open

feat: Add Docker Support and GHCR Publish Workflow#537
FNGarvin wants to merge 4 commits intoBreakthrough:mainfrom
FNGarvin:fng-infra-docker-ci

Conversation

@FNGarvin
Copy link

@FNGarvin FNGarvin commented Mar 5, 2026

Thanks for your great project. It has become the gold standard for a reason.

Description:
This PR adds official containerization support to PySceneDetect. It provides a reproducible environment via a multi-stage Dockerfile and an automated CI pipeline for publishing images to the GitHub Container Registry (GHCR).

Technical Highlights:

  • Headless Runtime: Uses python:3.11-slim with opencv-headless to avoid unnecessary X11/GUI dependencies.
  • Optimized I/O: Includes pyav for fast video decoding and mkvtoolnix for robust MKV splitting.
  • Automated CI: The added GitHub Action (docker-publish.yml) handles image tagging via semantic versioning and Git SHAs.
  • Build Provenance: Implements SLSA attestations to provide a secure, verifiable chain of custody for the published artifacts.

Implementation Notes:

  • The workflow uses the standard GITHUB_TOKEN for authentication. Once merged, the project will automatically begin hosting images at ghcr.io/breakthrough/pyscenedetect with no additional setup required.
  • I have elected not to pin deps because of the ongoing burden it would create. But if you'd prefer, I can pin everything and set up a dependabot to alert to updates.
  • There are a couple of "bonus" configuration changes to tighten up existing security issues. Each is presented as a separate commit to explore.

Note on platform.py:
During development, several shell=True subprocess calls were identified as potential injection vectors. I have elected to leave these as-is for this PR to avoid cross-platform regressions, but I recommend a dedicated audit/refactor as a follow-up.

Once merged, docker run --rm ghcr.io/breakthrough/pyscenedetect should allow full use of the tool as an appliance, without concern over installing depedndencies.

Here's the output of the package produced by my feature branch, as example:

PySceneDetect on  fng-infra-docker-ci via 🐍 v3.12.3
							~/projects/PySceneDetect
> docker run --rm ghcr.io/fngarvin/pyscenedetect:fng-infra-docker-ci version

Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
[PySceneDetect] PySceneDetect 0.7-dev0
System Info
------------------------------------------------------------
OS               Linux-6.6.87.2-microsoft-standard-WSL2-x86_64-with-glibc2.36
Python           CPython 3.11.11
Architecture     64bit +
Packages
------------------------------------------------------------
av               16.1.0
click            8.2.1
cv2              4.13.0
imageio          2.37.2
imageio_ffmpeg   0.6.0
moviepy          2.1.2
numpy            2.4.2
platformdirs     4.9.2
scenedetect      0.7-dev0
tqdm             4.67.3

Tools
------------------------------------------------------------
ffmpeg           5.1.8-0+deb12u1
mkvmerge         v74.0.0 ('You Oughta Know') 64-bit

And a more useful example, mounting a local dir into the container for i/o:

chmod -R 777 test_clips
docker run --rm -v "$(pwd):/files:z" ghcr.io/fngarvin/pyscenedetect:fng-infra-docker-ci -i /files/test_clips/matrix-1999.mkv detect-adaptive list-scenes -f /files/stats.csv
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
[PySceneDetect] PySceneDetect 0.7-dev0
[PySceneDetect] Detecting scenes...
  Detected: 100 | Progress:   8%|▊         | 15729/196092 [05:31<1:04:37, 46.52frames/s  Detected: 100 | Progress:   8%|▊         | 15734/196092 [05:31<1:04:31, 46.59frames/s
[...]

Summary by Sourcery:

Reviewer's Guide

Adds a production-ready Dockerfile and .dockerignore for PySceneDetect and replaces the placeholder Docker workflow with a full GHCR build/publish pipeline including provenance attestations.

Sequence diagram for Docker image build, publish, and attestation

sequenceDiagram
  actor Developer
  participant GitHub_Repo
  participant GitHub_Actions
  participant Docker_Buildx
  participant GHCR
  participant Attestation_Service

  Developer->>GitHub_Repo: Push commit or create release
  GitHub_Repo->>GitHub_Actions: Trigger docker-publish workflow

  GitHub_Actions->>GitHub_Repo: actions/checkout@v4
  GitHub_Actions->>Docker_Buildx: docker/setup-buildx-action@v3

  GitHub_Actions->>GHCR: docker/login-action@v3 (GITHUB_TOKEN)

  GitHub_Actions->>Docker_Buildx: docker/metadata-action@v5 (compute tags, labels)
  Docker_Buildx-->>GitHub_Actions: tags and labels

  GitHub_Actions->>Docker_Buildx: docker/build-push-action@v5 (context ., tags, labels)
  Docker_Buildx->>GHCR: Push image layers and manifests
  Docker_Buildx-->>GitHub_Actions: Image digest

  GitHub_Actions->>Attestation_Service: actions/attest-build-provenance@v1 (subject-name, subject-digest)
  Attestation_Service->>GHCR: Upload SLSA build provenance
Loading

Flow diagram for Dockerfile build stages and runtime behavior

graph TD
  A["FROM python:3.11-slim base image"] --> B["Set WORKDIR /app"]
  B --> C["apt-get update && install ffmpeg and mkvtoolnix"]
  C --> D["Clean apt cache (rm -rf /var/lib/apt/lists/*)"]
  D --> E["COPY . . (project into /app)"]
  E --> F["pip install .[opencv-headless,pyav,moviepy] with cache mount"]
  F --> G["useradd -m scenedetect"]
  G --> H["chown -R scenedetect:scenedetect /app"]
  H --> I["Switch to USER scenedetect"]
  I --> J["ENTRYPOINT [scenedetect] (run CLI by default)"]
Loading

File-Level Changes

Change Details Files
Introduce an official Docker image for running PySceneDetect in a headless, production-ready environment.
  • Create a Dockerfile based on python:3.11-slim with ffmpeg and mkvtoolnix system dependencies installed.
  • Install PySceneDetect using extras for opencv-headless, pyav, and moviepy with pip cache optimization.
  • Add a non-root scenedetect user, set /app as the working directory, and define scenedetect CLI as the container entrypoint.
Dockerfile
Optimize Docker build context for smaller, faster image builds.
  • Add a .dockerignore file to exclude unnecessary files from the Docker build context (contents not shown in diff).
.dockerignore
Replace the placeholder Docker GitHub Actions workflow with an automated GHCR build-and-push pipeline including SLSA attestations.
  • Extend workflow triggers to run on workflow_dispatch, pushes to main and fng-infra-docker-ci branches, and published releases.
  • Define registry/image env vars and grant permissions for contents, packages, attestations, and OIDC tokens.
  • Add steps to set up Docker buildx, log in to GHCR using GITHUB_TOKEN, and extract image tags/labels via docker/metadata-action with semver, ref, schedule, and SHA tagging.
  • Build and push images using docker/build-push-action, tagging them with metadata-derived tags and labels.
  • Generate and push SLSA build-provenance attestations for the published container images.
.github/workflows/docker-publish.yml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant