ComfyUI in Post-Production

01 The thesis

ComfyUI is usually shown off as an image-generation playground, but underneath the demos it’s something more useful to a post house: a node-based execution engine for chaining arbitrary model operations into deterministic, reproducible graphs.

That architecture maps almost directly onto how post-production already works — compositors think in node graphs (Nuke, Fusion), colorists in stacked, parameterized operations. ComfyUI speaks the same language, which makes it a credible place to slot generative steps into an existing pipeline rather than bolting on a black-box cloud service.

02 Toy to contender

Local-first

Everything runs on owned hardware. No footage leaves the facility, no per-call API costs, none of the NDA or security friction of uploading unreleased material to someone else’s cloud.

Composable, model-agnostic

One graph chains heterogeneous models — restoration, segmentation, face recovery, colorization, upscaling — and any node swaps as the state of the art moves. No vendor lock-in.

Reproducible

Workflows are JSON: version-controlled, diffed, snapshotted, and handed to another artist or machine to produce identical output — the determinism a pipeline demands.

Headless, API-driven

Runs without its UI, so a graph can be triggered from a watch folder, a render queue, or a larger automation layer.

Frame-native by extension

Motion work is just frame sequences. An image graph batched across frames is the on-ramp to video, alongside an expanding set of video-native nodes.

03 Case study — a local restoration pipeline

A single, fully working proof of the thesis: a restoration graph that takes a degraded source image and runs it through a chain of specialized models, each handling one failure mode of an old photograph.

Global restoration + scratch detection

BOPBTL (Bringing Old Photos Back to Life), run with tiled processing so full-resolution frames don’t exceed memory — the “HR” toggle is, in practice, a tiling switch rather than a quality setting.

Colorization / color correction

A manual-mode pass (DDColor / EasyColorCorrector) rather than auto — trading convenience for direct control over the result.

Selective face restoration

CodeFormer applied surgically, not globally: YOLO detection (Impact Pack) yields SEGS; each face is restored and composited back through a Gaussian-blurred mask. Fidelity is run low (~0.1) — the high-fidelity, low-hallucination end — to repair features rather than invent them.

Upscale

Real-ESRGAN ×2 (ComfyUI_essentials).

04 Results & the point

On an M5 Max with 128 GB of unified memory, a 2920×1748 source resolves to 5840×3496 (≈6K) in roughly 80 seconds cold and about 11 seconds on cached re-runs. The whole setup — node graph, model placement, environment, and a documented restore procedure — lives in a git-backed repo, so any restoration is consistent and reproducible.

Nothing in this graph is video-specific yet — but every operation is a per-frame transform. Swap “load image” for a frame sequence and the same pipeline becomes the spine of a restoration or look-development pass for motion footage. That’s the whole point: the node graph that restores a single magazine scan scales, conceptually unchanged, into a film and video post-production workflow.

local-firstJSON · git-trackedper-frame → motion