01 The thesis
ComfyUI is usually shown off as an image-generation playground, but underneath the demos it’s something more useful to a post house: a node-based execution engine for chaining arbitrary model operations into deterministic, reproducible graphs.
That architecture maps almost directly onto how post-production already works — compositors think in node graphs (Nuke, Fusion), colorists in stacked, parameterized operations. ComfyUI speaks the same language, which makes it a credible place to slot generative steps into an existing pipeline rather than bolting on a black-box cloud service.
02 Toy to contender
Local-first
Everything runs on owned hardware. No footage leaves the facility, no per-call API costs, none of the NDA or security friction of uploading unreleased material to someone else’s cloud.
Composable, model-agnostic
One graph chains heterogeneous models — restoration, segmentation, face recovery, colorization, upscaling — and any node swaps as the state of the art moves. No vendor lock-in.
Reproducible
Workflows are JSON: version-controlled, diffed, snapshotted, and handed to another artist or machine to produce identical output — the determinism a pipeline demands.
Headless, API-driven
Runs without its UI, so a graph can be triggered from a watch folder, a render queue, or a larger automation layer.
Frame-native by extension
Motion work is just frame sequences. An image graph batched across frames is the on-ramp to video, alongside an expanding set of video-native nodes.
03 Case study — a local restoration pipeline
A single, fully working proof of the thesis: a restoration graph that takes a degraded source image and runs it through a chain of specialized models, each handling one failure mode of an old photograph.
01
Global restoration + scratch detection
BOPBTL (Bringing Old Photos Back to Life), run with tiled processing so full-resolution frames don’t exceed memory — the “HR” toggle is, in practice, a tiling switch rather than a quality setting.
02
Colorization / color correction
A manual-mode pass (DDColor / EasyColorCorrector) rather than auto — trading convenience for direct control over the result.
03
Selective face restoration
CodeFormer applied surgically, not globally: YOLO detection (Impact Pack) yields SEGS; each face is restored and composited back through a Gaussian-blurred mask. Fidelity is run low (~0.1) — the high-fidelity, low-hallucination end — to repair features rather than invent them.
04
Upscale
Real-ESRGAN ×2 (ComfyUI_essentials).
04 Results & the point
On an M5 Max with 128 GB of unified memory, a 2920×1748 source resolves to 5840×3496 (≈6K) in roughly 80 seconds cold and about 11 seconds on cached re-runs. The whole setup — node graph, model placement, environment, and a documented restore procedure — lives in a git-backed repo, so any restoration is consistent and reproducible.
Nothing in this graph is video-specific yet — but every operation is a per-frame transform. Swap “load image” for a frame sequence and the same pipeline becomes the spine of a restoration or look-development pass for motion footage. That’s the whole point: the node graph that restores a single magazine scan scales, conceptually unchanged, into a film and video post-production workflow.
local-firstJSON · git-trackedper-frame → motion