The dailies pipeline.

Camera originals in. Proxies, sync, transcripts and search out.

Status

v0.1 · in development

What

Dailies and search pipeline

Stack

FastAPI · Prefect · Postgres

License

AGPL-3.0-or-later

Why

After a shoot, the useful details end up scattered: camera cards, sound rolls, notes, reports, bins, exports. By the time the grade starts, someone is often doing the same search again - the take, the line, the setup, the look.

This is our attempt to keep that work attached to the media. The first version takes camera originals and sound files, makes proxies, syncs audio, transcribes dialogue and indexes the result. Slate OCR, focus notes, visual tags and report reconciliation come after the core run is dependable.

01 - the pipeline

How it works.

A shoot moves through a chain of repeatable stages. Re-running a job only fills in the work that is missing.

Ingest, probe, proxy, sync and transcribe run end to end today, with face detection available per job. Stand-in backends keep local development and CI simple; production points the same code at the real render and transcription services.

Ingest Read camera cards and recorder files into a shoot record. One clip per source file, with the basic technical details attached. 01

Probe Run ffprobe on each clip for resolution, codec, framerate and embedded timecode. Local, cheap and quick to repeat. 02

Proxy Make viewing proxies from the camera originals. FFmpeg handles common codecs; a Resolve seat can take over for RAW formats like X-OCN, ARRIRAW, BRAW and R3D. Unsupported clips are flagged instead of stopping the job. 03

Sync Pull audio from picture, then match standalone recorder files by timecode, including the offset of the clip's first frame inside the take. 04

Transcribe Run WhisperX so dialogue is timed and searchable. Search for the line, not just the filename. 05

Faces Mark who appears in a clip and when, using a person list for that shoot. It works as an opt-in job today, not part of the default run. 06

Slate, visual tags, focus and reconcile tables are in the schema, but are not driven by a flow yet. They are the next layer, not part of the default run.

Hand-off

The point is not to make editors live in another tool. The metadata should come back into the edit and the grade as bin columns, markers and sidecars.

Avid Media Composer - ALE, the standard dailies log: scene, shot and take straight into the bin.
DaVinci Resolve - metadata CSV today; transcript, face and focus markers pushed to the render seat next.
Premiere Pro - XMP sidecars, on the way.

Clips are keyed by source file, so the hand-off stays close to the way assistants already relink and rebuild bins.

The take is usually there. The hard part is finding it after the notes, exports and bins have drifted apart.

02 - the shape of it

How it's built.

A small orchestration service talks over HTTP to the heavier render and transcription services.

Transcription and proxy rendering run outside the core app. Rendering is routed per clip: FFmpeg for common codecs, Resolve for RAW formats FFmpeg cannot decode cleanly. The core asks for a connector by type; config picks the implementation.

Works from a browser or script The same job can be started from a small web UI, a REST call, or an automated trigger. No workstation has to be the one machine that knows how it works. 01

Backends stay replaceable WhisperX can be swapped for a managed transcriber. FFmpeg can render most proxies, while Resolve handles the RAW it supports. That choice belongs in config and capability checks, not in a rewrite. 02

Same build everywhere The same image runs in docker-compose on a laptop and on the cluster. Different config, same code. 03

03 - what's next

Roadmap.

The order is deliberately boring: make the core run reliable, wire in the real backends, then add more metadata.

This is still in development, so the order can move. The dependency chain is what matters.

Now

Working end to end

v0.1

Ingest, probe, proxy, sync and transcribe run as one chain. Faces can be requested per job.
Screenplay parsing creates scene records for the shoot.
Filename and transcript search work on stand-in backends, so local development and CI do not need a GPU.

Real backends, durable runs

Phase 1

Wire in the FFmpeg renderer and Resolve render seat for production proxy generation, including RAW.
Move to a Prefect server with per-backend work pools, so runs survive restarts and clips fan out in parallel.
Add Alembic migrations and scheduled database backups.

Later

Enrichment + search

Phase 2

Add slate OCR, visual tags, focus timelines and a reconcile pass against department reports.
Add semantic and face search on pgvector, with a Qdrant offload path if the embedding volume outgrows Postgres.
Open the source under AGPL-3.0 once the tool is past v0.1.

Built by

Built at The Studio in Dunkeld by a director of photography who also works as a platform engineer. The tool comes from doing the dailies work, not from trying to invent a new category.

Licensed AGPL-3.0-or-later. The FFmpeg render path lands before the source opens, so a self-hoster can cover everyday codecs without a Resolve licence. Resolve stays optional for RAW. Source opens once it is past v0.1.

All tools Talk to us - [email protected]