Compile and Render
Score pacing, variety, flow, and style adherence on a 0-100 scale, compile the manifest into render-ready motion output, and get structured critique with specific fixes.
Overview
After the sequence planner produces a manifest, the evaluation engine answers: "How good is this sequence?" It scores pacing, variety, flow, and style adherence — then returns actionable findings you can use to improve the result.
The evaluation engine re-derives expected transitions and camera overrides from raw style pack rules independently. It does not import planner functions — this prevents circular validation and catches both planner bugs and hand-edited manifests.
Sequence evaluation
evaluate_sequence takes a manifest, analyzed scenes, and a style pack name. It returns a 0–100 score per dimension and a weighted overall score.
Overall score formula
Each dimension is weighted equally at 25%:
overall = round(pacing × 0.25 + variety × 0.25 + flow × 0.25 + adherence × 0.25)
Equal weights are intentional. Pacing without variety is monotonous. Variety without flow is chaotic. Flow without adherence drifts from the style. Adherence without pacing feels mechanical. All four matter equally.
Scoring dimensions
Pacing (25%)
Compares each scene's duration_s against the style pack's expected hold_durations[motion_energy].
| Check | Scoring |
|---|---|
| Per-scene deviation | Within ±0.5s = full marks; >1s = warning |
Total duration vs loop_time range | Within range = +5 bonus |
max_hold_duration violations | Each = warning + penalty |
| Confidence weighting | Low-confidence motion_energy = reduced penalty |
Variety (25%)
Style-agnostic cinematography quality. Four equally-weighted sub-scores:
| Sub-score | Penalty |
|---|---|
| Shot size runs | 2-run = -10, 3+ = -25 |
Adjacent same content_type | Each pair = -20 |
| Visual weight dominance >80% | -30 |
| All-same motion energy | -40; 3+ unique levels = +10 |
Short sequences (1–2 scenes) score 100 for variety — not enough data to penalize meaningfully.
Flow (25%)
Three weighted sub-scores assessing narrative structure:
| Sub-score | Weight | What scores well |
|---|---|---|
| Energy arc | 40% | Peak energy in the middle 30–70% of the sequence |
| Intent progression | 30% | Opening at start, closing at end, hero in first half |
| Transition coherence | 30% | Transitions match the style pack's expected rules |
Style adherence (25%)
Four equally-weighted sub-scores measuring how well the manifest follows the style pack:
| Sub-score | Method |
|---|---|
| Camera override match | Re-derive expected camera from style pack rules, compare to manifest |
| Transition type match | Re-derive expected transitions from style pack rules, compare to manifest |
| Shot grammar compliance | Check camera axes against personality_restrictions |
| Duration match | Compare scene durations against hold_durations[energy] |
Compilation
The compile_motion tool takes a validated sequence manifest and produces the final motion output — CSS keyframes, timing functions, and animation declarations ready for the Remotion renderer.
The compilation step transforms the abstract manifest (scene references, durations, transition types, camera overrides) into concrete render instructions:
- Scene sequencing — ordered scenes with precise frame-accurate timing
- Transition rendering — crossfades, whip-wipes, and hard cuts as interpolated values
- Camera interpolation — push_in, pull_out, pan, and drift as transform keyframes with easing curves
- Asset compositing — video, image, and HTML layers composed per the scene's layer stack
The output feeds directly into Remotion's <Series> component for frame-by-frame rendering via headless Chromium and ffmpeg. Output formats include MP4, WebM, ProRes, and GIF.
Motion critique
The critique_motion tool provides a qualitative assessment of compiled motion output. While evaluate_sequence scores the plan numerically, critique_motion reviews the actual compiled result for:
- Timing quality — do animations feel natural or mechanical?
- Transition smoothness — are cuts jarring or intentional?
- Camera movement — is intensity appropriate for the content?
- Overall coherence — does the sequence tell a visual story?
- Performance concerns — are there heavy compositions that may render slowly?
Critique returns structured feedback with specific, actionable suggestions rather than generic quality scores.
Use evaluate_sequence during planning to catch structural issues early. Use critique_motion after compilation to assess the final output quality.
Findings and severities
Both evaluate_sequence and critique_motion return findings with three severity levels:
| Severity | Meaning | Example |
|---|---|---|
warning | Significant issue that likely impacts quality | "Scene 3 duration 5.2s exceeds max_hold_duration 4.0s" |
info | Minor observation worth noting | "All scenes use the same visual weight" |
suggestion | Improvement opportunity, not a problem | "Consider adding variety — 3 consecutive hard cuts" |
Evaluation output structure
{
"score": 78,
"dimensions": {
"pacing": { "score": 85, "findings": [...] },
"variety": { "score": 72, "findings": [...] },
"flow": { "score": 80, "findings": [...] },
"adherence": { "score": 75, "findings": [...] }
},
"findings": [
{ "severity": "warning", "dimension": "pacing", "message": "...", "scene_index": 2 },
{ "severity": "info", "dimension": "variety", "message": "...", "scene_index": 4 }
]
}
Try it
Try asking your AI:
Evaluate this sequence manifest against the prestige style pack and show me the scores.
My sequence scored 62 on variety. What changes would improve it?
Compile this planned sequence to motion output and then critique the result.
Run benchmarks on this compiled animation to check render performance.