Mental model
A workflow run is a directory on disk containing arun.json (and logs/deliverables). The worker executes nodes in order, persists each node’s output, and downstream nodes reference upstream outputs via {{ }} template variables.
Template variables
At runtime, ClawRecipes replaces{{vars}} in strings inside node configs.
There are two broad categories:
Global vars (always available):
{{date}}{{run.id}}{{workflow.id}}{{workflow.name}}{{node.id}}
{{someNode.output}}— the full stored output envelope{{someNode.text}}— the most common “payload string” field{{someNode.someField}}— a field extracted from JSON inside the node’stext(when applicable)
- If an upstream node stores JSON inside its
textfield, the template system can extract nested fields (e.g.{{draft_assets.video_brief}}). - If you see “garbled JSON” showing up in a media prompt, you’re usually templating the envelope (e.g.
{{draft_assets.output}}) instead of the intended field (e.g.{{draft_assets.text}}or{{draft_assets.video_brief}}).
Node types
This section describes the important node types and the config fields the worker actually reads.LLM nodes
LLM nodes execute via thellm-task tool.
Common config fields (stored under node.config):
promptTemplate(string): the prompt content, after template-var substitutiontimeoutMs(number, optional)model(string, optional): provider/model selection (node → workflow → global precedence)
outputFields (structured output)
Ifnode.config.outputFields is present, ClawRecipes will:
- Append an explicit OUTPUT FORMAT section to the prompt
- Derive a JSON Schema and pass it to
llm-task
text→ JSON stringlist→ array of stringsjson→ JSON object
Media nodes (image/video/audio)
Media nodes ultimately produce a file deliverable. ClawRecipes invokes a MediaDriver selected byprovider.
Provider selection:
- In workflow JSON, providers are referenced as
skill-<slug> - At runtime the worker strips
skill-and looks up a driver by slug
Common media config fields
These are passed through to drivers asopts.config:
-
size(string):- For images, typically pixel size like
1024x1024,1792x1024,1024x1792 - Some providers interpret this differently (drivers may map pixel sizes to tiers)
- For images, typically pixel size like
-
duration(string or number): duration in seconds, e.g."5s","10s", or10 -
aspect_ratio(string): e.g."16:9","9:16"(provider-specific) -
quality/style: image-provider-specific fields (ex: OpenAI) -
addRefinement(boolean): opt-in prompt refinement pass for video/audio.- Default is OFF. When enabled, the worker runs an extra
llm-taskcall to refine the prompt before invoking the driver.
- Default is OFF. When enabled, the worker runs an extra
Driver constraints
Drivers can optionally declare duration constraints (minSeconds/maxSeconds/defaultSeconds).
Kitchen surfaces these constraints as UX hints, but the runtime is the source of truth.
Tool nodes
Tool nodes call a tool by name with JSON args. Example:fs.appendoutbound.postmessage.send
Human approval nodes
Human approval nodes pause execution until an approval is granted in the UI. Common pattern:- Upstream LLM generates a draft
- Approval node shows a
messageTemplate(with template vars) - Downstream nodes continue only after approval
Adding new capabilities (where to change code)
- LLM output shaping/validation:
src/lib/workflows/workflow-worker.ts - Media providers:
src/lib/workflows/media-drivers/ - CLI for Kitchen dropdown:
openclaw recipes workflows media-drivers(handler:src/handlers/media-drivers.ts)
Troubleshooting quick hits
- Provider missing in dropdown: run
openclaw recipes workflows media-driversand checkavailable/missingEnvVars. - Media node ignores config: confirm the driver reads
opts.config(not all scripts support every knob). - Bad video quality: ensure your prompt is a single clear scene brief; avoid multi-scene paragraphs when the provider needs a seed frame.
