Skip to content

langclaw 0.4.0 — dynamic workflow patterns

0.4.0 ships the workflow pattern cookbook: a running implementation of every orchestration pattern from Anthropic's dynamic workflows guide, plus two new primitives that power them.

Six patterns, all running

Anthropic published six patterns for serious multi-agent work. We built a running example of each — not a tutorial, a real workflow that drives a real job through the full gateway, validated end-to-end with the probe harness.

LANGCLAW__WORKFLOWS__ENABLED=true uv run python -m examples.workflow_patterns
uv run langclaw probe '/workflows run prioritize {"items": ["SSO","dark mode","audit log"], "criterion": "impact per eng-week"}'

Classify-and-act — triage

One model call routes the ticket; the branch does the work. The classifier never acts, the handler never re-classifies.

@app.workflow("triage", input=Ticket, max_concurrency=4)
async def triage(ctx, inp: Ticket) -> str:
    ctx.phase("classify")
    routing = await ctx.llm(
        f"Classify this ticket.\n\n{inp.text}",
        schema=Routing,   # Literal["security", "bug", "feature_request", "question"]
        system=_CLASSIFIER_SYS,
    )
    ctx.log(f"classified as {routing.category}")

    ctx.phase("act")
    if routing.category == "security":
        hits = await ctx.tool("web_search", query=f"{inp.text} CVE advisory")
        assessment = await ctx.llm(f"security ticket:\n{inp.text}", system=_ASSESS_SYS)
        ...
    elif routing.category == "bug":
        ...

Fan-out-and-synthesize — landscape

Each competitor researched by its own subagent — isolated context, its own web_search — so no contender's findings colour another's. Synthesis is a single ctx.llm call over the collected notes.

@app.workflow("landscape", input=Landscape, max_concurrency=5)
async def landscape(ctx, inp: Landscape) -> str:
    ctx.phase("research")
    findings = await ctx.parallel([
        lambda c, name=name: c.subagent(
            "scout",
            f"Research '{name}' as a {inp.subject}.",
        )
        for name in inp.contenders
    ], return_exceptions=True)  # one failing scout doesn't sink the brief

    ctx.phase("synthesize")
    table = await ctx.llm(prompt, system=_COMPARE_SYS)
    return table

Adversarial verification — fact_check

Decompose a draft into atomic claims, then for each claim spawn N skeptic subagents that search independently and try to refute. Majority vote decides. Independence is the whole game — each skeptic searches on its own and never sees the others' findings.

async def verify_one(c, claim: str):
    replies = await c.parallel([
        lambda cc: cc.subagent("skeptic", f"CLAIM: {claim}")
        for _ in range(inp.votes)
    ])
    verdicts = [_verdict(r) for r in replies]
    survived = verdicts.count("supported") > verdicts.count("refuted")
    return {"claim": claim, "survived": survived, "verdicts": verdicts}

results = await ctx.parallel([
    lambda c, cl=cl: verify_one(c, cl) for cl in claims
])

Generate-and-filter — tagline_studio

Generate candidates from deliberately different angles so the pool is diverse, not N variations of one idea. Score each in parallel with ctx.llm + schema, keep the ones that clear the bar.

ctx.phase("generate")
raw = await ctx.parallel([
    lambda c, angle=angle: c.llm(
        f"PRODUCT: {inp.product}\nANGLE: {angle}",
        system=_WRITE_SYS,
    )
    for angle in ["bold", "playful", "technical", "benefit-led", "contrarian", "minimalist"]
])

ctx.phase("filter")
verdicts = await ctx.parallel([
    lambda c, line=line: c.llm(
        f"AUDIENCE: {inp.audience}\nTAGLINE: {line}",
        schema=Score,   # score: int, why: str
        system=_JUDGE_SYS,
    )
    for line in candidates
])
kept = sorted(scored, key=lambda s: s["score"], reverse=True)[:inp.keep]

Tournament — prioritize

Asking a model to score 10 backlog items 1–10 is noisy. Asking "which of these two ships more value" is far more stable. Single-elimination bracket: all duels in a round run in parallel, winners advance.

async def duel(c, a: str, b: str | None) -> str:
    if b is None:
        return a  # bye
    verdict = await c.llm(
        f"CRITERION: {inp.criterion}\nA: {a}\nB: {b}",
        schema=Duel,    # winner: Literal["A", "B"], why: str
        system=_REFEREE_SYS,
    )
    return b if verdict.winner == "B" else a

while len(current) > 1:
    rnd += 1
    ctx.phase(f"round {rnd}")
    pairs = [(current[i], current[i+1] if i+1 < len(current) else None)
             for i in range(0, len(current), 2)]
    current = await ctx.parallel([lambda c, a=a, b=b: duel(c, a, b) for a, b in pairs])

Loop-until-done — edge_hunt

Fixed iteration counts either pad with junk or stop short. Loop instead: each round proposes new items, deduped against everything accumulated, stopping when you hit the target or when consecutive rounds turn up nothing new.

found, seen, dry = [], set(), 0

while rnd < inp.max_rounds:
    rnd += 1
    ctx.phase(f"round {rnd}")
    proposed = await ctx.llm(
        f"TARGET: {inp.target}\n\nALREADY FOUND:\n{already}",
        schema=Cases,   # cases: list[str]
        system=_HUNTER_SYS,
    )
    fresh = [c for c in proposed.cases if norm(c) not in seen]

    if not fresh:
        dry += 1
        if dry >= inp.patience:
            reason = f"dried up after {dry} empty rounds"
            break
        continue

    dry = 0
    seen.update(norm(c) for c in fresh)
    found.extend(fresh)
    if len(found) >= inp.target_count:
        break

Two primitives power all of it

ctx.llm — one-shot structured judgment

A model call that returns a validated Pydantic object. No tools, no agent loop, no string parsing. Used by every pattern above for classification, scoring, pairwise comparison, and extraction.

class Routing(BaseModel):
    category: Literal["security", "bug", "feature_request", "question"]

routing = await ctx.llm(inp.text, schema=Routing, system=_CLASSIFIER_SYS)
# routing.category is already one of the four literals

This is a langclaw primitive — neither LangChain nor deepagents expose a bare model-call step that's a first-class durable workflow step with crash-resume. ctx.llm fills that gap.

ctx.subagent — isolated delegation

When a leaf needs multi-step work with its own tools, give it an isolated context. Used by landscape (one scout per competitor) and fact_check (independent skeptics per claim).

The rule: does this step need tools? If yes, ctx.subagent. If no, ctx.llm.


What's also in 0.4

  • tools.llm in the JS sandbox — a one-shot model call for agent-authored (saved) workflows (plain text only — no schema= structured output like Python's ctx.llm; use JSON.parse in the body for structure)
  • Workflow subagent delegation fixctx.subagent was previously inert in registered workflows; it now invokes subagent runnables directly
  • Probe harnesslangclaw probe for driving the full gateway pipeline in tests without real channel credentials

Source: examples/workflow_patterns/