Probe Harness¶
The probe harness lets you drive features end-to-end through the real gateway → bus → agent → channel pipeline — without a human tapping on Telegram and without mocking anything out.
Why this matters for agentic development¶
Testing an agent is not like testing a REST endpoint.
A single user turn passes through: the message bus, session lookup, agent-name resolution, every middleware layer (RBAC, rate limit, content filter, PII), the LangGraph agent loop, tool calls, and finally the outbound channel formatter. The interesting bugs live in that pipeline, not in the agent in isolation.
The standard alternatives miss most of it:
| Approach | What it covers | What it misses |
|---|---|---|
langclaw agent -m "…" (CLI) |
Agent + tools | Bus, session, middleware, command router, workflow→channel progress, active-agent switching |
| Unit tests on individual functions | Function correctness | Integration between components |
| Manual Telegram / Discord | Everything | Can't run in CI, no assertions, no repeatability |
The probe closes that gap: it injects a turn as a real WebSocket message, flows it through the full pipeline, and returns a structured, assertable list of events. No Telegram credentials. No arbitrary sleeps. Deterministic turn completion.
This is how langclaw's own workflow patterns cookbook was validated — every example in examples/workflow_patterns/ was driven end-to-end through the probe before being documented.
Three ways to use it¶
1. CLI — interactive or scripted¶
Start an isolated gateway (WebSocket only — every other channel is forced off so probe traffic never reaches a real bot):
Then in another terminal:
langclaw probe "summarize today's HN"
langclaw probe "/reset" # command path
langclaw probe "what is X?" --agent researcher --reset # named agent + fresh thread
langclaw probe "/workflows run landscape {\"subject\": \"agent frameworks\", \"contenders\": [\"LangGraph\", \"CrewAI\"]}"
2. Script / notebook¶
import asyncio
from langclaw.testing import probe, WebSocketProbeTransport, final_text
async def main():
t = WebSocketProbeTransport("ws://127.0.0.1:18789")
events = await probe("Say exactly: PONG", transport=t, reset=True)
print(final_text(events)) # "PONG"
asyncio.run(main())
3. pytest¶
For CI, run langclaw gateway --probe as a fixture and probe it:
import pytest
from langclaw.testing import probe, WebSocketProbeTransport, final_text
@pytest.fixture(scope="session")
async def gateway(langclaw_app):
# start your app with gateway --probe, yield, then stop
...
async def test_greet_tool(gateway):
events = await probe(
"greet Alice",
transport=WebSocketProbeTransport(),
reset=True,
)
assert "Alice" in final_text(events)
async def test_reset_command(gateway):
events = await probe("/reset", transport=WebSocketProbeTransport())
assert any(e.type == "command" and e.is_final for e in events)
You can also inject a fake transport to test the probe's normalization logic without a live server:
import asyncio
from langclaw.testing import probe
class FakeTransport:
def __init__(self, frames): self.frames = frames
async def open(self): ...
async def close(self): ...
async def send_turn(self, content, *, agent=None): ...
async def receive(self):
for f in self.frames: yield f
async def main():
events = await probe("hi", transport=FakeTransport([
{"type": "ai_chunk", "content": "Hel"},
{"type": "ai_chunk", "content": "lo"},
{"type": "ai_stream_end"},
]))
assert events[-1].type == "ai"
assert events[-1].content == "Hello"
assert events[-1].is_final
asyncio.run(main())
ProbeEvent¶
Every turn returns a list[ProbeEvent]:
| Field | Values | Notes |
|---|---|---|
type |
ai · ai_chunk · tool_progress · tool_result · command · error |
|
content |
str | |
is_final |
bool | True on the terminal event of the turn |
metadata |
dict | tool, tool_call_id, args, … for tool events |
raw |
dict | The original frame, unmodified |
ai_chunk deltas stream individually and are assembled into a single final ai event on ai_stream_end. Assert on the stream or the assembled answer — both are there:
chunks = [e for e in events if e.type == "ai_chunk"]
answer = final_text(events) # assembled final text
final_text(events) — assembled answer string.
format_events(events) — labeled lines, same format the CLI prints.
What --probe does differently from a normal gateway¶
langclaw gateway --probe forces every channel except WebSocket off at the assembly seam — your .env and config files are never touched. This means:
- No Telegram bot polling, no Discord connection, no Slack socket
- Probe traffic is isolated from live users
- Safe to run in CI alongside a production bot
Transports¶
| Transport | Install | When to use |
|---|---|---|
WebSocketProbeTransport |
langclaw[websocket] (included by default) |
Feature testing, CI — ~95% of use |
TelegramProbeTransport |
langclaw[telegram-e2e] |
Smoke-test Telegram-specific rendering (Markdown, chunking, tool-progress formatting) |
The Telegram transport needs a user account (not a bot) — a bot can't DM another bot. A one-time human login mints a TELETHON_SESSION string stored in env. Because there's no stream-end signal over MTProto, turn completion is idle-timeout based — use it for occasional smoke tests, not per-feature CI.
Honest limits¶
- The probe does not start or manage the gateway lifecycle — you start it explicitly.
- The WebSocket transport needs
langclaw[websocket](included in the base install). - Telegram transport:
langclaw[telegram-e2e]+TELEGRAM_API_ID,TELEGRAM_API_HASH,TELETHON_SESSIONin env. - Per-turn timeout defaults to 60 s. On expiry the probe appends a terminal
errorevent and returns — it never hangs.