Probe Harness¶
Drive features end-to-end through the real channel pipeline — without a human tapping on Telegram.
langclaw agent -m "…" builds the agent directly and bypasses the gateway, bus,
command router, channel-context middleware, session/active-agent resolution, and
the workflow→channel progress projection. The probe injects a user turn
through the actual gateway → bus → agent → channel path and returns the
response as a structured, assertable event list.
┌── probe(content, transport) ──────────────────────────────────────────┐
│ ProbeTransport (injected seam) Probe core (deep module) │
│ ───────────────────────────── ────────────────────────── │
│ WebSocketProbeTransport ──frames──▶ collect → detect terminal → │
│ TelegramProbeTransport normalize → list[ProbeEvent] │
│ <fake in tests> │
└────────────────────────────────────────────────────────────────────────┘
Two transports, one interface¶
- WebSocket (the workhorse). Connects to a running gateway as an ordinary WS
client. Deterministic: the channel emits
{"type":"ai_stream_end"}on the final chunk, so the probe knows exactly when a turn is done — no arbitrary sleeps. Reuses the existing WS JSON contract verbatim; no change toWebSocketChannel. Used for ~95% of feature testing. - Telegram (narrow smoke). A Telegram user account (Telethon) sends to the
bot and reads its replies — only to catch Telegram-specific rendering
(Markdown, chunking, tool-progress formatting). No stream-end signal, so a turn
completes by idle-timeout; the transport synthesises an
ai_stream_endframe at that point so the core's completion logic stays identical. Occasional use.
Both implement the ProbeTransport protocol (open / close / send_turn /
receive). The transport is the injected, testable seam — mirroring the
injected executor / script-runner of the workflow runtime.
Three ways in¶
1. Run an isolated gateway, then probe it. --probe runs a WebSocket-only
gateway with every other channel forced off (applied at the channel-assembly
seam — your config file is never mutated), so test traffic never reaches a real
Telegram/Discord chat:
langclaw gateway --probe # WS-only on the configured port
langclaw probe "summarize today's news" # in another terminal
langclaw probe "/reset" # command path (bypasses bus+LLM)
langclaw probe "what's X?" --agent researcher --reset
langclaw probe "render this" --transport telegram --bot @my_test_bot
2. Importable function — for uv run python -c or any script:
import asyncio
from langclaw.testing import probe, WebSocketProbeTransport, final_text
async def main():
events = await probe(
"Say exactly: HELLO",
transport=WebSocketProbeTransport("ws://127.0.0.1:18789"),
reset=True,
)
print(final_text(events)) # -> "HELLO"
asyncio.run(main())
3. pytest helper — inject a fake transport that yields a canned frame sequence and assert on the normalized events (no live server):
from langclaw.testing import probe
class FakeTransport:
def __init__(self, frames): self.frames = frames
async def open(self): ...
async def close(self): ...
async def send_turn(self, content, *, agent=None): ...
async def receive(self):
for f in self.frames: yield f
events = await probe("hi", transport=FakeTransport([
{"type": "ai_chunk", "content": "Hel"},
{"type": "ai_chunk", "content": "lo"},
{"type": "ai_stream_end"},
]))
assert events[-1].type == "ai" and events[-1].content == "Hello"
ProbeEvent¶
A normalized record, isolating callers from per-transport frame differences:
| field | meaning |
|---|---|
type |
ai | ai_chunk | tool_progress | tool_result | command | error |
content |
text payload |
is_final |
terminal event of the turn (assembled ai, command, or error) |
metadata |
passthrough (tool, tool_call_id, args, …) |
raw |
the original frame dict |
ai_chunk deltas are emitted individually and assembled into a final ai
event on ai_stream_end, so you can assert on either the stream or the answer.
final_text(events) returns the assembled answer; format_events(events)
renders the labeled lines the CLI prints.
A frame ends the turn when its type is ai, ai_stream_end, command, or
error. A per-turn timeout (default 60s) is a safety net: on expiry the probe
appends a terminal error event and returns whatever arrived — it never hangs
and never silently truncates.
Honest limits¶
- The WS transport needs the
websocketextra; the Telegram transport needs thetelegram-e2eextra (telethon) plus a one-time human login that mints aTELETHON_SESSIONstring (a bot cannot DM another bot — simulating a user needs a user account via MTProto). Secrets via gitignored env:TELEGRAM_API_ID,TELEGRAM_API_HASH,TELETHON_SESSION. - The Telegram path is bounded by idle-timeout, not a stream-end signal, so it is occasional smoke, not per-feature testing.
- The probe does not start or lifecycle-manage the gateway — you start it
explicitly (e.g.
langclaw gateway --probe).