Examples
The repo’s examples/ directory ships nineteen runnable end-to-end
scripts. Each exercises a feature end-to-end against the framework
without any external dependencies beyond the model API. They use
Loom’s own loader / vector store / agent / workflow constructs ,
nothing pulled in from outside the framework.
# .env should contain OPENAI_API_KEY=sk-...
# Agent + retrieval + memory:
python examples/01_rag_pdf.py # default backend (unstructured)
python examples/01_rag_pdf.py --backend docling # alt PDF backend
python examples/02_specialist_debate.py
python examples/03_multi_user_sessions.py
python examples/04_structured_outputs.py
python examples/05_memory_showcase.py
# Workflow primitives:
python examples/06_workflow_chain.py # no API key required
python examples/07_workflow_route.py
python examples/08_workflow_loop.py
python examples/09_workflow_as_tool.py
python examples/10_workflow_architecture.py
python examples/11_workflow_custom_step.py
# Production observability — no API key required:
python examples/12_audit_log.py
python examples/13_telemetry.py
# Reasoning effort across providers (ANTHROPIC_API_KEY):
python examples/14_effort_dial.py
# Declarative config (no API key required):
python examples/15_config_file.py
# Shared workspace + prompt caching + living plan:
python examples/16_shared_workspace.py # OPENAI_API_KEY
python examples/17_prompt_caching.py # ANTHROPIC_API_KEY or OPENAI_API_KEY
python examples/18_living_plan.py # no API key required
python examples/19_workspace_lifecycle.py # no API key requiredEach example ships a graceful skip when OPENAI_API_KEY isn’t set
, it prints a hint and exits 0 so a make examples-style runner
doesn’t fail.
01. RAG over PDFs
Single-agent RAG over a folder of PDFs. Loader → chunker → vector store → retriever-as-tool → agent. About 100 lines.
examples/data/general/
company_handbook.pdf
engineering_guide.pdf
security_policy.pdf
support_runbook.pdf
│
▼ load_pdf(pdf, backend="unstructured" | "docling")
Document(content=<markdown>)
│
▼ RecursiveChunker(chunk_size=600).split(...)
list[Chunk]
│
▼ ChromaVectorStore.add(chunks)
indexed collection 'general_docs_<backend>'
│
▼ @tool search_docs(query): wraps store.search(query, k=4)
Agent(model="gpt-4.1-mini", tools=[search_docs])Picks the backend at the CLI. --backend unstructured (default,
Apache 2.0, what LangChain wraps) or --backend docling (MIT, IBM
Research, 2026 best-in-class on native PDFs). Each backend lands in
its own Chroma collection / persist directory (general_docs_<backend>)
so swapping backends doesn’t require manual cache busting.
pip install 'loomflow[loader-pdf,vectorstore-chroma,openai]' # default
pip install 'loomflow[loader-pdf-docling,vectorstore-chroma,openai]' # for --backend doclingThe example imports load_pdf directly to surface the backend
choice. The auto-dispatch load(pdf) also works (uses the
unstructured default). Pick whichever fits your code style.
See: End-to-end RAG tutorial for a guided walkthrough · PDF loader for the backend / strategy reference.
Read: examples/01_rag_pdf.py
02. Specialist debate
Five domain specialists (IT / physics / medicine / finance / law),
each with their own folder of PDFs and their own Chroma collection,
composed via Team.debate(...) with a synthesising judge agent.
examples/data/it/ examples/data/physics/ ...
it_runbook.pdf physics_notes.pdf ...
│ │
▼ ▼
Chroma 'it_docs' Chroma 'physics_docs' ...
│ │
▼ ▼
search_it_docs search_physics_docs ...
│ │
▼ ▼
Agent (IT tech) Agent (Physicist) ...
Team.debate(
debaters=[it, phys, med, fin, law],
judge=Agent("...synthesis judge..."),
rounds=1,
)See: Multi-Agent Debate.
Read: examples/02_specialist_debate.py
03. Multi-user sessions
Multi-user namespacing + conversation continuity on one shared
Agent + InMemoryMemory. Demonstrates that user_id is a hard
partition (Alice’s history never surfaces in Bob’s recall) and that
reusing session_id rehydrates prior turns as real chat history.
Also shows tools reading scope via get_run_context().
Read: examples/03_multi_user_sessions.py
04. Structured outputs
Type-safe structured outputs. Define a Pydantic BaseModel, pass it
as output_schema=, get a validated typed instance back on
result.parsed. Demonstrates schema-driven extraction (a
MeetingSummary with nested ActionItems, ISO dates, sentiment
enum) from a raw meeting transcript.
from pydantic import BaseModel
from loomflow import Agent
class ActionItem(BaseModel):
owner: str
description: str
due_date: str | None
class MeetingSummary(BaseModel):
title: str
attendees: list[str]
decisions: list[str]
actions: list[ActionItem]
sentiment: Literal["positive", "neutral", "negative"]
agent = Agent("Extract a structured summary.", model="gpt-4.1-mini")
result = await agent.run(transcript, output_schema=MeetingSummary)
summary: MeetingSummary = result.parsed # validated, typedRead: examples/04_structured_outputs.py
05. Memory showcase
Every memory backend behind one parameter. Walks through
inmemory / sqlite / chroma / postgres / redis
(Postgres/Redis skip gracefully without a DSN), demonstrates
profile(user_id=) / forget(user_id=) / export(user_id=) GDPR
ops, and shows the Consolidator extracting structured facts from
raw chat episodes. The memory= parameter is the only thing that
changes between backends.
Read: examples/05_memory_showcase.py
Workflow primitives
Each file is small (50–200 lines) and demonstrates one workflow pattern in isolation. Read them in order. Each builds on the previous one’s vocabulary.
06. Linear chain (no LLM)
Linear Workflow.chain([...]) of plain async functions. The
simplest possible workflow shape. No LLM involved, no API key
required. Touches RunContext propagation,
WorkflowResult.visited, per_step introspection.
See: Workflow.chain.
Read: examples/06_workflow_chain.py
07. Classify + dispatch
Workflow.route(classifier, {"a": agent_a, ...}). Classify the
question with a tiny model, dispatch to a specialist Agent.
Demonstrates “Agent as a workflow node” composition with
developer-controlled branching.
See: Workflow.route · Composition.
Read: examples/07_workflow_route.py
08. Refinement loop (cycles)
Refinement loop with cycles: draft → review → judge → (revise → review → ... → END). Shows add_router with END sentinels,
max_visits_per_node safety cap, and graceful cap-exceeded
handling via try/except RuntimeError + the in-place state dict.
See: Explicit graph builder , the cycles section.
Read: examples/08_workflow_loop.py
09. Workflow as tool
wf.as_tool(). The opposite composition direction. An open-ended
customer-support Agent has a deterministic refund workflow
available as a tool. Unified audit log shows agent’s tool_call
AND workflow’s per-step entries under one user_id.
See: Composition. Direction 2.
Read: examples/09_workflow_as_tool.py
10. Architecture inside a workflow
Agent with architecture="self-refine" inside a workflow chain.
Demonstrates that workflow shape and agent architecture are
orthogonal axes. The architecture is encapsulated inside the
agent step; the workflow doesn’t see the internal draft → critique
→ refine iteration.
See: Architectures · Composition.
Read: examples/10_workflow_architecture.py
11. Custom step wrapping an Agent
Agent wrapped in a custom async def step. For when “just call
agent.run(prev_output)” isn’t enough. Multi-field prompt
formatting, capturing RunResult metadata (tokens, turns) into
workflow state, post-processing the agent’s output.
See: The @step decorator.
Read: examples/11_workflow_custom_step.py
Production observability
The last two examples exercise the framework’s observability spine , no API key required, both run with in-memory backends so you can inspect the captured data directly.
12. Audit log (HMAC-signed, JSONL on disk)
Builds an Agent + a Workflow with a shared FileAuditLog, runs
both, and inspects what was written. Five things this example
covers:
- Two backends behind one protocol.
InMemoryAuditLogfor tests and notebooks,FileAuditLogfor production. - Per-
user_idfiltering.audit.query(user_id="alice")is a partitioned read, not a payload scan. - HMAC tamper detection.
verify_signature(entry, secret=)returnsFalsefor any mutation of the canonical payload, and for the wrong secret. Catches both tampering and secret-rotation mistakes. - Restart recovery. A fresh
FileAuditLogagainst the same path scans the existing JSONL and resumes theseqcounter, so new entries don’t collide. - Dict-config form for verbatim capture (0.9.36+). Hand
AgentorWorkflowanaudit_log={"scope_full": True, ...}dict and the resolver builds the right backend, wrapped inFullTranscriptAuditLog. Prompts, outputs, and full tool-result bodies all land in the log. The default is compliance-friendly: truncated prompts, no outputs recorded.
No API keys required. Uses EchoModel.
See: Audit log attribution.
Read: examples/12_audit_log.py
13. Telemetry (four sinks, no collector required)
Runs the same scripted agent against four different telemetry sinks that ship with Loom. No OpenTelemetry SDK, no collector deploy.
InMemoryTelemetry. AccumulatesCapturedSpan/CapturedMetricrecords in lists. Assert on them in tests directly.ConsoleTelemetry. Prints span lines (with nested-trace indentation) and metric lines to a stream as they happen. “Tail my agent in dev.”FileTelemetry. Append-only JSONL on disk. Each line is a structured record withparent_span_idlinkage. Queryable offline withjq.MultiTelemetry. Fan-out. Watch live in stderr AND assert on the in-memory side after.
Also demonstrates histogram-vs-counter auto-dispatch. Metric
names ending in _ms / _seconds / _bytes become histograms;
everything else becomes a counter. One emit_metric() API, the
right instrument under the hood.
Uses ScriptedModel so the run is deterministic. No API key. For
production, swap the sink for OTelTelemetry; the agent code
doesn’t change.
See: Telemetry.
Read: examples/13_telemetry.py
Reasoning effort
14. Effort dial across providers (0.9.36+)
Runs the same hard reasoning question (the classic 3L / 5L jug
puzzle, minimum-steps variant) at each effort tier on Claude Opus
4.7. The only regime that honours the full enum including xhigh
and max.
Five things this demonstrates:
- Dict-config form for
model=. Set the model name, default effort, andstrict_efforttogether in one dict:model={"name": "claude-opus-4-7", "effort": "medium"}. Same shape philosophy asaudit_log={...}. Read it once, configure it once. - Equivalent explicit kwargs.
Agent(model="claude-opus-4-7", effort="medium")does the same thing. Top-level kwargs win when both forms are present, so you can layer environment overrides on top of a shared config dict. - The dial actually moves cost / latency. Token usage printed at each tier. Output tokens grow with effort because the model spends more on internal thinking.
- Per-call override.
run(..., effort="high")wins over the agent default for that specific run. Lets one Agent serve cheap chit-chat and occasional deep reasoning without spinning up two. strict_effort=True. Wiring effort toclaude-haiku-3-5(which doesn’t support thinking) raisesEffortNotSupportedError. The wiring mistake surfaces immediately. The default warn-and-drop behaviour would let it pass silently. Same dict shape:model={"name": "claude-haiku-3-5", "effort": "high", "strict_effort": True}.
Requires ANTHROPIC_API_KEY. Opus 4.7 is the showcase model
because it accepts the full enum. The dial works on any reasoning
model; this example just uses the one that lets every tier through
without clamping.
See: Reasoning effort · Agent reference: effort.
Read: examples/14_effort_dial.py
Declarative config
15. Build an Agent from a TOML / dict config (0.9.37+)
Writes a complete agent.toml, builds an Agent from it with
Agent.from_config(path), and asserts every backend resolved to the
expected concrete class. Then does the same with Agent.from_dict(cfg)
to show the in-memory form. Runs offline against EchoModel. No API
key needed.
Five things this demonstrates:
Agent.from_config(path). Reads a TOML file. One declaration covers model, memory, runtime, telemetry, audit log, permissions, budget, architecture, effort, skills, and MCP servers.Agent.from_dict(cfg). Same shape, parsed in memory. Useful when your config comes from PydanticBaseSettings, a YAML file you already parsed, or a service-config endpoint.- Backend tables.
[memory]/[runtime]/[telemetry]/[audit_log]/[permissions]/[budget]each go through the same resolverAgent(...)uses for kwargs. String shorthand works too:telemetry = "memory",permissions = "strict". - Arrays of tables.
[[skills]]loads skill bundles by path.[[mcp]]connects to MCP servers (stdio or HTTP transport). - Kwarg override. Real callables (tools, hooks, secret stores, retry policies) come through Python kwargs since TOML can’t express them. When a kwarg and a config entry collide, the kwarg wins. That’s the per-environment override path.
See: Config file (TOML / dict) · Agent reference: from_config.
Read: examples/15_config_file.py
Multi-agent workspace
16. Shared notebook for multi-agent teams (0.9.39+)
A research team — four specialists plus a synthesizer —
collaborates on one question through a shared notebook. Each
specialist runs in parallel and writes exactly one findings note.
None of them sees the others’ transcripts. The synthesizer then
reads everyone’s notes via list_notes / read_note and writes
the final recommendation into the same notebook.
What this shows:
- No history sharing. Specialists run hermetically. Only their curated notes cross between agents, not raw transcripts.
- Synthesizer context stays small. It reads N ~100-token notes instead of N full transcripts.
- Auto-attribution. Each agent’s notes are tagged with its team role because the author identity is baked into the workspace tools’ closure. The agent never types its own name.
- Filesystem-mounted.
WORKSPACE.mdregenerates atomically. You cancatit during or after the run.
Uses gpt-4.1-mini, so the demo is cheap (~$0.02 per run).
Requires OPENAI_API_KEY.
See: Workspace.
Read: examples/16_shared_workspace.py
17. Prompt caching (0.9.41+)
Runs the same big system prompt twice, back to back, with
prompt_caching=True. The second run shows non-zero
cached_tokens_in and a markedly lower cost_usd. Live cost
evidence, not a claim.
Covers the boolean form on both Anthropic and OpenAI, the dict form
for advanced control (ttl="1h", cache_key), and the per-provider
behavior (Anthropic injects cache_control markers; OpenAI is
automatic and the flag buys accurate accounting plus routing).
Requires ANTHROPIC_API_KEY or OPENAI_API_KEY.
See: Prompt caching.
Read: examples/17_prompt_caching.py
18. Living plan (0.9.42+)
Walks a living_plan=True agent through the TodoWrite discipline:
commit a plan with plan_write, do work with a tool, rewrite the
plan to mark the step done with a finding, emit a final message.
After the run it inspects the workspace and confirms the plan
mirrored to a kind="plan" note.
Also shows pre-seeding: pass a constructed LivingPlan so the next
run starts with a plan already in place. Uses a ScriptedModel, so
it runs offline. No API key.
See: Living plan.
Read: examples/18_living_plan.py
19. Workspace lifecycle (0.10.0+)
Exercises all eight v0.10 workspace features in one offline script:
namespacing, versioning, archive, questions, semantic search (with
a tiny deterministic stub embedder), citation tracking plus outcome
attribution, relevance-aware search, and citation-aware prune().
Fully offline. Uses InMemoryWorkspace. No API key.
See: Workspace lifecycle.
Read: examples/19_workspace_lifecycle.py
Sample data
The image-bearing examples (01, 02) generate small sample PDFs on
first run (via reportlab) and cache them under examples/data/.
The on-disk Chroma indices are also cached, so subsequent runs only
re-execute the agent loop against the model.
Examples 06–11 ship with the workflow primitive overhaul. They
exercise the developer-controlled DAG side of the framework. The
peer to the LLM-controlled Agent. Each is meant to be readable
end-to-end in under five minutes. For more focused snippets see
Recipes; for the conceptual overview see
Workflow.