RAG
Loom ships its own loader + vector-store stack so you can
build retrieval-augmented agents without pulling in LangChain.
Everything is async-first, typed, and works with the same
@tool-as-retriever pattern an Agent already understands.
from loomflow import Agent, tool
from loomflow.memory.embedder import OpenAIEmbedder
from loomflow.vectorstore import ChromaVectorStore
from loomflow.loader import load, RecursiveChunker
# 1. Load + chunk
doc = load("research.pdf")
chunks = RecursiveChunker(chunk_size=600).split(doc.content)
# 2. Embed + store
store = ChromaVectorStore.local("./chroma-db", embedder=OpenAIEmbedder())
await store.add(chunks)
# 3. Retriever as a tool
@tool
async def search_docs(query: str) -> str:
"""Search the indexed PDFs."""
hits = await store.search(query, k=4)
return "\n\n---\n\n".join(h.chunk.content for h in hits)
# 4. Agent uses the tool
agent = Agent(
"Answer using the indexed docs. Cite the source filename.",
model="claude-opus-4-7",
tools=[search_docs],
)
result = await agent.run("What's our retention metric definition?")Three layers, one pipeline
PDF / DOCX / Excel / CSV / TSV / Markdown / HTML / text. All normalize to
Document(content=<markdown>).Document loadersRecursive · Markdown · Sentence · Token. Strategy depends on the source format.ChunkersInMemory / Chroma / Postgres / FAISS. Async surface, MMR diversity, BM25 hybrid search, Mongo-style filters.Vector storesBuild a RAG agent over a folder of PDFs from scratch.End-to-end tutorialOptional dependencies
The RAG primitives are gated behind extras so the base install stays lean:
pip install 'loomflow[loader]' # all loaders (unstructured for PDF)
pip install 'loomflow[loader-pdf]' # unstructured[pdf]
pip install 'loomflow[loader-pdf-docling]' # docling — alt PDF backend
pip install 'loomflow[loader-docx]' # python-docx
pip install 'loomflow[loader-excel]' # openpyxl
pip install 'loomflow[loader-html]' # beautifulsoup4
pip install 'loomflow[vectorstore]' # all vector stores
pip install 'loomflow[vectorstore-chroma]' # chromadb
pip install 'loomflow[vectorstore-postgres]'# asyncpg + pgvector
pip install 'loomflow[vectorstore-faiss]' # faiss-cpu + numpyEach loader / store raises a helpful ImportError if its dependency
isn’t installed.
Memory vs Vector stores. Memory (the memory= kwarg on
Agent) is for conversational state. Episodes, working blocks,
and Facts extracted from chats. VectorStore is for arbitrary
document corpora. The things your agent retrieves from but
doesn’t build up. They’re separate primitives with separate
backends.
Last updated on