RAG

Loom ships its own loader + vector-store stack so you can build retrieval-augmented agents without pulling in LangChain. Everything is async-first, typed, and works with the same @tool-as-retriever pattern an Agent already understands.


from loomflow import Agent, tool
from loomflow.memory.embedder import OpenAIEmbedder
from loomflow.vectorstore import ChromaVectorStore
from loomflow.loader import load, RecursiveChunker
 
# 1. Load + chunk
doc = load("research.pdf")
chunks = RecursiveChunker(chunk_size=600).split(doc.content)
 
# 2. Embed + store
store = ChromaVectorStore.local("./chroma-db", embedder=OpenAIEmbedder())
await store.add(chunks)
 
# 3. Retriever as a tool
@tool
async def search_docs(query: str) -> str:
    """Search the indexed PDFs."""
    hits = await store.search(query, k=4)
    return "\n\n---\n\n".join(h.chunk.content for h in hits)
 
# 4. Agent uses the tool
agent = Agent(
    "Answer using the indexed docs. Cite the source filename.",
    model="claude-opus-4-7",
    tools=[search_docs],
)
result = await agent.run("What's our retention metric definition?")

Three layers, one pipeline

PDF / DOCX / Excel / CSV / TSV / Markdown / HTML / text. All normalize to Document(content=<markdown>).Document loaders Recursive · Markdown · Sentence · Token. Strategy depends on the source format.Chunkers InMemory / Chroma / Postgres / FAISS. Async surface, MMR diversity, BM25 hybrid search, Mongo-style filters.Vector stores Build a RAG agent over a folder of PDFs from scratch.End-to-end tutorial

Optional dependencies

The RAG primitives are gated behind extras so the base install stays lean:


pip install 'loomflow[loader]'              # all loaders (unstructured for PDF)
pip install 'loomflow[loader-pdf]'          # unstructured[pdf]
pip install 'loomflow[loader-pdf-docling]'  # docling — alt PDF backend
pip install 'loomflow[loader-docx]'         # python-docx
pip install 'loomflow[loader-excel]'        # openpyxl
pip install 'loomflow[loader-html]'         # beautifulsoup4
 
pip install 'loomflow[vectorstore]'         # all vector stores
pip install 'loomflow[vectorstore-chroma]'  # chromadb
pip install 'loomflow[vectorstore-postgres]'# asyncpg + pgvector
pip install 'loomflow[vectorstore-faiss]'   # faiss-cpu + numpy

Each loader / store raises a helpful ImportError if its dependency isn’t installed.

Memory vs Vector stores. Memory (the memory= kwarg on Agent) is for conversational state. Episodes, working blocks, and Facts extracted from chats. VectorStore is for arbitrary document corpora. The things your agent retrieves from but doesn’t build up. They’re separate primitives with separate backends.