Skip to Content
DocsRAGVector stores

Vector stores

Unified async interface modeled on LangChain’s VectorStore but properly async-first and typed against the framework’s Chunk / Document.

Backends

BackendUse when
InMemoryVectorStoreDefault. Zero deps. Cosine over a Python list. Up to ~10K chunks before search latency bites.
ChromaVectorStorePersistent on-disk or hosted Chroma. Lazy import.
PostgresVectorStoreProduction durable. pgvector + asyncpg. Lazy import.
FAISSVectorStoreFast in-memory ANN over large corpora. Lazy import.
from loomflow.memory.embedder import OpenAIEmbedder from loomflow.vectorstore import ChromaVectorStore, FAISSVectorStore, InMemoryVectorStore, PostgresVectorStore # In-memory store = InMemoryVectorStore(embedder=OpenAIEmbedder()) # Chroma — persistent on-disk store = ChromaVectorStore.local("./chroma-db", embedder=OpenAIEmbedder()) # Postgres — production durable store = await PostgresVectorStore.connect( dsn="postgres://user:pw@host/db", table="my_chunks", embedder=OpenAIEmbedder(), ) # FAISS — fast ANN store = FAISSVectorStore(embedder=OpenAIEmbedder())

The protocol

Every backend supports:

async def add(chunks: list[Chunk]) -> list[str]: ... # returns ids async def delete(ids: list[str]) -> None: ... async def search( query: str, *, k: int = 4, filter: Mapping[str, Any] | None = None, diversity: float | None = None, # MMR, [0, 1] ) -> list[SearchResult]: ... async def search_by_vector( vector: list[float], *, k: int = 4, filter: Mapping[str, Any] | None = None, ) -> list[SearchResult]: ... async def count() -> int: ... async def get_by_ids(ids: list[str]) -> list[Chunk]: ...

A SearchResult carries the matched chunk plus the similarity score (cosine in [-1, 1]; backend-specific for other metrics).

Filters (Mongo-style)

# equality results = await store.search("how to ship", k=5, filter={"source": "release.md"}) # range results = await store.search("...", filter={"page": {"$gte": 5}}) # membership results = await store.search("...", filter={"tag": {"$in": ["draft", "final"]}}) # composition results = await store.search("...", filter={ "$and": [{"source": "spec.md"}, {"section": "auth"}], })

Operators: $eq (default), $ne, $gt, $gte, $lt, $lte, $in, $nin, $and, $or, $not.

Diversity (MMR)

Pass diversity in [0, 1] to rerank with Maximal Marginal Relevance:

results = await store.search("...", k=8, diversity=0.4)
  • None (default). Plain top-k by similarity.
  • 0.0. Same as None.
  • 1.0. Maximum diversity (may sacrifice relevance).
  • 0.3..0.5. What most users want when they want diversity.

The framework picks a 0..1 diversity scale (not LangChain’s inverted lambda_mult) because more diverse → bigger number is intuitive.

Hybrid search (in-memory only, today)

InMemoryVectorStore.search_hybrid() combines BM25 lexical scores with vector similarity via Reciprocal Rank Fusion:

results = await store.search_hybrid("recurring billing error", k=8)

Useful when the query has acronyms / IDs / proper nouns that embeddings miss but BM25 catches.

Persistence (in-memory only)

InMemoryVectorStore.save("./store.json") / .load("./store.json") round-trips the store to JSON. Useful for tests and small reproducible demos. For production, use Chroma / Postgres / FAISS instead.

Embedders

Vector stores need an embedder that satisfies the Embedder protocol:

from loomflow import HashEmbedder from loomflow.memory.embedder import CohereEmbedder, OpenAIEmbedder, VoyageEmbedder # Production: embedder = OpenAIEmbedder("text-embedding-3-small") embedder = OpenAIEmbedder("text-embedding-3-large") embedder = VoyageEmbedder("voyage-3") embedder = CohereEmbedder("embed-english-v3.0") # Zero-key dev / tests: embedder = HashEmbedder() # deterministic, no network, low quality

For custom embedders see Custom embedder recipe.

Embedder used at add() AND search(). Both code paths embed the text against the same embedder; mismatched embedders between writes and reads silently produces near-zero similarity. Pin the embedder at store construction.

Last updated on